DIALOGUES IN MUSIC THERAPY AND MUSIC NEUROSCIENCE: COLLABORATIVE UNDERSTANDING DRIVING CLINICAL ADVANCES

EDITED BY: Julian O'Kelly, Jörg C. Fachner and Mari Tervaniemi PUBLISHED IN: Frontiers in Human Neuroscience and Frontiers in Neuroscience

#### *Frontiers Copyright Statement*

*© Copyright 2007-2017 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

> *All copyright, and all rights therein, are protected by national and international copyright laws.*

> *The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88945-137-1 DOI 10.3389/978-2-88945-137-1

## About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

## Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **DIALOGUES IN MUSIC THERAPY AND MUSIC NEUROSCIENCE: COLLABORATIVE UNDERSTANDING DRIVING CLINICAL ADVANCES**

Topic Editors:

**Julian O'Kelly,** WHO Collaborating Centre for Mental Health Services Development & Royal Hospital for Neuro-Disability, UK **Jörg C. Fachner,** Anglia Ruskin University, UK **Mari Tervaniemi,** University of Helsinki, Finland

Cover image by Aiju Salminen

Music is a complex, dynamic stimulus with an un-paralleled ability to stimulate a global network of neural activity involved in attention, emotion, memory, communication, motor co-ordination and cognition. As such, it provides neuroscience with a highly effective tool to develop our understanding of brain function, connectivity and plasticity. Increasingly sophisticated neuroimaging technologies have enabled the expanding field of music neuroscience to reveal how musical experience, perception and cognition may support neuroplasticity, with important implications for the rehabilitation and assessment of those with acquired brain injuries and neurodegenerative conditions. Other studies have indicated the potential for music to support arousal, attention and emotional regulation, suggesting therapeutic applications for conditions including ADHD, PTSD, autism, learning disorders and mood disorders.

In common with neuroscience, the music therapy profession has advanced significantly in the past 20 years. Various interventions designed to address functional deficits and health care needs have been developed, alongside standardised behavioural assessments. Historically, music therapy has drawn its evidence base from a number of contrasting theoretical frameworks. Clinicians are now turning to neuroscience, which offers a unifying knowledge base and frame of reference to understand and measure therapeutic interventions from a biomedical perspective. Conversely, neuroscience is becoming more enriched by learning about the neural effects of 'real world' clinical applications in music therapy. While neuroscientific imaging methods may provide biomarking evidence for the efficacy of music therapy interventions it also offers important tools to describe time-locked interactive therapy processes and feeds into the emerging field of social neuroscience. Music therapy is bound to the process of creating and experiencing music together in improvisation, listening and reflection. Thus the situated cognition and experience of music developing over time and in differing contexts is of interest in time series data.

We encouraged researchers to submit papers illustrating the mutual benefits of dialogue between music therapy and other disciplines important to this field, particularly neuroscience, neurophysiology, and neuropsychology.

The current eBook consists of the peer reviewed responses to our call for papers.

**Citation:** O'Kelly, J., Fachner, J. C., Tervaniemi, M., eds. (2017). Dialogues in Music Therapy and Music Neuroscience: Collaborative Understanding Driving Clinical Advances. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-137-1

# Table of Contents


Jana Strahler and Urs M. Nater

*101 Benefits of listening to a recording of euphoric joint music making in polydrug abusers*

Thomas Hans Fritz, Marius Vogt, Annette Lederer, Lydia Schneider, Eira Fomicheva, Martha Schneider and Arno Villringer

## **Lessons from Investigations with Non-clinical Populations**


## **Reviews and Commentaries**


Helen Shoemark, Deanna Hanson-Abromeit and Lauren Stewart

*156 Theory-guided Therapeutic Function of Music to facilitate emotion regulation development in preschool-aged children*

Kimberly Sena Moore and Deanna Hanson-Abromeit

*168 The pleasures of sad music: a systematic review* Matthew E. Sachs, Antonio Damasio and Assal Habibi

# Editorial: Dialogues in Music Therapy and Music Neuroscience: Collaborative Understanding Driving Clinical Advances

Julian O'Kelly 1, 2 \*, Jörg C. Fachner <sup>3</sup> and Mari Tervaniemi <sup>4</sup>

<sup>1</sup> Unit for Social and Community Psychiatry, WHO Collaborating Centre for Mental Health Services Development, London, UK, <sup>2</sup> Research, Royal Hospital for Neuro-Disability, London, UK, <sup>3</sup> Music and Performing Arts, Anglia Ruskin University, Cambridge, UK, <sup>4</sup> Cicero Learning, University of Helsinki, Helsinki, Finland

Keywords: music therapy, neuroscience methods, neuro-rehabilitation, neuro-degenerative diseases, psychoimmunology, EEG/ERP, fMRI, neurophysiology

**Editorial on the Research Topic**

#### **Dialogues in Music Therapy and Music Neuroscience: Collaborative Understanding Driving Clinical Advances**

Over 30 years of neuroscientific investigations of music perception and cognition have developed an understanding of music involving and supporting global brain processing in virtually every sphere of human activity (Levitin and Tirovolas, 2009; Särkämö et al., 2013). Increasingly sophisticated neuroimaging technology is able to provide objective biomedical evidence of musical activity supporting neuroplasticity (Münte et al., 2002; Pantev and Herholz, 2011), which has been shown as underpinning recovery and rehabilitation in stroke (Schlaug et al., 2009; Särkämö et al., 2014). Furthermore, the therapeutic potential of musical activity has been evidenced by neuroscience methods in relation to effects between common areas of processing between speech, memory, attention and motor activity (Schlaug et al., 2009; Besson et al., 2011; Patel, 2011), in how it influences arousal (Pelletier, 2004; O'Kelly et al., 2013), and through the modulation of wide ranging neurochemical activity involved in stress, immunity, social affiliation, and reward (Chanda and Levitin, 2013; Fancourt et al., 2014).

The effective use of music with clinical populations by credentialed music therapists is also an area of increasingly robust research activity, as evidenced by recent Cochrane reviews of music therapy with autism (Geretsegger et al., 2014), depression (Maratos et al., 2008) schizophrenia (Mössler et al., 2011), and acquired brain injury (Bradt et al., 2010). Various interventions designed to address functional deficits and health care needs have been developed (e.g., Thaut and Hoemberg, 2014), alongside valid and reliable scales sensitive to the effects of music in assessment and treatment with diverse clinical populations, including disorders of consciousness (Magee et al., 2014), parent–child relationships (Jacobsen and McKinney, 2015), and neurodegenerative conditions such as dementia (McDermott et al., 2014) and Huntington's disease (O'Kelly and Bodak, 2016). Historically, music therapy has drawn its evidence base from a number of contrasting theoretical frameworks, wherein an abundance of heterogeneous, sometimes contradictory theoretical approaches are hard to generalize to a wider multidisciplinary and international milieu (Hillecke et al., 2005). Clinicians are now turning to neuroscience, which offers a unifying knowledge base and frame of reference to understand and measure therapeutic interventions from a biomedical perspective.

Neuroscience methods offer exciting opportunities to understand the effects and mechanisms involved in music therapy practice through (i) in situ studies, where measures are used during

Edited and reviewed by: Srikantan S. Nagarajan, University of California, San Francisco, USA

> \*Correspondence: Julian O'Kelly julian.okelly@elft.nhs.uk

Received: 24 May 2016 Accepted: 03 November 2016 Published: 22 November 2016

#### Citation:

O'Kelly J, Fachner JC and Tervaniemi M (2016) Editorial: Dialogues in Music Therapy and Music Neuroscience: Collaborative Understanding Driving Clinical Advances. Front. Hum. Neurosci. 10:585. doi: 10.3389/fnhum.2016.00585 music therapy sessions to explore underlying neurological processes, (ii) empirical comparisons where neuroimaging and neurological measures provide biomarkers of general changes in brain processes pre and post interventions, and (iii) approximations, where methods are focussed on the effects of specific musical features, and findings explored to identify brain based action mechanisms in the music therapy process (Fachner, 2016).

Whilst music therapy is benefiting from neuroscience collaborations, neuroscience is becoming more enriched by learning about the neural effects of "real world" clinical applications in music therapy. Not only do neuroscientific imaging methods provide biomarking evidence for the efficacy of music therapy interventions, they also offer important tools to describe time-locked interactive therapy processes, feeding into the emerging field of social neuroscience. Music therapy is bound to the process of creating and experiencing music together in improvisation, listening and reflection. Thus the situated cognition and experience of music developing over time and in differing contexts is of interest in time series data (Fachner, 2014, 2016).

This research topic developed as a consequence of the editors shared commitment to promoting fruitful dialogues between music therapists, psychologists, neuroscientists, and other medical professionals, at a time where these professions are increasingly sharing the same platforms at conferences (e.g., Luck and Brabant, 2013; Bigand and Tillmann, 2015; O'Kelly et al., 2014), and in collaborative research topics such as this, and its predecessor of a similar theme (Tervaniemi, 2014, for music and brain plasticity; Särkämö et al., 2016 for music in neurorehabilitation). Our research topic features the work of 115 authors in 18 papers across the titles Frontiers in Human Neuroscience and Frontiers in Auditory Cognitive Neuroscience. Whilst the majority of papers detail music therapy and neuroscience collaborations with specific clinical populations, two further areas are covered (i) commentary on the challenges and opportunities of music therapy and neuroscience collaborations, and (ii) studies with healthy populations offering both insights into how we process music and transferrable lessons for music therapy.

A diverse range of clinical populations are covered by the topic, reflecting the many areas of health care where music therapy and neuroscience collaborations are providing important new understandings. Two areas of stoke rehabilitation are detailed in the topic. Street et al. outline feasibility, efficacy, and patient experience of a music therapy treatment protocol aimed at promoting measurable changes in upper limb function in hemiparetic stroke patients. The authors detail the use a neurologic music therapy (NMT) technique designed for this purpose: Therapeutic Instrumental Music Performance. Similarly, Cortese et al. explore the effectiveness of "Melodic-Rhythmic Therapy" in the treatment of aphasia with six stroke patients. Bukowska et al. combine several NMT techniques ("Therapeutic Instrumental Music Performance," "Rhythmic Auditory Stimulation," and "Pattern Sensory Enhancement") in a pilot study examining mobility and stability with Parkinsons patients. Using a range of novel methods including 3D Movement Analysis, the authors demonstrate significant improvement in the majority of the spatiotemporal gait parameters in the experimental group (n:30) compared to the control group (n:25).

Neurological populations feature in several other studies in the topic, such as Baker et al.'s exploration of flow and meaningfulness using songwriting with those with traumatic brain injuries or spinal cord injury. Whilst they found the intervention was positively associated with well-being outcome in the 10 participants, they also observed that those who found the songwriting process had strong personal meaning, experienced increased anxiety and depression in the process of accepting their emotions. Steinhoff et al. (2015) detail a pilot music therapy study with five individuals in an unresponsive wakefulness state (or "vegetative state") using Positron Emission Tomography to measure changes in brain activity, finding increases in tracer uptake across the frontal, hippocampal, and cerebellar region of the brain of four patients receiving music therapy for 5 weeks. Krick et al. (2015) use structural brain scanning (magnetic resonance imaging, MRI) methods to examine the effects of the Heidelberg model of music therapy on tinnitus at a cortical level. The authors found increases in gray matter volume of a range of brain areas dedicated to auditory processing concurrently with decreased symptoms. Finally, neuro-developmental issues in autistic spectrum disorder (ASD) are addressed in a study of the effect of sung speech on socio-communicative responsiveness by Paul et al. Using an adapted single subject design with three autistic children, the authors concluded sung directives may play a useful role in engaging children with ASD serving as an effective intervention for promoting socio-communicative responsiveness.

Ramirez et al. (2015) detail a pilot study using a novel musical EEG neurofeedback system to treat depression in the elderly. The authors found an average of 17% improvements on depression scores (BDI), concurrent with significant decreases of relative alpha activity in their left frontal lobe (p = 0.00008) in subjects receiving the intervention. As with all the studies in the topic featuring pilot level data, the small sample (n:10) in the study suggests findings must be cautiously interpreted, but points to the potential benefits of this novel technology. Linnemann et al. detail a study of voluntary (not therapist initiated) music listening with 30 female participants with fibromyalgia syndrome, which is characterized by chronic pain. Whilst neuro-chemical measures of stress response (cortisol and alpha-amylase) did not indicate significant effects, significantly improved perceived control over pain was observed using VAS type Ecological Momentary Assessment Items. The final study with a clinical population was provided by Fritz et al., in their study of the psychological effects of listening to self-made music during a prior musical feedback intervention with 22 polydrug abusers. The study design compared scores range of scales (e.g., PANAS) completed after listening to pre-recorded drum and base music or listening back to music co-created with other participants on exercise machines ("Jymmin") capable of modulating musical sounds, producing a similar, but original co-created music. They found a positive effect of listening to the recording of joint music making on self-efficacy, mood, and a readiness to engage socially, proposing participants were influenced by "recapitulating intense pleasant social interactions during the Jymmin intervention" (p. 1).

As detailed, a range of review and commentary papers explore the challenges and opportunities afforded by neuroscience and music therapy collaborations. Magee and Stewart frame this discussion by exploring existing and potential collaborations, whilst highlighting the misconceptions from both parties that may impede further expansion of the field. In a similar vein, Hunt comments on the boundaries of research methods employed in the neurosciences with regard to capturing inter-subjective, holistic experiences in music therapy, highlighting the potential of emerging technologies providing methods for delivering clinically relevant information for music therapists. Further to providing an overview of our neuroscientific understanding of auditory processing in premature infants, Shoemark et al. build the case for music-based interventions, including a hypothetical vignette from their shared clinical experience. Similarly Moore and Hanson-Abromeit review the neuroscience of emotional regulation development in childhood to frame the rationale for "Musical Contour Regulation Facilitation," an interactive intervention for emotional regulation. Finally Sachs et al. provide a systematic review of the "Pleasures of Sad Music," concluding that such pleasures exist where music is perceived as nonthreatening, is aesthetically pleasing, and where it produces psychological benefits such as mood regulation. The authors continue by exploring the neural mechanisms involved in producing sadness which can also induce a positive affective states, with implications for informing music therapy work in this field.

A range of research papers explore the therapeutic potential of music through neuroscientific investigations with non-clinical participants. Here, the emotional effects of music are explored from a range of perspectives. Psychologists Sharman and Dingle explore the conception that listening to music from the extreme metal genre might have a causal relationship with anger behaviors. Thirty-nine extreme metal fans were tested for heart rate and positive/negative feelings of effect on the PANAS scale after an "anger induction" followed by listening to 10 min of either preferred music or silence. Contrary to expectations, participants reacted to their preferred extreme music with stabilized heart rate and positive emotions. The role of music listening in emotional regulation receives further attention from Carlson et al., who investigated relationships between music listening behaviors ("Discharge" or "Diversion") and gender, levels of depression, anxiety and neuroticism in a large nonclinical sample (n:123). Interestingly, in the context of Sharman and Dingles findings, Carlson et al. found on psychological scales (MADRS, BFQ, and HADS-A) that Discharge (using music to express negative emotions), was related to increased anxiety and Neuroticism, particularly in males. However, comparisons can only be made cautiously given the different samples, methods and

### REFERENCES


range of genres involved in Carlson's study. Carlson et al. also present brain imaging (functional magnetic resonance imaging, fMRI) findings highlighting decreases in medial prefrontal cortex activity in high Discharge males, with increases for females preferring music listening for Diversion, exploring these findings in relation to the neurological and psychological impact of maladaptive listening practices.

Focussing on more active music making in non-clinical populations, the effects of improvised and pre-composed choral singing on experiences of flow, engagement, and neurobiological measures of social affiliation and arousal (oxytocin and adrenocorticotropic hormone/ACTH) were investigated by Keeler et al. Significant increases on a validated measure of flow for both singing conditions were observed concurrently with decreases in ACTH, significantly for pre-composed music. Whilst the small sample of one choral quartet with a limited range of material indicates caution, the authors propose group singing as effective in reducing stress and arousal, highlighting the importance of flow states in this process.

In setting up this research topic, we aimed to investigate the following question from different viewpoints: what can we understand about the musical, therapeutic, relational, or creative processes in music therapy from a neuroscience perspective, and how can this perspective advance music therapy practice? Moreover, we aimed at introducing various experimental approaches designed recently in order to investigate the efficacy and underlying principles of music therapy. As we hoped, authors working with those with acquired, developmental or neurodegenerative neurological and psychiatric conditions submitted empirical research, systematic reviews, and case studies adopting neuroscientific methods. Furthermore, the richness, challenges, and potentials of this field have been explored in commentary, position statement, and theoretical papers. Though, convergent thinking and research activity this volume illustrates how much music therapy and neuroscience have to learn from each other. The authors wish to thank all authors, peer reviewers, and participants in the research featured here for the important contribution to this evolving field they have made in this topic. The papers featured show the great potential for more important and synergistic collaborations to benefit the wellbeing of both clinical populations, and those interested in harnessing the therapeutic power of music in their everyday lives.

## AUTHOR CONTRIBUTIONS

JO: drafted the first version of the editorial, MT and JF: revised the first draft and made contributions from their areas of expertise.


and practice: summary report and reflections on a two-day international conference. In Voices World Forum Music Ther. 14. doi: 10.15845/voices. v14i1.742


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 O'Kelly, Fachner and Tervaniemi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# **Home-based neurologic music therapy for upper limb rehabilitation with stroke patients at community rehabilitation stage—a feasibility study protocol**

*Alexander J. Street 1, Wendy L. Magee2, Helen Odell-Miller 1, Andrew Bateman3, 4, 5, 6 and Jorg C. Fachner <sup>1</sup> \**

*<sup>1</sup> Music and Performing Arts, Music for Health Research Centre, Anglia Ruskin University, Cambridge, UK, <sup>2</sup> Music Therapy Program, Boyer College of Music and Dance, Temple University, Philadelphia, PA, USA, <sup>3</sup> Department of Psychiatry, University of Cambridge, Cambridge, UK, <sup>4</sup> National Institute for Health Research, Collaborations for Leadership in Applied Health Research and Care, Cambridgeshire and Peterborough NHS Trust, Cambridge, UK, <sup>5</sup> Oliver Zangwill Centre for Neuropsychological Rehabilitation, Ely, UK, <sup>6</sup> Cambridgeshire Community Services NHS Trust, St Ives, UK*

**Background:** Impairment of upper limb function following stroke is more common than lower limb impairment and is also more resistant to treatment. Several lab-based studies with stroke patients have produced statistically significant gains in upper limb function when using musical instrument playing and techniques where rhythm acts as an external time-keeper for the priming and timing of upper limb movements.

#### *Edited by:*

*Lutz Jäncke, University of Zurich, Switzerland*

#### *Reviewed by:*

*Julià L. Amengual, Institut du Cerveau et de la Moelle Épinière, France Michael Thaut, Colorado State University, USA*

#### *\*Correspondence:*

*Jorg C. Fachner, Anglia Ruskin University, East Road, Cambridge CB1 1PT, UK jorg.fachner@anglia.ac.uk*

*Received: 23 March 2015 Accepted: 17 August 2015 Published: 23 September 2015*

#### *Citation:*

*Street AJ, Magee WL, Odell-Miller H, Bateman A and Fachner JC (2015) Home-based neurologic music therapy for upper limb rehabilitation with stroke patients at community rehabilitation stage—a feasibility study protocol. Front. Hum. Neurosci. 9:480. doi: 10.3389/fnhum.2015.00480* **Methods:** For this feasibility study a small sample size of 14 participants (3–60 months post stroke) has been determined through clinical discussion between the researcher and study host in order to test for management, feasibility and effects, before planning a larger trial determined through power analysis. A cross-over design with five repeated measures will be used, whereby participants will be randomized into either a treatment (*n* = 7) or wait list control (*n* = 7) group. Intervention will take place twice weekly over 6 weeks. The ARAT and 9HPT will be used to measure for quantitative gains in arm function and finger dexterity, pre/post treatment interviews will serve to investigate treatment compliance and tolerance. A lab based EEG case comparison study will be undertaken to explore audio-motor coupling, brain connectivity and neural reorganization with this intervention, as evidenced in similar studies.

**Discussion:** Before evaluating the effectiveness of a home-based intervention in a larger scale study, it is important to assess whether implementation of the trial methodology is feasible. This study investigates the feasibility, efficacy and patient experience of a music therapy treatment protocol comprising a chart of 12 different instrumental exercises and variations, which aims at promoting measurable changes in upper limb function in hemiparetic stroke patients. The study proposes to examine several new aspects including home-based treatment and dosage, and will provide data on recruitment, adherence and variability of outcomes.

**Keywords: stroke, hemiparetic, therapeutic instrumental music performance (TIMP), music-supported therapy, ARAT, community rehabilitation, feasibility study**

## **Background**

There are approximately 152,000 people affected by stroke in the UK every year (British Heart Foundation, 2012) causing more disability in adults than any other disease or condition. More than 50% of these report severe disability (Adamson et al., 2004b) and face long-term dependency on others for support with daily activities in their home (Adamson et al., 2004a). The mean length of stay in hospital for stroke patients in the UK has fallen from 32 days in 2000 to 20 days in 2010 (British Heart Foundation, 2012). Community services, sometimes referred to as "early supported discharge teams," and other community based rehabilitation teams are reported to improve outcomes for stroke patients, but an audit in 2010 recorded only 36% of hospitals in the UK were providing such services (Department of Health, 2010). A shortfall in spending on chronic stroke rehabilitation is also reported in the US (Miller et al., 2010), despite the fact that studies have shown improvements in outcomes for patients when interventions continue from acute care into the community up to five years after stroke (Fens et al., 2013).

Weakness on one side, or hemiparesis, is the most commonly encountered sensorimotor impairment following ischaemic or haemorrhagic stroke (Sabini et al., 2013), occurring in 80% of patients (Adey-Wakeling and Crotty, 2013). Hemiparesis has a profound effect on patients' ability to perform ADLs such as washing, dressing, cooking and eating, and is extremely resistant to rehabilitation treatments. The total financial costs resulting from stroke in 2009, including direct health care costs, productivity loss and informal care were £3,741,682 (British Heart Foundation, 2012). Other estimates put the annual cost figure at 7 billion with 2.8 billion comprising direct healthcare costs (Bhatnagar et al., 2010).

Research beginning in the 1990s into rhythm driven interventions for gait training following stroke and traumatic brain injury (Thaut et al., 1993, 1997, 2007; Prassas et al., 1997; Hurt et al., 1998), in Parkinson's disease (Thaut et al., 1996; McIntosh et al., 1997), and with cerebral palsy (Kwak, 2007; Kim et al., 2011, 2012) has resulted in a well evidenced intervention known as Rhythmic Auditory Stimulation (RAS). RAS is reported to improve gait parameters including stride length and symmetry with stroke patients, with further research recommended into rhythm driven interventions in neurorehabilitation (Bradt et al., 2010). Building on this research Thaut et al. (2002) and Malcolm et al. (2009a) found evidence for the application of rhythm driven interventions in upper limb rehabilitation, with participants making significant improvements in movement trajectories and quality of arm movement. Motivation is a major factor that, when lacking, can hinder engagement in rehabilitation programs, and a number of other studies illustrate the use of music and the inclusion of music therapy within multidisciplinary rehabilitation in order to improve patient mood and enhance motivation (Nayak et al., 2000; Jochims, 2004; Magee et al., 2006; Craig, 2008; Sarkamo et al., 2008; Magee and Baker, 2009; Street, 2012). Using electronic drums supported with live music from the music therapist, Paul and Ramsey (1998) found clinical (but not statistical) significance in increased active shoulder and elbow range for stroke participants. Sharing some features with this study, Music Supported Therapy (MST) is a recently researched intervention in which participants played through a series of increasingly complex musical exercises using electronic drum pads and keyboard. Results from these studies have consistently shown statistically significant improvements for participants' upper limb function, also evidencing neural reorganization using EEG and fMRI technology (Schneider et al., 2007; Rojo et al., 2011; Altenmüller et al., 2009; Grau-Sánchez et al., 2013). EEG was recorded during playing, i.e., hitting a key or a drum pad, which would indicate an event in the EEG. Pre-post therapy results in the music group showed an increase of Event-Related-Desynchronization and coherence in the beta band indicating reorganization of motor patterns (Altenmüller et al., 2009). Rojo et al.'s case study indicated that music patterns that were listened to before they were played by participants showed increased activation of motor and auditory regions when listening after the patterns had been learned, at the end of treatment (Rojo et al., 2011). Evidence suggesting that the music generated by the participants' playing during these studies has induced neural reorganization, whereby the auditory cortices appear to be incorporated into motor circuits, has prompted use of the term "audio-motor coupling" (Rojo et al., 2011), a phenomenon also observed within minutes of novice piano players beginning to practice (Classen et al., 1998). Musical motor performance involves the same brain regions as other motor tasks, those being the: motor, premotor, supplementary motor area (SMA), the cerebellum and the basal ganglia, as well as somatosensory, auditory, emotional, temporal, and memory loops (Altenmüller, 2001; Lotze et al., 2003; Meister et al., 2004). Musicians perform complex movement patterns, which are informed by continuous auditory feedback from their playing (Altenmüller et al., 2009), and feedback from movements is fundamentally important in order to inform and control them (Carpenter and Reddi, 2012).

Therapeutic Instrumental Music Performance (TIMP) is a Neurologic Music Therapy intervention (NMT) used in neurorehabilitation which employs external audio cues during music based activities in which the selection and spatial arrangement of instruments facilitates improved upper limb movement trajectories and arm kinematics (Thaut, 2008). Jeong and Kim (2007) suggest that the combination of rhythmic music and movement attuned to it creates a powerful neurological stimulus that may increase the plasticity of the nervous system. TIMP is one of a number of NMT interventions applied to sensorimotor, cognitive and communication rehabilitation (Thaut, 2008). It involves the planning of functional therapeutic musical experiences to meet functional physical goals, set within the multidisciplinary team, with the aim of transferring the therapeutic learning into real-world applications. Whilst evidence has emerged from the aforementioned studies regarding the effects of either rhythm or musical instrument playing on neural reorganization and upper limb movement trajectories,

**Abbreviations:** TIMP, Therapeutic Instrumental Music Performance; MST, Music Supported Therapy; RAS, Rhythmic Auditory Stimulation; NMT, Neurologic Music Therapy; ARAT, Action Research Arm Test; 9HPT, Nine Hole Peg Test; Bpm, Beats per Minute; CCS NHS Trust, Cambridgeshire Community Services National Health Trust; ADLs, Activities of Daily Living.

there have been very few that combine these elements to form a unified treatment protocol matching that of TIMP. Lim et al. (2011) investigated its effects on perceived exertion and fatigue, with positive findings, but did not measure for any physiological change. Paul and Ramsey's (1998) study matches the TIMP protocol, but was delivered in a group setting. Yoo (2009) conducted a study using TIMP in a lab setting with three chronic stroke patients and found evidence of improved wrist and hand function, as well as increased movement velocity.

Music therapy is not commonly associated with, nor found within, neurorehabilitation settings; in 2005 only four neurorehabilitation units in the UK employed a music therapist (Magee et al., 2006). Musical instrument playing is not widely recognized as a feasible and effective, short-term intervention for treating movement disorders resulting from stroke, a patient group within which there is a high level of heterogeneity as regards upper limb hemiparesis, cognitive, sensory and communication impairments. Yoo's study, which included three participants, was conducted at Colorado State University, a major center for NMT research and training. Participants were recruited from a facility managed by their center for biomedical research in music. Heterogeneity influences decisions regarding inclusion criteria; if it is too specific, then recruitment can be slow, too broad and heterogeneity introduces more variables, which in turn may skew statistical outcomes. In either case, a further influence is the pool size from which patients will be recruited; the geographical area and whether single or multi-site.

Home based and combined home/clinic training programs for sensorimotor treatment have been trialed previously using RAS gait training with Parkinson's patients (Thaut et al., 1996), rhythmic auditory cueing for upper limb reaching kinematics with stroke patients (Malcolm et al., 2009b), and computer gaming (King et al., 2012), however, all other research relating to this topic has been laboratory based. There is a lack of research investigating sensorimotor interventions with patients at the home-based community stage of rehabilitation. Previous studies investigating musical instrument playing have included in-patients, who were, on average, approximately 2 months poststroke (Schneider et al., 2007; Altenmüller et al., 2009; Grau-Sánchez et al., 2013). One study using a rhythm and music-based therapy program included participants at 1–5 years post stroke (Bunketorp Kall et al., 2012). This study will include participants 3–60 months post stroke, defined as being at the chronic stage of recovery (Barrett and Meschia, 2013).

Frequency of therapy sessions in existing studies has been predominantly 5 days per week for 3–4 weeks (Schneider et al., 2007; Altenmüller et al., 2009; Malcolm et al., 2009a; Rojo et al., 2011; Amengual et al., 2013), which is comparable with typical modified constraint induced movement therapy (mCIMT) delivery (Earley et al., 2010), and the music therapy treatment has usually been compared with other forms of standard care or combined music therapy/standard care. Early versus late treatment using RAS in gait training has been trialed (Hayden et al., 2009), but music therapy treatment for upper limb rehabilitation has not been investigated using a wait list design. The feasibility of delivering RAS for stroke patients as part of standard care has been explored (Hayden et al., 2009). Owing to the innovative nature of this intervention and recruitment of participants from within an NHS trust where neurologic music therapy is not recognized or available, participants will be recruited after discharge from community stroke rehabilitation services.

The study reported here will build upon the existing knowledge of music's effect on neuroplasticity (Schneider et al., 2007; Altenmüller et al., 2009; Rojo et al., 2011; Amengual et al., 2013) and translate this knowledge into a clinical protocol that may improve patient outcomes. Thus, it will add to limited, existing research into musical instrument playing, rhythm and upper limb rehabilitation following stroke. It will also address questions concerning dosage, setting and the timing of treatment delivery. Whereas most of the research to date on this topic has been laboratory based, this study provides a novel intervention that will be delivered one-to-one, in participants' homes. It will therefore examine the feasibility of home treatment delivery at the end of standard community care. In addition, participant experience of TIMP recorded via semi-structured interview will provide data regarding motivation, access and compliance to treatment. Frequency of sessions will be reduced compared to previous studies, in order to determine whether it is still effective at a lower dosage and to ensure that the sample size can be treated within the timeframes and resources available for this research.

#### **Study Aims and Objectives**

The aims of this crossover study are to investigate whether TIMP is a feasible and effective home-based intervention for upper limb hemiparesis following stroke, when delivered at a frequency of twice weekly for a period of 6 weeks. Additional qualitative data gathered will explore the participant experience of this treatment with specific focus on feasibility of treatment delivery in the home, patient motivation, and patient preference with regard to the treatment methods under investigation. This biomedical research study is registered with clinical trials.gov, number NCT 02310438 and also approved by the Integrated Research Application System (IRAS) and Anglia Ruskin University Ethics Boards.

## **Methods**

## **Study Design**

A cross-over design with repeated measures will be used, with participants being randomized into either a wait list or treatment group (see **Figure 1**). Assessments for each participant will take place at the same time points after baseline measure, as illustrated in **Figure 1**, with the baseline and end assessments immediately before and after the 6 weeks of TIMP being conducted by a therapist who is blind to participant allocation.

#### **Procedure**

Each participant will have a total of 12 individual music therapy sessions in their home, delivered twice weekly over the course of 6 weeks. They will not be required to perform any practice or exercises set by the researcher between sessions. Whilst in wait list, before the intervention begins, participants will not receive any community nor privately employed rehabilitation

interventions for upper limb hemiparesis. Each participant will also receive the Action Research Arm Test (ARAT) (Lyle, 1981) and Nine Hole Peg Test (9HPT) (Kellor et al., 1971) assessments at the same five time points after baseline measure, over an 18 week period as follows: Timepoint 1 at Week 1 after randomization; timepoint 2 at week 6; timepoint 3 at week 9; timepoint 4 at week 15; and timepoint 5 at week 18. The design will allow analysis of treatment and no treatment by comparing treatment group with wait list group data. It will also be possible to compare early versus late intervention following discharge from the community rehabilitation treatment, as the wait list participants will have a delay of 9 weeks between community discharge and beginning TIMP. Data collected from wait list group participants prior to TIMP treatment will be analyzed to determine whether there has been any spontaneous change in upper limb function, which can occur as an independent covariate (Kwakkel et al., 2006). Subject data from the ARAT and 9HPT will also be individually analyzed in order to discuss group results.

Although unusual in a randomized cross-over design study, qualitative data will be collected from each participant in order to explore patient preference with regard to using music to support exercises, and factors that might provide insights into patient tolerance and compliance. These aspects are important given the innovative treatment being used, the dosage and the setting i.e., within the home environment. Qualitative data will be collected by the researcher immediately before and after the 6 weeks of TIMP, by using a semi-structured interview that has been devised for this study and comprises five questions regarding the participant's experience of playing the instruments and playing to the music. This data will provide an overall impression of the feasibility for this treatment protocol. Participant responses to the post semi-structured interview will specifically provide data regarding motivational effects. In addition, the researcher will gather information in a research journal during and after each session, which will describe emotional responses; no direct quotes from participants will be used. There is also a five point Likert scale for recording how much participants feel the treatment will help them and, at the end of the 6 weeks of TIMP, how much they feel it actually has helped them in their ADLs. The researcher will record, in written form, participants' responses during the interview for later thematic analysis. Open questions will be used to give participants the opportunity to express their preconceptions and communicate their experience of the treatment, with the possibility of other themes arising. The quantitative data gained with the Likert response scale is used to record a time series of treatment responses (control for auto-correlation is not intended).

#### **Recruitment**

Participants will be recruited from three geographically separate community stroke rehabilitation teams in the south of England. Patients discharged from community rehabilitation who meet the inclusion criteria will be invited by the host NHS trust to participate in this study. It will not be possible to control for the length of time that each participant receives statutory community rehabilitation prior to joining the study, as within the host NHS trust this is extremely variable. Music therapy is also not recognized as an intervention for upper limb hemiparesis following stroke in the UK. As such, the host NHS trust cannot agree to facilitate any disruption or adjustment to standard treatment for their patients for music therapy research purposes. Such bodies as the Care Quality Commission (CQC) (Care Quality Commission, 2011) and National Institute for Clinical Excellence (NICE) (National Institute For Health and Care Excellence, 2013) do not recommend music therapy as an intervention with stroke patients. However, following a review of music therapy and traumatic brain injury by the Cochrane library, music therapy is now listed as a possible intervention within neurorehabilitation (Bradt et al., 2010). Potential participants who meet the inclusion criteria and, following invitation by the host NHS trust, have expressed an interest in the study will be visited in their home by the researcher in order to demonstrate the treatment methods, including the playing patterns, and answer any questions. All participants will be required to give informed consent, which will be recorded.

#### **Participants**

Fourteen adult participants, 3–60 months post stroke, with hemiparesis will be recruited who have been discharged from community rehabilitation and can consent to treatment. The age range for inclusion will be 18–90. As with the MST studies (Schneider et al., 2007; Altenmüller et al., 2009; Rojo et al., 2011; Amengual et al., 2013) participants must be able to lift their affected hand up on to a table whilst seated, unaided by their unaffected side and have some finger movement in the affected hand. In a meta-analysis of literature reporting prognostic variables in upper limb recovery (Coupar et al., 2012), inconclusive evidence was found for time since stroke being a predictor. Comparisons between lesion site also revealed no predictive value in upper limb recovery. The most significant findings were that patients recovered significantly more upper limb function if there was less impairment in their upper limb initially caused by the stroke. The inclusion criteria for this feasibility study has been determined based on this research and that set for the MST studies earlier cited. On entry into the study participants will be randomized to either the treatment or wait list group, with a list of numbers randomly generated by an independent statistician using a computer, which will be concealed using opaque sealed envelopes. The assessor will be blinded to participant allocation. In order to maintain blinding the researcher will deliver a script for all participants and the assessor at each assessment time-point.

#### **Sample Size**

The sample size of 14 stroke patients was calculated not based on a power calculation but through discussion of clinical matters between the researcher and the NHS trust hosting the study. The discussion was based on defining an appropriate number of stroke patients that would represent the heterogeneity of upper limb impairment and facilitate a report on the feasibility, management and efficacy of the intervention under the stated research conditions (unknown music based treatment, in participants' homes, twice weekly).The sample size Street et al. TIMP for hemiparesis following stroke

is considered feasible for the researcher to deliver with the available time scale and resources within the host NHS trust, including staff assisted access to patient records, identification of suitable patients and invitation to participate from NHS community stroke team therapists, and completion of data collection within a limited period of time. Risks, benefits and the logistics of delivering treatment in the home, given that it is labor intensive and delivered by the researcher alone, will be observed and reported, together with participant compliance. The home environment will introduce variables that cannot be controlled for, such as space available to set up equipment, management of seating equipment to optimize positioning for the intervention, distractions in sessions such as the activities of family members or other residents present in the home. The instruments will not be set up permanently, as would be the case in a research lab, where instrument height and distance from participant can be maintained and standardized. In this study TIMP will be delivered at a frequency much lower than has been the case in existing research of this nature, so effects at this dosage are not known. All of these logistical factors warrant detailed examination in a smaller feasibility study prior to moving to a study with a larger number of participants.

#### **Measures**

A wide range of assessment tools and technology has been used to record outcomes in upper limb post-stroke rehabilitation. The selection of assessment tools for this feasibility study has been determined by two major factors: (1) the restrictions enforced by the ethics committee who approved the study according to UK ethics procedures for research within the NHS via the Integrated Research Application System (IRAS), (2) availability of assessment tools and training requirements for their application. The Action Research Arm Test (ARAT), will be the primary outcome measure for this study; 9HPT will measure finger dexterity and semi-structured interviews allow for the collection of qualitative data on the participant experience. The ARAT has also been used in constraint-induced movement therapy studies (Kitago et al., 2013), and a study using rhythm and music with stroke patients (Bunketorp Kall et al., 2012). It has excellent interrater reliability (Hsieh et al., 1998; van der Lee et al., 2002), excellent intra-rater reliability (van der Lee et al., 2002), and excellent convergent validity against the Fugl-Meyer (De Weerdt, 1985). The ARAT is a timed, 19 item measure that is divided into four categories: grasp, grip, pinch and gross movement. Each item, or action, is performed by the participant while seated with their back against the chair, a measured distance from the table (15 cm), where all of the assessment items are individually placed for each task (see **Figure 2**). The test recreates the movements or sequences of movements required to perform many ADLs, such as reaching up onto a shelf to obtain a food ingredient or pouring liquid from one container to another. A table map for the ARAT is laid out flat on the table top and this has markers on it to indicate the start and end position for each object used in the test, thus optimizing consistency between patients and settings. The assessment can take up to 30 min if the patient needs to complete all items in every subcategory.

**FIGURE 2 | The action research arm test.**

The 9HPT, more specifically measuring finger dexterity, is widely used in stroke rehabilitation and related research (Kellor et al., 1971). It has excellent inter-rater reliability and adequate intra-rater reliability (Oxford Grice et al., 2003). It is a timed test, using standardized equipment comprising a rectangular plastic tray with a rounded, concave tray at one end containing nine small white pegs and at the other end nine holes into which the participant must (one at a time) place the nine pegs from the tray and then remove them, placing them back into the tray as quickly as possible. The participant practices with the unaffected hand first, then the affected side. The participant is seated while performing the test, which takes no more than 2 minutes to administer, depending on the age of the patient and their existing degree of finger dexterity.

#### **Electroencephalography (EEG) Recording**

The EEG recording procedure will include rest, listening to the TIMP music patterns (that will be learned in the sessions) a Go/No go task, after which another resting state EEG will be recorded. This study will utilize a BrainAmp DC (Brainproducts) EEG utilizing 64 active electrodes and artifact channels (EOG, EMG). Artifact control will be guided by video recording of the participant whilst undergoing EEG. The video footage will inform about body movement and eye-blinks. After visual inspection and indexing of the continuous EEG traces, data pre-processing will utilize the Neuroguide Artifact toolbox (Thatcher et al., 2009) for the resting EEG and the music listening task. EEGLAB and ePrime will be used for the analysis of the Go/No Go task and will focus on the short episodes around the clicks. After visual inspection ICA will be applied to pre-process data and determined artifacts will be excluded.

#### **Semi-structured Interviews**

Using the principles and processes described in Interpretative Phenomenological Analysis (IPA) (Smith and Osborn, 2008) material from the semi-structured interviews will be analyzed into themes. The questions have been designed as open questions to encourage participants to offer descriptions of their experience of playing the instruments and playing to the supporting music. In keeping with IPA principles, the underlying aim of the questions is to offer participants a chance to explain how they feel the treatment will affect them and describe how they are experiencing, feeling and thinking about the processes involved, as such this will provide data regarding motivation and emotional response. Patient feedback in these areas will also feed into questions around feasibility of delivery for this treatment protocol.

#### **Statistical Analysis**

For the primary outcome measures, for the appropriate analysis of the crossover design with repeated measures, a linear mixed model approach will be used. This will be undertaken using the computer program R and will employ the R package lme4, which is sufficiently flexible to provide detailed analysis for this type of design, including the accommodation of missing values. The main result of this analysis will be the assessment of whether the music therapy has had an effect. To avoid the need to make strong assumptions about the distribution of the data, namely, that fitted model residuals are Normally distributed, computer-intensive methods will be employed for the statistical inferences. This will include bootstrap approaches to calculating 95% confidence limits, and permutation tests to obtain statistical significance test *P*-values.

### **EEG Analysis**

EEG case analysis of the pre/post *resting state* EEG recordings will be treated with NeuroGuide Software (www. appliedneuroscience.com; Version 2.6.6) including an age, gender and condition-matched (*N* = 678 matched controls) LORETA normative EEG database (Thatcher et al., 2005). Continuous, artifact-free, raw EEG data will be subjected to a power spectral analysis (PSA) to calculate raw and z-scored spectral values, topography (Absolute power and Current Source Density), electrode correlation, burst metrics (burst number, amplitude, duration, and interval), instantaneous connectivity and coherence patterns, especially beta coherence. Due to the small sample size (*N* = 2) we do not expect significant differences between subjects' resting state displayed on central and temporal leads in beta power *z*-score topography and coherence data in the early compared to delayed intervention. However, we do expect intra-subject pre/post intervention z-score decreases in temporal and central leads. Topography will explore post-therapy spectral power in-, or decreases in central and temporal areas; coherence data of central and temporal leads will inform about post-therapy connectivity decreases or increases between motor and auditory cortex (as seen in beta coherence; Altenmüller et al., 2009). Pre/post intervention paired *t*-test will indicate the probabilities and directions of change. We expect the post-intervention measures to show a lowering of *z*-scores, i.e., normalization, and an increase of brain connectivity between central and temporal regions.

Low Resolution Tomography (LORETA), a specific mathematical solution to EEG source localization (Pascual-Marqui et al., 1994), will inform about raw and z-transformed current density means and their pre/post differences (subtraction/individual paired *t*-test) especially of the beta range. Further raw and z-transformed Region of Interest (ROI) correlations of primary and pre-motor [Brodmann Area (BA) 4,6, Pre central gyrus], auditory (BA 22, 41, 42, superior and transversal temporal gryus) and frontal (BA 44, 45, inferior frontal gyrus) cortices will inform about post-treatment related neural reorganization in audio-motor coupling, expected to be shown as increased ROI correlations.

A previous study observed post therapy changes in motor and auditory activity during the music listening (Altenmüller et al., 2009; Rojo et al., 2011). In this study we will compare the pre/post *music listening data* of TIMP patterns employed in the intervention. In Rojo's study, when the participant listened to the music played in sessions after the MST treatment period, they displayed an increase of motor and auditory responses when compared to listening before they had received any MST. We hope to demonstrate the same tendencies by utilizing topographic EEG mapping and LORETA. Differences in power means and probabilities of change will inform us about differences in responses to the particular TIMP patterns.

We will analyse the particular raw pre/post means and differences of listening sequences (see **Table 1**) on topography shifts (Absolute power and CSD), electrode correlation (PCC), burst metrics, instantaneous connectivity, phase and coherence patterns in particular on the beta range (Altenmüller et al., 2009). Altenmüller further investigated alpha ERD/S measured after hitting a trigger pad and was able to show differences in the response latencies. Increases in frontal midline theta (FMT) triggered by emotional responses during music listening have been reported in several music therapy studies (Sammler et al., 2007; Lee et al., 2012; Fachner et al., 2013; O'Kelly et al., 2013). We expect to see more increases in FMT post TIMP treatment (6 weeks). We will apply LORETA ROI correlation of beta, alpha and theta frequency power means for each TIMP pattern (see **Table 1**) to indicate task related pre-post changes in auditory, frontal and motor regions (see rest EEG analysis).

A Go/No Go *task* will be performed to track improvements in reaction time (measuring participant's response to visual stimuli with a button press) and changes in contingent negative variation (CNV) between signals as a marker of attention processes. We expect the reaction time to have shortened more in the early intervention group.

#### **Interventions**

#### **Therapeutic Instrumental Music Performance (TIMP)**

TIMP, a NMT technique which has not been widely researched, is a defined intervention for upper limb rehabilitation, which comprises three essential elements: (1) Musical structure: clearly pulsed music, with melodic, harmonic and dynamic structures, which cue the organization of movements in time, space and force dynamics; (2) Choice of instruments and mode of


**TABLE 1 | TIMP chart.**


*(Continued)*

**TABLE 1 | Continued**


**TABLE 1 | Continued**

<sup>1</sup>This exercise is described and illustrated in Baker et al. (2006), p79 and adapted for this exercise.

playing; (3) Positioning or spatial arrangement of instrument/s to facilitate the target movement/s (Thaut, 2008; Thaut and Hoemberg, 2014). TIMP is an intervention that can be delivered following specialist neurologic music therapy training. It involves playing musical instruments or digital music equipment in a way that demands specific movement patterns. Musical equipment is positioned to practice those target movements that patients find difficult, for example elbow flexion and extension (reaching and playing a cymbal), or shoulder abduction (playing a drum to the side of the participant). Specific qualitative aspects of movement such as trajectory smoothing and variability, priming, timing, and movement range are targeted and the music prepared for this study, which can be performed live by the therapist or played in identical, pre-recorded format from a tablet, supports these aspects. All music is set to a metronome beat and each musical pattern that accompanies each exercise is comprised of strongly pulsed, simple repeated patterns, which provide a predictable temporal framework within which participants are able to plan and execute each movement, achieving a high number of repetitions. The aim of the music is to provide an auditory mirroring of movement patterns using melodic contour to support movement direction, and tempo, which is set to the existing speed of participants' movements (see Figures A1–A3 in Supplementary Materials Section).

#### **Instrument Choice**

The instruments played by participants in this study have been selected for their portability, flexibility in offering various spatial arrangements, and the quality and range of audio feedback that they can offer. These are important considerations for a treatment that is being delivered in the home environment, where access and space might prevent the use of many conventional acoustic instruments. Percussion instruments are accessible to non-musicians and require a wide range of movements and movement sequences, potentially employing all muscle groups (Thaut, 2008). They can also be positioned for unilateral and bilateral playing, and played using hands, fingers and other finger joints such as the knuckles, or with beaters and drumsticks. There is also a playing pattern that facilitates grip and release finger movements (see pattern 11 in the TIMP chart, using hand held percussion). Computer tablet touch screen instruments are also accessible to non-musicians and offer the appeal of more contemporary sound-worlds, with which some participants may identify and be more motivated by than with the acoustic instruments. Audio feedback and quality from the tablet touch screen instruments will be enhanced through the use of a "Jawbone Jambox" Bluetooth, wireless speaker, mounted on the microphone stand that holds the tablets; also saving space and eliminating the need for cables (with the exception of TIMP 8, which requires two tablets). The speaker is extremely resonant and will also be used to provide tactile feedback by placing it on table surfaces as participants play exercises whilst seated at a table. The touchscreen instruments and speaker will not provide the same quality or degree of tactile and acoustic feedback as acoustic instruments, but for this study they were considered to be most suitable to meet the need for a wide enough range of visual targets for fine motor exercises, whilst being portable and offering a variety of motivating instrumental sounds. Playing techniques for tablets do not require technique acquired through musical training and are easily accessible using finger tips, finger and thumb joints and movements not commonly associated with the sounds that they produce; such as that of the "smartpiano," which requires fingers to be moved vertically up and down across bars on the screen that represent and produce piano chords (see **Figure 3**), with the bass notes in the lower portion of each bar. Playing these touch screen instruments also requires more shoulder stability and controlled upper limb abduction, adduction, flexion and extension movement patterns than is the case with the larger acoustic instruments which have much larger target areas that are easier to hit.

#### **Equipment**

The instruments used in this study will be: bongos on an adjustable stand, 14- cymbal on a boom stand, two computer tablets, which mount on a single microphone boom stand using two clamp holders that can present various angles for playing, a small Bluetooth speaker, which mounts on the boom stand with the tablets, Garageband music software, ThumbJam music software, three cabasas (small, medium, and large), a selection of standard and adapted drum sticks and beaters, a pair of drum sticks made for playing tablet touch screen drums, a set of finger picks, a plectrum made for playing tablet touch screen guitar.

Adjustments to the tablet settings (see Appendix in Supplementary Materials) will be made in order to ensure that when participants play the touch screen instruments, the screen does not change or move, but the instrumental sounds are triggered, thus alleviating any frustration that may be caused by technical issues with the tablet on top of participants' existing motor control problems.

Smartpiano chords offer opportunities for participants to practice finger flexion and extension and other fine motor movements using a wide range of finger combinations, including thumb only (see patterns 6–10 in the TIMP chart). Smartbass and Smartguitar will be used to practice these movements, in addition to pinch grip by holding the tablet plectrum and strumming

notes and chords. The "sustain" switch for Smartpiano will be set to "on" in order to provide participants with more sustained harmony and auditory feedback before they go on to play the next chord. Chords for the touchscreen instruments (selected from the Garageband instrument menu) will be set for some patterns so that each one is separated by a blank chord space in order to minimize error in participants' playing (see **Figure 3**).

#### **Spatial Arrangement Set-up Time and Transportation of the Instruments**

A total minimum area of two meters squared will be required to set up the cymbal on boom stand and bongos on stand, including space for the participant to sit. The area increases depending on how far the instruments will be moved back from or to either side of the participant in order to facilitate greater range of upper limb movements.

Fifteen minutes are required for setting up all equipment, then a further 15 min for packing away, making a total of 30 min setting up time for each session in addition to the time for intervention. It is important to ensure that all equipment is transported with minimum risk of damage when carried in and out of the car and in to the various properties visited. A travel bag with extending handle and wheels will be used to transport the bongos, cymbal and all hardware, beaters and drum sticks. All together this weighs 16.7 kg. A shoulder bag will be used to transport the two tablets, metronome and all paperwork. The microphone stand with tablets brackets attached will be carried separately without cases and classical guitar will be transported in a robust, hard case.

#### **Using the TIMP Chart**

The extensive detail for TIMP intervention presented here (**Table 1**) was developed and refined through the course of delivering treatment to a volunteer stroke patient. It therefore represents an intervention that was refined through patient collaboration, prior to recruiting participants via the host NHS trust database, in order to maximize patient compliance.

Based on the TIMP chart, all musical patterns, which will be played live by the researcher to support participants whilst they play the instruments, have also been recorded onto one of the tablets, using "Garageband" music software. They were recorded at a metronome setting of 50 and 60 beats per minute (bpm) respectively, using the "audio recorder" selected from the Garageband instrument menu and input via the tablet builtin microphone. This offers two tempo settings for participants to try each exercise whilst the therapist physically guides arm movements in cases where hand-over-hand support is required. Following this, the researcher and participant will play together, with the therapist playing the supporting music live, in-time to a metronome, listening via an ear piece, that is adjusted to a tempo which supports each participant's current frequency of movement.

Most of the TIMP patterns have variations, where participants will follow alternative finger patterns, or be given various beaters, drum sticks, plectrums and finger picks to use with the instruments as required. These equipment serve one of two functions: they either facilitate improved access to the instruments and improved sound quality and auditory feedback from their movements and playing, or they require from the participant more complex finger movements, bilateral playing patterns and additional grasp, grip or pinch movements. Some participants may struggle to grip beaters initially and be more able to access instruments using hands and fingers only, with the focus more on gross motor movement. In this way, the musical tasks demand maximum physical performance.

### **Metronome Settings**

Pacing of movements can be problematic in hemiparetic movement disorders. Tapping exercises to external precisely paced auditory cues provide opportunities for rehearsal of movement timings. Using a metronome with a "tap" facility, each participant's playing tempo can be calculated by tapping into the metronome in-time with their playing. Following this, the researcher plays the music in a manner that strongly accents each beat. If the movements involve a high level of compensatory movements, for example from the trunk or shoulder, then the metronome speed will be reduced and the pulsed music played more slowly until the participant can be observed as having more time to plan movements between each beat, and move (playing the instrument/s) in-time with the music, or with more controlled and better quality movements. Once the performance of exercises is seen to become more fluent and the timing of playing more in-keeping with the music, then increases of approximately 10 bpm can be made provided that the movement quality is not compromised. For further reference to tempo and motor learning refer to (Massie and Malcolm, 2012; Furuya et al., 2013).

### **Monitoring Patient Performance**

Initially, each exercise will be played by participants for periods of up to 2 min (a timer will be used), after which the researcher will stop and ask the participant if they would like to continue or have a rest. The researcher will also ask more specific questions to determine if the participant is experiencing any discomfort or pain possibly related to each exercise, for example in the back, neck, shoulder, elbow, wrist, fingers, not normally present. If the participant feels that they are experiencing pain or discomfort related to the musical exercises then treatment will be paused and these symptoms discussed, before either continuing or considering any potential need for a GP or physiotherapy consultation.

For all exercises participants will be encouraged to keep their feet flat on the ground in order to provide support for their back and core muscles and optimize movement control when playing the instruments. This instruction to participants has become a part of the TIMP protocol for the study following review of video footage with members of the host NHS trust team and academic supervisors, which was taken during sessions with a volunteer stroke participant prior to this study.

## **Discussion**

At the center of this study is the aim of testing whether the 12 different TIMP playing patterns (**Table 1**) and their variations, which have been developed following the TIMP protocol, can be effectively delivered in the home environment and improve upper limb function across a small sample size of participants with hemiparesis following stroke. The TIMP protocol, whilst sharing some attributes with MST and stemming from scientific research into the effects of rhythm on movement kinematics, has not been clinically or scientifically researched to a great extent. Whilst MST uses protocolized musical exercises, it has not explored any additional effects of using rhythm and music, which would be derived from existing scientific evidence for its role in supporting the priming, timing, trajectory and muscle force requirements for the upper limb movements within each exercise pattern.

Research into the effects of musical instrument playing and rhythm supporting movements has been based on a model of daily treatment, 5 days per week, which has produced statistically significant results (Schneider et al., 2007; Altenmüller et al., 2009; Malcolm et al., 2009b; Rojo et al., 2011; Amengual et al., 2013). Studies with a lower frequency of treatment have not been widely conducted and with such a reduction in frequency it is not known what the treatment effect will be.

There is great heterogeneity of upper limb impairment within this patient group and the ARAT has been developed as a tool that can capture change within these parameters by recreating a protocol combining tasks commonly performed within ADLs. The table of TIMP exercise patterns (see **Table 1**) developed for this study describes the target arm movements for each instrumental exercise in the first column, then the instrument/s and equipment to be used, the positioning of each instrument and how it should be played. It can be seen that the musical exercises require arm, hand and finger movements that are the same or similar to those required in order to perform tasks in the ARAT and 9HPT.

ARAT and 9HPT data will not inform about treatment effects on audio-motor coupling and neural re-organization as demonstrated in other studies with fMRI, TMS, and EEG (Altenmüller et al., 2009; Rojo et al., 2011; Rodriguez-fornells et al., 2012). In order to estimate the feasibility of neurometric EEG measures (John, 1989) as an imaging tool for determining cerebral changes related to the TIMP intervention we plan to visualize audio-motor coupling (Rodriguez-fornells et al., 2012) with the continuous EEG using one participant from each group. A mathematical solution of the inverse problem of EEG sources allows the creation of a low resolution electromagnetic tomography (LORETA) of brain regions (Pascual-Marqui et al., 1994) and we can correlate the estimated EEG source activity of the raw and *z*-scored transformed means (Thatcher et al., 2005). EEG cannot visualize brain activity of the midbrain but of the cortex and this is where we expect most lesions after stroke.

Utilizing these imaging tools we also plan to explore a comparison of differences between early and late intervention. Two clients will be subjected to an EEG but we are aware that the imaging results will not be as high resolution as those provided by fMRI. Furthermore, with only two participants, no resulting statistical differences are sought or expected. EEG measures changes in electrical current in the brain and has been utilized in studies on the recovery of stroke patients (Giaquinto et al., 1994), prefrontal-to-motor cortex connectivity (Picazio et al., 2014), post-movement beta-event-related-synchronization (PMBS) in stroke patients with mild hemiparesis (Eder et al., 2006) and current stroke studies utilizing MST in stroke rehabilitation (Altenmüller et al., 2009; Rodriguez-fornells et al., 2012). To explore the limitations and advances of the imaging techniques proposed for this study, the manageability of the measurement process and the cost-effectiveness of utilizing low-cost, portable EEG apparatus and analysis, compared to lab-based, more expensive fMRI measures is a legitimate goal for a feasibility study.

Thus, the intention of this feasibility study is to provide and test a platform, via the TIMP playing patterns, for breaking down movement sequences, facilitating a high level of repetition of specific movements within an activity that is interactive and enjoyable and that is clearly linked to movements required for ADLs. Exercises are performed within clearly structured and repeated rhythmic, musical frameworks, the like of which are evidenced as potential drivers of neural reorganization specifically in the realm of stroke hemiparesis rehabilitation, and also found to reduce learned misuse or compensatory motor behaviors.

Although in this feasibility study, with a small sample size, we do not predict significant group outcomes, we still expect to report on feasibility of the delivery and efficacy of the intervention. We are not intending to apply non-parametric statistical analysis and aren't expecting larger generalizability of the data (which would be increased by applying nonparametrical testing) but want to explore tendencies achievable with parametric data analysis strategies, research design and conditions that would apply with a larger and powered sample size.

MST and trials investigating the effects of rhythm and music on upper limb kinematics have taken place in research laboratories and included, predominantly, inpatients 2 months post stroke. This TIMP study includes participants up to 5 years post stroke, where community rehabilitation in their home has been completed. The majority of rehabilitation for stroke patients in the UK takes place in patients' homes and does not target upper limb hemiparesis alone, but mobility and independent living skills in a more holistic model. To date there have been no reports on feasibility for this type of intervention in participants' homes. As such the study will make a contribution to new knowledge in the field that could influence future service design.

## **Author Contributions**

AS developed the treatment protocol following the TIMP guidelines set out by Michael Thaut, conducted the literature review and drafted the manuscript. HO, WM, and JF advised on the initial overall design, ethics, and timing of clinical and research protocols. JF advised on feasibility of treatment frequency and sample size. JF, WM, and HO edited draft manuscripts and advised on structure and content. AB facilitated the hosting of the study, advised on recruitment sites and procedures and enabled blind assessment.

## **Acknowledgments**

Andrew Bateman is supported by the National Institute for Health Research (NIHR) Collaboration for Leadership in Applied Health Research and Care East of England at Cambridgeshire and

### **References**


Peterborough NHS Foundation Trust. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health.

### **Supplementary Material**

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fnhum. 2015.00480


gait patterns in adults with cerebral palsy: a randomized controlled trial. *Clin. Rehabil.* 26, 904–914. doi: 10.1177/0269215511434648


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Street, Magee, Odell-Miller, Bateman and Fachner. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Rehabilitation of aphasia: application of melodic-rhythmic therapy to Italian language

Maria Daniela Cortese<sup>1</sup> , Francesco Riganello<sup>1</sup> \*, Francesco Arcuri <sup>1</sup> , Luigina Maria Pignataro<sup>1</sup> and Iolanda Buglione<sup>2</sup>

1 Intensive Care Unit, S. Anna Institute and Research in Advanced Neurorehabilitation, Crotone, Italy, <sup>2</sup> Casa di Cura Villa Margherita, San Giuseppe Moscati Institute, Benevento, Italy

Aphasia is a complex disorder, frequent after stroke (with an incidence of 38%), with a detailed pathophysiological characterization. Effective approaches are crucial for devising an efficient rehabilitative strategy, in order to address the everyday life and professional disability. Several rehabilitative procedures are based on psycholinguistic, cognitive, psychosocial or pragmatic approaches, including amongst those with a neurobehavioral approach the Melodic Intonation Therapy (MIT). Van Eeckhout's adaptation of MIT to French language (Melodic-Rhythmic Therapy: MRT) has implemented the training strategy by adding a rhythmic structure reproducing French prosody. The purpose of this study was to adapt MRT rehabilitation procedures to Italian language and to verify its efficacy in a group of six chronic patients (five males) with severe non-fluent aphasia and without specific aphasic treatments during the previous 9 months. The patients were treated 4 days a week for 16 weeks, with sessions of 30–40 min. They were assessed 6 months after the end of the treatment (follow-up). The patients showed a significant improvement at the Aachener Aphasie Test (AAT) in different fields of spontaneous speech, with superimposable results at the followup. Albeit preliminary, these findings support the use of MRT in the rehabilitation after stroke. Specifically, MRT seems to benefit from its stronger structure than the available stimulation-facilitation procedures and allows a better quantification of the rehabilitation efficacy.

#### Edited by:

Julian O'Kelly, Royal Hospital for Neuro-disability, UK

#### Reviewed by:

Alexander J. Street, Anglia Ruskin University, UK Philippa Derrington, Queen Margaret University, UK

#### \*Correspondence:

Francesco Riganello, Intensive Care Unit, S. Anna Institute and Research in Advanced Neurorehabilitation, 11 Via Siris, Crotone, KR 88900, Italy f.riganello@istitutosantanna.it; francescoriganello@gmail.com

> Received: 01 April 2015 Accepted: 07 September 2015 Published: 24 September 2015

#### Citation:

Cortese MD, Riganello F, Arcuri F, Pignataro LM and Buglione I (2015) Rehabilitation of aphasia: application of melodic-rhythmic therapy to Italian language. Front. Hum. Neurosci. 9:520. doi: 10.3389/fnhum.2015.00520 Keywords: melodic intonation therapy, melodic rhythmic therapy, aphasia, broca, music therapy

## Introduction

A frequent event after stroke is aphasia (with an incidence of 38% of cases; Pedersen et al., 1997; Engelter et al., 2006), which is multifaceted because of the brain structural and functional processes dedicated to, or involved in language (Mesulam, 1990; Bachman and Albert, 1991; Bookheimer, 2002; Démonet et al., 2005; Jung-Beeman, 2005). Linguistic and non-linguistic processes (e.g., attention, memory, sensory or motor subroutines) are functionally related and their damage results in language impairment at varying levels of complexity. Accordingly, aphasia is a complex disorder (Huber et al., 1997; McNeil and Pratt, 2001); detailed pathophysiological characterization and proper approaches are mandatory for an efficient rehabilitative strategy to be devised and the disability in everyday life and profession to be compensated for Black-Schaffer and Osberg (1990), Holland et al. (1996), Paolucci et al. (1997), Robey (1998) and Tilling et al. (2001).

Some of the most common varieties of aphasia are classified in two major forms: fluent and non-fluent. The fluent form is generally characterized by the impairment to grasp the meaning of spoken words, while the ease of producing connected speech is not affected so critically. Therefore Wernicke's aphasia is referred to as a ''fluent aphasia''. However, speech is far from normal. Sentences do not hang together and irrelevant words intrude sometimes to the point of jargon, in severe cases. Reading and writing are often severely impaired (Stringer and Green, 1996).

The second form of aphasia is characterized by severe reduction of speech output, limited mainly to short utterances of less than four words. Vocabulary access is limited and the formation of sounds by persons with Broca's aphasia is often laborious and clumsy. The person may understand speech relatively well and be able to read, but be limited in writing (Stringer and Green, 1996). Broca's aphasia is often referred to as a ''non fluent aphasia'', characterized by anomia (i.e., word-retrieval difficulty), agrammatism (i.e., grammar and syntax deficit), and apraxia of speech (AOS; a motor speech disorder affecting the planning or programming of speech movements; American Academy of Neurology, 1994; Ballard et al., 2000). However if anomia is the core symptom of aphasia, and is present in all aphasic syndromes, agrammatism and AOS are clinical markers used to differentiate Broca's from other aphasias. MIT (Albert et al., 1973) has shown little effect on agrammatism. More, the hypothesis that MIT could be effective on Broca's aphasia is due to its action on deficit in motor planning or programming of speech movements.

No indications are referred to the application of MIT to global aphasia patients, characterized by the production of no or few recognizable words and no or poor comprehension of spoken language. Moreover, global aphasia patients can neither read nor write (Stringer and Green, 1996).

There is widespread consensus on the efficacy of the different rehabilitative approaches of the aphasic (Brain Injury Interdisciplinary Special Interest Group; American Congress of Rehabilitation Medicine; European Federation of Neurological Societies; Cappa et al., 2005; Ciceron et al., 2011). However, efficacy can vary among subjects. Variability depends on the applied rehabilitative procedure as well as on the intensity of treatment (Brindley et al., 1989; Poeck et al., 1989; Teasell et al., 2003), with better recovery after intensive and prolonged rehabilitation (Bhogal et al., 2003). Several rehabilitative procedures are available based on psycholinguistic (Schwartz and Fink, 1997; Lesser and Milroy, 2014), cognitive (Holland, 1994), psychosocial or pragmatic (Holland, 1991; Lyon et al., 1997; Elman, 1998) approaches. However, major limitations of, and source of criticism to the rehabilitation of the aphasic, rest on the inadequate tailoring of rehabilitative procedures to the individual patient's needs.

Among the rehabilitative procedures with neurobehavioral rationale, the Melodic Intonation Therapy (MIT) designed by Albert and co-workers (**Table 1**; Albert et al., 1973; Sparks et al., 1974; Sparks and Holland, 1976) has been rated promising (class III) by the Therapeutics and Technology Assessment Subcommittee of the American Academy of Neurology (American Academy of Neurology, 1994) and transferred with comparable results to non-English linguistic populations such as Romanian, Persian, and Japanese (Seki and Sugishita, 1983; Popovici and Mihailescu, 1992; Baker, 2000; Bonakdarpour et al., 2003). Van Eeckhout 's adaptation to French language (Melodic-Rhythmic Therapy, MRT; Van Eeckhout and Bhatt, 1984) has implemented the training strategy by adding a rhythmic structure reproducing French prosody, i.e., with melodic interval of 4th and the presentation of sentences in rhythmic epochs matching the syntactic-semantic structure (Tranel, 1987; Rossi, 1999; Hind, 2002; Van Eeckhout, 2002). The approach proved efficient and suggests application to other languages. Purposes of this study were to adapt MRT rehabilitation procedures to Italian language and to verify its efficacy in a group of patients with severe non-fluent aphasia.

## Materials and Methods

#### Adaptation

Rhythmic-temporal and melodic-intonative plans are considered the characteristic features of prosodic aspects, or suprasegmental, of language. In particular, in the speech, the rhythm refers to the prominent elements and not the phonetic string, while the tone refers to the pitch and loudness variations (Marotta, 2009). The basic unit of rhythm is the syllable (phonetically and phonologically defined as an agglomeration of phonic elements around an intensity or loudness peak) and the alternation of strong and weak syllables is the analysis and the creation of a rhythmic pattern. The prominence, or force, is determined by the accent, i.e., an increase of intensity, duration and height, with respect to the adjacent elements (Savy, 2009).

#### TABLE 1 | Melodic intonation therapy (MIT).


MIT is a "formal" treatment with hierarchical structure developed in four levels, assuming a key role of the right hemisphere in the control of the speech accent, intonation and melodic pattern.

Cortese et al. Aphasia-melodic-rhythmic therapy and Italian language

Depending on the characteristics of the rhythm, natural languages have been divided into stress-timed (with regular intervals between the accents) and syllable-timed, (with constant syllabic duration), as Italian. In addition, there are other elements that allow classifying languages, such as the compressibility of unstressed syllables, defined ''compensation languages'', e.g., English. Other languages do not allow it, they are defined ''check languages'', e.g., Italian (Romito and Trumper, 1993).

Among the models developed to analyze the rhythmic characteristics of the languages, a recent one, the Control/Compensation Index, is able to create groupings, in order to define the belonging of a language to the syllableor stress-timed group (Bertinetto and Bertini, 2008). Unlike other prosodic elements, the tone is intrinsically significant. Even Italian, although it is not a tonal language (such as Mandarin), can be represented as a sequence of two types of discrete tones: High and Low, e.g., a decreasing tone characterizes a ''declaration'', and an increasing tone characterizes a ''question'' (Bertini and Bertinetto, 2007).

A peculiarity of MRT in French version is the use of a core melodic sequence based on two notes (high and low, respectively), with stressed accentuation and slow scanned rhythm (Van Eeckhout et al., 1995; **Table 2**; **Figures 1C,D**). French language is characterized by a consonant at the end of most words, a tonic accent falling most often on the last vowel, a high prolonged note at the beginning of most sentences, and the sentence subdivision in syntactic-semantic units (**Tables 3**, **4)**. Instead, Italian language is organized in ''tonal units'' (Hart et al., 2006) that contribute to the sentence rhythmic scan and also add in the communication of meaning (Cresti, 1987). MRT adaptation to Italian language was therefore performed by adjusting to the tone and prosody properties of this language (**Tables 3**, **4**; **Figures 1**, **2A,B**) and by taking in due consideration the role of these properties in spontaneous linguistic communication (Chapallaz, 1964; Austin, 1975; Bertinetto, 1981; Vayra and Fowler, 1987; De Dominicis and Vineis, 1992; Bertinetto and Magno Caldognetto, 1993; Savy et al., 2004). To this end, the tonal interval of 3rd major has been selected (Romano, 2001; Romano and Interlandi, 2005), with the high and low notes positioned where the tonic accent falls and a low note at the end of the sentence (with the exception of words with the last syllable stressed or interrogative sentences).

#### TABLE 2 | MRT parameters.


#### FIGURE 1 | French and Italian interrogative vs. declarative sentences.

(A) and (C) interrogative sentences in Italian and French Language respectively; (B) and (D) declarative sentences in Italian and French Language respectively.

#### TABLE 3 | Intonation/prosody peculiarities of Italian language.


#### Patients

Six patients (five males) with ischemic stroke in dominant hemisphere and non-fluent Broca's aphasia rated severe at the Aachener Aphasie Test (AAT; Huber et al., 1983; Luzzatti et al., 1996) were admitted to the study at least 9 months after brain injury. Age was 59.8 ± 9.3 years (range: 53–71 years); education ranged from grammar school to university. In all cases, brain damage was unilateral; spontaneous speech, word articulation and repetition of single words were impaired; comprehension of spoken language was maintained; acoustic perception was documented across a wide range of sound frequencies; patients were motivated and emotionally stable. All subjects had been treated by conventional speech therapy rehabilitation procedures for 3–17 months before entering the study. Upon admission to the study, they underwent a baseline standard assessment of their residual language performance (AAT), that was proved to be superimposable to the last evaluation, referred to the end of the traditional treatment. The sample size and follow-up evaluation met the requirements of the ''American Academy of Neurology, Therapeutics and Technology Assessment Subcommittee''. The study has been approved by the local public health care Ethical Committee. Subjects were informed in full detail about the study purpose and experimental procedures, more


codes and policies for research (Resnik, 2011) of the ethical principles of the Declaration of Helsinki (1964) by the World Medical Association concerning human experimentation were followed.

#### MRT Rehabilitation Protocol

Patients were treated intensively (Schlaug et al., 2009; Wan et al., 2014) 4 days a week for 16 weeks. Each session lasted 30–40 min; apraxia was treated for about 10 min at the beginning of the rehabilitation session. In all cases, the following procedures were applied in hierarchical sequence:


progressively reduced with the patient improving repetition of all sentences by him/herself. Each sentence was attributed the score 1 (if adequately repeated and understandable) or zero. Sentences difficult to pronounce were presented again after focusing on the problem. Upgrading to phase 3 was allowed when 90% of sentences were properly reproduced without the therapist's help and visual contact.

3. In the third phase, sentences were sorted out of the patient's daily life (life in hospital, news, etc.) and the subjects were forced to use the rhythmic-melodic scan to communicate (**Table 5**).

## Follow-Up

The efficacy of MRT was assessed by means of the AAT both at the end of rehabilitation and 6 months later.

### Data Analysis

The differences between baseline and end of the treatment, as well as the difference between end of treatment and control at follow-up were tested statistically by the Wilcoxon's exact test (Siegel, 1956; Gibbons and Chakraborti, 2011), that is more accurate in case of small sample, or when the tables are sparse or unbalanced (Tanizaki, 1997; Mundry and Fischer, 1998; Gibbons and Chakraborti, 2011). The effect size (r; i.e., the index measuring the magnitude of difference or change between two conditions, in this case baseline vs. end of the protocol; Rosenthal, 1991) was calculated as the z/square root (N; where N is the number of observations on which z is based) and will be hereafter formally referred to as not relevant (r < 0.1), small (0.1 < r < 0.3), medium (0.3 < r < 0.5), or large (r > 0.5; Hemphill, 2003).

## Results

At baseline, the patients' speech was restricted to few, fragmentary and scarcely understandable sentences. Anomies, agramatisms, phonemic paraphasias, neologisms and perseverations were observed as indicative of partial efficacy of conventional rehabilitation on spontaneous speech. Spontaneous speech (as measured by the AAT test; **Figure 3**; **Table 6**) was


improved at the end of MRT specifically, in the semantic-lexical structure (Wilcoxon exact test: z = −2.220, p = 0.031, r = 0.640), phonemic structure (Wilcoxon exact test: z = −2.226, p = 0.031, r = 0.642), speech automatism (Wilcoxon exact test: z = −2.332, p = 0.031, r = 0.673), prosody (Wilcoxon exact test: z = −2.333, p = 0.031, r = 0.673) and communication (Wilcoxon exact test: z = −2.264, p = 0.031, r = 0.653). Moreover improvements were found in the correct repetition (Wilcoxon exact test: z = −2.207, p = 0.031, r = 0.637), naming (Wilcoxon exact test: z = −2.201, p = 0.031, r = 0.635), and comprehension (Wilcoxon exact test: z = −2.201, p = 0.031, r = 0.635) subtests (**Figure 3**; **Table 7**). The number of pronounced words per interval time increased, phonemic structure and syntax improved too. At follow-up, the AAT ratings in all subtests were superimposable to those recorded at the end of rehabilitation in the spontaneous speech as well as in the subtest (z ≤ −1.633, p ≥0.125).

## Discussion

The emerging research field of music and neuroscience has evidenced that the sound envelope processing (Kotz and

Schwartze, 2010; Patel, 2011; Peelle and Davis, 2012) and the synchronization and entrainment to a pulse, may help to stimulate brain networks for human communication (Fujii and Wan, 2014). The possible circuits that may help to stimulate the brain networks underlying human communication could be: (1) the auditory afferent circuit consisted of brainstem, thalamus, cerebellum, and temporal cortex for precise encoding of sound envelope and temporal events (Kotz and Schwartze, 2010); (2) the subcortical–prefrontal circuit for emotional and reward-related processing (Koelsch, 2014); (3) the basal gangliathalamo-cortical circuit for processing beat-based timing (Kotz and Schwartze, 2011); and (4) the cortical motor efferent circuit for motor output (Meister et al., 2009a,b).

The cortical functional re-organization underlying recovery of language remains poorly understood. PET or fMRI studies have related the recovery of impaired language with increased activation of cortical area of the right hemisphere involved in language (Cappa et al., 1997; Thulborn et al., 1999; Gold and Kertesz, 2000) the result of inadequate compensation processes for others (Rosen et al., 2000; Perani et al., 2003; Naeser et al., 2004), while other studies indicate activation of perilesional areas in the left hemisphere as the key mechanism for an efficient recovery to occur (Karbe et al., 1998; Cao et al., 1999; Heiss et al., 1999). PET studies on aphasic patients with no spontaneous recovery undergoing MIT rehabilitation (Belin et al., 1996; Warburton et al., 1999) have documented activation of left Broca's area and concomitant inhibition of contralateral Wernike's area. Compensatory re-activation in response to MIT/MRT rehabilitation of the left hemisphere structures involved in language (e.g., Heschl's gyrus, temporal pole, angular gyrus, Broca's area and adjacent prefrontal cortex) therefore is a practicable hypothesis. Further investigation is required to correlate the efficacy of MRT rehabilitation with the extent of brain damage at baseline and changes in the

#### TABLE 6 | AAT speech language.


#### TABLE 7 | AAT sub-test.


brain functional organization as documentable e.g., by advanced neuroimaging techniques.

#### Conclusion

Use of MRT in neo-latin countries required adjustment to the language metrics. Adaptation to Italian language rhythm and prosody (Pöchhacker, 1994; Hart et al., 2006) according to metrics criteria (**Tables 3**,**4**) proved successful: impaired speech improved in our chronic patients' sample. Albeit preliminary, these findings support the use of MRT in the rehabilitation after stroke. Specifically, MRT seems to benefit from its stronger structure than the available stimulationfacilitation procedures and allows a better quantification of the rehabilitation efficacy. In this regard, it compensates in

#### References


part for the current problems in documenting the individual patient's improvement in his/her interacting environment (unit, family, everyday life) and appears more reliable than standard evaluation scales (AAT, BDAE etc.). The observation of improved written language suggests possible selective application in the treatment of this deficit and upgraded research on the mechanisms undergoing writing and its impairment in aphasia.

### Author Contributions

All authors were involved in the conception and design or analysis and interpretation of data, have contributed to the drafting and revisions of the manuscript, and have approved the submitted version.


of poststroke aphasia. Ann. Neurol. 45, 430–438. doi: 10.1002/1531- 8249(199904)45:4<430::aid-ana3>3.0.co;2-p


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Cortese, Riganello, Arcuri, Pignataro and Buglione. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Neurologic Music Therapy Training for Mobility and Stability Rehabilitation with Parkinson's Disease – A Pilot Study

*Anna A. Bukowska1,2\*, Piotr Kr ˛ezałek* **˙** *3, Elzbieta Mirek* **˙** *2,4, Przemysław Bujas5 and Anna Marchewka2*

*<sup>1</sup> Department of Occupational Therapy, The University of Physical Education in Krakow, Krakow, Poland, <sup>2</sup> Department of Clinical Rehabilitation and Laboratory of Pathology of the Musculoskeletal System, The University of Physical Education in Krakow, Krakow, Poland, <sup>3</sup> Department of Physiotherapy, The University of Physical Education in Krakow, Krakow, Poland, <sup>4</sup> Section of Rehabilitation in Neurology and Psychiatry, The University of Physical Education in Krakow, Krakow, Poland, <sup>5</sup> Department of Theory of Sport and Kinesiology, The University of Physical Education in Krakow, Krakow, Poland*

Idiopathic Parkinson's Disease (PD) is a progressive condition with gait disturbance and balance disorder as the main symptoms. Previous research studies focused on the application of Rhythmic Auditory Stimulation (RAS) in PD gait rehabilitation. The key hypothesis of this pilot study, however, assumes the major role of the combination of all three Neurologic Music Therapy (NMT) sensorimotor techniques in improving spatio-temporal gait parameters, and postural stability in the course of PD. The 55 PD-diagnosed subjects invited to the study were divided into two groups: 30 in the experimental and 25 in the control group. Inclusion criteria included Hoehn and Yahr stages 2 or 3, the ability to walk independently without any aid and stable pharmacological treatment for the duration of the experiment. In order to evaluate the efficacy of the chosen therapy procedure the following measures were applied: Optoelectrical 3D Movement Analysis, System BTS Smart for gait, and Computerized Dynamic Posturography CQ Stab for stability and balance. All measures were conducted both before and after the therapy cycle. The subjects from the experimental group attended music therapy sessions four times a week for 4 weeks. Therapeutic Instrumental Music Performance (TIMP), Pattern Sensory Enhancement (PSE) and RAS were used in every 45-min session for practicing daily life activities, balance, pre-gait, and gait pattern. Percussion instruments, the metronome and rhythmic music were the basis for each session. The subjects from the control group were asked to stay active and perform daily life activities between the measures. The research showed that the combination of the three NMT sensorimotor techniques can be used to improve gait and other rhythmical activities in PD rehabilitation. The results demonstrated significant improvement in the majority of the spatiotemporal gait parameters in the experimental group in comparison to the control group. In the stability tests with eyes closed, substantial differences were revealed, indicating improvement of proprioception (the sense of body position and movement). These findings suggest a new compensatory strategy for movement and postural control through the use of the auditory system.

Keywords: Parkinson's disease, gait, stability, neurologic music therapy

#### *Edited by:*

*Julian O'Kelly, Royal Hospital for Neuro-disability, UK*

#### *Reviewed by:*

*Urs Nater, University of Marburg, Germany Alexander J. Street, Anglia Ruskin University, UK*

> *\*Correspondence: Anna A. Bukowska annabookowska@gmail.com*

*Received: 27 July 2015 Accepted: 18 December 2015 Published: 26 January 2016*

#### *Citation:*

*Bukowska AA, Kr ˛ezałek P, Mirek E, ˙ Bujas P and Marchewka A (2016) Neurologic Music Therapy Training for Mobility and Stability Rehabilitation with Parkinson's Disease – A Pilot Study. Front. Hum. Neurosci. 9:710. doi: 10.3389/fnhum.2015.00710*

## INTRODUCTION

Idiopathic Parkinson's disease (PD) is one of the most common, progressive, degenerative conditions of the central nervous system. It causes significant physical disability and leads to impairment in cognitive functions. PD affects 1% of people over the age of 60 in all developed countries. About 90,000 cases are currently registered in Poland (De Rijk et al., 1995; Friedman, 2005; Drapała et al., 2014; Opara, 2014).

The impairment of automatic and rhythmic movements is described in the pathophysiology associated with PD. The disease is caused by disappearance of dopamine production and leads to changes in the central nervous system, impairing the neural networks including the basal ganglia and supplementary motor areas. The consequences of progressing PD are increasingly felt by patients as an abnormal gait pattern, balance problems and difficulties in performing daily life activities. It leads to alterations in functioning, which results in a lowering of the quality of life (Freeman et al., 1993; Bergman and Deuschl, 2002; Martin et al., 2002; Purves et al., 2004). Typical changes in walking patterns among the PD population are described as gait slowness with the reduction or absence of reciprocal arm swing and increased double support. Furthermore, motion amplitude scaling is impaired resulting in the inability to generate sufficient step length. Gait changes in PD also include asymmetry between the sides of the body and a reduction of bilateral coordination. Deprivation of ability to produce internal walking rhythm causes an increase of asymmetry between steps (Blin et al., 1990; Hausdorff, 2009; Peterson et al., 2012). According to the Hoehn and Yahr (1967) scale, slight changes in balance are noticeable from the very beginning of the disease progression. However, considerable postural instability, based on proproception, is described in the progression of the disease between second and third H&Y stage. It engenders balance impairment, not only during gait but also in standing position (Hoehn and Yahr, 1967; Konczak et al., 2009).

Despite the degenerative nature of PD, pharmacological treatment and proper rehabilitation enable a patient's wellfunctioning and active participation in daily life for years after the diagnosis. Research and clinical experience indicate the significance of the appropriate selection of therapeutic interventions, exercises, and stimulation for an optimal effect which can be translated into a better quality of life for PD patients (Keus et al., 2004, 2007; Aragon and Kings, 2010). The current data shows that approximately 1% of people of 60 years of age in Poland are annually diagnosed with PD. The aging population may lead to a significant increase in numbers of patients with this disease in the future (Drapała et al., 2014). Therefore, it is crucial to search for new therapeutic solutions.

The use of music in neurorehabilitation is grounded in neurophysiological theories, and research on the influence of music on cognitive processes and motor learning principles (Bayona et al., 2005; Kitago and Krakauer, 2012). The therapeutic approach established 20 years ago in the US called Neurologic Music Therapy (NMT) is known as an effective approach in neurorehabilitation (Thaut, 2005). NMT concepts distinguish three sensorimotor techniques, with motor skills improvement as an overall goal. The first one, Rhythmic Auditory Stimulation (RAS), is a technique that aims to develop and maintain a physiological rhythmic motor activity (gait) through rhythmic auditory cues. This technique has been proven effective for gait rehabilitation in PD (McIntosh et al., 1997; Thaut et al., 1998; Thaut and Rice, 2014). The second technique is Patterned Sensory Enhancement (PSE). The objective of this technique is to facilitate movements associated with the activities of daily life, not necessarily rhythmical in nature. PSE uses complex music elements: pitch, dynamics, harmony, meter, and rhythm to enhance and organize movement patterns in time and space, and to favorably affect the activity, muscle coordination, strength, balance, postural control and range of motion (Thaut, 2014a). The last technique, Therapeutic Instrumental Music Performance (TIMP), employs musical instruments as a task orientation training to simulate and facititate functional movements. The technique most commonly uses percussion instruments, playing them in a traditional or non-traditional way to improve range of motion, limb coordination, postural control, dexterity, body perception, and sensation (Thaut, 1999; Mertel, 2014). In order to optimize the music therapy process, NMT uses the Transformational Design Model (TDM) to translate theoretical knowledge into clinical practice. It promotes effective assessment, design and implementation of therapeutic music interventions (Thaut, 2005, 2014b).

The purpose of this pilot study was to evaluate the efficacy of music and rhythm for mobility and balance in a group of patients with PD. The key hypothesis assumed the major role of the combination of all three NMT sensorimotor techniques in improving spatio-temporal gait parameters and postural stability in the course of PD. The research question aimed to find an answer to how effective the combination of TIMP, PSE, and RAS are in the motor treatment of PD patients.

## MATERIALS AND METHODS

## Participants

Fifty five subjects diagnosed with PD participated in the research project. All of them were recruited from the database of the Neurology and Neurosurgery Clinic of Jagiellonian University in Krakow. The inclusion criteria included: informed consent for participation in the experiment, Idiopathic PD diagnosed by a neurologist, stages two and three of the disease according to Hoehn and Yahr (1967; H&Y), the ability to walk independently without any aid for at least 6 m × 8 m distance, and stable pharmacological treatment for the duration of the experiment. The exclusion criteria were: lack of informed consent for participation in the experiment, musculoskeletal injuries (e.g., fractures and prosthesis), diagnosis of dementia including Alzheimer's disease (*MMSE <* 25), and frequent changes in medications. The subjects displayed similar symptoms due to the duration and severity of the condition and were randomly assigned into two groups. The experimental group E (*n* = 30 individuals) were involved in the NMT program. The average age in this group of patients was 63.4 years. In the control group K (*n* = 25 individuals) participants were asked to maintain their daily life activities (changing of position, walking, walking stairs). The average age of the control group was 63.44 years. The characteristics in each group are presented in **Table 1**.

The NMT sessions were held in the Department of Rehabilitation at the Neurology and Neurosurgery Clinic of Jagiellonian University in Krakow. The research was registered in The Ministry of Science and Higher Education, Poland. The clinical trial was approved by the Bioethical Committe of The Supreme Medical Council in Krakow, Poland.

## Methods

In order to evaluate the efficacy of the chosen therapy procedure the following measures were applied: Optoelectrical 3D Movement Analysis System BTS Smart to obtaine gait parameters and Computerized Dynamic Posturography CQ Stab to measure stability. All measures were conducted directly before and directly after 4 weeks for the experimental group (therapy cycle) and for the control group (Motion Analysis Lab, Department of Physiotherapy, University of Physical Education in Krakow, Poland).

## Gait Assessment

To measure the temporal and spatial gait parameters (stance and swing phase, double support, stride time and cadence, step and stride length, velocity and step width) Optoelectrical 3D Movement Analysis System BTS Smart was used (Davis et al., 1991; Marchewka, 2005; Rutowicz et al., 2005; Cygon, 2011 ´ ). BTS SMART allows to capture the movement with a frequency of 70 Hz. The system consists of six analog cameras with infrared light and BTS SMART-Analyzer software. Before the survey anthropometric measures, such as body height and weight, width of pelvis, depth of pelvis, diameter of knee, width of ankle, and the length of lower limbs were taken. According to the Davis Protocol, 22 passive reflective markers were attached to every subject's body. All anthropometric measures and attached markers were taken by the same previously trained person. The BTS System was calibrated before the measurement, which enabled a spatial assessment of distance between the markers. The gait assessment consisted of 10 s of static capture and six walking trials (to increase reliability) on the 8 m track. Walking with the participant's spontaneous velocity took place without any stimulation during movement; only a starting command was used. Collected static and dynamic data was utilized to generate a report containing the temporal and spatial gait parameters.

## Stability Assessment

In Romberg's test which assessed stability, Computerized Dynamic Posturography CQStab was applied (Swierc, 2009 ´ ). The recording was made twice in the static position, each of the records lasted 30 s, carried out at first with eyes opened and secondly eyes closed (Błaszczyk and Czerwisz, 2005; Ocetkiewicz et al., 2006; Strzecha et al., 2008).

## Neurologic Music Therapy Procedure

The therapeutic program for the experimental group included 4 weeks of individual 45-min sessions of NMT four times a week. At the time of the sessions the subjects were in an "on" phase (the best possible mobility). Participation in the sessions did not require any previous knowledge or musical skills. Each of the therapy sessions took place according to the same scheme. It comprised practicing activities of daily living, balance, pregait and gait training by using sensorimotor NMT techniques: TIMP, PSE, and RAS. For planning the therapy sessions the TDM was employed. Percussion instruments (cajon, conga, drums, maracas, and tambourine), a metronome and recorded rhythmic music were the basis for each session. Different sizes, shapes and sounds of the instruments provided numerous possibilities for motor activity stimulation.

Applied music, through its main elements – pitch, dynamics and harmony, meter, tempo and rhythm, supported the organization of movement in time and space, introduced movement fluency, gave an impulse to the muscle and provided rhythmic instructions to initiate and continue the activity. Rhythmic music, mostly African and Indian, was selected by the music therapist. The rhythmic structure of the music gave a temporal cue for movement independently of the participants' music preferences (pure sensorimotor stimulation). For this reason participants were not asked about their music preferences. MP3 recordings were played during the session, and the volume was adjusted individually to the auditory perception of the participant. The metronome provided an additional analog auditory cue to feed the exact tempo and rhythm during pre-gait and gait excercises. In order to enhance the effect, the metronome tone was embedded


into the music (Thaut, 1999; Mertel, 2014; Thaut and Rice, 2014).

## Research Protocol (NMT Training)

Warm-up exercises provided the initial part of the session. The aim was to increase the range of motion of the trunk and limbs, muscle tension adjustment (reduction of stiffness) and preparation of the whole body for further activities. Participants performed rotations of the upper and lower body, stretching exercises, and movements of the upper and lower limbs with TIMP.

Exercises of the activities of daily living (ADL) were addressed the next part of the session. Movements similar to those performed in daily life were simulated through PSE and TIMP, including turning, changing of position (moving from lying to sitting, from sitting to standing, from standing to sitting), reaching for an object to the front, above the head, reaching back, stepping up and down.

Following ADL exercises, the session focused on pre-gait training. In this section TIMP and PSE facilitated gait phases (stand or swing), step length, shifting body weight to the side and forward (balance reactions), coordination and reciprocal movements of the upper and lower limbs.

Stimulation of gait pattern provided the final part of the session. RAS was used for improving gait speed, step length, walking up and down the stairs with assistance of metronome and music. Advanced walking exercises were also practiced: initiating and stopping to the musical cues, turning, jumping, braiding, and backward gait. The session concluded with breathing exercises accompanied by relaxing and calming music (examples of NMT training – **Supplementary Figures S1–S3**).

## Statistical Analysis

Power calculation by Cohen (1992) methodology was utilized for this pilot study. Shapiro–Wilk test was applied to assess the normality of distribution. Since data were abnormally distributed in both groups, non-parametric tests were employed for all comparisons. In order to assess the difference in gait and stability parameters between trial I and II, a Wilcoxon matched-pairs test was applied. A Mann–Whitney *U* test was utilized to evaluate the level of significance in differentiation between the examined groups. Statistical analysis was performed using Statistica 10 Statsoft and the statistical package R 3.1.2 (R Development Core Team, 2009).

## RESULTS

Prior to the research the distribution of characteristics within both groups selected for the study was compared. All *p*-values were higher than 0.05; thus, no significant differences were demonstrated between the two groups with regard to gender, age, duration of disease, H&Y stage scale and location of the weaker side (**Table 1**).

After the first examination of the groups, differences in terms of obtained gait and stability parameters were analyzed, presenting no significant differences between measured parameters. The summary of the results attained from the measures (performed twice in each group) is displayed in the tables and figures below. To address the research question, the outcomes of the first and second tests were compared. Then the significance of differences in obtained parameters between the groups were examined.

## Gait

In order to examine the effect of applied NMT for motor performance, changes in all the measured spatial and temporal gait parameters were thoroughly analyzed. The comparison of results of temporal and spatial parameters from trial I and II are presented in the **Tables 2** and **3**).

## Analysis of Temporal Gait Parameters

The second measure of the temporal gait parameters was significantly higher than the first for the duration of swing phase and cadence. The second measure of duration of stance phase, the double support time and the stride time was significantly lower than the first. The comparison of the results of temporal gait parameters in the control group did not reveal any statistically significant differences between the trial I and II (**Table 2**).

## Analysis of Spatial Gait Parameters

The second measure of spatial gait parameters was significantly higher than the first one for the step length, velocity and the stride length. Analysis of the results for the same parameters in the control group also demonstrated significant differences, but the significance level was lower than in the experimental group. Step width did not differ significantly in any of the examined groups (**Table 3**).

## Comparison of Changes in Temporal Parameters Between the Groups

In the experimental group the shortening of the parameters such as stance phase, double support time, and stride time was more considerable than in the control group. Moreover, for the subjects from the experimental group the extension of the swing phase and increasing of cadence was significantly higher than for the controls (**Table 4** and **Figure 1**).

## Comparison of Changes in Spatial Parameters Between the Groups

A significance level lower than 0.001 indicated a statistically significant difference in the elongation of both step and stride length in the experimental group in comparison to the control group. An increase of velocity in the experimental group was also higher than in the control group. Furthermore, a significant difference in the increase of step width observed in the control group was higher than for the experimentals (**Table 5** and **Figure 2**).

**Figures 1** and **2** graphically depict significant differences in the changes of gait parameters between the groups, demonstrating the efficacy of the applied therapy. The second

#### TABLE 2 | Temporal parameters (I and II trials).


gait measure in the experimental group was significantly different for the majority of parameters in comparison to the first one. There were also statistically significant differences in the comparison of both groups. These outcomes indicate the effectiveness of the research protocol employed to improve the temporal and spatial parameters of gait in PD.

### Stability

In order to determine the effect of NMT on stability, the changes in the parameters obtained in the Romberg's test (with eyes open and eyes closed) were examined. As in the case of the gait parameters, a comparison of the results of the trial I and II, the experimental and the control group was conducted. Also the differences between the groups were analyzed.

## Stability Parameters from the Test with Eyes Opened

Out of 17 measured parameters in the test with eyes open, only two changed significantly in the experimental group. The change concerned the center of pressure mean frequency (MF-EO) measured in Hz (the level of significance *p* = 0.019) and the amount of sway in sagittal plane (LWAP-EO) by center of pressure (the level of significance equal to 0.003). There was no significant difference demonstrated in the comparison of the results of the same parameters in the control group. In the control group only one of the parameters changed significantly (**Figure 3**). It was the mean velocity (MVML-EO) of the center of pressure in frontal plane on the level of significance *p* = 0.049.

## Stability Parameters from the Test with Eyes Closed

The comparison of the test results with eyes closed in the experimental group showed significant changes in five parameters. In the second trial the total sway path (SP-EC) calculated in both planes (*p* = 0.032), the sway path (SPAP-EC) in milimeters calculated in sagittal plane (*p* = 0.019), the mean velocity of the center of pressure (MV-EC) in both planes (*p* = 0.035), the mean velocity of the center of pressure (MVAP-EC) in the sagittal plane (*p* = 0.023) and the amount of sway (LWAP-EC) in sagittal plane (*p* = 0.021) were significantly reduced (**Figure 4**). The comparison of the results of the same parameters for the control group showed no significant difference, while statistically significant differences were observed in the sway path (SPML-EC) in frontal plane (*p* = 0.031) and the mean velocity of the center pressure (MVML-EC) in frontal plane (*p* = 0.025).

## Group Comparisons

The comparison of changes in parameters of Romberg's test between the two groups displayed no significant differences either in the test with eyes open or with eyes closed. The only parameter that was significantly reduced in the experimental group (compared to the controls) was the amount of sway (LWAP-EO) in sagittal plane in a test with eyes opened.

The results obtained from Romberg's test with eyes opened changed significantly in the experimental group only with regard to two parameters. Due to these outcomes it is impossible to confirm the effectiveness of the research protocol in the improvement of the stability of people with PD.


#### TABLE 3 | Spatial parameters (I and II trials).

Significant changes in the five parameters with eyes closed in the sagittal plane might indicate an improvement of proprioception (a part of somatic sensory system responsible for the sense of body position and movement – Purves et al., 2004) and body perception in the same group of patients.

## DISCUSSION

The key hypothesis of this project assumes the major role of the combination of all three NMT sensorimotor techniques in improving spatio-temporal gait parameters and postural stability in the course of PD. During 4 weeks of a rehabilitation program based on sensorimotor NMT techniques conducted with PD patients, functional movements, stability and locomotion were stimulated with methods incorporating the use of music with a strong sense of rhythm. To confirm the first part of the hypothesis, nine spatio-temporal parameters of gait were calculated and compared by the Optoelectrical 3D Movement Analysis System BTS Smart. The comparison of these results in





#### TABLE 5 | Spatial parameters – groups comparison (E and C).

length (*p <* 0.001) was observed in comparison to the control group. In the control group only the step width (*p* = 0.034) was higher than in the experimental group. This may be a sign of increasing balance disorders in the control group, resulting in the widening of the base of suport. The obtained outcomes confirm the high efficiency of the applied NMT. The sensorimotor techniques employed in this study significantly differentiated the experimental group and the control group in terms of the measured gait parameters.

Currently, the foundation of gait treatment in PD is the application of pharmacology, supported by training with cueing. Research shows that reduced joint amplitude caused by the illness can be normalized through receiving levodopa and cues, which in turn results in the improvement of motor behavior and a bypassing of the damaged motor mechanisms in the basal ganglia (Morris et al., 2005). In this study, rhythmic auditory cues were applied in the experiment to improve gait. It should be noted that it is very important to choose an appropriate rhythmic stimulation. Optimal rhythmic stimulation, adjusted to the patient's preferred velocity, ranges from 60 to 150 beats per minute. A metronome tempo set below each clients preferred velocity reduces stability, whilst a pace above this range negatively affects step length (Ebersbach et al., 1999; Arias and Cudeiro, 2008). Similarly, rhythmic cueing is most effective when music is selected for bringing clear rhythmic instructions that controls the pace and cadence of gait (Brown et al., 2009). The music of patient's preference is not always suitable for RAS; therefore, the selection is usually made by a music therapist. For the purposes of this study, the selection of suitable music for physical stimulation was also conducted by the leading music therapist. The ability of human brain to entrain with the rhythm of movement allowed the subjects to react on the beat, even if the selected music did not quite match the taste of the participants.

Previously, most researchers tended to focus on only one of the NMT techniques – RAS – for gait facilitation in PD (McIntosh et al., 1997; Freedlanda et al., 2002; Hausdorff et al., 2007); whereas the authors of this study evaluated the effectiveness of NMT protocol with not only RAS, but an approach combining the two remaining sensorimotor NMT techniques: TIMP and PSE. The authors decided not to measure the gait parameters during rhythmic stimulation which would have revealed the immediate effect of the rhythmic auditory cues. Instead, they examined the effect of musical and rhythmic stimulation after a 4-week therapy cycle. Consequently, during the first and second measure patients were moving independently, without any cues during the movement; they received only the starting command. The second measure was taken the day after the therapy was concluded. According to available publications, the effect of the RAS is maintained for several weeks after the completion of this form of therapy. This fact was noted, among others, in the Benoit et al. (2014) study. The measure of gait performed a month after the completion of the applied rhythmic therapy, still demonstrated statistically significant changes in the velocity and length of steps. McIntosh stated that the application of RAS temporarily improves gait parameters. The patient is still able to follow the rhythm, even though the RAS disappears. This effect persists for up to 6 weeks (McIntosh et al., 1997; Thaut et al., 1998).

In addition to the forward gait analysis, it is worth examining turning during gait. Due to the close relationship between gait and balance disorders, turning often presents considerable difficulties for PD patients. Unfortunately, the pathophysiology of these difficulties is still poorly understood (Dibble et al., 2008). In his study, Huxham et al. (2008) dealt with analysis problems with turning during walking. 20 patients (10 with PD and 10 healthy subjects of similar age) were examined with

the use of 3D motion analysis system. The study focused on the spatiotemporal regulation of steps while turning up to 60– 120◦. Significant differences in most factors were noted in the PD group. The spatial regulation of turning was similar, even if slightly reduced, however, the velocity of turning was slower than in the control group. In general steps are shorter during walking in PD, but there was a significant additional reduction during turning; there were small but significant differences in the regulation of the steps in time. The differences between the groups, more visible in the performance of larger turns, may reflect balance and coordination impairment in people with PD during demanding gait tasks (Woollacott and Shumway-Cook, 2002). These findings prompted the authors of this study to formulate the research question concerning the changes of stability under the influence of applied NMT. According to another study (Koh et al., 2008), the reasons for main changes within velocity and step length can be related to problems with postural stability that appear with the development of the disease. The gait assessment in PD patients with postural instability showed a significant reduction in velocity and step length in comparison to a group with tremor. Also pelvic and lower limb joints range of motion were significantly reduced in this group. Reduced pelvic motion range may be a compensatory strategy for postural instability which provokes shortening of step length in PD patients, which in turn affects the velocity of gait.

The second part of this study – the influence of research protocol on stability, cannot be verified positively. There were no significant differences between the first and the second posturography measure within the experimental group, or between the groups. This suggests that the application of NMT primarily leads to an improvement of rhythmical movements, with center of mass transfered forward or backward (walking, daily activities). The static stability tested with eyes open did not change significantly for the majority of measured parameters. Significant differences in the experimental group were reported only for: the center of pressure mean frequency (MF-EO) measured in Hz and the amount of sway in the sagittal plane (LWAP-EO). Therefore, this type of training may not be so effective for improving stability parameters with visual control. However, during the research cycle the stability did not deteriorate. This type of therapy may help to maintain the satisfactory level of stability, thereby delaying balance disorder that often occurs at the beginning of the third stage of PD in the H&Y scale.

What was particularly interesting in the study, was the occurance of the statistically significant differences (*p <* 0.05) in the stability tests with eyes closed in the experimental group within five parameters: the total sway path (SP-EC) calculated in both planes, the sway path (SPAP-EC) in milimeters calculated in sagittal plane, the mean velocity of the center of pressure (MV-EC) in both planes, the mean velocity of the center of pressure (MVAP-EC) in the sagital plane and the amount of sway (LWAP-EC) in sagital plane. The transfer of body weight during performance of many everyday activities, including walking, is organized also in the sagittal plane. The improvement of stability with the exclusion of a strong visual component can be the sign of proprioception and body perception enhancement through auditory stimulation, necessary to maintain balance, posture, and motor control mechanisms.

Drawing on the Koh et al.'s 2008 investigation of the impact of poor postural control on stride length and velocity, it can be expected that the improvement of these spatial parameters of gait positively affects postural control processes based on proprioceptive information. It is possible that the application of auditory stimulation exclusively is not sufficient for stability and balance training in PD.

Stozek et al. (2003) ˙ in their study evaluated the rehabilitation program focused on improving balance in PD. They applied the combination of cues including verbal, visual, auditory, and proprioceptive stimulation. Posturography was employed to assess stability and balance. The stability tests were conducted with eyes open and eyes closed conditions, and the body weight transfer was measured in six directions. As with the results obtained by the authors of this article, the comparison of the outcomes from three different measures showed statistically

significant differences in the stability parameters only with eyes closed. These differences were observed between the first, the second, and the third measure. The last measure was carried out a month after the completion of the training. The changes remained significant which indicated that the treatment effects were maintained. Therefore, the development of the training with closed eyes in the context of improving stability and balance in PD appears to be worthy of further exploration.

The latest scientific research carried out in the animal model of PD (Farley et al., 2008) illustrated that physical exercises with different stimuli are as efficient as a physiotherapy method, modifying the course of the disease and contributing to the functional improvement of patients. Furthermore, researchers place emphasis on the significant influence of music and rhythm on motor function, gait, mobility, and patient's quality of life (Paccetti et al., 2000; Howe et al., 2003; Moore et al., 2006; Leins et al., 2009; De Bruin et al., 2010; Pasek et al., 2010). Therefore, the extensive application of sound and rhythmic stimulation in the basic rehabilitation program for patients struggling with the PD is supported by these pilot level findings.

This pilot level study has some limitations which merit attention. In particular, it is impossible to distinguish the individual effects of the applied techniques and indicate the most efficient for gait and balance training. Given the effectiveness of RAS has been established (McIntosh et al., 1997; Thaut et al., 1998; Hausdorff et al., 2007), future research focused on the comparative effectiveness of PSE and TIMP individually and in combination is indicated, to understand more about the characteristic effects and mechanisms of each treatment and how they might work together. Furthermore, as this data only provides an indication of short term effects, further studies are indicated for determining what long-term NMT treatment effects might be afforded through a more longitudinal design. The hypothesis assumed only efficacy of NMT in PD treatment, but did not indicate it as a superior method. The second step of the research could perhaps detect the differences between effectiveness of NMT and another chosen approach.

## CONCLUSION

This pilot study indicates that NMT sensorimotor techniques may be employed to improve gait and other rhythmical activities for individuals with PD. Changes in stability without visual control indicate improvement of proprioception, giving a new compensatory strategy for movement and postural control through auditory system. The confirmation of the research hypothesis in this study builds the evidence base for a therapeutic strategy based on the use of rhythmic music for the improvement and maintenance of good functional state. These changes may be effective in improving patient's ability to perform activities of daily living and engage in social activities important for their quality of life.

This study illustrates that by connecting the disciplines of music therapy, physiotherapy and occupational therapy, through the work of task orientation, including transfers, activities of daily living and locomotion, as well as providing an appropriate level

## REFERENCES


of difficulty and number of repetitions, it may be possibile to decrease damaging effects of PD. However, the question of how to prolong the obtained effect of rhythmic and musical stimulation requires further investigation.

## ACKNOWLEDGMENTS

This research was funded by the National Science Center, Poland. Program under the *Preludium* project (grant agreement no. 2012/05/N/NZ7/00651).

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fnhum. 2015.00710

FIGURE S1 | Balance exercise - reaching back with the trunk rotation. FIGURE S2 | Gait preparation - reciprocal movement of upper and lower extremities.

FIGURE S3 | Gait preparation - steps forward-backward with arms swing.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Bukowska, Kr˛ezałek, Mirek, Bujas and Marchewka. This is an ˙ open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# **Flow and meaningfulness as mechanisms of change in self-concept and well-being following a songwriting intervention for people in the early phase of neurorehabilitation**

#### *Felicity Anne Baker <sup>1</sup> \*, Nikki Rickard<sup>2</sup> , Jeanette Tamplin<sup>1</sup> and Chantal Roddy <sup>2</sup>*

*1 The University of Melbourne, Melbourne, VIC, Australia, <sup>2</sup> Monash University, Melbourne, VIC, Australia*

Anecdotal evidence suggests that songwriting assists people with spinal cord injury (SCI) or acquired brain injury (ABI) to explore threats to self-concept, yet studies that explore the mechanisms of change have not been reported. In a pilot study, we explored the correlations between changes in self-concept and well-being, with mechanisms of flow and meaningfulness of songwriting. Five people with ABI (all male) and 5 SCI (4 males, 1 female) (mean age 38.90 years, SD = 13.21), with an average 3 months postinjury, participated in a 12-session songwriting program that targeted examination of self-concept. Measures of self-concept, depression, anxiety, emotion regulation, affect, satisfaction with life, and flourishing were collected pre-, mid-, and post-intervention, and compared with repeated measures of flow and meaningfulness of songwriting. Medium effects were found for changes in self-concept (*d* = 0.557) and depression (*d* = 0.682) and approached a medium effect for negative affect (*d* = 0.491). Improvements in self-concept over time were associated with decreases in depression (*r*<sup>p</sup> = *−*0.874, *n* = 9, *p <* 0.01), anxiety (*r*<sup>p</sup> = *−*0.866, *n* = 9, *p <* 0.01), and negative affect (*r*<sup>p</sup> = *−*0.694, *n* = 10, *p <* 0.05), and an increase in flourishing (*r*<sup>p</sup> = +0.866, *n* = 9, *p <* 0.01) and positive affect (*r*<sup>p</sup> = + 0.731, *n* = 10, *p <* 0.05). Strong experiences of flow were not positively correlated with positive changes to self-concept and well-being, whereas deriving high levels of meaning were associated with increased negative affect (*r*<sup>p</sup> = +0.68 *p <* 0.05), increased anxiety (*r*<sup>p</sup> = +0.74, *p <* 0.05), and reduced emotional suppression (*r*<sup>p</sup> = *−*0.58, *p <* 0.05). These findings show that the targeted songwriting intervention appears to be positively associated with enhanced well-being outcomes. However, the findings also suggest that people who find the songwriting process has strong meaning for them might be more likely to start accepting their emotions and as a result experience an increase in anxiety and depression, although full, mediated regression analyses with larger sample sizes are required to explore this further. Acknowledging their changed circumstances may nonetheless assist people with SCI and ABI to grieve their losses and facilitate the building of a healthy post-injured self-concept. We propose that there may be other mechanisms more critical in facilitating the positive changes in self-concept

#### *Edited by:*

*Julian O'Kelly, Royal Hospital for Neuro-disability, UK*

#### *Reviewed by: Karen Burland, University of Leeds, UK*

*Melody Schwantes, Appalachian State University, USA*

> *\*Correspondence: Felicity Anne Baker felicity.baker@unimelb.edu.au*

*Received: 10 March 2015 Accepted: 10 May 2015 Published: 26 May 2015*

#### *Citation:*

*Baker FA, Rickard N, Tamplin J and Roddy C (2015) Flow and meaningfulness as mechanisms of change in self-concept and well-being following a songwriting intervention for people in the early phase of neurorehabilitation. Front. Hum. Neurosci. 9:299. doi: 10.3389/fnhum.2015.00299* and well-being than flow and meaning, such as the role of story-telling and the impact of music in facilitating the consolidation of self-concept explorations in memory.

**Keywords: songwriting, self-concept clarity, spinal cord injuries, acquired brain injury, depression and anxiety disorders, well-being, flow theory**

## **Introduction**

The self-concept is a set of beliefs that, when combined, enables people to have a sense of who they are in the world. The selfconcept is derived from an integration of self-schemas constructed by temporal frameworks that encompass the past, present, and future selves (Markus and Nurius, 1986) and is multidimensional, comprising the personal, physical, family, social, academic/vocational, and spiritual/moral self domains (Fitts and Warren, 1996). Recent research suggests that the self-concept has neuropsychological and structural properties that, when combined, form a continuum from unhealthy-fragmented to healthy-integrated self-concept (Gana, 2012). Importantly, the self-concept forms a nomological network of relationships that involve an individual's personality, psychological adjustment, and achievement (Gana, 2012). Emotional or significant life events may be catalysts for personal examination of the self-concept, particularly when these stimulate opportunities to reassess purpose and meaning in life and construct an alternative future self that aligns with realistic possibilities (Habermas and Bluck, 2000). People who report having a strong sense of self tend to thrive because encountering failures in life does not dramatically affect their positive view of themselves. However, for those with a more negative self-concept, there is a risk that they will not achieve in life because they tend to avoid opportunities for growth by avoiding risk (Fitts and Warren, 1996).

The concept of the self is constantly evolving throughout life as a consequence of maturation and encountering new people and experiences (MacKinnon and Helse, 2010). However, in some circumstances, a significant and traumatic life event interrupts this gradual process and demands a more focused review of the past, present, and future selves. Acquiring a neurodisability calls for a reappraisal of the self and may involve a process of grieving for components of the self that have been damaged or lost as a result of the trauma (Hinkebein and Stucky, 2007). Finding meaning, purpose, and fulfillment in life is challenging for people with acquired neurodisabilities (Vickery et al., 2005) and there is a risk for people with acquired brain injury (ABI) or spinal cord injury (SCI) that the lens through which they frame and experience life will be dominated by the "disabled self " if an integrated and balanced selfconcept is not constructed post-injury (Lennon et al., 2014). A number of studies have concluded that people with ABI or SCI report incongruities in past, present, and future selves that do not improve naturally over time when compared to the normal population (Anson and Ponsford, 2006; Kelly et al., 2013).

Engaging people with ABI or SCI in a narrative process that explores the residual self alongside the disabled self enables them to grieve the lost self and construct a new and healthy present and future self (Feinstein and Krippner, 2008). When provided with opportunities to tell and retell their stories, and have a listener support and gently challenge their perceptions of themselves, alternative selves are identified and long-term integration of the self-concept is more likely (Obodaru, 2012).

Therapeutic songwriting is a method that has been extensively used across a range of clinical and non-clinical populations as a medium for people to tell their stories (Baker et al., 2008). In a study of song lyrics created by people with ABI, inductively derived themes illustrated that their songs focused on past, present, and future selves (Baker et al., 2005a,b,c). Songs about the past included descriptions of relationships with significant others and past events (16.9%). Songs about the present comprised reflections on or sending messages to family and friends (22.6%) or expressions of their adversity in relation to their physical impairments and their efforts in rehabilitation (9.6%). Of particular interest, 28% of the lyrics focused on self-reflections, including questioning life's meaning and describing what makes them happy. A small percentage of the lyrics (7.4%) also focused on the future self. The songs reviewed in this study were, however, not created according to any specific protocol, as they were drawn from the large collection of songs that had been accumulated over a number of years of therapeutic practice and later analysed. We have recently developed and piloted a songwriting protocol specifically designed to explore the past, present, and future self for people with ABI or SCI using a narrative approach (Tamplin et al., 2015). Constructing the most effective protocol is, however, also likely to be dependent upon understanding the mechanisms of change. Currently, no songwriting study has specifically tested which mechanisms are active during the songwriting process.

#### **Theory of Mechanisms of Change**

In an earlier article (Tamplin et al., 2015), we presented a theory of possible mechanisms active during the songwriting process for people with neurodisability that enabled them to successfully integrate multiple injured and non-injured narratives. We proposed that songwriting accommodates for memory impairments typical in people with ABI because of the strong links between music, memory, and emotions, which enable exploration of the self to be consolidated more effectively in memory (Cahill and McGaugh, 1996; Judde and Rickard, 2010) and stimulate autobiographical memories that are important in raising awareness of the residual self (Janata, 2009). Our mechanisms of change theory suggests that because engagement in music-based activities activates the neural "pleasure" network in the brain (e.g., Menon and Levitin, 2005; Salimpoor and Zatorre, 2013), songwriting has the potential to enhance mood and coping and decrease or prevent depression and anxiety. Through achieving this positive-affect shift, people may access the inner strength needed to face the challenges associated with processing and revising their self-concept post-injury.

Flow theory is of particular importance in our theory of mechanisms of change because of its clear links with well-being (Csikszentmihalyi, 2008; Seligman, 2011). Songwriting studies by Baker and MacDonald (2013a,b) found that creating songs about positive or negative personal experiences generates strong experiences of flow in healthy populations and that participants derive meaning from both the songwriting process and the song product they created. A regression analysis determined that a predictive relationship existed between meaning and flow during songwriting experiences. More recently, Silverman et al. (under review) examined the relationship between flow, meaning, and health and well-being during songwriting interventions in a group of adults in a psychiatric unit (study 1, *N* = 54) and in adults undergoing detoxification (study 2, *N* = 170). Although these songwriting approaches were not specifically tailored to address self-concept, correlational and multiple regression analyses determined that flow and meaningfulness of songwriting were significantly correlated and that strong flow experiences were predictors of increases in hope (study 1) and readiness to change (study 2).

As strong flow experiences and meaning activated by songwriting are predictors of readiness to change and hope in adults with substance addictions and with acute psychiatric illnesses, in the current study, we aimed to explore whether songwriting activates flow and creates meaning for people with ABI and SCI, and whether these mechanisms of change correlate with changes in well-being indicators.

## **Study Hypotheses**

Hypothesis 1: greater improvements in self-concept will be positively correlated with lower levels of anxiety, depression, and negative affect, and increased levels of satisfaction with life, sense of flourishing, and emotion regulation.

Hypothesis 2: greater improvements in self-concept and wellbeing will be positively correlated with stronger feelings of flow and meaningfulness of the songwriting experience.

## **Materials and Methods**

#### **Design**

A non-randomized, quasi-experimental design with repeated measures (pre–mid–post-intervention) was employed to determine: (a) whether there was a therapeutic effect (outcome measures) and (b) what mechanisms might explain this effect (mechanism measures). Outcome measures were collected at baseline, mid-point (between sessions 6 and 7), and postintervention. Mechanisms of change (flow and meaning measures) were collected after the completion of each song during the 12-session songwriting program (see **Figure 1**).

The study was reviewed and approved by Human Research Ethics Committees at The University of Melbourne (1339728), Monash University (CF13/2098 – 2013001081), and Austin Health (H2013/05038).

#### **Participants**

Over the study period, 16 inpatients with either SCI or ABI were identified as meeting the inclusion criteria and were invited to participate in the study. Inclusion criteria comprised: (1) inpatient status at Royal Talbot Rehabilitation Centre from the ABI, Spinal or Neurology wards; (2) diagnosis of SCI or ABI (including traumatic brain, stroke, brain tumor, and substance abuse); (3) aged between 18 and 65 years of age; (4) <12 months post-injury/onset; (5) cognitive capacity sufficient to complete self-report measures; (6) without significant language or hearing impairments; and (7) not in posttraumatic amnesia.

A member of each patient's treating team (not one of the researchers) was responsible for informing the patient of the study and obtaining consent. Two patients declined to participate, one female patient was recruited but found the self-report measures too emotionally confronting and dropped out before treatment commenced, and three other participants had substantial amounts of missing data and were subsequently excluded from the analysis. Ten participants (five ABI and five SCI) completed the study; nine males and one female aged between 20 and 64 years of age (Mean 38.9, SD 13.2). The time since injury or incident ranged from 30 to 157 days (Mean 89.6, SD 44.29).

#### **Procedure and Music Therapy Approach**

Following recruitment, participants completed a battery of tests via iPad before engaging in a 12-session-targeted songwriting program (Tamplin et al., 2015). During the 12 (twice weekly, 1 h) sessions, the therapist and participant co-created three songs using a narrative songwriting approach (Baker, 2015). Each song incorporated the various domains of self-concept: personal, social, family, physical, academic, and moral/spiritual self. Song 1 (sessions 1–4) was focused on these domains for the past self, song 2 (sessions 5–8) was focused on these domains for the present self, and song 3 (sessions 9–12) was focused on these domains regarding the conceptualized future self. The therapist worked carefully with the participants to ensure that their stories of self were authentically represented in each song both musically and lyrically so that they could make meaning from their stories and self-descriptions. Further details of the intervention and the role of the therapist in facilitating the song creations are presented in a previously published paper (Tamplin et al., 2015).

#### **Outcome Measures**

Therapeutic outcomes were determined by collecting data pre–mid–post-intervention using the following battery of measures.

#### Self-Concept

As self-concept was the primary outcome measure of interest in this study, the 20-item *Head Injury Semantic Differential*

*Scale* (HISDS; Tyerman and Humphrey, 1984) was used to measure self-concept. It uses adjective pairs, such as unfeelingcaring, worried-relaxed, etcetera, along a 7-point Likert Scale, to determine their view of various aspects of self. This measure focuses on perception of personal attributes with scores ranging from 20 to 140, where higher scores indicate a healthier, more positive, self-concept. The HISDS has been used previously in studies with people who have ABI (Vickery et al., 2005).

#### Well-being Measures

Well-being data were collected using seven different measures to evaluate sense of flourishing, life satisfaction, coping, affect, depression, and anxiety.

*The Flourishing Scale* is an eight-item measure of psychological well-being, specifically self-perceived success in areas such as relationships, self-esteem, purpose, and optimism. Statements are rated across a 7-point Likert Scale with scores ranging from 6 to 56; higher scores indicate stronger sense of flourishing. The measure has good psychometric properties with Cronbach's α of 0.87, temporal stability of 0.71, and construct validity ranging from 0.43 to 0.70 (Diener et al., 2010). Mean of flourishing for healthy populations range from 42.2 (Singaporean sample, SD = 6.4) to 46.6 (New Jersey Sample SD = 5.0). The Flourishing Scale has been used previously in studies with people who have an ABI (White, 2014).

The *Satisfaction with Life Scale* (SWLS; Diener et al., 1985) is a five-item scale designed to measure satisfaction with life. The items are scored using a 7-point Likert scale with scores ranging from 5 (low satisfaction) to 35 (high satisfaction). It has good construct validity (ranging from 0.5 to 0.75), test–retest reliability (0.82–0.84), and internal consistency (0.87) (Diener et al., 1985). Normative data for adults range from 23.6–27.9 (Pavot and Diener, 1993/2009). Normative data have been collected for people at 1 and 2 years post-ABI with mean scores of 20.32 (SD = 8.13) and 20.80 (SD = 8.42), respectively, and 22.7 (SD = 7.28) for people with SCI <60 days post-injury (Fortmann et al., 2013).

*The Emotion Regulation Questionnaire* (ERQ; Gross and John, 2003) is a 10-item questionnaire designed to assess individual differences in the habitual use of two emotion regulation strategies: cognitive reappraisal (six items) and expressive suppression (four items). Items are rated on a scale from 1 (strongly disagree) to 7 (strongly agree) with scores ranging from 6 to 42 (Reappraisal) and 4 to 28 (Suppression). Mean α reliabilities were 0.79 for Reappraisal and 0.73 for Suppression and test–retest reliability across 3 months was 0.69 for both Reappraisal and Suppression scales. Higher mean scores on each subscale indicates that the reappraisal or suppression strategy is more endorsed. Testing the psychometric properties of the ERQ showed that the cognitive reappraisal subscale (α = 0.79) and expressive suppression (α = 0.73) subscales have high internal consistency for both (Gross and John, 2003). Good convergent validity has been reported with the COPE scales (Carver et al., 1989), discriminant validity with the 44-item Big Five Inventory (see Gross and John, 2003), and stability across 3 months (*r* = 0.69; Gross and John, 2003). Normative data for Reappraisal are 28.92 (6.27) and 28.48 (6.29) for females and males, respectively, and for Suppression, 13.12 (4.99) and 14.91 (4.67) for female and males, respectively (Melka et al., 2011).

The *Positive Affect and Negative Affect Scale (*PANAS-20; Watson et al., 1988) is a 20-item scale measuring the hedonic aspect of well-being using 10 positive items [Positive Affect scale (PA)], 10 negative items [Negative Affect scale (NA)]. Each item is scored on a 5-point Likert scale and each positive and negative scale ranges from scores of 10 to 50. Normative data indicate a mean score of 31.31 (SD = 7.65) and 16.00 (SD = 5.90) for the PA and NA scales, respectively (Crawford and Henry, 2004). Internal consistency of the PA was 0.89 (95% CI = 0.88–0.90) and 0.85 (95% CI = 0.84–0.87) for the NA scale. The PANAS has convergent validity when correlated with the Depression and Anxiety Stress Scale [*t*(986) = 7.523, *p <* 0.001] and the Hospital Anxiety and Depression Scale [*t*(737) = 7.667, *p <* 0.001] (Crawford and Henry, 2004). PANAS has been used in both ABI (e.g., Juengst et al., 2014) and SCI research (Salter et al., 2013).

The *Patient Health Questionnaire-9* (PHQ-9; Kroenke et al., 2001) is a nine-item scale that screens for severity of depression. Each item is scored from 0 to 3 with the total scale scores ranging from 0 to 27. Higher scores are indicative of moderately severe (15–19) and severe (20–27) levels of depression. Internal reliability (Cronbach's α of 0.86–0.89) and convergent validity as measured against the SF-20 Health-related Quality of Life Scale (*p <* 0.05 for most pairwise comparisons) were good (Kroenke et al., 2001). The PHQ-9 has been validated on both ABI (Fann et al., 2005) and SCI samples (Bombardier et al., 2012).

The *Generalized Anxiety Disorder* scale (GAD-7; Spitzer et al., 2006) is a 7-item measure of generalized anxiety for use with the general population, measured along a 4-point Likert scale. Scores range from 0 to 21, with scores of *≥*5, *≥*10, and *≥*15 representing mild, moderate, and severe anxiety symptom levels, respectively (Löwe et al., 2008). The internal consistency of the GAD-7 was excellent (Cronbach's α = 0.92) and test–retest reliability was also good (intraclass correlation = 0.83). Construct validity was good as evidenced by strong association between increasing GAD-7 scores and worsening function on Short-Form Health Survey (SF-20) and convergent validity was good when compared with Beck Inventory (*r* = 0.72) (Spitzer et al., 2006).

#### **Mechanisms of Change Measures**

To capture changes in flow and meaningfulness of the songwriting experience throughout the process, three measurement tools were used.

#### Short Flow Scale and Core Flow Scale

After completion of each of the three songs (after sessions 4, 8, and 12), participants completed the 9-item Short Flow Scale (SFS) and the 10-item Core Flow Scale (CFS) to measure their subjective absorption and motivation during the songwriting task (Martin and Jackson, 2008). High levels of flow are indicative that the activity is engaging and meaningful to them. While flow may have been experienced during any songwriting session, we chose to only measure flow after sessions 4, 8, and 12 to minimize assessment burden for participants.

The SFS measures the strength of the nine dimensions of flow (challenge-skill balance, action-awareness merging, clear goals, unambiguous feedback, concentration on task, sense of control, loss of self-consciousness, time transformation, and autotelic experience). When their scores are combined, they represent a measure of the psychological state of flow. The CFS, however, measures the strength of the lived experience of flow, rather than a psychological state. Both the SFS and CFS have demonstrative internal validity (CFI = 0.97; SRMR = 0.05). Construct validity has been tested across several domains (work, sport, and music) and has acceptable reliability (Cronbach's α = 0.82; Martin and Jackson, 2008). Internal validity for SFS flow in work and sport were good, and more so for music (χ <sup>2</sup> = 136.78, 112.38, and 44.11, respectively) and external validity was also good (χ2 = 6088.56, 4479.03, and 4056.76, respectively).

#### Meaningfulness of Songwriting Scale

*Meaningfulness of Songwriting Scale* is a 21-item scale constructed to measure the meaningfulness of the songwriting process and the song product post-creation Baker et al. (under review). This measure was administered to participants after the completion of each song (after sessions 4, 8, and 12). Developed by Baker and MacDonald (2013a,b), the self-report scale measures 11 domains of meaning relevant to songwriting experiences and the song product: enjoyment, discovery/self reflection, arousal of emotions, creativity, engagement, challenge, understanding context, associations, achievement, personal value, and identity. Items are measured on a 5-point Likert scale with total scores ranging from 21 to 105. Larger numbers are indicative of stronger meaning derived from the songwriting experience and song product. The measure has good face validity, strong internal consistency (Cronbach's α 0.96), test–retest reliability (ICC = 0.89–0.93), and construct validity (*r* = 0.56–0.68) when used with people with acute mental illness and with people who have substance use disorder Baker et al. (under review). Its psychometric properties have not been measured with people with neurodisability. We have noted anecdotally that people with ABI and SCI find songwriting a meaningful experience (Baker et al., 2005d; Tamplin, 2006). The questions in the scale were intentionally worded as simply and clearly as possible to make the scale appropriate to use for people with mild cognitive impairments.

#### **Analyses**

The data set was first screened for missing data, and occasional missing data were found for one time-point for at most two participants on any variable. Given the missing data points were not systematic in any way, analyses were conducted on the available remaining data for each variable. Distributions for all variables were generally normal and no outliers were identified, and despite the small sample sizes, parametric analyses were performed to maintain sufficient sensitivity (although caution is advised in interpreting significant findings due to the small sample sizes).

Self-concept and well-being outcome measures were operationalized as the change in each measure from baseline to midpoint (time 2–time 1), mid-point to post-intervention (time 3–time 2), or baseline to post-intervention (time 3–time 1). Flow and meaningfulness of songwriting measures were obtained by averaging ratings provided for the three songs. Pearson's bivariate correlations (two-tailed, α set at 0.05) were performed to test the relationship between measures for each hypothesis.

## **Results**

### **Potential Confounds**

As both age and time since head injury could potentially influence self-concept and well-being outcome measures independent of the songwriting intervention, Pearson's correlations were first performed and showed no significant covariates.

The measures of self-concept and well-being at baseline, midintervention, and post-intervention are detailed in **Table 1**. The mean self-concept at baseline was 94.30 (SD = 25.88) out of a possible range 20–140, indicating a moderate view of the self-concept. Participants did not have a very poor or negative self-concept, but it was not particularly positive either. Mean self-concept improved across time and the effect size of this improvement in self-concept was medium (*d* = 0.557). Baseline levels of depression (M = 9.7, SD = 6.15) were bordering on moderate depression (10–14) and decreased to mild depression at post-intervention, and also had a medium effect (*d* = 0.682). Anxiety levels at baseline were at the lower end of the moderate anxiety range and decreased to the mild anxiety range at post-intervention. Negative affect at the baseline (M = 22.90) was higher than a normative sample (M = 16.00, Crawford and Henry, 2004) but also decreased at post-intervention, with a small effect size (*d* = 0.461). Positive affect (M = 34.5) was slightly above the normal range (M = 31.31). Baseline Satisfaction with Life Scale data were below normative



*d reflects effect sizes for baseline to post-intervention changes only.*

*\*p < 0.05.*

levels but moved toward this over time (M = 21.78). Emotion regulation (suppression) was below the normative data set and the appraisal was higher.

#### The Association Between Changes in Self-Concept and Well-being Across the Intervention Period

**Table 2** shows means and SD for the change in self-concept and well-being measures and the correlations between changes in the self-concept measure and each of the well-being measures, at each period between time-points (baseline, mid-intervention, and post-intervention).

It can be seen from **Table 2** that across the entire intervention period there were significant negative correlations between improvements in self-concept scores and decrease in ratings of negative affect, depression, and anxiety, and significant positive correlations with improvement in positive affect, flourishing, and satisfaction with life scores. From baseline to the mid-intervention point, there was a significant negative correlation between improvements in self-concept score and decreases in both negative affect and anxiety ratings, and the correlation with depression ratings approached significance. A positive correlation was also observed in change in self-concept scores over this period and satisfaction with life ratings. From mid-intervention to end of intervention, a negative correlation was observed between

**TABLE 2 | Descriptive statistics and Pearson's bivariate correlations between self-concept and well-being outcomes**.


self-concept scores and ratings of both depression and anxiety, and a positive correlation was observed between change in selfconcept score and flourishing ratings.

### The Association Between Self-Concept and Well-being Outcome Measures and Mechanisms of Change Measures (Flow and Meaning)

**Table 3** shows the means and SD for change in self-concept and well-being measures, and for mechanisms of change (flow and meaningfulness of songwriting) averaged across the entire intervention period (that is, from baseline to post-intervention), and the bivariate correlations between these variables.

**Table 3** reveals that the flow measures did not correlate with any of the changes in self-concept or well-being variables. The Meaningfulness of Songwriting Scale correlated positively with increased negative affect, anxiety and suppressive emotion regulation, and the negative correlation with increased positive affect approached significance.

## **Discussion**

## **Changes to Self-Concept and Well-being**

Research indicates that integrating a past, present, and future self is difficult for people with ABI and SCI, and the self-concept does not improve naturally over time (Anson and Ponsford, 2006; Kelly et al., 2013). This study found that self-concept changed from baseline to post-intervention [mean change 14.70 (39.27)] supporting our theory that songwriting can strengthen the positive aspects of the self-concept and make a difference in how people with disability view themselves at a critical time in their rehabilitation process (Tamplin et al., 2015).

The data revealed that the largest changes in self-concept emerged from baseline to mid-point (first six sessions). During this phase of the program, the participants were exploring the past (first four sessions) and present self (sessions 5 and 6). This positive change in self-concept suggests that they had a more positive view of themselves when compared with baseline. The theory underpinning this study proposed that by exploring the past self, participants were guided to explore who they were as people prior to their injury – a focus on the residual self that may have been forgotten or hidden by the more prominent issue of the current disabled self (Lennon et al., 2014). The process of focusing on who they were prior to their injury (song about the past self), and which parts of themselves remain the same (song about the present self), led to a stronger, healthier self-concept by the mid-point assessment. As participants continued to explore the present self (sessions 7–8) and then the future self (sessions 9–12), there was a further strengthening of the self-concept although this change was less marked than during the initial six sessions. This might indicate that although there were additional improvements in self-concept that exploring the future has less benefits than reflecting on the past and the here-and-now. Perhaps for some participants, the future remains too uncertain at this early stage in their rehabilitation journey and therefore creating songs about the future self should be introduced at a later period of time postinjury. Alternately, it may be that after an initial rapid improvement in self-concept, further improvements are more gradual. In


**TABLE 3 | Descriptive statistics for mechanisms of change variables, and correlations with change (from baseline to post-intervention) in self-concept and well-being variables**.

*cSC, change in self-concept; cNA, change in negative affect; cDep, change in depression; cAnx, change in anxiety; cSupp, change in suppression emotion regulation; cPA, change in positive affect; cFlour, change in flourishing; cSWL, change in satisfaction with life; cReapp, change in reappraisal emotion regulation.*

*Ns range from 7 to 10.*

*\*p < 0.05.*

this case, greater improvements might be detected if a follow-up measurement was performed at a later stage.

Correlations between changes in self-concept and other well-being indicators were significant from baseline to postintervention in all cases except for in the Emotion Regulation Subscales (suppressive and reappraisal). The data showed that as selfconcept improved, this was positively correlated with enhanced sense of flourishing, positive affect, satisfaction of life, and was also significantly correlated with reductions in anxiety, depression, and negative affect. This indicates that as self-concept improved during the songwriting process, other well-being measures also improved. Low levels of self-concept have been associated with higher levels of depression and anxiety (Anson and Ponsford, 2006; Kelly et al., 2013). Therefore, it is noteworthy that our songwriting intervention had the reverse effect of improving self-concept and simultaneously reducing levels of anxiety and depression.

Improvements in affect, sense of flourishing, and satisfaction in life, and reductions in depression, anxiety, and negative affect are all important goals during the initial months post-injury. This is the time when recovery is most rapid and focused attention on rehabilitation is imperative (Schultz and Tate, 2013). It could be hypothesized that a process that involves grieving the past self, and facing the present and imagined future self might lead participants to overly focus on their disabilities and in doing so, negatively affect well-being. However, this was not the case in our study. The songwriting process enabled participants to be reminded of the residual self and this led to positive well-being outcomes. While other music therapy studies have not examined songwriting's impact on affect at baseline-mid-post, other music therapy interventions addressing other rehabilitation goals have been found to facilitate an improvement in affect and mood in people with SCI (Tamplin et al., 2013a) and ABI (Baker and Wigram, 2004; Tamplin et al., 2013b).

### **Mechanisms of Change**

It was hypothesized that strong experience of flow, and high levels of meaningfulness of the songwriting experience are mechanisms active in the songwriting process that would contribute to a change in self-concept and other well-being indicators (Tamplin et al., 2015). The findings in this study did not support the hypothesis that strong flow was associated with improved self-concept and well-being indicators. In other words, having a stronger sense of flow had no bearing on whether the participant had a greater change in self-concept or well-being when compared with participants who reported low levels of flow. The strength of meaning derived from the songwriting experience and song product did, however, significantly correlate with some well-being indicators; however, these correlations were in the opposite direction than expected. These correlations suggest that, as the songwriting experience becomes more meaningful, individuals' levels of anxiety and negative affect increase, while suppression of emotion decreases. In trying to understand these unexpected findings, we propose that positive songwriting experiences within the context of a therapeutic relationship with a highly skilled music therapist may have enabled individuals to start accepting their emotions, which led to an increase in anxiety and negative affect. Being authentic and honest with oneself in times of stress and grief can be challenging. However, when a process such as songwriting is meaningful and enables a person with an ABI or SCI to feel safe explore aspects of their self that they might otherwise suppress, the initial effect may be positive, but as a person reflects on the content of their songs over time, it may cause negative feelings to emerge into consciousness. Our finding is not necessarily an unfavorable outcome, as it is not possible to process fears and anxieties until they are acknowledged. The music therapist has specialist skills in enabling people to explore painful issues within the safety of a therapeutic relationship and within the safety of musical experiences so that these fears and anxieties can be addressed.

## **Overall Findings**

When considering the two hypotheses of this study, there seems to be some contradictory findings. First, songwriting positively affected self-concept over time and this was, as hypothesized, correlated with positive changes in well-being. However, higher states of flow and more meaning derived from the songwriting experience were not significantly correlated with positive changes to self-concept and well-being, and at times, the trends were in the opposite direction than predicted. One explanation for why our second hypothesis was rejected stems from the possibility that participants were completing flow scales that had not previously been psychometrically tested with this population. It is unclear whether participants were able to reflect on their experiences well enough to be able to rate their experience of flow. The same may be said for the Meaningfulness of Songwriting Scale, which has to date only been psychometrically tested in a mental health population Baker et al. (under review). Hence, it is possible that this measure may not have accurately captured the meaningfulness of the experience for our study's populations. Further, it is possible that the timing of the flow measures affected the results. If flow had been measured after each of the sessions rather than at the completion of each song, stronger flow experiences during the lyric writing or music creation process may have been identified. This would have provided a deeper understanding of whether flow is stronger at different points during the songwriting process.

An alternative explanation for the absence of positive correlations between the mechanisms of change (flow and meaning) and self-concept and well-being could be that other mechanisms are more significant contributors to a change in self-concept and well-being. Given that the songwriting protocol systematically facilitates an exploration of the full range of self-concept domains (physical, personal, family, social, moral, and academic self), perhaps this self-exploration and narrative approach (that just so happens to incorporate a songwriting experience), is the critical, mediating factor that enables the multiple aspects of self to be more integrated (Feinstein and Krippner, 2008). Similarly, the role of the therapist in offering support when the participant grieves lost parts of the self, challenging a participant's self view, or presenting potential alternative perspectives (Obodaru, 2012), might have a strong impact on changes to self-concept and well-being over time. If this is the mediating mechanism of change, the role of songwriting is therefore to provide a supportive yet challenging and stimulating context in which the narrative experience may evolve. Songs are an age-appropriate and culturally accepted medium for communicating people's stories (Baker, 2015). They provide a framework where key events, feelings, or self-perspectives can be highlighted in a chorus, thereby encouraging further processing, and more effective consolidation into memory (Cahill and McGaugh, 1996; Judde and Rickard, 2010). Finally, our findings indicated a positive change in mood and emotional well-being across the 12 sessions, supporting our earlier proposed ideas (Tamplin et al., 2015) that songwriting – a musicbased intervention – engages the mesolimbic system in the brain and in doing so affects mood, depression, anxiety, and coping (Menon and Levitin, 2005; Salimpoor and Zatorre, 2013).

The proposed theory that (a) music facilitates consolidation of the self-exploration process into memory and (b) the role of the narrative process is pivotal in addressing self-concept that deserves further investigation. A study that compares the effects of narrative therapy with narrative songwriting on self-concept and well-being with cognitively compromised people (issues of ongoing memory) may shed light into the role of the songwriting process in reconstructing the self post-injury.

#### **Limitations of the Study**

This study comprised a small sample size of two cohorts (ABI and SCI) whose data were pooled. The sample size was insufficient to allow separate examination of the cohorts. Larger sample sizes would enable population differences to emerge regarding the effects of songwriting on self-concept and well-being, as well as the mechanisms of change. While measures for some outcome variables have been psychometrically tested, or at least been used, in other studies with people who have ABI or SCI (HISDS, SWLS, Satisfaction with Life Scale, Flourishing Scale, PANAS, and PHQ-9), the ERQ and GAD-7 have not. It is unclear whether these scales are valid for use with people who have SCI and ABI. Similarly, the Meaningfulness of Songwriting Scale is a newly developed scale and only has data on two samples (patients in detoxification for substance use disorder and acute psychiatric patients). Finally, given the small sample sizes, non-parametric analyses may have been more cautious although would have reduced power considerably and therefore parametric analyses were retained. However, the significant findings should be interpreted as preliminary in nature and require replication with a larger sample size.

This study did not have a comparative or control condition to determine whether the changes in self-concept and well-being were due to natural recovery or were indeed an outcome of the songwriting intervention. As self-concept has not been found to improve naturally over time (Kelly et al., 2013), we have made an assumption that the songwriting intervention effected this change; however, this cannot be confirmed until a larger study with sufficient power has been implemented using a comparative or control condition with random assignment. Further, it is possible that the songwriting program was a distraction from thinking about their losses and thus led to the positive change in well-being. This is unlikely, however, because the program was directing participants to reflect on the self rather than distract them from thinking about the self and their future.

It is likely that strong flow experiences were evident across each of the 12 sessions of songwriting. Measuring flow after each of the 12 songwriting sessions may have yielded more data about how flow was experienced over the whole songwriting process. Such data may have enabled stronger correlations between the flow experiences and changes in self-concept and well-being measures to be captured. It is therefore recommended in future research that flow is measured after each songwriting session to provide a more complete picture of the experience of flow throughout the songwriting process.

## **Conclusion**

This study has examined the impact of a therapeutic songwriting program on the self-concept and well-being of people with ABI and SCI, with a specific focus on measuring hypothesized mechanisms of change. Our songwriting protocol was specifically designed to explore the various domains of the self-concept via the creation of three songs about the past self, present self, and future self. We found that changes to self-concept and well-being facilitated by the intervention were highly correlated and changed in a positive direction indicating that people currently undergoing rehabilitation for SCI or ABI benefit from such a strategic songwriting approach. There were no correlations between levels of flow and self-concept and other well-being measures but found correlations with meaningfulness in the inverse-to-hypothesized direction. In particular, as the strength of the meaningfulness of the songwriting experience increased, levels of anxiety and negative affect increased and emotional suppression decreased. We propose that there may be other mechanisms more critical in facilitating the positive changes in self-concept and well-being that emerged in this study, such as the role of story-telling and the impact of music in facilitating the consolidation of self-concept explorations in memory.

## **References**


after quadriplegia: a randomized controlled trial. *Arch. Phys. Med. Rehabil.* 94, 426–434. doi:10.1016/j.apmr.2012.10.006


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Baker, Rickard, Tamplin and Roddy. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# A pilot study into the effects of music therapy on different areas of the brain of individuals with unresponsive wakefulness syndrome

Nikolaus Steinhoff <sup>1</sup> , Astrid M. Heine<sup>2</sup> , Julia Vogl <sup>3</sup> , Konrad Weiss <sup>4</sup> , Asita Aschraf <sup>5</sup> , Paul Hajek <sup>4</sup> , Peter Schnider <sup>5</sup> and Gerhard Tucek <sup>2</sup> \*

*<sup>1</sup> OptimaMed Neurological Rehabilitation, Kittsee, Austria, <sup>2</sup> Department of Music Therapy, IMC University of Applied Sciences, Krems, Austria, <sup>3</sup> Department of Social and Cultural Anthropology, University of Vienna, Vienna, Austria, <sup>4</sup> Department of Nuclear Medicine, Regional Hospital Wiener Neustadt, Wiener Neustadt, Austria, <sup>5</sup> Department of Neurology, Regional Hospital Hochegg, Grimmenstein, Austria*

#### Edited by:

*Julian O'Kelly, Royal Hospital for Neuro-Disability, UK*

#### Reviewed by:

*Rita Formisano, Santa Lucia Foundation, Italy Jeanette Tamplin, University of Melbourne, Australia*

#### \*Correspondence:

*Gerhard Tucek, Music Therapy Program, Department of Health Sciences, IMC University of Applied Sciences Krems, Piaristengasse 1, 3500 Krems, Austria gerhard.tucek@fh-krems.ac.at*

#### Specialty section:

*This article was submitted to Auditory Cognitive Neuroscience, a section of the journal Frontiers in Neuroscience*

> Received: *11 April 2015* Accepted: *03 August 2015* Published: *21 August 2015*

#### Citation:

*Steinhoff N, Heine AM, Vogl J, Weiss K, Aschraf A, Hajek P, Schnider P and Tucek G (2015) A pilot study into the effects of music therapy on different areas of the brain of individuals with unresponsive wakefulness syndrome. Front. Neurosci. 9:291. doi: 10.3389/fnins.2015.00291* The global cerebral network allows music " to do to us what it does." While the same music can cause different emotions, the basic emotion of happy and sad songs can, nevertheless, be understood by most people. Consequently, the individual experience of music and its common effect on the human brain is a challenging subject for research. Various activities such as hearing, processing, and performing music provide us with different pictures of cerebral centers in PET. In comparison to these simple acts of experiencing music, the interaction and the therapeutic relationship between the patient and the therapist in Music Therapy (MT) provide us with an additional element in need of investigation. In the course of a pilot study, these problems were approached and reduced to the simple observation of pattern alteration in the brains of four individuals with Unresponsive Wakefulness Syndrome (UWS) during MT. Each patient had three PET investigations: (i) during a resting state, (ii) during the first exposure to MT, and (iii) during the last exposure to MT. Two patients in the MT group received MT for 5 weeks between the 2nd and the 3rd PET (three times a week), while two other patients in the control group had no MT in between. Tracer uptake was measured in the frontal, hippocampal, and cerebellar region of the brain. With certain differences in these three observed brain areas, the tracer uptake in the MT group was higher (34%) than in the control group after 5 weeks. The preliminary results suggest that MT activates the three brain regions described above. In this article, we present our approach to the neuroscience of MT and discuss the impact of our hypothesis on music therapy practice, neurological rehabilitation of individuals in UWS and additional neuroscientific research.

Keywords: positron emission tomography (PET), music therapy, human brain, brain areas, activity alteration

## Introduction

During the 1980s and 1990s neuroscientific research predominantly used electroencephalography (EEG) to show music related activities in the brain (Pape, 2005). Today, 30 years later, more elaborate methods of investigation offer the opportunity to show cerebral processes related to music. Functional and structural changes are shown quite clearly using single or combined measurement techniques. Magnetic and functional magnetic resonance tomography (MRT, fMRT) brain mapping, positron emission tomography (PET) as well as magnetic encephalography (MEG) and other techniques are used to explore focal brain activities. These studies developed the evidence base for understanding how listening to music is a complex process that involves multiple brain regions. Besides the auditory cortex, music increases activity in frontal, temporal, parietal and subcortical regions (Koelsch, 2009; Altenmüller and Schlaug, 2015; Brown et al., 2015). Thus, music has a wide range of effects on emotion (Blood and Zatorre, 2001; Boso et al., 2006; Koelsch, 2006, 2009, 2015; Koelsch and Jentschke, 2010; Pereira et al., 2011; Vuilleumier and Trost, 2015), cognitive functions such as attention and memory (Särkämö et al., 2008; Baird and Samson, 2015; Castro et al., 2015), motor functions (Limb, 2006; Koelsch, 2009; Levitin and Tirovolas, 2009; Schaefer and Overy, 2015) and mood (Särkämö et al., 2008; Radstaak et al., 2014; Zatorre, 2015).

Due to the influence on the brain, listening to music and music therapy are often used in neurological rehabilitation of disorders of consciousness (Gustorff and Hannich, 2000; O'Kelly et al., 2013; Magee et al., 2014; Verger et al., 2014; Magee and O'Kelly, 2015). Unresponsive Wakefulness Syndrome (UWS) belongs to the disorders of consciousness and is one of the most severe neurological impairments. The damage in several brain regions leads to an inability to respond to the environment even though patients show clear signs of wakefulness (Adams et al., 2001; Gosseries et al., 2011). As a consequence, the severity of UWS manifests itself to those interacting with the patient in a sudden impossibility to communicate via the usual means. While most professionals try to support the detection and recovery of functional communication, music therapy additionally tries to find new ways of connecting and communicating within the framework of the patient's capabilities. Music therapy has been used to support the neural and behavioral rehabilitation of individuals with UWS for more than 20 years.

An increase in music therapy research in this field points to its importance (Gustorff and Hannich, 2000; O'Kelly et al., 2013; Magee et al., 2014; Magee and O'Kelly, 2015). Combined with research on the neurological impact of music, music therapy research leads to a better understanding of its benefits for patients with brain damage. Still, evidence of music therapy's impact on the neurological rehabilitation of individuals with UWS is rare. To improve our understanding of the impact of music therapy on the neurological rehabilitation and its neural processing, we propose to take a closer look into the brain during music therapy as a complex process.

Understanding music as a language that transports its own distinct neuropsychological and emotional codes (Spreckelmeyer et al., 2013), we follow the hypothesis that enhanced activity and functional augmentation in the cerebral regions for emotion, learning, motion planning, and cognition can be expected after music therapy and shown by PET. Furthermore, from a music therapy perspective, our hypothesis is that individual, live music therapy in a setting of a therapeutic relationship promotes the neurological rehabilitation of individuals with UWS and boosts their brain activity. Our approach may be considered as part of a developing approach exploring brain activation by music therapy, but with a particular focus on an individualized and open investigation format.

Our aim is to trigger new investigations as a dialogue between music therapy and neuroscience in an effort to heighten our understanding of the function of music therapy, its way of activating the brain and its implementation in neurorehabilitation. These investigations could also help improve the individual approach to each patient.

## Theoretical Framework

What is the meaning of music therapy in its "whole complexity"? In our understanding, the complex effect of music therapy on the neuro-rehabilitation of individuals with UWS can be summarized in three aspects: the musical stimulus, the therapeutic relationship and the emotional exchange between the patient and therapist. To investigate the effect of music therapy, all three aspects need to be considered, not just the musical stimulus. For a better understanding of our hypothesis, the concept of music therapy as it is applied in Krems (Tucek, 2014; Tucek et al., 2014) needs a further explanation.

Music itself, as described in the introduction, has an impact on the human body, including brain, emotion, and movement, and so lends itself as an appropriate therapeutic medium in this and other fields. Even though studies show general neurological effects of music as a stimulus, the inter- and intra-individual meaning of this stimulus is different. Various aspects of listening to or performing music, such as personal preference, experience and the current mood are responsible for the formation of the personal meaning of music. Theories of embodiment (Csordas, 2002; Storch and Tschacher, 2014), emerging from anthropological studies, describe the engagement of culture and individuals through sensual perception and experience. Therefore, the meaning of music in therapy develops within the therapeutic session as a specific tool of communication between the patient and the therapist. To paraphrase Simon Rattle (2004), "music is not just what it is, but is that what it means to the people." To perceive and respond to the personal meaning and individual reactions of patients, the therapist empathically observes the patient and constantly adapts the music and the whole interaction to the reactions of the patient (Eisenberger et al., 2003). This leadsto a constant exchange between the patient and the therapist that forms the therapeutic process as well as shapes brain activity.

The foundation of this interaction is the therapeutic relationship. From early childhood, experiences of bonding and attachment enhance the growth and connectivity in the neural network (Schore, 1994), whereas social isolation increases the risk for morbidity and mortality (Cacioppo and Hawkley, 2003) and the potential for aggression (Eisenberger et al., 2003). Thus, interpersonal relationships are a basic need (Insel, 2001; Cozolino, 2006). Gustorff and Hannich (2000) emphasize that every living individual has the need and ability for perception and interpersonal communication. Although we do not know how patients with UWS perceive their environment, it is important to see them from a holistic perspective as social individuals. The therapeutic relationship has to be initiated and maintained actively in every session. Within the therapeutic relationship, we try to connect with the patients by observing their reactions to the performed music and by considering even the smallest physiological changes. Live music therapy can address the individual needs of patients and offer adjusted stimuli for the support of rehabilitation. We therefore propose that the experience of a therapeutic relationship within music therapy also promotes the connectivity in the neural networks in these patients.

Studies found that patients with UWS show emotional processing of auditory and visual information (Coleman et al., 2009; Yu et al., 2013). Music itself can evoke emotions (Koelsch, 2015). Additionally, emotional auditory stimuli, like listening to one's name or the mother's voice, activate anterior and posterior midline cortex in patients with UWS (Laureys et al., 2004; Demertzi et al., 2010). Emotion is a key component of how we experience our environment (Sharon et al., 2013). Emotional stimuli receive privileged access to attention and awareness, and thus are more likely to capture one's attention (Vuilleumier, 2005; Phelps, 2006). In particular autobiographic memories lead to emotional responses and involve widespread functions of the brain (Svoboda et al., 2006; Cabeza and St Jacques, 2007; Piolino et al., 2009). Music therapy uses this knowledge by applying familiar songs, singing names of individuals and using entrained music in therapy to reach the patients more directly and to promote reactions suggestive of awareness (Magee and O'Kelly, 2015).

A study on sensory stimulation revealed that, by inviting responses, we could pass from stimulation (which promotes arousal and attention) to rehabilitation (which promotes and reinforces behavioral responses) (Abbate et al., 2014). This statement supports our hypothesis that by combining musical stimuli, the therapeutic relationship and emotional approach, individual live music therapy encourages multisensory, behavioral and physical responses. These, in turn, promote the rehabilitation of individuals with UWS.

Until now, research on the neural effect of music therapy was limited to the observation of musical stimuli in the brain. To strengthen our understanding of the effect of music therapy in its complexity and to pretest our hypothesis, we started the first of a series of investigations of individual live music therapy. While the original research results will be published at a later date, part of the pilot study is presented in this article to describe our approach.

Even though inter-individual brain activity of the patients differs due to different levels of cerebral lesions, we expected to see functional and structural augmentation in the cerebral regions for emotion, learning, motion planning, and cognition (Schlaug et al., 2005; Hyde et al., 2009). In our pilot study, only patients with hypoxic brain injury following cardiopulmonary resuscitation (CPR) in UWS were chosen, where a more homogenous affection of the brain could be expected. While traumatic brain injury leads to heterogeneous regions of cortical damages in the brain with patterns of several foci, non-traumatic causes show an impact of thalamic and cortical functions due to hypoxic nerve cell lesions (Markl et al., 2013). Consequently, the fronto-temporo-parietal network also shows a decrease of activity in patients with UWS (Jennett, 2002; Laureys et al., 2004; Demertzi et al., 2010, 2013; Laureys and Schiff, 2012). However, for the first pilot study, we reduced our focus by limiting the examination to those three brain areas that are thought to be crucial to the success of music therapy and cognitive functions. Those are the frontal regions, the hippocampus and the cerebellum.

Following our hypothesis, the aim of this pilot study was to examine whether differences can be detected in the brain between individuals after hypoxic brain lesions who received music therapy and individuals with no music therapy in neurological rehabilitation.

## Material and Methods

## Study Organization

The pilot study was conducted at the IMC University of Applied Sciences Krems, under the direction of the corresponding author. The practical work with the patients was carried out at the Intermediate Care Unit (IMCU), specialized in rehabilitation of patients with disorders of consciousness at the Provincial Hospital of Hochegg, Austria. Ethical approval was given by the official Ethics Committee of Lower Austria. The study was financially supported by the Lower Austrian Health and Social Fund (NÖGUS) and Lower Austrian Provincial Hospital Holding (Landeskrankenhaus Holding). However, the sponsors had no role in study design, data collection, analysis, and interpretation.

### Participants

In this pilot study, we included patients with UWS after CPR who stayed at the IMCU. Patients were diagnosed before uptake at the IMCU and the UWS was confirmed after uptake following the common rules of diagnosis (Adams et al., 2001). The participants' legal representatives gave their written consent after a personal elucidation. Patients were randomly enrolled either into the music therapy group or the control group by drawing lots. For the first evaluation of our hypothesis, we examined four participants, two in the music therapy group and two in the control group.

## Methods

To show the activity of the brain during and after music therapy, we used PET investigations (Siemens Biograph 16 HiRez PET-CT Scan). PET is still the only method to study the relation between cognitive processes and neurotransmission by showing radiation of nuclear medical tracers in active brain areas (Pape, 2005; Akanuma et al., 2015).

Patients in both groups had three PET scans within 6 weeks: the first one (week 1) is a standard PET scan by the hospital in a resting state, without any stimulation. The second (week 2) and third (week 6) are with individual, live music therapy right before the PET scan and during the tracer application. The participants were transported from the hospital bed to the nuclear medical investigation at the Central Radiological Institute in Wiener Neustadt and received an intravenous <sup>18</sup>F-FDG tracer application (230 MBq) during music therapy or a resting state in the PET room. By measuring <sup>18</sup>F-FDG tracer uptake in the brain PET

shows the activity of brain regions. The advantage of this form of investigation is that the tracer is applied during music therapy or resting state, and we can see which brain regions are active and compare the results of different situations and times.

Patients in the music therapy group received live and individual music therapy for 5 weeks between the 2nd and the 3rd PET scan, three times a week. The sessions were conducted by a trained music therapist using various instruments and the therapist's voice. The approach to the patients in the therapy sessions coheres with the theoretical frame described above. A key element of the therapeutic work was the attunement to the patient. The therapies started with an initial touch on the arm or shoulder and humming, singing, or playing in the rhythm of breath. The manner of breath allows for the interpretation of the patient's current constitution to which the therapy is adapted. Autobiographical information, such as favorite songs or artists, were involved in the therapy as well as singing the patient's name. The therapist carefully observed the patient the entire time, including his physical (e.g., tonicity, facial expression, eyes) and physiological (e.g., breath, heart rate, oxygen saturation) actions and reactions, and adjusted to these. For example to support relaxation the therapist played improvisations entrained to the rhythm of the patient's respiration, or in order to help the patient to relive tension the therapist enhanced the amount of smooth tactile contact. To invite reactions and to avoid excessive demand, music and speech were provided in a basal, slow and adjusted manner and filled with pauses. The average therapy duration lasted for 27 min. All sessions were recorded on video and documented in protocols for further analysis.

Patients in the control group had no music therapy during those 5 weeks. However, all participants received standard care (physical, occupational and speech therapy, neuropsychological treatment), as the pilot study took place at the IMCU Hochegg, specializing in neurological rehabilitation of individuals with UWS (**Table 1**).

Three brain areas were analyzed in this pilot study, namely the frontal areas, the hippocampus and the cerebellum.

The frontal regions are known for processing cognitive and motor functions. For example, frontal premotor areas are involved in the perception and production of rhythm (Limb, 2006; Levitin and Tirovolas, 2009), while other frontal regions are responsible for cognitive tasks, impulsion, memory, and social functions (Trepel, 2008).

The hippocampus is a part of the limbic system, hence is involved in emotional processes, social bonding, and

#### TABLE 1 | Course of the study.


relationships (Koelsch, 2012). Therefore, it is hypothesized that therapeutic relationship may have an influence on the hippocampus. Its activity increases while listening to music that is associated with positive emotions (Brown et al., 2004; Koelsch, 2009; Levitin and Tirovolas, 2009). Additionally, it plays a crucial part in learning, memory, spatial orientation and the processing of sensory information (Blood and Zatorre, 2001; Brown et al., 2004; Eldar et al., 2007; Trepel, 2008; Levitin and Tirovolas, 2009; Koelsch, 2012).

The cerebellum is involved in several motor functions, such as posture, tonicity, and arbitrary movement (Trepel, 2008). Due to the connection to the limbic system it is also involved in cognitive and emotional processes. In musical tasks it is responsible for the perception and production of rhythm as well as emotional reactions to music (Blood and Zatorre, 2001; Limb, 2006; Levitin and Tirovolas, 2009; Trost et al., 2012; Akanuma et al., 2015).

This pilot study was based on the hypothesis that the brain is activated by individual music therapy. We assume that PET can be used to show that music therapy reliably activates the human brain and enhances neurological rehabilitation. However, our aim was not to prove our hypothesis, but observe the brain of individuals with UWS during music therapy and develop our understanding of this in the context of a neuroanthropological approach.

#### Results

Quantitative data of uptake values were generated automatically using the Syngo Scenium Ver.1.2.0.13 Siemens Medical Solutions software. To avoid misinterpretation caused by metabolic variations, all results were adjusted to the uptake values of a reference region (calvaria). For further analysis the differences between the three PET scans were calculated for each patient individually (numerical value and percentage calculation) and then compared to each other. As the numerical values of the differences vary considerably due to the severity of the patients' brain lesions, the changes in the uptake values are presented in percentages. This allows a better comparison of the results in the two groups.

The results of the first evaluation show an increase in tracer uptake in PET 3 in all three areas in music therapy patients, while it decreased in the control group patients. In both groups tracer uptake was lower in PET 2 than in PET 1 (mean value: MT-Group: −1%; CG: −12%). **Figure 1** shows the mean values of the changes in the course of the study.

After 5 weeks of music therapy tracer uptake in PET 3 increased by 37% in frontal regions, 28% in hippocampus, and 38% in cerebellum in the music therapy group. The control group shows different results. While activity increased in PET 3 by 7% in frontal areas, 4% in hippocampus and 3% in cerebellum, tracer uptake was still lower than in PET 1. **Figure 2** shows the mean value of changes from PET 2 to PET 3.

The goal of the investigation was not to describe different states of consciousness or the awakening after music therapy but to show changes in brain activity before, during and after the therapy. This was documented clearly through simple PET investigation. However, we did not conduct further statistical

FIGURE 1 | Mean values of the changes in brain activity in the course of the study.

analyses in this pilot study due to the small number of participants.

## Interpretation

Considering the low number of subjects, we want to handle the interpretation with care. However, there is a considerable difference between the two groups, which supports our hypothesis. The increase of tracer uptake can be interpreted as an increase in brain activity. Patients in the music therapy group show a higher brain activity than control group patients. However, we have to take into account that PET 3 is also a scan of music therapy as a stimulus. We cannot yet explain the decrease of tracer uptake in PET 2, as patients received the first music therapy during this situation and four cases provide insufficient data for interpretation.

This pilot study represents a first step in a series of investigations as a dialogue between music therapy and neuroscience. It shows that these research methods may open the way to getting more definite results on the effect of music therapy on the neuro-rehabilitation of individuals with UWS. While this pilot study focused on the activity in targeted areas of the brain, further research will provide more room for interpretation of the neurological rehabilitation. Additionally, more patients and statistical analysis of the PET results may help clarify our results.

## Discussion

This pilot study was a primary step into the very complex field of music therapy in the neuro-rehabilitation of patients with UWS. Examining four patients was the first attempt to evaluate whether our hypothesis can be tested with the chosen methods. Further research is currently in progress and the following steps are planned. In light of the complexity of music therapy as discussed previously, the focus on three brain areas is a limiting factor. Nevertheless, it was a stepping stone for developing research methods under almost-bedside conditions in order to bridge the gap between research and practice.

During rehabilitation of individuals with UWS, neurologists and music therapists have a long history of interdisciplinary cooperation. Particularly when working with patients who cannot communicate what they experience, it is important to find indications from the effect of music therapy. Studies on behavior observation are crucial for our practical work; however, they only capture what is observable from the outside of UWS patients. Neuroscience provides deeper insight into neurological processes and the neurological rehabilitation and has in recent years helped gain a better understanding of the effect of music as a stimulus in the brain. In order to achieve a better understanding of music therapy, it is important to find a more complex approach, combining video analysis, neuroanthropological methods, psycho-vegetative parameters (e.g., heart rate variability) and brain imagery.

Neuroscience helps music therapy gain knowledge about the physiological effects of musical elements, which is useful for the theoretical foundation of music therapy. However, we have to be aware that research on music therapy needs a broader approach and interpretation of results than research on music. As the concept of music therapy in Krems, as described above, derives from an anthropological perspective, our approach to research is influenced by a neuroanthropological one (Vogl et al., 2015). Neuroanthropology combines neuroscientific and anthropological research by investigating the interaction between brain, environment and culture. It allows a broader perspective on music therapy by collecting quantitative data as well as qualitative information on the patient's cultural background, environmental influences and the therapeutic relationship between the patient and the therapist. To achieve a careful interpretation of behavioral reactions and imaging results, neuroanthropology encourages a reflective process at any time of a research project and poses profound questions on the meaning of results. PET scans, for example, are important to gain insight into the physiological correlation to music therapy, but results give no answer to the question about the meaning of music therapy. What do neural changes mean for patients with UWS? Does the increase in brain activity show an effect in their behavior? What are the advantages of higher brain activity for these patients? And more generally, what does music therapy really do for them?

We should not forget that adapting to the new situation after the lesion of the brain and coping with UWS can pose an emotional challenge and cause a "reorientation syndrome" (Steinhoff, 2012) for the patient as well as their relatives. To bridge the gap between research and practice, our studies are accompanied by a neuroanthropologist, who focuses on cultural and environmental influences on the brain activity of patients with UWS. Further explanations and first examples of the neuroanthropological approach are published by Vogl et al. (2015).

From an anthropological perspective, the aim of music therapy is to transform the foreign, clinical environment (Umwelt, "around-world") of patients to their contemporaries (Mitwelt, "with-world") (Binswanger, 1963; Prinds et al., 2013). By addressing the patient individually and opening up to individual needs and reactions, music therapy is formed not only for the patient, but with the patient. Given that the therapeutic relationship and the interaction within music therapy promotes rehabilitation, the therapist presents another unexplored element.

In an interdisciplinary team, we deviate from common patterns of investigation and try to find new ways to examine the effect of music therapy on patients with UWS. As described above, music therapy is more than listening to musical stimuli. Therefore, studying its effect needs to include all its elements. An interdisciplinary approach may help find new methods to get answers. Furthermore, a dialogue is necessary between all people and professions involved in a study: physicians, care team, other therapists, anthropologists, nuclear physicians as well as participants' relatives. Everyone can provide information which is beneficial for a good course of the study.

### References


Summing up, music therapy practice can be advanced by neuroscience opening itself up to individual real-life settings and integrating all elements of music therapy, because its benefit may lie exactly in its complexity. Music therapy is a multisensory, emotional, physical and social approach and therefore involves many neurological functions. If we want to meet the individual needs of the patients, music therapy cannot be standardized. Therefore, it is crucial to have research methods within the frame in which the investigation of individual music therapy takes place. Opening up to this complexity requires new ways of thinking which can be enhanced by an interdisciplinary dialogue. Particularly in music therapy, whose theory and methods are based on the combined knowledge of various disciplines, a dialogue with neuroscience can support the evidence for our practical work and provide insight into deeper processes in our patients. Hence, the dialogue between music therapy and neuroscience is seen as an important, fruitful advantage for both disciplines.

## Acknowledgments

We acknowledge the contributions of the Region of Lower Austria, the Lower Austrian Hospital Holding and the Lower Austrian Health and Social Fund (NÖGUS) and all contributors of the different related institutions, who made this study possible by personal or organizational effort. We also want to thank Denise Kleiss, Ben Freid, and Veronika Dornhofer for help with the English editing.


longitudinal study. The neurosciences and music III: disorders and plasticity. Ann. N. Y. Acad. Sci. 1169, 182–186. doi: 10.1111/j.1749-6632.2009.04852.x


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Steinhoff, Heine, Vogl, Weiss, Aschraf, Hajek, Schnider and Tucek. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Cortical reorganization in recent-onset tinnitus patients by the Heidelberg Model of Music Therapy

#### *Christoph M. Krick1 \*, Miriam Grapp2, Jonas Daneshvar-Talebi 1, Wolfgang Reith1, Peter K. Plinkert <sup>3</sup> and Hans Volker Bolay4*

*<sup>1</sup> Department for Neuroradiology, Saarland University Hospital, Homburg, Germany*

*<sup>2</sup> German Center for Music Therapy Research (Victor Dulger Institute) DZM, Heidelberg, Germany*

*<sup>3</sup> Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital for Ear, Nose, and Throat, University of Heidelberg, Heidelberg, Germany*

*<sup>4</sup> Music Therapy Tinnitus Outpatient Department, German Center for Music Therapy Research (Victor Dulger Institute) DZM, Heidelberg, Germany*

#### *Edited by:*

*Jörg Christfried Fachner, Anglia Ruskin University, UK*

#### *Reviewed by:*

*Martin Schecklmann, University of Regensburg, Germany Derek James Hoare, University of Nottingham, UK*

#### *\*Correspondence:*

*Christoph M. Krick, Department for Neuroradiology, Saarland University Hospital, Kirrberger Straße, D-66421 Homburg, Germany e-mail: christoph.krick@ uniklinikum-saarland.de*

Pathophysiology and treatment of tinnitus still are fields of intensive research. The neuroscientifically motivated Heidelberg Model of Music Therapy, previously developed by the German Center for Music Therapy Research, Heidelberg, Germany, was applied to explore its effects on individual distress and on brain structures. This therapy is a compact and fast application of nine consecutive 50-min sessions of individualized therapy implemented over 1 week. Clinical improvement and long-term effects over several years have previously been published. However, the underlying neural basis of the therapy's success has not yet been explored. In the current study, the therapy was applied to acute tinnitus patients (TG) and healthy active controls (AC). Non-treated patients were also included as passive controls (PTC). As predicted, the therapeutic intervention led to a significant decrease of tinnitus-related distress in TG compared to PTC. Before and after the study week, high-resolution MRT scans were obtained for each subject. Assessment by repeated measures design for several groups (Two-Way ANOVA) revealed structural gray matter (GM) increase in TG compared to PTC, comprising clusters in precuneus, medial superior frontal areas, and in the auditory cortex. This pattern was further applied as mask for general GM changes as induced by the therapy week. The therapy-like procedure in AC also elicited similar GM increases in precuneus and frontal regions. Comparison between structural effects in TG vs. AC was calculated within the mask for general GM changes to obtain specific effects in tinnitus patients, yielding GM increase in right Heschl's gyrus, right Rolandic operculum, and medial superior frontal regions. In line with recent findings on the crucial role of the auditory cortex in maintaining tinnitus-related distress, a causative relation between the therapy-related GM alterations in auditory areas and the long-lasting therapy effects can be assumed.

**Keywords: tinnitus, cerebral reorganization, brain plasticity, auditory cortex, MRI, voxel-based morphometry (VBM), Heidelberg Model of Music Therapy, gray matter**

#### **INTRODUCTION**

Tinnitus is one of the most common symptoms in ENT medicine (Pilgramm et al., 1999). Whereas short and transient phantom noise seems to be a ubiquitous phenomenon in general population, 5–15% of people are affected by a persisting manifestation, among those, up to 2% of cases being even severely restricted in their quality of life (Axelsson and Ringdahl, 1989; Khedr et al., 2010; Shargorodsky et al., 2010). Beyond that, the phantom noise often carries additional psychiatric, psychosocial, or psychosomatic comorbidities such as anxiety and depression, concentration and attention deficits, as well as sleep disorders (Jacques et al., 2013). This epidemiological profile points to the value of investigating the origins of tinnitus more thoroughly in order to get better understanding of its nature and of the potential for remediation. Tinnitus is thought to be triggered in many cases by cochlear damage, resulting in abnormal or missing afferent input to the auditory cortex (Moller, 2007). However, this specific defect seems not to be sufficient to explain the whole genesis. To date, there are many methods available to explore brain involvement in phantom noise (Galazyuk et al., 2012; Langguth et al., 2013; Noreña and Farley, 2013; Zhang, 2013) including, Transcranial Magnet Stimulation (Theodoroff and Folmer, 2013), Independent Component Analysis of brain potentials (De Ridder et al., 2011b) and morphometric measurements (Schecklmann et al., 2013). Recent neuroimaging studies of tinnitus indicate the involvement of wide-spread brain networks for perception, attention, memory, and emotional aversive processes (Adjamian et al., 2009; Lanting et al., 2009). In this context De Ridder et al. (2013) proposed a neuronal model of phantom perception and its emotional coupling to distress based on a previous model proposed by Jastreboff (1990). According to the authors, missing signals by sensory deafferentation cause high-frequency, gamma band, synchronized neuronal activity in the sensory cortex. This activity only reaches awareness when it co-activates brain networks that are related to self-perception and salience (amygdala, anterior cingulated, anterior insula, and precuneus). In such way getting conscious, the phantom percept also activates a non-specific distress network that in turn overlaps with salience coding, resulting in an emotional coupling of tinnitus to the experience of distress. Structural brain analysis observing neuronal plasticity has been found to be suitable for understanding correlations between such mental sensations and neural mechanisms (Valkanova et al., 2014). Of critical importance are seven studies so far that specifically investigated anatomical deviations in tinnitus (Mühlau et al., 2006; Landgrebe et al., 2009; Schneider et al., 2009; Husain et al., 2011; Leaver et al., 2012; Boyen et al., 2013; Schecklmann et al., 2013). Noteworthy, even though they all consistently report structural differences, their results largely differ regarding both the localization and the direction of changes. However, there is growing consensus on the involvement of gray matter (GM) alterations in auditory brain areas when suffering from tinnitus distress. For instance, Schecklmann et al. (2013) found such an interrelation in the course of a large cross-sectional morphometric study (*n* = 257), and cross-validated the results in an independent second sample (*n* = 78). More precisely, tinnitus distress correlated negatively with GM volume in bilateral auditory areas, pointing to higher individual tinnitus distress with lower gray matter volume. Similarly, Schneider et al. (2009) found gray matter loss associated with tinnitus in the Heschl's gyri, again indicating a close relationship between tinnitus and auditory cortices. Leaver et al. (2012) also revealed reduced gray matter next to auditory area in tinnitus patients, and additionally observed a substantial GM decrease in medial frontal cortex (dmPFC). The authors argued that the latter tinnitus-related alterations in dmPFC might not be related to distress, but to individual loudness of tinnitus sensation. Boyen et al. (2013) also observed changes in auditory areas due to tinnitus, but in contrast to the previous findings pointing to an increase of GM.

Suffering from tinnitus does not necessarily mean feeling diseased due to the phantom noise. Quite the contrary, neither perceived loudness nor tinnitus frequency seem to correlate with mental strain, but it is the emotional correlate of tinnitus, that is, tinnitus distress, which may trigger such feelings of diminished well-being (De Ridder et al., 2011a). In line with this assumption, GM alterations in auditory cortex were not correlated with tinnitus sound, but with severity of tinnitus-related distress (Schecklmann et al., 2013). Hence the existence of tinnitus *per se* (by phantom noise) does not require any therapeutic intervention. But since tinnitus often co-occurs with considerable emotional decline among affected patients, there is still demand for therapeutic assistance. However, many available therapies resulted in relatively small effects or lacked improvement in tinnitus load (Pichora-Fuller et al., 2013).

In case of acute tinnitus manifestation, existing treatment options may be considered unsatisfactory. On the one hand, several pharmacological approaches (Patterson and Balough, 2006) have been established considering tinnitus to be equivalent to sudden sensorineural hearing loss (Hesse and Laubert, 2010) or to any cochlear damage (Shim et al., 2011). However, none of these treatment methods have proven to be effective after replication in controlled trials (Elgoyhen and Langguth, 2011). On the other hand, different types of psychotherapeutic intervention supporting and accompanying medical treatment have also been designed (Schildt et al., 2006; Gerhards and Brehmer, 2010). These adjuvant psychotherapeutic interventions consist of one or more of the following elements: psycho-educative counseling, relaxation training, and general and tinnitus-related stress management. Different approaches designed to manage or to habituate the phantom noise have been established (Tinnitus Retraining Therapy, Cognitive Behavioral Therapy, Progressive Tinnitus Management, Biofeedback, Education, and Relaxation Therapies), partially resulting in persistent therapy success (Herraiz et al., 2007; Hesser et al., 2011; Folmer et al., 2014; Grewal et al., 2014; Myers et al., 2014). Whereas psychological strategies are intended to modulate attention and emotion toward tinnitus, noise maskers and hearing aids instead interact with acoustic sensation to suppress tinnitus perception. Tinnitus sound masking was developed in the early 1970s (Coles et al., 1984) and is still being used in mild cases, because a lasting improvement can be achieved as long as the external noise is applied. The devices led to reduced tinnitus distress, especially when combined with hearing aids amplifying the impaired frequency range (Oz et al., 2013). Direct modulating of the tinnitusrelated activity is intended in either Transcranial Direct Current Stimulation (tDCS) or rapid Transcranial Magnet Stimulation (rTMS), too (Langguth and De Ridder, 2013). In most of these studies the primary auditory cortex has been targeted for tinnitus treatment by cortical stimulation (Simon et al., 2012). However, benefits from rTMS therapy have not been shown to persist over time (Theodoroff and Folmer, 2013).

Any effective therapy for tinnitus requires a fundamental understanding of its physiological and neural background. For instance, the "Heidelberg Model of music therapy for Tinnitus" refers to scientific evidence for cerebral circuits of tinnitus enhancement (Argstatter et al., 2008) (for details see Procedure section). This treatment approach strives for an integration of strategies to manage the psychological state and to possibly reverse the underlying neuronal reorganization. For this purpose, complementary music- and psychotherapeutic interventions, comprising emotional regulation of tinnitus load and exercises of frequency discrimination in the spectral range of tinnitus noise, have been organized into several modules, resulting in a manualized short-term music therapeutic treatment concept whose separate treatment modules and long term effects are described in detail by Argstatter et al. (2012). The authors also reported the high clinical efficacy and long-term effects of this approach in chronic tinnitus patients. Corresponding clinical therapeutic effects in patients with acute tinnitus have been previously reported by Grapp et al. (2013). The authors of this study measured a decrease of tinnitus-related mental load in treated compared to untreated patients after 1 week of therapy. This improvement on tinnitus distress by the aforementioned therapy concept formed the starting point of our research. We aimed to investigate the corresponding neural correlates of this distress-related improvement in tinnitus patients more thoroughly. Thus, we sought to gain a deeper insight into the complex brain etiology and into the possibility of cortical reorganization in tinnitus.

#### **HYPOTHESIS**

Based on previous studies on structural plasticity, we expected a neural correlate of the therapy effect to be most prominent within auditory areas (Heschl's gyri), as tinnitus distress is highly related to structural GM loss in these regions. Whereas microstructural regeneration processes on a cellular level (Kwok et al., 2011) cannot be directly detected by MRI, corresponding effects on brain tissue (Kleim et al., 2004) seem to be reliably detectable by Voxel Based Morphometry (VBM) (Ashburner and Friston, 2000). By conflating specific evidence for structural changes after therapy (Seminowicz et al., 2013) with assumptions about the rapid intervention-induced expansion of GM as general principle of human neural plasticity (Driemeyer et al., 2008; Taubert et al., 2010; Tavor et al., 2013), we hypothesized a GM alteration also with the Heidelberg Model of Music Therapy after a short-term treatment interval of 1 week.

#### **METHODS AND PARTICIPANTS**

#### **PARTICIPANTS**

In this study, we included participants who were diagnosed with acute tinnitus persisting for a maximum of 3 months, without significant symptom change after an initial medical intervention according to AWMF guidelines (glucocorticoids or rheological drugs). Before including the participants in music therapy, a waiting period up to 4 weeks was warranted in order to prevent both delayed drug response and the influence of possible spontaneous remission. After completion of this pharmacological treatment during the first weeks after tinnitus onset, tinnitus patients underwent a pre-participation evaluation for participation in the music therapy. In addition to standard audiological testing and otolaryngological examination, important demographic and tinnitus-related data were collected. Patients were excluded if the tinnitus was related to anatomic lesions of the ear, to retrocochlear lesions or to cochlear implantation. Further exclusion criteria comprised clinical diagnosis of a co-morbid severe mental disorder, clinical diagnosis of Menière's Disease, severe hyperacusis or severe hearing impairment more than 40 dB beyond the affected tinnitus frequencies. The latter criterion was chosen to exclude interaction between music therapy and hearing aids for the present.

Fifty patients with experience of a recent tinnitus onset (between 6 and 12 weeks prior to the intervention) were invited to participate in the music therapy study subsequent to treatment according to the standard clinical protocol for acute tinnitus in the University Hospital for Ear, Nose, and Throat at the University of Heidelberg. All patients had an age-appropriate hearing level and reported no otological or psychological comorbidity. At the time point of the pre-participation evaluation (T0) the patients were randomly divided into two groups, a treatment group (TG) and a waiting group for passive tinnitus controls (PTC). The time span between tinnitus onset and T0 was 5.10 (SD 2.14) weeks in TG and 4.63 (SD 2.01) weeks in PTC. For ethical reasons, PTC patients also received the therapeutic intervention, but following the study period. Participants of both groups were instructed about MRI measurements and its noise level. All participants were insured for any health impairment and accidents. They gave written informed consent in accordance with the Declaration of Helsinki. The study was in accordance with the requirements of the ethic review board of Saarland.

After the period of the standard clinical treatment protocol, 7 patients were excluded from music therapy due to disappearance of tinnitus. Two further patients were excluded because of claustrophobia. Thus, the effective sample comprised 19 patients in the TG and 22 patients in the PTC. The patient groups did not differ in age, sex or in level of distress (see **Table 1**). The mean delay between tinnitus onset and therapy start (T1) was 8.14 (SD 1.85) weeks in TG and 8.10 (SD 1.45) weeks in PTC.

After recruitment of tinnitus patients, a group of 22 healthy participants were included into the music therapy condition serving as active controls (AC). They were matched in age and sex to the patients' groups. They underwent the same therapy protocol as implemented in TG. This study protocol consisted in 9 consecutive 50-min sessions of individualized therapy over 5 days, comprising acoustic training for frequency discrimination, auditory attention control tasks, and guided exercises for mindfulness and distress regulation.

In total, data from 63 participants from the three groups were included in the analysis. The three samples used for Voxel Based Morphometry (VBM) did not differ in biometric data in sex [χ²(*df* = 2) = 0.99, *p* = 0.95] or in age profile [χ²(*df* = 2) = 1.76, *p* = 0.42].


#### **Table 1 | Patient-related as well as tinnitus-related data in an overview.**

#### **STUDY PROTOCOL**

Therapy effects on tinnitus severity and individual tinnitus related distress were assessed by Tinnitus Questionnaire (TQ) developed by Goebel and Hiller (1998). The TQ refers to both tinnitus-related functional disabilities (such as concentration difficulties or hearing impairment) and emotional impairments (such as fear, anger or frustration due to tinnitus). TQ scores were obtained at three different times, during inclusion examinations, before start of treatment, and after the therapy week. The preceding TQ assessment as part of the inclusion examination was integrated into the experimental setup to exclude novelty effects from further evaluation of therapy effect.

All participants underwent two MRI sessions on two subsequent weekends. Between these MRI sessions, TG and AC were treated with music therapy according to the Heidelberg Model. Participants of PTC did not receive any intervention during this time. MRI scans were performed at the Department for Neuroradiology in Homburg using a "Skyra" Siemens 3- Tesla-Scanner and a 20-channel head coil. Each MRI session consisted of three parts: functional measurement during a continuous performance task previously used in attention studies (Schneider et al., 2010), high resolution anatomical T1-weighted scan, and functional measurement of emotional processing of tinnitus related (idiosyncratic) and other affective and neutral verbal stimuli (Golm et al., 2013). However, in this paper only the results from the anatomical scans will be reported. The Magnetization Prepaired Rapid Acquisition Gradient Echo (MPRAGE) protocol (Mugler and Brookeman, 1990) was used, resulting in a resolution of 0.9 × 0.9 × 0.9 isometric voxel size covering the whole head.

#### **STATISTICAL ANALYSIS**

MRI scans were performed twice, once before (A-image) and one after (B-image) the 1-week period, for the purpose of Voxel Based Morphometry (VBM) (Ashburner and Friston, 2000) as realized for longitudinal measurements by the VBM8-Toolbox (Christian Gaser, University of Jena, http://dbm.neuro.uni-jena. de/vbm). Brain compartments of white and gray matter were segmented, DARTEL normalized by IXI-template to MNI space (Ashburner, 2007), and smoothed by Gaussian kernel of 10 mm radius.

Comparisons of structural changes were calculated by "flexible factorial design" as implemented in SPM8 (Wellcome Trust Centre for Neuroimaging, London, 2010). The numerical procedure was carried out as a Two-Way ANOVA calculating the influence of the three participant groups and the two dependent time points, scanned before and after the study week, respectively. A comparison between treated and untreated patients (TG vs. PTC) was performed to examine differential therapy-induced effects on structural change. Resulting structural findings from this contrast were further used as spatial mask for general effects of music therapy. Specific tinnitus-related therapy effects were calculated by contrast between TG and "treated" AC in conjunction with selected brain clusters from general effects (TG vs. PTC). This step was implemented to separate tinnitus-related effects from general therapy-related effects. All obtained clusters of each comparison were corrected *post-hoc* by extent threshold of 125 contiguous voxels and reported after family-wise error (FWE) correction on cluster-level of 5% alpha error.

Revealed clusters from GM contrasts between groups were anatomically assigned to brain structures using the cytoarchitectonic maps as published in Morosan et al. (2001) by application of the Anatomy Toolbox (Eickhoff et al., 2006) supplemental to SPM.

#### **MUSIC THERAPY AND ASSESSMENT OF CLINICAL THERAPY EFFECTS**

The music therapy according to the Heidelberg Model of Music Therapy for tinnitus (Argstatter et al., 2008) is a manualized short term treatment approach lasting for nine consecutive 50 min sessions of individualized therapy. Therapy takes place over five consecutive days with two therapy sessions per day. The therapy was carried out by a team of two expert therapists, usually one music therapist and one psychotherapist.

Treatment by music therapy was characterized by several distinctive features:


The treatment modules are described in more detail by Argstatter et al. (2012).

#### **RESULTS**

#### **CLINICAL THERAPY EFFECT AS ASSESSED BY TQ**

Tinnitus-related mental load in terms of distress or psychiatric disorders was measured by TQ at T0 (pre-participation evaluation), at T1 (therapy start) and therapy end (T2). The resulting therapy effect was assessed by the difference of TQ scores between T1 and T2. Treatment by the compact approach of the Heidelberg Model of Music Therapy over 1 week added up to 450 min of therapy sessions (9 × 50 min). TQ effects between TG and untreated PTC patients were assessed by General Linear Model (*df* = 1; *F* = 22.9; *MSE* = 1374) for repeated measures using SPSS21 (IBM Corp.). The 1-week therapy resulted in a significant (*p* < 0.00005) effect on change in TQ scores (see **Figure 1**) between both groups of tinnitus patients: Compared to a slight test-retest effect (about 1.8 TQ scale points) in PTC, in TG a significant (*T* = −5.7, *df* = 18, *p* < 0.0001) decrease of 17.7 (SD 13.6) TQ scale points was measured. In PTC, the TQ score did not significantly change over the observation period of 1 week.

#### **THERAPY- RELATED CORTICAL ALTERATION: GM INCREASE IN TG vs. PTC**

Over the observation period of 1 week, Heidelberg Model of music therapy was applied to TG patients, whereas PTC patients were not treated during this time span. Comparing structural alterations between T1 and T2, several brain regions revealed increased GM density in treated vs. untreated tinnitus patients (see **Figure 2**), yielding clusters in precuneus, supplemental motor area (SMA), medial superior frontal sulci, prefrontal areas, right Rolandic operculum and right Heschl gyrus (see **Table 2**). The highest effect was measured in the right Rolandic operculum resulting in GM increase of about 1.7% in treated patients. The resulting contrast was subsequently considered as a mask for general effects of the therapy situation, comprising training tasks, relaxation sessions, and specific auditory exercises.

#### **CORTICAL ALTERATIONS IN "TREATED" AC vs. UNTREATED PTC**

In "treated" AC, an increase of GM density also occurred in precuneus and medial superior frontal areas (see **Figure 3**) when contrasted with untreated PTC. This result overlapped with clusters revealed by contrast between TG and PTC. However, the contrast in AC did not reach the magnitude of the effect as observed in TG (see **Table 3**). No Clusters in temporal areas were observed.

#### **TINNITUS-SPECIFIC GM ALTERATIONS: CONTRAST BETWEEN TG vs. AC**

Comparison of TG and AC each with experience of the therapy week, respectively, but different in tinnitus presence, was performed to assess specific tinnitus-related structural effects due to the therapeutic intervention. This contrast was calculated within the clusters from general effects of the music therapy as calculated by contrast between TG and untreated PTC. Thus, this calculation is regarded as a selection of tinnitus-related effects

**Table 2 | Clusters showing increase of gray matter density in TG vs. PTC over 1 week.**


*\*\*p* < *0.01/\*p* < *0.05 after FWE correction.*

within therapy-related structural GM alterations. **Figure 4** shows the resulting clusters in right auditory and medial superior frontal areas, covering the right Heschl gyrus and the right Rolandic operculum (see **Table 4**).

#### **DISCUSSION**

The present study was very innovative due to the homogeneity of patient samples. Most of previous studies on neural correlates of tinnitus distress have been carried out on patients with chronic tinnitus in which tinnitus duration was not taken into account. Importantly, in this paper, these potentially moderating effects

**AC vs. PTC over 1 week.**


*\*p* < *0.05 after FWE correction.*

of tinnitus duration were explicitly controlled for by including only patients with a recent onset of tinnitus persisting for a maximum of 3 months. On the whole, only a few studies so far have engaged in systematic measurements of such neural alterations in *acute* tinnitus. Among them, Job et al. (2012) found neural hyperactivities in attention and emotion related areas especially in the insula, the ACC and the PFC in military adults with acute acoustic trauma and consequent tinnitus. In addition, Vanneste et al. (2011) examined the differences of the neural network between tinnitus of recent onset and chronic tinnitus. Their results indicate that the neural structures detected in both acute and chronic tinnitus were identical (comprising auditory cortices, insula, dorsal ACC and premotor cortex) but they also revealed different activity and connectivity patterns within this network.

In line with the previous findings of Argstatter et al. (2012) as well as of Grapp et al. (2013), a significant clinical improvement by the Heidelberg Model of music therapy was quantified using TQ. Thus, the neuro-music therapy approach according **Table 4 | Separation of tinnitus-related from therapy-related GM alteration (\****p <* **0***.***05 after FWE correction) by calculating contrast TG** *>* **AC within mask of general therapy effect (TG** *>* **PTC).**


to the "Heidelberg Model" seems to provide an effective treatment option for patients with acute tinnitus if initial medical treatment fails to induce remediation. In these studies, both a significant improvement in subjectively perceived tinnitus distress and GM changes were evident immediately after the treatment. The improvements in tinnitus distress not only concerned the patients' cognitive and emotional strategies dealing with tinnitus, but also its intrusiveness and subsequent auditory perception difficulties. Compared to most other therapy options for tinnitus patients, the Heidelberg Model of Music Therapy goes far beyond a pure symptom management. At the core of this treatment approach, the patients are "confronted" actively with their individual tinnitus sounds and are instructed to deal with their tinnitus explicitly instead of trying to ignore them.

In the present study, results revealed that, consistent with the clinical effects of music therapy, GM increased substantially in treated patients (TG) as well as in active controls (AC) compared to untreated patients (PTC). Both TG and AC experienced the same exercises and therapeutic sessions. However, GM increase in treated patients covered more brain areas and yielded higher effect sizes compared to the AC. One may speculate on these findings that healthy controls did not similarly profit from the therapeutic interventions as did tinnitus patients. While tinnitus distress is treated both by relaxation techniques and by frequency discrimination exercises, healthy subjects probably experienced these approaches as "mental wellness" only due to their general influence on the distress network (De Ridder et al., 2013). This may contribute to explain the different effect sizes despite equal training schedule. However, this difference was expected due to a specificity of therapeutic effects on patients suffering from tinnitus (compensation view).

The tinnitus-related structural effects, or the therapy-induced GM alterations, respectively, could be consistently located on areas that are considered to be most sensitive for tinnitus-related distress (Leaver et al., 2012; De Ridder et al., 2013; Schecklmann et al., 2013). However, findings on the direction of the structural therapy effects revealed in our study were not in line with previous findings on tinnitus distress: Whereas mental tinnitus load had been previously associated with GM loss in Heschl's gyri and in dorsomedial frontal location, improvement by music therapy intervention resulted instead in GM increase in these areas.

Most probably music therapy was able to influence and reinforce auditory sensation of those frequencies that were disrupted by a partial hearing impairment. Many patients reported a variability of tinnitus pitch in the course of therapy sessions and lower tinnitus loudness after the therapy week (Hutter et al., 2014). As this partial hearing loss is considered to cause the phantom noise (Lanting et al., 2009), exercises of frequency discrimination in the spectral range of tinnitus might be involved in the reduction of distress and loudness. Although a rapid direct cochlear regeneration can be deemed implausible, further compensation strategies using overtone or envelope characteristics of musical harmonics can be trained to enhance signal extraction for auditory processing. A higher perceptual efficiency regarding the defective frequencies is then able to more activate those areas within auditory cortex that formerly sustained a loss of GM due to lack of signal. More neuronal activation in turn modulates reconstruction processes in the neuronal network, for example, by a down regulation of restrictive factors for neuronal contact in the peri-neuronal net (Wang and Fawcett, 2012). Although VBM is not able to directly depict cellular activity, one may assume that these regeneration processes are similar to re-innervation mechanisms, involving the peri-neuronal net proteins (Kwok et al., 2011). Results from training studies in a mouse model indicate a subsequent growth of synaptic bulk accompanied by a dilatation of the neuronal network on tissue level over several days (Kleim et al., 2004). On a macroscopic level, this tissue augmentation can be detected as a rapid increase of GM density by structural MRI (Warraich and Kleim, 2010). Driemeyer et al. (2008) also underlined these temporal dynamics of structural plasticity by training in humans. They observed a major increase of GM as early as after 7 days during a continuous motor coordination training task. Recent studies even report a reduction of traininginduced cortical reorganization despite ongoing exercises over longer observation times (Tennant et al., 2012). This also may explain our results of striking GM alterations due to the compact therapy over 1 week.

In contrast to bilateral results of Schecklmann et al. (2013), the music therapy influenced the right auditory areas only at the reported statistical level. This lateralization may be based on functional lateralization in auditory processing (Warrier et al., 2009), indicating that the right Heschl's gyrus might be more involved in spectral-related acoustic information. In general, the left primary auditory cortex is more active in right-handed subjects, but it shows more sensitivity to temporal stimulus variation compared to frequency variation (Izumi et al., 2011). However, a frequency discrimination task requires more involvement from the right auditory cortex (Doeller et al., 2003). Thus, the Heidelberg Model of music therapy comprising exercises of frequency discrimination in the impaired spectral range was able to specifically repair the tinnitus-related GM loss in the right Heschl's gyrus.

#### **LIMITATIONS**

Limitations of the study should be discussed. The TQ scores of included tinnitus patients ranged from 7 to 67 with an average score of 37.3 ± 16. This value corresponds to mild or middle tinnitus-related distress. Further, only patients with general hearing impairment less than 40 dB were included for the present. Therefore, therapy success in severe cases cannot be predicted by this study.

Another limitation generally concerns the measurement of gray matter alterations by SPM and VBM. Due to usage of nonlinear deformation, there is some residual impreciseness during the overlap of gyri and sulci between individual brains. Although we calculated with repeated intra-subject measures, the assignment of individual contrasts to the standard space must be critically regarded. The relative quality of the DARTEL normalization used in the study has been compared with several other methods by Klein et al. (2009), resulting in an acceptable rating.

Further limitations are related to possible interpretations of VBM contrasts indicating a shifted probability of focal GM or WM proportion. It is hard to decide whether its origin may be found in some growth within certain brain structures or slightly shifted segmentation results due to certain tissue alteration.

#### **CONCLUSION**

The Heidelberg Model of Music Therapy was able to reveal both rapid clinical improvements related to tinnitus distress and evidence of this specific therapeutic effect on brain areas suspected to play a role in sustaining tinnitus-related distress. When taking into account that the Heidelberg Model of Music Therapy has been shown to provide long-lasting effects (Argstatter et al., 2012), the observed structural brain plasticity can be assumed to be causative. Due to the rapid intervention in acute tinnitus this therapy may be able to prevent tinnitus from chronification (Grapp et al., 2013).

#### **ACKNOWLEDGMENTS**

Financial support: The study was supported by KTS Klaus Tschira Stiftung gGmbH. Many thanks to Sandra Dörrenbächer and Dr. Carrie Ankerstein for stylistic and linguistic improvement of this paper.

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 07 September 2014; accepted: 04 February 2015; published online: 19 February 2015.*

*Citation: Krick CM, Grapp M, Daneshvar-Talebi J, Reith W, Plinkert PK and Bolay HV (2015) Cortical reorganization in recent-onset tinnitus patients by the Heidelberg Model of Music Therapy. Front. Neurosci. 9:49. doi: 10.3389/fnins.2015.00049*

*This article was submitted to Auditory Cognitive Neuroscience, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2015 Krick, Grapp, Daneshvar-Talebi, Reith, Plinkert and Bolay. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The effect of sung speech on socio-communicative responsiveness in children with autism spectrum disorders

Arkoprovo Paul <sup>1</sup> , Megha Sharda1, 2, Soumini Menon<sup>3</sup> , Iti Arora<sup>3</sup> , Nayantara Kansal <sup>3</sup> , Kavita Arora<sup>3</sup> and Nandini C. Singh<sup>1</sup> \*

*<sup>1</sup> National Brain Research Centre, Gurgaon, India, <sup>2</sup> International Laboratory of Brain, Music and Sound Research (BRAMS), University of Montreal, Montreal, QC, Canada, <sup>3</sup> Children First Mental Health Institute, New Delhi, India*

There is emerging evidence to demonstrate the efficacy of music-based interventions for improving social functioning in children with Autism Spectrum Disorders (ASD). While this evidence lends some support in favor of using song over spoken directives in facilitating engagement and receptive intervention in ASD, there has been little research that has investigated the efficacy of such stimuli on socio-communicative responsiveness measures. Here, we present preliminary results from a pilot study which tested whether sung instruction, as compared to spoken directives, could elicit greater number of socio-communicative behaviors in young children with ASD. Using an adapted single-subject design, three children between the ages of 3 and 4 years, participated in a programme consisting of 18 sessions, of which 9 were delivered with spoken directives and 9 with sung. Sessions were counterbalanced and randomized for three play activities—block matching, picture matching and clay play. All sessions were video-recorded for *post-hoc* observational coding of three behavioral metrics which included performance, frequency of social gesture and eye contact. Analysis of the videos by two independent raters indicated increased socio-communicative responsiveness in terms of frequency of social gesture as well as eye contact during sung compared to spoken conditions, across all participants. Our findings suggest that sung directives may play a useful role in engaging children with ASD and also serve as an effective interventional medium to enhance socio-communicative responsiveness.

Edited by:

*Julian O'Kelly, Royal Hospital for Neuro-disability, UK*

#### Reviewed by:

*Pamela Heaton, Goldsmiths University of London, UK Kate Simpson, Griffith University, Australia*

#### \*Correspondence:

*Nandini C. Singh, National Brain Research Centre, NH 8, Manesar, Gurgaon 122 050, Haryana, India nandini@nbrc.ac.in*

Received: *01 April 2015* Accepted: *22 September 2015* Published: *29 October 2015*

#### Citation:

*Paul A, Sharda M, Menon S, Arora I, Kansal N, Arora K and Singh NC (2015) The effect of sung speech on socio-communicative responsiveness in children with autism spectrum disorders. Front. Hum. Neurosci. 9:555. doi: 10.3389/fnhum.2015.00555* Keywords: autism, socio-communicative responsiveness, song, joint attention, eye contact

## Introduction

Impairments in the socio-communicative domain are a hallmark feature of Autism Spectrum Disorders (ASD) (Kanner, 1943a; Zwaigenbaum et al., 2005; American Psychiatric Association, 2013). These impairments are reflected in behaviors such as the inability to orient socially, understanding and use of social gestures, gaze following, eye contact, imitation as well as the capacity to initiate and/or respond to joint attention. An extensive body of research has established these early emerging social behaviors as important building blocks for a typical developmental trajectory (Mundy et al., 1990; Charman et al., 2003). More specifically, these behaviors are critical in initiating and maintaining social relationships and verbal language development. A number of studies have shown that there are significant challenges in the development of skills associated with these socio-communicative behaviors in children with autism (Mundy et al., 1986; Charman et al., 1997; Dawson et al., 1998, 2004; Lozier et al., 2014). Consequently, such behaviors are important targets for early intervention in children with ASD (Warreyn et al., 2005; Jones et al., 2006; Leekam and Ramsden, 2006; Whalen et al., 2006).

An emerging practice for targeting socio-communicative impairments in ASD is the use of music- and song-based interventions (Lim, 2010; Wan et al., 2011; Simpson et al., 2013). Historically, the use of music has always been associated with increased engagement and a preserved domain of functioning. In the earliest scientific account, Kanner had noted the exceptional musical capacity of children with autism (Kanner, 1943a,b). Subsequent investigations further confirmed that children with autism showed a preference for musical stimuli (Thaut, 1988; Buday, 1995). These were also accompanied by numerous anecdotal reports that described the unique and profound effect music has on children with autism (Sacks, 2007). Other studies of musical abilities have demonstrated enhanced skills such as perfect pitch and good melodic memory in children with ASD (Heaton et al., 2008; Molnar-Szakacs and Heaton, 2012; Ouimet et al., 2012). In the domain of affect and music, Heaton et al. (1998) showed that children with autism had a good understanding of the affective implications of musical mode and were able to pair happy and sad faces with excerpts of music in major and minor keys, suggesting that the inability to identify emotions in social stimuli like faces, did not apply to the musical domain. It is also important to note that musical preferences in individuals with autism develop early in life (Allen et al., 2009) and responsiveness to music is found to remain preserved in adults on the autism spectrum, though it is often underestimated due to their reduced ability to articulate it (Allen et al., 2013). However, integrated reviews of the literature on music therapy (MT) interventions have consistently noted music's potential to support the social and affective development of young children with autism (Whipple, 2004; Kaplan and Steele, 2005; Gold et al., 2006; Accordino et al., 2007; Simpson and Keen, 2011).

Recently, studies from the neuroimaging domain have also provided compelling biological evidence showing preserved neural activity for music processing in children with ASD (Lai et al., 2012; Sharda et al., 2015). For instance, a neuroimaging study by Caria et al. (2011), showed that individuals with ASD recruit regions involved in emotion and reward processing while listening to happy and sad musical excerpts, similar to neurotypical controls. On the other hand, two studies (Lai et al., 2012; Sharda et al., 2015) showed that brain regions that show decreased activation during speech stimulation in ASD vs. controls showed greater activation during song stimulation. In fact, the study by Sharda et al. (2015), also demonstrated that fronto-temporal connectivity in the brain remains intact during perception of sung but not spoken words in children with ASD (Sharda et al., 2015). These findings provide robust neurobiological support for the use of music and song stimuli for therapeutic purposes and suggest that the sung stimulus might be a powerful medium to engage a child with ASD.

Since ASD presents a unique condition where sociocommunicative impairments and enhanced music perceptual abilities coexist, clinicians have often attempted to capitalize on the musical strengths of individuals to compensate for their social difficulties (Alvin, 1978; Alvin and Warwick, 1992; Vaiouli et al., 2015). Recently, MT has been classified as an emerging evidence-based practice, useful in teaching individual skills or goals, through the use of specific musical components, such as songs, rhythm, and movement (Geretsegger et al., 2014; Thaut et al., 2015). Although MT has long been used for rehabilitation of neurological disorders (Wan et al., 2010b) and cognitive development (Paul et al., 2012), its potential and validation to improve social, cognitive and motor skills for individuals with autism (American Music Therapy Association, 1999, 2003; Kaplan and Steele, 2005; Molnar-Szakacs and Heaton, 2012) is still an emerging field. The literature pertaining to the use of music as an interventional medium in ASD has focused predominantly on socio-communicative behaviors, with music being consistently used to explore the development of social skills in children (Duffy and Fuller, 2000; Finnigan and Starr, 2010). More recently, a novel music intervention based on auditory-motor mapping has been developed to aid expressive language development for non-verbal children with ASD (Wan et al., 2011). Another study comparing infant-directed speech with infant-directed song on the levels of engagement and learning outcomes (Simpson et al., 2015), used spoken and sung conditions embedded in a computer-based communication intervention, developed to teach receptive labeling in children with autism. Combined together, the above studies provide both behavioral and neurobiological motivation for use of music, especially song, as an effective tool for improving sociocommunicative responsiveness in individuals with autism (Gold et al., 2006; Simpson et al., 2013; Geretsegger et al., 2014).

Based on this premise, the aim of the current study was to further investigate the effects of singing on sociocommunicative responsiveness in children with ASD. More, specifically, efficacy of sung-directives to improve eye contact and social responsiveness in children with ASD were studied and the potential of intoned vocalizations and singing as an interventional medium, suited for the clinic and easily adaptable for home and classroom settings, was examined. In contrast to previous cross-over or group level designs, we employed an adapted single subject research design to control for within subject variability (Barlow and Hayes, 1979; Barlow and Herson, 1984; Scruggs et al., 1987; Horner et al., 2005; Kennedy, 2005). The main goal was to assess the efficacy of song as a medium of intervention in ASD, given its intrinsic motivational value. We hypothesized that sung instructions may act as a communicative scaffold for children with ASD and consequently be more engaging and elicit greater number of socially responsive behaviors in participants, as compared to spoken directives.

## Methods

### Participants

Three children, all boys (mean age = 3.36 years, SD = 0.21) participated in this study. All three children were diagnosed using the Diagnostic and Statistical Manual of Mental Disorders-5 (DSM 5, American Psychiatric Association, 2013) and International Classification of Diseases-10 (ICD 10, World Health Organization, 1992) criteria by experienced medical professionals. Standard assessment measures including the Social Responsiveness Scale (SRS 2, Constantino and Gruber, 2012), Childhood Autism Rating Scale (CARS II, Schopler et al., 1980), parental reports and direct child observations were used to confirm the diagnoses. The Vineland Adaptive Behavior Scale (VABS II, Sparrow et al., 2005) was administered to assess adaptive behavior and socio-communicative skills. Detailed demographics are provided in **Table 1**. To be eligible for participation in the study, the participants had to be (1) formally diagnosed with ASD by a practicing physician, (2) chronologically aged between 3 and 5 years, (3) able to participate in the 18 sessions of the research programme, and (4) without any other comorbid neurological or psychiatric diagnosis. The participants were selected based on parental consultation and informed consent procedures approved by the Institutional Ethics Committee. Detailed information for each child is provided below.

#### Child A

Child A was 3 years 4 months old when the study commenced. He had a repertoire of few words, such as "hello," names of objects and people which he could use in 2–3 word sentence combinations for social greetings and need-based communication. He was a socially-oriented child with evident joint attention in high to moderate interest activities. He also displayed delayed echolalia and template language, and was quick to adapt to repetitive routines and patterns in play and social interaction. Child A was the higher functioning child amongst the three participants with a CARS score of 41 (mild to moderate) and SRS of 75. He often showed neutral affect and occasionally produced echolalic words or phrases.

#### Child B

Child B was 3 years 7 months old at the start of the programme. He had extremely low functional language skills and frequently uttered vocalizations without any apparent communicative intent. Child B had a CARS score of 52 (severe). He displayed low joint attention in social interactions and often avoided eye contact. He used to engage primarily in solitary play. He had sensory processing difficulties and emotional dysregulation, which often manifested in disruptive behaviors. His awareness of self and others was low, and sitting tolerance and ability to attend to table top activities was also difficult.

#### Child C

Child C was 3 years 2 months old when the study started. He was averbal at the beginning of the study and showed minimal to none signs of communicative intent via verbal or vocal modalities. Child C was severely affected by autism with a CARS score of 53. He usually maintained a neutral disposition and avoided eye contact. He was reported to be a child who tended to remain in a world of his own and did not show interest in any

#### TABLE 1 | Behavioral profile and standardized test scores for all participants.


*Summary of behavioral profile of the participants.*

social activity. His responses could occasionally be elicited in a therapeutic setting by high interest sensory routines.

#### Procedure

The study used an adapted single subject research design of AB type (Barlow and Hayes, 1979; Barlow and Herson, 1984; Scruggs et al., 1987; Horner et al., 2005; Kennedy, 2005). In a single subject design participants serve as their own controls; and thus it was preferred for this study to account for the within-subject variability across the two conditions—(A) spoken directives (considered the baseline condition) and (B) sung directives (considered the treatment condition), as individuals with ASD largely vary on their behavioral profiles. This was a "proof of concept" study to test our hypothesis that song may be more efficient than spoken directives to act as a communicative scaffold and enhance socio-communicative responsiveness in young children with ASD.

#### Programme

The programme consisted of 18 sessions over a period of 3 months for each child. Each session consisted of a (A) spoken or (B) sung condition. Three activities were used for all sessions. Each activity was used in both sung and spoken conditions. Each session was of 3–4 min duration per condition. Similar directives such as "Hello," "Look at me," "Let's match pictures," "Let's play with blocks" etc. were used in all sessions. Both spoken and sung sessions contained similar semantic content and only differed in the intonation of the directives. This was to ensure that any differences in behavior could be attributed to the musical nature of the directives used. Representative spectrograms for sample stimuli [refer to **Supplementary Audio Clips S1, S2** for (A) spoken and (B) sung conditions] are attached in **Supplementary Figure S1** to further illustrate this. Every child took part in 9 sessions with spoken directives and 9 sessions with sung directives, counterbalanced and randomized for the three play activities such as block matching, picture matching and clay play. All conditions and activities were further randomized to account for day-to-day variability in each child's performance.

The activities were chosen as the preferred play activities as reported by the therapists. The materials used for these play activities consisted of colored wooden blocks of different shapes, picture matching board games and synthetic modeling clay. Each session took place in a secluded room at the intervention clinic with the participant and the trained therapist seated across from each other at a table. A second caregiver videotaped the sessions from an adjacent position to the participant using a video camera and played no role in conducting the sessions. The therapist delivered the spoken and sung directives during the spoken and sung sessions, respectively while engaging the child in play activities. All sessions were video-recorded for posthoc observational coding of three behavioral metrics including performance, frequency of social gesture, and eye contact as described below.

#### Independent Variables

The study examined the participant's socio-communicative responsiveness within two experimental conditions: (A) the baseline spoken directive condition and (B) the treatment sung directive condition. In both conditions each participant was presented with bids within a play context by the therapist with the goal of having the child respond in a socially appropriate manner. The participant-therapist interaction took place in the following format: (1) the therapist would greet and/or present a preferred play material and initiate a communicative bid; (2) the participant was expected to respond; and (3) the participant's response, if correct/appropriate, would be reinforced by applause. The only difference between (A) the baseline spoken and (B) the treatment sung condition was the intonation of directives used by the therapist as illustrated in the **Supplementary Figure S1**; while all other elements of the session such as the semantic content, session structure, and settings were unaltered.

#### Dependent Variables

Several dependent variables were operationally defined in order to characterize the participants' response to the experimenter's communicative bids. (1) "Performance" on each session was measured as a percentage of correct responses with respect to the total number of instructional directives presented to the child during that session. This measure was used as a nonsocial measure of responsiveness to assess the participants' overall performance and comprehension abilities associated with each play session. Socio-communicative responsiveness was measured using two distinct behaviors–social gesture and eye contact. (2) "Social gesture" was defined as the child's physical response to social greeting such as "hi five" and was measured as a percentage of instances of such social touch with respect to the total number of opportunities received from the experimenter. (3) "Eye contact" was measured as the percentage of frequency of eye contact made by the child with respect to the total number of occurrences of name calling by the experimenter. All these measures were evaluated using videos for each session by a trained rater using a custom-made rating scheme (Hooker, 2013).

#### Reliability

An independent second rater trained in behavioral coding but blind to the purpose of the study rated 30% of the video recordings. These videotapes were randomly selected and the three different behavioral measures defined above were coded from each video. Cohen's Kappa value was calculated for all three behavioral measures to assess the inter-rater reliability. There was substantial agreement on looking behavior (kappa = 0.69) and social gesture (kappa = 0.70) whereas the kappa value on performance (kappa = 0.82) represented almost perfect agreement (Viera and Garrett, 2005). Only the behavioral measures recorded by the primary rater were used for data analysis.

## Results

The results of measured behaviors for all three participants in (A) baseline spoken vs. (B) treatment sung conditions are shown in **Figure 1**. All participants scored higher in the treatment (sung) condition compared to the baseline (spoken) on all measures including performance, social gesture and eye contact. Child A performed much better in sung condition with a mean of ∼78% correct responses to instructional directives when compared to 48% mean correct responses for spoken sessions. Child B was a little lower on accuracy with a mean of ∼52% in the spoken condition and a mean of 42% in the sung condition. The performance of Child C was quite low and comparable across both spoken and sung conditions (with a mean of 33% in spoken and a mean of 31% in sung condition).

The data also indicate that there was a trend of enhanced responsiveness to social gestures in the sung condition, as compared to baseline spoken condition, for all three participants. Child A responded to social gestures with a mean of 77% in spoken and ∼89% in sung conditions. Child B showed ceiling effects with very high level of responses, particularly in this behavioral category, both in spoken (a mean of 91%) and sung (a mean of 100%) conditions. Finally, Child C also showed a similar pattern with lower responses in (mean of 41%) spoken conditions compared to sung (mean of 60%) conditions, although the variability for performance on spoken conditions (SD = 33%) was very high.

A similar trend of increased frequency of eye contact in response to name calling across the sung sessions was observed. Child A made an average of 38% eye contact in the spoken sessions compared to a mean of 62% in sung sessions. Child B responded with a mean of 7.5% eye contact in spoken condition as compared to a mean of ∼34% in sung condition, though with a high variability (SD = 20). Child C also showed an increase in eye contact from a mean of 24% in spoken condition to a mean of 33% for sung condition. Overall, the observational analysis of the videos indicated increased socio-communicative responsiveness in terms of both frequency of social gesture as well as eye contact

overall increase in sung sessions compared to baseline spoken conditions.

during the sung as compared to the spoken condition, across all 3 participants.

Nevertheless, there was a high degree of variability in the data, revealed by the trajectory of performance for all the participants across 18 sessions, comprising of 9 spoken sessions and 9 sung sessions (**Figure 2**). Visual inspection was used to examine changes in measured behavior as it is considered to be the most appropriate and most commonly used method of analysis in single-subject design research (Horner et al., 2005; Kennedy, 2005). For all participants, the scores for all measures in the sung sessions were greater than (or equal to) the spoken sessions- performance (Child A-7 out of 9 sessions, Child B-7 out of 9 sessions, Child C-7 out of 9 sessions), social gesture (Child A-8 out of 9, Child B-all sessions, Child C- all sessions), and eye contact (Child A-6 out of 9 sessions, Child B-6 out of 9 sessions, Child C-6 out of 9 sessions). Moreover, the means of all three behavioral measures across the sessions revealed an overall increase in sung sessions compared to baseline spoken conditions (**Figure 1**).

To further characterize the profile of participants to assess responsiveness to sung vs. spoken directives, VABS socialization and communication domain scores and SRS social communication and interaction (SCI) scores for each child were compared with their overall "responsiveness to sung words" for performance, social gesture and eye contact measures (**Figure 3**). This measure of responsiveness was calculated as a difference score: [(sung – spoken)/(sung + spoken)] for all three measures. Child B, with the higher standardized test score in VABS socialization and SRS SCI domains, showed an increased responsiveness to sung directives as reflected by the difference score for socio-communicative responsiveness in comparison with the other two participants. Interestingly, Child C who had a comparatively lower standardized test scores in VABS socialization and communication and SRS SCI domains also showed comparable responsiveness to sung directives for social gesture, eye contact, and performance.

#### Discussion

Findings from the current study indicate the effectiveness of using singing and song-based directives in improving sociocommunicative responsiveness of young children with ASD. Such song-based directives can be implemented as a medium of communication in interventional programmes at home, clinics as well as school-based settings to facilitate communication and interactions between individuals with ASD and their parents and care givers to help build upon their socio-communicative development.

Previous literature on ASD has shown that behaviors such as coordinated eye contact, joint attention (Mundy and Crowson, 1997; Warreyn et al., 2005; Whalen et al., 2006) and dyadic orienting (Leekam and Ramsden, 2006; Koegel et al., 2009a) are important precursors for communication and socialization. In the current study, we were able to evoke social behaviors using sung directives which may serve as a simple albeit effective interventional medium to enhance social interaction and communication in children with ASD. Our findings

response to name calling) across 9 sung (red) and 9 spoken (blue) sessions for all 3 activities, randomized, and counterbalanced across 18 sessions for child A. The lower panels show the same for child B and child C, respectively. For all participants, the scores for all measures in the sung sessions were greater than (or equal to) the spoken sessions- performance (Child A-7 out of 9 sessions, Child B-7 out of 9 sessions, Child C-7 out of 9 sessions), social gesture (Child A-8 out of 9, Child B-all sessions, Child C- all sessions), and eye contact (Child A-6 out of 9 sessions, Child B-6 out of 9 sessions, Child C-6 out of 9 sessions).

show that singing based directives not only improved sociocommunicative behaviors such as social gesture ("hi five") and eye contact, but also improved non-social behaviors such as performance on a play activity. This suggests that song may not only be engaging, but also provide a communicative scaffold for children with ASD and help in the development of their social skills. This suggests that sung speech may play an important role for children with ASD by engaging them in interactive play activities and increasing attention, compliance, and sociocommunicative skills. The findings from our study corroborate the results obtained in previous research that has used song as a tool for increasing social skills in children with autism (Stevens and Clark, 1969; Buday, 1995; Brownell, 2002; Pasiali, 2004; Kern and Aldridge, 2006; Finnigan and Starr, 2010).

Since ASD is conceptualized largely as a disorder of social impairment leading to delay in communication and other developmental milestones (Garfin and Lord, 1986; Koegel et al., 1992), most standard therapeutic interventions in ASD, aim at methods to enhance the development of these delayed skills. However, to learn any skill which is not driven by innate motivation, the child is required to engage with the therapist who leads the intervention. This in itself has been and remains an obstacle facing many therapeutic approaches.

Considering the rehabilitative potential of music therapies in facilitating neural plasticity as well as its intrinsic reward value, recent research in neuroscience has provided a robust biomedical perspective for clinical investigation of music therapies in various populations with psychiatric disorders. However, till date there are only few studies which have made an attempt to translate neuroimaging findings in a behavioral context and measure the efficacy of such interventions (e.g., Wan et al., 2010a, 2011; Wan and Schlaug, 2010). For instance, Wan et al. (2011) tested the efficacy of music making on expressive communication in non-verbal children with ASD, using a novel method called

sung directives is defined as the "difference score" [(sung − spoken)/(sung + spoken)]. The difference scores for performance and socio-communicative responsiveness such as social gesture and eye contact (shown in blue) are plotted against socio-communicative skills or standardized test scores such as VABS and SRS (shown in yellow) for all three participants. Child B with higher standardized test score in VABS socialization and SRS SCI domains showed an increased responsiveness to sung directives as reflected by the difference score for socio-communicative responsiveness in comparison with the other two participants. Child C who had a comparatively lower standardized test scores in VABS socialization and communication and SRS SCI domains also showed comparable responsiveness to sung directives for social gesture, eye contact and performance. VABS, Vineland Adaptive Behavior Scale (subscales—Soc, socialization; Com, communication). SRS, Social Responsiveness Scale (subscales—SCI, social communication and interaction T score; RRB, Restricted Interests and Repetitive Behavior T score).

Auditory-Motor Mapping Training. This was motivated by previous neuroimaging studies that had suggested that the mirror neuron system responsible for imitative behaviors is implicated in ASD (Hadjikhani et al., 2006). In contrast to the Wan et al. (2011) study which focused on expressive communication and speech output in non-verbal children with ASD, our current investigation emphasized on the use of spoken and sung conditions in the receptive domain, particularly on socio-communicative responsiveness, contingent on engagement and motivation of the participants. This study was motivated by recent neuroimaging research which showed that neural pathways are preserved for sung word perception in children with ASD (Sharda et al., 2015) and was a direct follow-up from its findings. As suggested earlier, in another independent study, the neural networks for song processing remain intact and are more effectively engaged in the autistic brain than spoken words (Lai et al., 2012). The findings from our current behavioral study reaffirm such neurophysiological explanations for enhanced behavioral response to sung directives as compared to spoken instructions.

Future studies exploring the potential of song-based interventions could benefit from building upon the findings from this study. Despite the potential of our findings, there were some limitations of this study. Specifically, despite being a powerful design to conduct preliminary studies, a single-case design in which each participant acts as his own control (Barlow and Hayes, 1979; Barlow and Herson, 1984; Scruggs et al., 1987; Horner et al., 2005; Kennedy, 2005) cannot account for the generalization of results to other settings such as home, classroom or community. Therefore, it is not known whether these improvements would generalize and skills would transfer to other domains, since generalization to a new situation is of particular difficulty for children with autism (Jordan and Powell, 1995). Secondly, there was considerable variability in the data collected for each condition (**Figure 1**). Consequently, the treatment sung condition did not show any stable trend within the duration of the program, which might be due to the participants' volatility and other factors (**Figure 2**). Thirdly, since there was no clear order of ability between child A, B, and C, any trends that were observed were hard to interpret and depended on the measure (VABS vs. SRS) used (**Figure 3**). Therefore, it was difficult to make any generalized conclusions regarding the relationship between overall socio-communicative functioning and responsiveness to sung stimuli. Additionally, child B showed ceiling effects with very high responses particularly in the social gesture behavioral category (91% in spoken and near 100% in sung conditions), which was reflected in the low difference score of 0.04. A larger sample would help to clarify the situation in future research and lead to more statistically robust findings as indicated in previous music intervention studies (Geretsegger et al., 2014).

Future studies could replicate the current findings using larger samples to establish the validity of song as a therapeutic interventional medium to improve social responsiveness and communication. An individualized strategy which uses preferred melody, holds a promising role since preferred activities might be motivating and engaging context for children with autism (Koegel et al., 1987; Koegel and Koegel, 2006). In addition, future studies might also focus on determining which acoustic or musical elements of singing such as pitch, rhythmic pattern or tempo, are most salient in evoking a differential response from the children with ASD. However, the present study provides further empirical support to the anecdotal claims that the children with autism tend to be more engaged by music and songs than speech. Further explorations in this direction would lead to the development of song as a simple and effective interventional tool for children with ASD.

### Acknowledgments

We would like to acknowledge and thank Ms. T. A. Sumathi for assistance in making the spectrographs of speech stimuli and Ms. Sandra Jose for independent rating of the recorded videos.

## References


We would like to thank all of the children and parents who participated in this project and the staff and clinicians at Children First Mental Health Institute for assistance in data acquisition. We would also like to thank National Brain Research Centre and Department of Science and Technology for supporting the research project with generous funding.

## Supplementary Material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fnhum. 2015.00555

Supplementary Figure S1 | Representative spectrograms for spoken vs. sung directives. Spectrographic representations of spoken vs. sung directive ("Hi five!") reflect the similarity of content in overall structure but differences in spectral distribution, in particular, the increased tonality of the sung as compared to the spoken directive.

Supplementary Audio Clips S1, S2 | Representative audio clips for verbal directive in spoken and sung conditions. Audio clips of verbal directive "Hi five!" in spoken condition (S1) and sung condition (S2).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Paul, Sharda, Menon, Arora, Kansal, Arora and Singh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Musical neurofeedback for treating depression in elderly people

#### Rafael Ramirez <sup>1</sup> \*, Manel Palencia-Lefler <sup>2</sup> , Sergio Giraldo<sup>1</sup> and Zacharias Vamvakousis <sup>1</sup>

<sup>1</sup> Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain, <sup>2</sup> Department of Communication, Universitat Pompeu Fabra, Barcelona, Spain

We introduce a new neurofeedback approach, which allows users to manipulate expressive parameters in music performances using their emotional state, and we present the results of a pilot clinical experiment applying the approach to alleviate depression in elderly people. Ten adults (9 female and 1 male, mean = 84, SD = 5.8) with normal hearing participated in the neurofeedback study consisting of 10 sessions (2 sessions per week) of 15 min each. EEG data was acquired using the Emotiv EPOC EEG device. In all sessions, subjects were asked to sit in a comfortable chair facing two loudspeakers, to close their eyes, and to avoid moving during the experiment. Participants listened to music pieces preselected according to their music preferences, and were encouraged to increase the loudness and tempo of the pieces, based on their arousal and valence levels. The neurofeedback system was tuned so that increased arousal, computed as beta to alpha activity ratio in the frontal cortex corresponded to increased loudness, and increased valence, computed as relative frontal alpha activity in the right lobe compared to the left lobe, corresponded to increased tempo. Pre and post evaluation of six participants was performed using the BDI depression test, showing an average improvement of 17.2% (1.3) in their BDI scores at the end of the study. In addition, an analysis of the collected EEG data of the participants showed a significant decrease of relative alpha activity in their left frontal lobe (p = 0.00008), which may be interpreted as an improvement of their depression condition.

Keywords: music, neurofeedback, emotions, expressive performance, depression, electroencephalography, elderly patients

## Introduction

There is ample literature reporting on the importance and benefits of music for older adults (Ruud, 1997; Cohen et al., 2002; McCaffrey, 2008). Some studies suggest that music contributes to positive aging by promoting self-esteem, feelings of competence and independence while diminishing the feelings of isolation (Hays and Minichiello, 2005). Listening to music appears to be rated as a very pleasant experience by older adults since it promotes relaxation, decreases anxiety, and distracts people from unpleasant experiences (Cutshall et al., 2007; Ziv et al., 2007; Fukui and Toyoshima, 2008). It can also evoke very strong feelings, both positive and negative, which very often result in physiological changes (Lundqvist et al., 2009). These positive effects seem to be also experienced by people with dementia (Särkämö et al., 2012, 2014; Hsu et al., 2015). All these findings have led many researchers to be interested in the topic of the contribution of music to the quality of life and to life satisfaction of older people (Vanderak et al., 1983). Music activities (both passive and active) can affect older adults' perceptions of their quality of life, valuing highly the

#### Edited by:

Julian O'Kelly, Royal Hospital for Neuro-disability, UK

#### Reviewed by:

Lutz Jäncke, University of Zurich, Switzerland Eric Miller, Montclair State University, USA

#### \*Correspondence:

Rafael Ramirez, Department of Information and Communication Technologies, Universitat Pompeu Fabra, Roc Boronat 138, 08018 Barcelona, Spain rafael.ramirez@upf.edu

#### Specialty section:

This article was submitted to Auditory Cognitive Neuroscience, a section of the journal Frontiers in Neuroscience

Received: 01 May 2015 Accepted: 17 September 2015 Published: 02 October 2015

#### Citation:

Ramirez R, Palencia-Lefler M, Giraldo S and Vamvakousis Z (2015) Musical neurofeedback for treating depression in elderly people. Front. Neurosci. 9:354. doi: 10.3389/fnins.2015.00354 non-musical dimensions of being involved in music activities such as physical, psychological, and social aspects (Coffman, 2002; Cohen-Mansfield et al., 2011). Music experiences, led by music therapists or by other caregivers, besides being a source of entertainment, seem to provide older people the mentioned benefits (Hays and Minichiello, 2005; Solé et al., 2010). Music has been shown to be beneficial in patients with different medical conditions. Särkämö et al. (2010) demonstrated that stroke patients merely listening to music and speech after neural damage can induce long-term plastic changes in early sensory processing, which, in turn, may facilitate the recovery of higher cognitive functions. The Cochrane review by Maratos et al. (2008) highlighted the potential benefits of music therapy for improving mood in those with depression. Erkkilä et al. (2011) showed that music therapy combined with standard care is effective for depression among working-age people with depression. In their study, patients receiving music therapy plus standard care showed greater improvement in depression symptoms than those receiving standard care only.

Neurofeedback has been found to be effective in producing significant improvements in medical conditions such as depression (Kumano et al., 1996; Rosenfeld, 2000; Hammond, 2004), anxiety (Vanathy et al., 1998; Kerson et al., 2009), migraine (Walker, 2011), epilepsy (Swingle, 1998), attention deficit/hyperactivity disorder (Moriyama et al., 2012), alcoholism/substance abuse (Peniston and Kulkosky, 1990), and chronic pain (Jensen et al., 2007), among many others (Kropotov, 2009). For instance, Sterman (2000) reports that 82% of the most severe, uncontrolled epileptics demonstrated a significant reduction in seizure frequency, with an average of a 70% reduction in seizures. The benefits of neurofeedback in this context were shown to lead to significant normalization of brain activity even when patients were asleep. The effectiveness of neurofeedback was validated compared to medication and placebo (Kotchoubey et al., 2001). Similarly, Monastra et al.'s (2002) research found neurofeedback to be significantly more effective than ritalin in changing ADD/ADHD, without having to remain on drugs. Other studies (Fuchs et al., 2003) have found comparable improvements with 20 h of neurofeedback training (forty 30-min sessions) to those produced by ritalin, even after only twenty 30-min sessions of neurofeedback (Rossiter and La Vaque, 1995). In the context of depression treatment, there are several clinical protocols used to apply neurofeedback such as shifting the alpha predominance in the left hemisphere to the right by decreasing left-hemispheric alpha activity, or increasing right hemispheric alpha activity, shifting an asymmetry index toward the right in order to rebalance activation levels in favor of the left hemisphere, and the reduction of Theta activity (4–8 Hz) in relation to Beta (15–28 Hz) in the left prefrontal cortex (i.e., decrease in the Theta/Beta ratio on the left prefrontal cortex) (Gruzelier and Egner, 2005; Michael et al., 2005; Ali et al., 2015). Dias and van Deusen (2011) applied a neurofeedback protocol that is simultaneously capable of providing the training demands of Alpha asymmetry and increased Beta/Theta relationship in the left prefrontal cortex.

A still relatively new field of research in affective computing attempts to detect emotion states in users using electroencephalogram (EEG) data (Chanel et al., 2006). Alpha and beta wave activity may be used in different ways for detecting emotional (arousal and valence) states of mind in humans. For instance, Choppin (2000) propose to use EEG signals for classifying six emotions using neural networks. Choppin's approach is based on emotional valence and arousal by characterizing valence, arousal and dominance from EEG signals. He characterizes positive emotions by a high frontal coherence in alpha, and high right parietal beta power. Higher arousal (excitation) is characterized by a higher beta power and coherence in the parietal lobe, plus lower alpha activity, while dominance (strength) of an emotion is characterized as an increase in the beta/alpha activity ratio in the frontal lobe, plus an increase in beta activity at the parietal lobe. Ramirez and Vamvakousis (2012) characterize emotional states by computing arousal levels as the prefrontal cortex beta to alpha ratio and valence levels as the alpha asymmetry between lobes. They show that by applying machine learning techniques (support vector machines with different kernels) to the computed arousal and valence values it is possible to classify the user emotion into high/low arousal and positive/negative valence emotional states, with average accuracies of 77.82, and 80.11%, respectively. These results show that the computed arousal and valence values indeed contain meaningful user's emotional information.

In this paper we investigate the potential benefits of combining music (therapy), neurofeedback and emotion detection for improving elderly people's mental health. Specifically, our main goal is to investigate the emotional reinforcement capacity of automatic music neurofeedback systems, and its effects for improving depression in elderly people. With this aim, we propose a new neurofeedback approach, which allows users to manipulate expressive parameters in music performances using their emotional state. The users' instantaneous emotional state is characterized by a coordinate in the arousal-valence plane decoded from their EEG activity. The resulting coordinate is then used to change expressive aspects of music such as tempo, dynamics, and articulation. We present the results of a pilot clinical experiment applying our neurofeedback approach to a group of 10 elderly people with depression.

## Materials and Methods

## Participants

Ten adults (9 female and 1 male, mean = 84, SD = 5.8) with normal hearing participated in the neurofeedback study consisting of 10 sessions (2 sessions per week) of 15 min each. Participants granted their written consent and procedures were positively evaluated by the Clinical Research Ethical Committee of the Parc de Salut Mar (CEIC-Parc de Salut Mar), Barcelona, Spain, under the reference number: 2015/6343/I. EEG data was acquired using the Emotiv EPOC EEG device. Participants were either residents or day users in an elderly home in Barcelona and were selected according to their cognitive capacities, sensitivity to music and depression condition: all of them declared to regularly listen to music and presented with a primary complaint of depression, which was confirmed by the psychologist of the

#### TABLE 1 | Selected pieces for each participant.


center. Informed consent was obtained from all participants. There were four people who abandoned the study toward the end of it due to illness.

### Materials

#### Music Material

Prior to the first session, the participants in the study were interviewed in order to determine the music they liked and to identify particular pieces to be included in their feedback sessions. Following the interviews, for each participant a set of 5–6 music pieces was collected from commercial audio CDs. During each session a subset of the selected pieces was played to the participant. **Table 1** shows the selected pieces for each participant.

#### Data Acquisition and Processing

The Emotiv EPOC EEG system (Emotiv, 2014<sup>1</sup> ) was used for acquiring the EEG data. It consists of 16 wet saline electrodes, providing 14 EEG channels, and a wireless amplifier. The electrodes were located at the positions AF3, F7, F3, FC5, T7, P7, O1, O2, P8, T8, FC6, F4, F8, AF4 according to the international 10–20 system (see **Figure 1**). Two electrodes located just above the subject's ears (P3, P4) were used as reference. The data were digitized using the embedded 16-bit ADC with 128 Hz sampling frequency per channel and sent to the computer via Bluetooth. The EEG signals were band-pass filtered with Butterworth 8–12 Hz and 12–28 Hz filters. The impedance of the electrode contact to the scalp was visually monitored using Emotiv Control Panel software.

The Emotiv EPOC EEG system is part of a number of lowcost EEG systems, which have been recently commercialized [a usability review of some of them can be found in (10)]. These systems are mainly marketed as gaming devices and the quality of the signal they capture is lower than the signal captured by more

<sup>1</sup>Emotiv Systems Inc. Researchers. (2014). Available online at: http://www.emotiv. com/researchers/

expensive equipment. However, recent research on evaluating the reliability of some of these low-cost EEG devices for research purposes has suggested that they are reliable for measuring EEG signals (Debener et al., 2012; Thie et al., 2012; Badcock et al., 2013). In the case of our study, the Emotiv EPOC device has provided several important pragmatic advantages compared with more expensive equipment: the setting up time of the Emotiv EPOC system at the beginning of each session is considerably shorter than that of an expensive EEG system (for which an experienced clinical professional can take up to an hour to place the electrodes on the patient's scalp, which results in long and tedious sessions). Furthermore, expensive EEG systems typically require the application of conductive gel in order to create a reliable connection between each electrode and the patient's scalp (the gel attaches to the patient's hair and can only be properly removed by washing the entire head at the end of each session). The setting up time of the Emotiv EPOC takes a few minutes (typically 3–5 min) and conductive gel is not necessary for the Emotiv EPOC's wet saline electrodes. However, the inferior signal quality of the Emotiv EPOC device is a limitation of this study, and thus it should be emphasized that future studies should involve the use of a more accurate EEG device.

We collected and processed the data using the OpenViBE platform (Renard et al., 2010). In order to play and transform the music feedback through the OpenVibe platform, a VRPN (Virtual-Reality Peripheral Network) to OSC (Open Sound Control protocol) gateway was implemented and used to communicate OpenViBE with Pure Data (Puckette, 1996). OSC is a protocol for networking sound synthesizers, computers, and other multimedia devices for purposes such as musical performance, while VRPN is a device-independent and networktransparent system for accessing virtual reality peripherals in applications. The VRPN-OSC-Gateway connects to a VRPN server, converts the tracking data and sends it to an OSC server. Music feedback was played by AudioMulch VST-host application, which received MIDI messages from Pure Data, and in which a tempo transformation plugin (Mayor et al., 2011) was installed. The plugin allows performing pitch-independent realtime tempo transformations (i.e., time stretch transformations) using audio spectral analysis-synthesis techniques. The plugin parameters were controlled using the MIDI messages sent by Pure Data. Music tempo and loudness were controlled assigning the corresponding MIDI control message coming from Pure Data. Both data acquisition and music playback were performed on a laptop with an Intel Core i5 2.53 Ghz processor with 4 GB of RAM, running windows 7 64-bit Operating System and using the laptop's internal sound card (Realtek ALC269). Music was amplified by two loudspeakers Roland MA150U.

#### Methods

Participants were treated individually. At the beginning of each feedback session, participants were informed about the experiment procedure, were asked to sit in a comfortable chair facing two loudspeakers, to close their eyes, and avoid moving during the experiment. Participants listened to preselected music pieces according to their music preferences for 15 min. Within these 15 min music pieces were separated by a pause of 1 s. Participants were encouraged to increase the loudness and tempo of the pieces so the pieces sounded "happier." As the system was tuned so that increased arousal corresponded to increased loudness, and increased valence corresponded to increased tempo, participants were encouraged to increase their arousal and valence, in other words to direct their emotional state to the high-arousal/positive-valence quadrant in the arousal-valence plane (see **Figure 2**). At the end of each session, participants were asked if they perceived they were able to modify the music tempo and volume. Pre and post evaluation of participants was performed using the BDI depression test.

the loudness and tempo of musical pieces, they were encouraged to increase their arousal and valence, and thus direct their emotional state to the top-right quadrant in the arousal-valence plane.

No artifact detection/elimination method was applied to the measured EEG signal. Both electrooculographic (EOG) and electromyographic (EMG) artifacts were minimized by asking participants to close their eyes and avoid movement. No control of the interface through eye or muscle movement was observed during the experimental sessions. However, it must be noted that it is important to extend/redo the reported study using artifact detection methods.

The EEG data processing was adapted from Ramirez and Vamvakousis (2012). Based on the EEG signal of a person, the arousal level was determined by computing the ratio of the beta (12–28 Hz) and alpha (8–12 Hz) brainwaves. EEG signal was measured in four locations (i.e., electrodes) in the prefrontal cortex: AF3, AF4, F3, and F4 (see **Figure 1**). Beta waves β are associated with an alert or excited state of mind, whereas alpha waves α are more dominant in a relaxed state. Alpha activity has also been associated to brain inactivation. Thus, the beta/alpha ratio is a reasonable indicator of the arousal state of a person. Concretely, arousal level was computed as following:

$$Arousal = \left(\beta F3 + \beta F4 + \beta AF3 + \beta AF4\right) / (\alpha F3 + \alpha F4 + \alpha AF3) \tag{1}$$
 
$$+ \alpha AF4 \tag{1}$$

In order to determine the valence level, activation levels of the two cortical hemispheres were compared. A large number of EEG studies (Henriques and Davidson, 1991; Davidson, 1992, 1995, 1998), have demonstrated that the left frontal area is associated with more positive affect and memories, and the right hemisphere is more involved in negative emotion. F3 and F4 are the most used positions for looking at this alpha/beta activity related to valence, as they are located in the prefrontal lobe, which plays a crucial role in emotion regulation and conscious experience. Valence values were computed by comparing the alpha power α in channels F3 and F4. Concretely, valence level was computed as following:

$$Valuece = \alpha F4 - \alpha F3\tag{2}$$

Valence and arousal computation was adapted from Ramirez and Vamvakousis (2012), where the authors show that the computed arousal and valence values indeed contain meaningful user's emotional information.

Computed arousal and valence values are fed into an expressive music performance system which calculates appropriate expressive transformations on timing, loudness and articulation (however, in the present study only timing and loudness transformations are considered). The expressive performance system is based on a music performance model, which was obtained by training four models using machine learning techniques (Mitchell, 1997) applied to recordings of musical pieces in four emotions: happy, relaxed, sad, and angry (each corresponding to a quadrant in the arousal-valence plane). The coefficients of the four models were interpolated in order to obtain intermediate models (in addition to the four trained models) and corresponding performance predictions (**Figure 3**). Details about the expressive music performance system and our approach to expressive performance modeling can be found in (Ramirez et al., 2010; Giraldo, 2012; Ramirez et al., 2012; Giraldo and Ramirez, 2013; Marchini et al., 2014). In order to model expression in music performances we characterized each performed note by a set of inter-note features representing both properties of the note itself and aspects of the musical context in which the note appears (**Figure 4**). Information about the note included note pitch (Pitch), note duration (dur), and note metrical strenght (MetrStr), while information about its melodic context included the relative pitch and duration of the neighboring notes (PrevPitch, PrevDur, NextPitch, NextDur), i.e., previous and following notes, as well as the music structure (i.e., Narmour groups) in which the note appears (Narmour, 1991). We also extracted the amount of legato with the previous note, the amount of legato with the following note, and mean energy. We applied machine learning techniques to train a linear regression models for predicting duration, and energy deviations expressed as a ratio of the values specified in the score (for energy which is not specified in the score we take the score value as the average of the energy of all the notes in the piece). For instance, for duration a predicted value of 1.14 represents a prediction of 14% lengthening of the note with respect to the score. In the case of energy it indicates that the note should be played a 14% louder than the average energy in the piece.

The general proposed emotion-based musical neurofeedback system is depicted in **Figure 5**. The system consisted of a

real-time feedback loop in which the brain activity of participants was processed to estimate their emotional state, which in turn was used to control an expressive rendition of the music piece. The user's EEG activity is mapped into a coordinate in the arousalvalence space that is fed to a pre-trained expressive music model in order to trigger appropriate expressive transformations to a given music piece (audio or MIDI).

## Results

Seven participants completed training, requiring a total of ten 15 min sessions (2.5 h) of neurofeedback, with no other psychotherapy provided. There were four people who abandoned the study toward the end of it due to health problems. Pre and post evaluation of 6 participants was performed using the BDI depression test (One participant was not able to respond to the BDI depression test at the end of the treatment due to serious health reasons). The BDI evaluation performed using the BDI depression test, showed an average improvement of 17.2% (1.3) in BDI scores at the end of the study. Pre–post changes on the BDI test are shown in **Figure 6**.

We computed average valence and arousal values at the beginning of the first session and the beginning of the last session of the study. The obtained average valence values were 0.74 (0.22) and 0.83 (0.26) for the beginning of the first session and the beginning of the last session, respectively, while the obtained average arousal values were 0.97 (0.14) and 0.98 (0.21) for the beginning of the first session and the beginning of the last session, respectively (**Table 2**).

**Figure 7** shows the correlation within sessions between the computed arousal and valence values, and time (1 min periods) within sessions. For valence we obtained a correlation of r = 0.919 (p = 0.000171) while for arousal we obtained a correlation of r = 0.315 (p = 0.375335).

## Discussion

Five out of six participants who responded to the BDI test made improvements in their BDI, and one patient improved from depressed to slight perturbation in the BDI scale. One participant, who initially scored as not depressed in the BDI pre-test (score = 1), did not show any improvement in her BDI post-score (score = 4), which was also in the non-depressed range. Either the participant was not depressed at the beginning of the study, or her responses to the BDI tests were not reliable. Excluding her from the BDI test analysis, the mean decrease in BDI scores was 20.6% (0.06). These differences were found significant (p = 0.018).

EEG data obtained during the course of the study showed that overall valence level increased at the end of the treatment compared to the starting level. The difference between valence values at the beginning and end of the study is statistically significant (p = 0.00008). This result should be interpreted as a decrease of relative alpha activity in the left frontal lobe, which may be interpreted as an improvement of the depression condition (Henriques and Davidson, 1991; Gotlib et al., 1999). Arousal values at the beginning and at the end of the study showed no significant difference (p = 0.33). However, in this study the most important indicator was valence since it reflects changes in negative/positive emotional states, which are directly related to depression conditions.

Correlation between valence values and time within sessions was found significant (p < 0.00018) but that was not the case for the correlation between arousal values and time. The fact that arousal-time correlation was not significant is not a negative result since valence is the most relevant indicator for depression.

Taking into account the obtained within- and cross-session improvements in valence levels and the limited duration of both each session (i.e., 15 min) and the complete treatment (i.e., 10 sessions), it may be reasonable to think that further improvements in valence could have been reached if sessions and/or treatment had been longer. We plan to investigate the impact of treatments with longer duration in the future.

Very few studies in the literature have examined the long-term effect of neurofeedback, but the few studies that did it found promising results (Gani et al., 2008; Gevensleben et al., 2010). Both Gani et al. (2008) and Gevensleben et al. (2010) showed that after the end of their studies, improvements were maintained and some additional benefits could be noted, suggesting that patients were still improving even after the end of treatment. In the current study, in addition to the post study BDI test, no follow-up for the participants was conducted in order to examine the longterm effect of our approach. This issue should be investigated in the future.

As it is the case of most of the literature on the use of neurofeedback to treat depression, which mainly represent uncontrolled case study reports, no control group has been considered in this pilot study. In order to quantify the benefits of combining music and neurofeedback compared to other approaches, ideally 3 groups should have been considered: one group with music therapy, one group with neurofeedback, and one group with the proposed approach combining music therapy and neurofeedback. In this way it would have been possible to quantify the added value of combining music therapy and neurofeedback. However, due to the limited number of participants this was not possible.

TABLE 2 | Arousal and valence values at the beginning and at the end of the study.


Some researchers have showed that listening to music regularly during the early stages of rehabilitation can aid the recovery maintaining attention, and preventing depressed and confused moods in stroke patients (Särkämö et al., 2008). Särkämö et al. conclude that in addition to these effects, music

listening may also have general effects on brain plasticity, as the activation it causes in the brain is in both hemispheres, and more widely distributed than that caused by verbal material alone. In the current study, we propose a new neurofeedback approach, which combines emotion-driven neurofeedback with (active) music listening. In the light of the mentioned benefits of music listening/receptive music therapy (Grocke et al., 2007), it is reasonable to think that incorporating music listening in a neurofeedback setting can only improve the benefits of traditional neurofeedback systems. Furthermore, we argue that the combination of neurofeedback and receptive music therapy provides the benefits of both techniques, while eliminating potential drawbacks of each separate technique. When considered as separate methods, the advantages of receptive music therapy and neurofeedback are clear: they both provide a noninvasive method with a lack of contraindications. In addition, neurofeedback is oriented to encourage patients to self-regulate their brain activity in order to promote beneficial activity patterns, while receptive music therapy relies on the emotional therapeutic effects of listening to music. These positive properties of both techniques are clearly preserved by the proposed approach. On the other hand, neurofeedback procedures often can be tedious and consist of tasks involving visual or auditory feedback with little or no emotional content (e.g., moving a car on a computer screen). Furthermore, a drawback of neurofeedback is that it is based on traditional EEG-rhythms (e.g., theta, alpha, beta), which are functionally heterogeneous and individual (Hammond, 2010). Receptive music therapy methods are combined with the difficulty of selecting music material corresponding to the individual needs of the patient (MacDonald, 2013). These shortcomings are avoided by the proposed system: The system provides attractive feedback consisting of music material specially selected by the individual participants, and it is based on high-level descriptors (i.e., arousal and valence) representing the emotional state of users.

The results obtained in the current study seem to indicate that music has the potential to be a useful component in neurofeedback treatment. However, future research needs to explore the effect of individual responses' variables to music through direct experimental comparison. Future investigation of individual variables, such as music sensibility (e.g., music experience/familiarity) and the impact of depression severity, in addition to more stringent methodology, is required.

## References


In summary, we have introduced a new neurofeedback approach, which allows users to manipulate expressive parameters in music performances using their emotional state, and presented the results of a neurofeedback clinical pilot study for treating depression in elderly people. The neurofeedback study consisted of 10 sessions (2 sessions per week) of 15 min each initially involving 10 participants from a residential home for the elderly in Barcelona. Participants were asked to listen to music pieces preselected by them according to their music preferences, and were encouraged to increase the loudness and tempo of the pieces, based on their arousal and valence levels, respectively: arousal was computed as beta to alpha activity ratio in the frontal cortex, and valence was computed as relative frontal alpha activity in the right lobe compared to the left lobe. Pre and post evaluation of 6 participants was performed using the BDI depression test, showing an average improvement of 17.2% (1.3) in their BDI scores at the end of the study. Analysis of the participants' EEG data showed a decrease of relative alpha activity in their left frontal lobe, which may be interpreted as an improvement of their depression condition. The positive results of our clinical experiment, suggest that new research with the proposed music neurofeedback approach is worthwhile.

## Acknowledgments

This work has been partly sponsored by the Ministerio de Economia y Competitividad under Grant TIN2013-48152-C2-2-R (TIMuL Project).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Ramirez, Palencia-Lefler, Giraldo and Vamvakousis. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The effects of music listening on pain and stress in the daily life of patients with fibromyalgia syndrome

*Alexandra Linnemann1, Mattes B. Kappert1, Susanne Fischer2, Johanna M. Doerr1, Jana Strahler1 and Urs M. Nater1\**

*<sup>1</sup> Department of Psychology, University of Marburg, Marburg, Germany, <sup>2</sup> Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK*

Music listening is associated with both pain- and stress-reducing effects. However, the effects of music listening in daily life remain understudied, and the psycho-biological mechanisms underlying the health-beneficial effect of music listening remain unknown. We examined the effects of music listening on pain and stress in daily life in a sample of women with fibromyalgia syndrome (FMS; i.e., a condition characterized by chronic pain) and investigated whether a potentially pain-reducing effect of music listening was mediated by biological stress-responsive systems. Thirty women (mean age: 50.7 ± 9.9 years) with FMS were examined using an ecological momentary assessment design. Participants rated their current pain intensity, perceived control over pain, perceived stress level, and music listening behavior five times per day for 14 consecutive days. At each assessment, participants provided a saliva sample for the later analysis of cortisol and alpha-amylase as biomarkers of stress-responsive systems. Hierarchical linear modeling revealed that music listening increased perceived control over pain, especially when the music was positive in valence and when it was listened to for the reason of 'activation' or 'relaxation'. In contrast, no effects on perceived pain intensity were observed. The effects of music listening on perceived control over pain were not mediated by biomarkers of stress-responsive systems. Music listening in daily life improved perceived control over pain in female FMS patients. Clinicians using music therapy should become aware of the potential adjuvant role of music listening in daily life, which has the potential to improve symptom control in chronic pain patients. In order to study the role of underlying biological mechanisms, it might be necessary to use more intensive engagement with music (i.e., collective singing or music-making) rather than mere music listening.

Keywords: ecological momentary assessment, fibromyalgia syndrome, music listening, pain, stress

## Introduction

Music listening has been shown to alleviate pain. Thus, it would appear to be a promising means of symptom reduction and health promotion, given that it is an activity of daily life that is popular, cost-effective and easily accessible. However, the mechanisms underlying its potential pain-reducing effect remain to be elucidated. We posit that the pain-reducing effect of music is mediated by a reduction in the activity of stress-responsive systems in the body.

#### *Edited by:*

*Mari Tervaniemi, CICERO Learning – University of Helsinki, Finland*

#### *Reviewed by:*

*Karsten Specht, University of Bergen, Norway Boudewijn-Van Houdenhove, KU Leuven, Belgium*

#### *\*Correspondence:*

*Urs M. Nater, Department of Psychology, University of Marburg, Gutenbergstra*β*e 18, Marburg 35032, Germany nater@uni-marburg.de*

> *Received: 04 May 2015 Accepted: 16 July 2015 Published: 30 July 2015*

#### *Citation:*

*Linnemann A, Kappert MB, Fischer S, Doerr JM, Strahler J and Nater UM (2015) The effects of music listening on pain and stress in the daily life of patients with fibromyalgia syndrome. Front. Hum. Neurosci. 9:434. doi: 10.3389/fnhum.2015.00434*

Music listening is associated with reduced subjective stress levels and affects physiological markers of stress (Pelletier, 2004; Chanda and Levitin, 2013; Koelsch, 2014). Koelsch (2014) proposes a model of music-evoked emotions in which music is initially processed in the central nervous system, then further impacts endocrine, autonomic and immune activity, and leads to the experience of a broad range of emotions. A stress-reducing effect of music listening may be explained by music particularly affecting activity in the hippocampal formation, which in turn influences the activity of the hypothalamus–pituitary–adrenal (HPA) axis (Koelsch, 2014). The HPA axis constitutes one of the major stress-responsive systems in the body, and its activation leads to the secretion of the steroid hormone cortisol (Hellhammer et al., 2009). Research has linked music listening to a down-regulation of HPA axis activity, as mirrored in lower concentrations of cortisol (Chanda and Levitin, 2013). This has led to the conclusion that the health-beneficial effect of music listening is mediated by a reduction in stress (Thoma and Nater, 2011). Another stress-responsive system of high relevance is the autonomic nervous system (ANS). Music listening has also been associated with a down-regulation of ANS activity, which is reflected by both lower blood pressure and lower heart rate (Kreutz et al., 2012).

Interestingly, stress has been discussed as an etiological factor in the manifestation of chronic pain (Fitzcharles and Yunus, 2012), giving rise to the conclusion that music listening has the capacity to positively influence pain. Indeed, the term 'audioanalgesia' was coined in this context (Gardner and Licklider, 1959). Nevertheless, the exact mechanisms underlying the painreducing effect of music remain unclear. One open question concerns whether music listening can reduce pain *per se* (i.e., direct effect) or whether it facilitates coping with pain (i.e., indirect effect). Bernatzky et al. (2012) state that music exerts effects in the brain that directly impact on the relevant pain circuits, which in turn reduce the perception of pain intensity. However, the empirical evidence is not consistent in this regard, as there are also studies showing no music-induced reduction in perceived pain intensity (i.e., MacDonald et al., 2003). Similarly, Mitchell and MacDonald (2006) did not find a reduction in pain intensity, but did find an increase in control over pain, and thus propose that music listening is successful in reducing pain by specifically improving control over pain (Mitchell and MacDonald, 2012). While the exact mechanisms remain unclear, it has been discussed that improved control over pain may be achieved via distraction from pain or via induced relaxation (MacDonald et al., 2003; Mitchell and MacDonald, 2006; Mitchell et al., 2006).

When experiencing pain, information is forwarded from nociceptive receptors in the periphery of the body to cortical areas in the brain. Different brain mechanisms process information either serially or in parallel, with certain brain circuits being involved in the sensory-discriminative components of pain and others being involved in the affective-motivational components of pain (Treede et al., 1999). Research examining the neurobiological mechanisms underlying the pain-reducing effect of music identified the limbic system – which is involved in the affective-motivational modulation of pain – as a key structure in the brain that is affected by listening to music (Bernatzky et al., 2012). As the cortico-limbic pathways exert an inhibitory effect on pain, it is likely that music listening exerts a painreducing effect. At the same time, the limbic system, especially the hippocampal formation, is closely associated with the modulation of the HPA axis (Jankord and Herman, 2008). In this way, music listening is thought to exert a stress-reducing effect (Koelsch, 2014). Bernatzky et al. (2012, p. 270) state in this regard that a 'hypothalamic changeover' leads to music-induced distraction and relaxation in the context of pain. In other words, on a neurobiological level, music listening exerts effects in the central nervous system that are critical to the modulation of both pain and stress. The limbic system can be regarded as a key structure in this context, which further impacts on neuroendocrine and autonomic functioning.

However, most of the studies on music listening and pain are based on studies of acute pain in the clinical context. What remains understudied so far is the effect of music listening in chronic pain conditions (Finlay, 2014). In contrast to acute pain, which requires a pain stimulus to be present, the perception of pain in chronic pain occurs even though no current pain stimulus is present (Lee et al., 2011). This persistent experience of pain is explained by long-lasting alterations in both the central and the peripheral nervous system as nociceptive information transmission is impaired (Lee et al., 2011). Therefore, patients with chronic pain show an altered processing of sensory input, which could be linked to an effect on the limbic system (Bennett, 1999). Inhibitory effects of the thalamus on limbic structures seem to be impaired, thereby increasing the experience of pain (Bennett, 1999; Tracey and Mantyh, 2007). As chronic pain – in contrast to acute pain – is associated with these physiological alterations, it is of utmost importance to examine whether music listening can affect pain in chronic pain conditions as well.

Fibromyalgia syndrome (FMS) is a medically unexplained condition mainly characterized by chronic widespread pain impacting heavily on patients' quality of life (Woltman et al., 2012). Stress seems to play a major role in the manifestation of pain symptoms in FMS patients: it was demonstrated that patients with FMS show alterations in HPA axis (Tak et al., 2011) as well as in ANS functioning (Martinez-Lavin, 2002). We previously investigated the role of both HPA axis and ANS in the relationship between stress and pain in daily life (Fischer et al., in revision). Hereby, we have shown that stress exacerbates pain in FMS patients, with HPA axis activity (but not ANS activity) being associated with pain. As music listening has been shown to reduce stress and stress system activity, the question arises whether music listening can also have a positive impact on pain in FMS patients. The still limited number of studies point toward a pain-reducing effect of music listening in these patients (Onieva-Zafra et al., 2013; Garza-Villarreal et al., 2014). In their experimental study, Garza-Villarreal et al. (2014) found painreducing effects, which they hypothesized to be due to cognitive and emotional processes, as music listening might have helped to distract from the pain and lead to a state of relaxation. However, the study was set in a highly artificial laboratory context, thus limiting the generalizability of its results. Onieva-Zafra et al. (2013) conducted an intervention study set in daily life, in which FMS patients were instructed to listen to pre-recorded music CDs almost daily for 14 consecutive days. Although music listening was associated with alleviated pain intensity levels, the underlying mechanisms remain unclear, as no biological markers were assessed in this study.

Taken together, the currently available literature suggests that music listening has pain- and stress-reducing effects. The underlying mechanisms, however, remain to be elucidated. Since most of the previous studies were set in experimental contexts examining acute pain, an investigation of the psycho-biological effects of music listening on chronic pain in daily life is warranted.

#### Research Question

Our overarching research question is whether patients with FMS benefit from mere music listening in daily life, and what the mechanisms underpinning the potential health-beneficial effect of music listening are. Based on the evidence summarized above, we hypothesized that music listening reduces pain intensity and increases control over pain (Hypotheses 1 and 2), that music listening has a stress-reducing effect both on subjective stress levels and markers of HPA and ANS activity (Hypothesis 3), and that these biomarkers mediate the pain-reducing effect (Hypothesis 4).

## Materials and Methods

#### Participants

A total of 30 women meeting the Fibromyalgia Research Criteria (Wolfe et al., 2011; mean age = 50.7 ± 9.9 years, range: 27– 64 years) were recruited via specialized clinics and various advertising outlets from the general population (i.e., newspapers, self-help groups, internet) as part of a bigger study on everyday life stress in patients with functional somatic symptoms. The eligibility criteria were as follows: diagnosis of fibromyalgia according to the Fibromyalgia Research Criteria, female sex (due to the majority of FMS patients being female, Wolfe et al., 1995), German as native language, age between 18 and 65 years, body mass index (BMI) in the range of 18–30, regular menstrual cycle or being post-menopausal, no acute or chronic unmedicated conditions influencing biological stress markers, no current pregnancy or breast-feeding, no lifetime psychotic or bipolar disorder, no eating disorder within the past five years, no substance abuse within the past two years, and no current episode of major depression. As compensation, participants received 80 Euro. The participants were informed about the aims of the study and gave written informed consent. The study was approved by the Ethics Committee of the Department of Psychology at the University of Marburg, Germany.

#### Procedure

The study was designed as an ecological momentary assessment study (Shiffman et al., 2008). Participants were examined on 14 consecutive days. Initially, potential participants underwent a telephone-based interview in order to check the eligibility criteria. If they fulfilled the criteria, they received a detailed study description by post. Subsequently, participants were invited to

the laboratory for an introductory session. For the duration of the study, patients received an iPod-R touch (Apple, Cupertino, CA, USA), on which they were required to complete six daily assessments on each day by means of the software iDialogPad (G. Mutz, University of Cologne, Germany). The participants were instructed to switch on the iPod-R upon waking up the following morning, not to turn it off from then on, and to recharge it every night. The assessment period began on the day following the introductory session. Each day, participants were asked to trigger the first assessment themselves directly after awakening. An activated timer prompted participants to complete a second set of questions 30 min afterwards (to determine the cortisol awakening response, which was not used in the current analyses). Further fixed assessments followed at 11.00, 14.00, 18.00, and 21.00 h. Participants were instructed to provide a saliva sample after each assessment for the later analysis of salivary cortisol and salivary alpha-amylase, for which they were provided with pre-labeled tubes. After completing the 14 days of assessment, the participants returned to the laboratory to hand over the saliva samples and iPod-R . On this occasion, they also completed online-based questionnaires. Additionally, a post-monitoring interview was conducted, checking for problems in handling the iPod-R or problems with the collection of saliva samples and asking for any unusual events during the measurement period.

#### Measures

Ecological Momentary Assessment Items At all assessments, except for directly after awakening, participants had to complete items regarding their music listening behavior, their momentary pain experience, and their momentary stress level.

*Music listening behavior* Participants were asked whether they had deliberately listened to music since the last assessment, which was defined as actively deciding to listen to music (i.e., by turning on the radio or a music listening device). If they reported deliberate music listening, they were then required to answer further items concerning their listening to music. They were asked to rate the perceived valence on a visual analog scale (VAS) ranging from 0 ('sad') to 100 ('happy') and the perceived arousal on a VAS ranging from 0 ('relaxing') to 100 ('energizing'). In the context of music, pain, and stress, the dimensions of valence and arousal are of special relevance. While the kind of music that is effective in reducing pain is still subject to debate, there is accumulating evidence that music which is positive in valence is especially effective, independent of its arousal (Roy et al., 2008). However, in the context of stress, the reverse pattern seems to be true, with music which is low in arousal being especially effective in reducing stress (Kreutz et al., 2012; Chanda and Levitin, 2013). Based on the occurrence of deliberate music listening, time points were classified as 'music episodes' if deliberate music listening was reported.

Subsequently, participants were asked to indicate the reasons why they had listened to music by choosing from among the following options: 'relaxation,' 'activation,' 'distraction,' and 'reducing boredom.' These have been stated as the most common reasons for listening to music in previous studies (Juslin et al., 2008; Linnemann et al., 2015b). Further, in the context of pain, it has been long discussed whether music listening is effective in reducing pain by distracting from pain and/or by inducing relaxation (Mitchell and MacDonald, 2012). This therefore highlights the importance of investigating the link between reasons for music listening and potential pain- and stress-reducing effects.

*Pain* Perceived pain intensity (Myles et al., 1999) and control over pain (Haythornthwaite et al., 1998) were measured as indicators for the experience of pain. Perceived pain intensity was measured using a VAS ranging from 0 ('At the moment, I am in no pain') to 100 ('At the moment, I am in the most intense pain possible'). Perceived control over pain was measured using the item 'I had the feeling that I was in control of the pain,' which was rated on a 6-point Likert scale ranging from 0 to 5, with low values indicating low control over pain and high values indicating high control.

## *Stress*

*Subjective stress* Subjective stress was measured using the item 'At the moment, I feel stressed,' which was rated on a 5-point Likert scale ranging from 0 ('not at all) to 4 ('very much'). In order to keep the number of items asked to a minimum, we decided to use a single- item measure for stress, which proved to have sufficient psychometric qualities (Elo et al., 2003).

*Physiological stress* After each assessment, participants were asked to collect a saliva sample for the later analysis of salivary cortisol (sCort) and salivary alpha-amylase (sAA) which we measured as biomarkers of stress. High cortisol levels generally indicate high levels of stress (Hellhammer et al., 2009). The secretion of cortisol follows a marked diurnal rhythm, with a rise in the morning and a subsequent decrease of cortisol towards the evening. sAA is an enzyme which is secreted from the salivary glands in the oral cavity. As its secretion is regulated by the ANS, sAA is also regarded as an indirect indicator of autonomic activation (Nater and Rohleder, 2009). Like cortisol, alpha-amylase follows a distinct diurnal pattern, with a decrease within 60 min after awakening and a steady increase of activity during the course of the day. Both sCort and sAA can be considered as valid physiological markers of stress system activity (Nater et al., 2013).

Measures of sCort and sAA were obtained from the unstimulated whole saliva samples collected during the 14-day measurement period. The participants were instructed to collect the samples as follows: first, they should rinse their mouth with water and then swallow all remaining saliva before collecting saliva for 2 min, which they should then transfer into SaliCap-R tubes via a straw. They were also asked to refrain from eating, smoking, brushing their teeth, or drinking anything but water for 1 h prior to sample collection. Participants were asked to store samples in a freezer or a fridge as soon as possible at home. Upon returning to the laboratory, the samples were stored at −20◦C at the Biochemical Laboratory of the Department of Psychology, University of Marburg, until analysis. sCort levels were measured using a commercially available enzyme-linked immunoassay (IBL, Hamburg, Germany). sAA activity was measured using a kinetic colorimetric test and reagents obtained from Roche (Fa. Roche Diagnostics, Mannheim, Germany). Inter- and intra-assay variance for both sCort and sAA was below 10%.

Paper–Pencil Questionnaires During the introductory session, participants were asked to complete a questionnaire on their socio-demographic background. Furthermore, after having completed the study protocol, participants completed the music preference questionnaire once online (MPQ; Nater et al., 2005). The MPQ provides insight into participants' habitual music listening behavior and is based on their subjective experience of their music listening behavior. The first item covers the preference for the most common music genres (i.e., 'pop,' 'rock,' 'classical music') by asking participants to indicate which music genre they prefer, using a scale ranging from 1 ('not at all') to 5 ('very much'). Subsequent items evaluate common reasons for music listening (i.e., 'relaxation,' 'activation,' 'distraction), with participants being asked to indicate how frequently they listen to music for the respective reasons on a scale from 1 ('never') to 5 ('very often'). Further items cover participants' music-making experience (i.e., information about playing an instrument or singing in a choir). Finally, the personal significance of music for participants is stated on a scale ranging from 1 ('not important') to 5 ('very important').

#### Statistical Analyses

In order to account for the nested structure of the data, analyses were performed using two-level hierarchical linear modeling (HLM; Raudenbush et al., 2004). Perceived pain intensity, perceived control over pain, subjective stress, sCort, sAA, and items on music listening such as music listening (yes/no), valence, arousal, and reasons for music listening were considered as level-1 variables. Furthermore, at the momentary level (level-1), the time of day was entered into analyses concerning the relationship between biological parameters and music listening due to the known diurnal patterns of both sCort and sAA. However, as only time points at which music listening was reported were entered into analyses concerning valence/arousal and reasons for music listening (thus reducing the number of level-1 observations), we did not include time of day as predictor in these analyses. Exploratory analyses revealed that entering time of day as predictor did not alter the results. At the individual level (level-2), the intercept (β0) was modeled as a function of number of music episodes (γ01) and a residual component (u0). In analyses concerning biological parameters, both age and BMI were entered additionally at the individual level (level-2) due to their known associations with HPA axis and ANS regulation.

The total number of music episodes varied between 0 and 53 per participant per measurement period, with participants listening to music once per day on average (*X* = 13.80 ± 13.34). Therefore, we controlled for the total number of music episodes on level-2, although this only turned out to be a significant predictor in one model (Model 2a). A total of 70 measurements were available per participant (five time points per day for 14 consecutive days), making a total of 2100 possible observations. As HLM uses a listwise deletion procedure in the case of missing values, the degrees of freedom (df) vary between the tested models. In models concerning subjective reports of stress, a total of 1883 observations were entered into the analyses. With regard to neuroendocrine and autonomic markers of stress, a total of 1773 observations were entered into the analyses. As music listening occurred in 21.2% of daily assessments, the number of degrees of freedom was further reduced to 412 when subjective stress reports were considered as the outcome and 379 when biological markers of stress were considered as the outcome, respectively. As neither sCort nor sAA was normally distributed, we natural log-transformed both sCort and sAA [ln (*x*) + 10]. Two participants had not listened to music at all during the 14-day period of assessment, and were therefore not entered into analyses in which valence/arousal as well as reasons for music listening were examined.

Hypothesis testing was performed in accordance with Woltman et al. (2012). Therefore, both unconditional and conditional models were specified. The comparison between these models was undertaken by means of χ2- statistics, comparing the reduction in deviance as a measure of model fit. As an indicator of explained variance, pseudo-*R*<sup>2</sup> is reported, calculated in accordance with the formula [(σ2(unconditional growth model) <sup>−</sup> <sup>σ</sup>2(subsequent model))/σ2(unconditional growth model)] (Singer and Willett, 2003). Mediation analyses were performed using the stepwise procedure as recommended by Kenny et al. (2003), Korchmaros and Kenny (unpublished).

*P*-values of ≤0.05 were considered significant. For all analyses, unstandardized coefficients (UC) are reported.

## Results

Participants reported having suffered from FMS for a mean of 120 ± 86 months. The mean BMI was 25.24 ± 2.90. Four participants had conditions which potentially influenced sCort or sAA: one had a BMI of 31.2, one suffered from Hashimoto's thyroiditis, and two suffered from inflammatory respiratory diseases. As there is no reason to assume that these conditions affect subjective data, these participants were included in all analyses regarding subjective data. For analyses in which biological markers were included, we initially excluded these four participants, although as their exclusion did not alter the pattern of results, we decided to include these patients in all analyses for the sake of consistency.

Data from the MPQ (*n* = 29) on participants' habitual music listening behavior revealed that patients listened to music on average for 2.0 ± 2.2 h per day. The reasons most commonly stated for listening to music were: 'activation' (*X* = 3.62), 'relaxation' (*X* = 3.52), and 'distraction' (*X* = 3.31). Concerning the importance of music for their lives, on a scale ranging from 1 ('not at all important') to 5 ('very important'), patients reported a moderate level of importance (*X* = 3.38 ± 1.37). A total of seven participants (=24.1%) reported that they were currently actively making music. Half of the participants (*n* = 15) stated that they had actively made music in the past.

With regard to the ecological momentary assessment data, music listening was reported in 21.2% of daily assessments. The music that was listened to was rated as rather positive in valance (*X* = 72.9 ± 14.1) and high in arousal (*X* = 61.42 ± 22.7). The reasons most commonly stated for music listening were: 'relaxation' (48.8%), 'distraction' (34.5%), 'activation' (25.8%), and 'reducing boredom' (12.3%).

The overall mean pain intensity, rated on a scale ranging from 0 to 100, was *X* = 47.5 ± 25.0. On a scale from 0 to 5, patients rated a moderate level of perceived control over pain *X* = 2.82 ± 1.1. Perceived stress was rated with a mean of *X* = 1.5 ± 1.0 on a scale from 0 to 4.

#### Does Music Listening in Daily Life Reduce Perceived Pain Intensity (Hypothesis 1)?

We first examined whether music listening was associated with reduced levels of perceived pain intensity (Model 1a). The unconditional model included only perceived pain intensity as dependent variable at level-1 and number of music episodes at level-2, and no music listening. We then specified the following conditional model: perceived pain intensity levels were modeled as a function of music listening (yes/no; β1j) and a residual component (*r*ij). At the individual level (level-2), both the intercept (β0) and the slope (β1) were modeled as a function of number of music episodes (γ01*,* γ11) and a residual component (u0). However, there were no associations between music listening and perceived pain intensity [*UC* = 1.64, *t*(1918) = 1.001, *p* = 0.317]. As music listening *per se* did not affect pain intensity, we tested whether music characteristics, i.e., perceived valence and arousal of the selected music, influenced perceived pain intensity (Model 1b). We specified the following conditional model: perceived pain intensity levels were modeled as a function of valence (0–100) (β1j), arousal (0–100) (β2j), and a residual component (*r*ij). At the individual level (level-2), the intercept (β0) was modeled as a function of number of music episodes (γ01) and a residual component (u0). Neither valence [*UC* = −0.03, *t*(384) = −0.328, *p* = 0.743] nor arousal [*UC* = −0.09, *t*(384) = −1.221, *p* = 0.223] was associated with any changes in perceived pain intensity. Concerning the reasons for music listening (Model 1c, **Table 1**), 'activation' was the only reason associated with a reduction in perceived pain intensity. The reduction in deviance (from the model including 'relaxation,' 'distraction,' and 'reducing boredom' as predictors to the model additionally including 'activation' as predictor) was marginally significant (χ<sup>2</sup> <sup>=</sup> 12.11; df <sup>=</sup> 6; *<sup>p</sup>* <sup>=</sup> 0.059), with 'activation' explaining 1.86% of the variance in perceived pain intensity.

#### Does Music Listening in Daily Life Increase Control Over Pain (Hypothesis 2)?

Next, we examined whether music listening was associated with increased levels of perceived control over pain. The unconditional model included only perceived control over pain as dependent variable at level-1 and number of music episodes at level-2, and no music listening. The conditional model (Model 2a) examined whether the level of perceived control over pain varied as a function of music listening (yes/no; β1j) and a residual component (*r*ij) at level-1. At the individual level (level-2), both the intercept (β0) and the slope (β1) were modeled as function of number of music episodes (γ01, γ11), and a residual component (u0). In line with our expectations, music listening was associated with higher levels of perceived control over pain [*UC* = 0.30, *t*(1458) = 3.548, *p <* 0.001]. Furthermore, the number of music episodes influenced the association between perceived control over pain and music listening [*UC* = 0.01, *t*(1458) = 2.047, *p* = 0.041], with those reporting more music episodes experiencing more control over pain. The reduction in deviance was significant (χ<sup>2</sup> <sup>=</sup> 14.59; df <sup>=</sup> 4; *<sup>p</sup>* <sup>=</sup> 0.006), with 0.83% of the variance in control over pain being explained by our predictors.

We then examined whether perceived control over pain varied as a function of the perceived valence and arousal of the selected music (Model 2b). Therefore, we specified a conditional model with valence (0–100) and arousal (0–100) being included as predictors at level-1, and at level-2 the intercept was modeled as a function of number of music episodes. Valence [*UC* = 0.01, *t*(292) = 2.719, *p* = 0.007], but not arousal [*UC* = 0.00, *t*(292) = 0.285, *p* = 0.776], was associated with higher levels of perceived control over pain. The reduction in deviance was significant (χ<sup>2</sup> <sup>=</sup> 10.78, df <sup>=</sup> 4, *<sup>p</sup>* <sup>=</sup> 0.029), with 2.35% of the variance in perceived control over pain being explained. Next, we examined whether reasons for music listening were associated with the level of perceived control over pain. Therefore, we specified a conditional model in which the reasons for music listening ['relaxation' (yes/no), 'activation' (yes/no), 'distraction' (yes/no), and 'reducing boredom' (yes/no)] were included as predictors at level-1 (Model 2c, **Table 1**). We found that 'relaxation' and 'activation' were associated with an increase in perceived control over pain, with 3.79% of the variance in control over pain being explained by reasons for music listening (χ<sup>2</sup> <sup>=</sup> 17.08, df <sup>=</sup> 6, *<sup>p</sup>* <sup>=</sup> 0.009).

# Does Music Listening in Daily Life Reduce Stress in Patients with FMS (Hypothesis 3)?

Subjective Stress We examined whether music listening was associated with reduced levels of subjective stress (Model 3a). The unconditional model included only subjective stress as dependent variable at level-1 and number of music episodes and a residual component at level-2, and no music listening. Concerning the conditional model (Model 3a) subjective stress levels were modeled as a function of music listening (yes/no; β1j) and a residual component (*r*ij). At the individual level (level-2), both the intercept (β0) and the slope (β1) were modeled as a function of number of music episodes (γ01, γ11) and a residual component (u0). There was no association between music listening and subjective stress [*UC* = −0.12, *t*(1918) = - 1.204, *p* = 0.229]. Subsequently, we examined whether valence or arousal were associated with subjective stress, adjusting the conditional model accordingly with valence and arousal at level-1, and only the intercept being modeled as a function of number of music episodes at level-2 (Model 3b). However, neither valence [*UC* = −0.00, *t*(384) = −0.855, *p* = 0.393] nor arousal [*UC* = 0.00, *t*(384) = 0.752, *p* = 0.453] was associated with subjective stress. Concerning the reasons for music listening (Model 3c, **Table 1**), 'activation' was the only reason for music listening which was associated with lower subjective stress levels. The reduction in deviance (calculated as the reduction in deviance from the unconditional model to the conditional model) was significant (χ<sup>2</sup> = 12.75, *df* = 6, *p* = 0.047), with 2.42% of the variance being explained by reasons for music listening.

TABLE 1 | Hierarchical linear models predicting repeatedly measured perceived pain intensity (70 measures; Model 1c), repeatedly measured perceived control over pain (70 measures; Model 2c) as well as repeatedly measured subjective stress (70 measures; Model 3c) by reasons for music listening using full maximum likelihood.


<sup>1</sup>*UC, Unstandardized Coefficients.* <sup>2</sup>*SE, Standard Error.* <sup>3</sup>*SD, Standard Deviation.* <sup>4</sup>*Number of music episodes: total number of music episodes per participant per measurement period (0–53).* <sup>5</sup>*(0/1): 0* <sup>=</sup> *no, 1* <sup>=</sup> *yes.* <sup>6</sup>*VC, Variance Component.*

Physiological Parameters of Stress Music listening was not associated either with sCort [*UC* = −0.06, *t*(1816) = −0.694, *p* = 0.488] or with sAA [*UC* = −0.01, *t*(1807) = −0.078, *p* = 0.938]. Furthermore, valence and arousal ratings did not affect sCort concentrations [valence: *UC* = 0.00, *t*(352) = 1.034, *p* = 0.302; arousal: *UC* = 0.00, *t*(352) = −1.244, *p* = 0.214] or sAA activity [valence: *UC* = 0.00, *t*(352) = 0.927, *p* = 0.354; arousal: *UC* = 0.00, *t*(352) = 0.743, *p* = 0.458]. None of the reasons for music listening was associated with sCort secretion [*UC <* 0.11, *t*(350) *<* 1.092, *p >* 0.276], as was the case with sAA activity [*UC <* −0.11, *t*(350) *<* −0.789, *p >* 0.431].

### Is the Pain-Reducing Effect of Music Listening Mediated by the Biological Stress-Responsive Systems (Hypothesis 4)?

As our analyses above identified relations among control over pain, the reason of 'relaxation,' and the reason of 'activation,' we only tested whether the increase in perceived control over pain when music was listened to for the reason of 'relaxation' and 'activation' was mediated by the biological stress-responsive systems. However, as none of the reasons for music listening was associated either with sAA or with sCort [see Does Music Listening in Daily Life Reduce Stress in Patients with FMS (Hypothesis 3)?], conditions for mediation analyses were not met and we therefore refrained from performing mediation analyses.

## Discussion

We found a beneficial effect of listening to music on how FMS patients coped with pain in daily life: whereas the perceived pain intensity was not affected by listening to music, perceived control over pain was significantly increased after having listened to music. This effect was especially profound in participants who listened to music more often, with an increase in the number of music episodes being associated with an increase in the pain-reducing effect of music listening. The perceived valence and the reasons for music listening ('relaxation' and 'activation' emerged as the most important factors) seem to be especially relevant, as they predicted increases in control over pain. We did not find a specific stress-reducing effect of music in FMS patients. The reason of 'activation' again predicted successful reduction in subjective stress, but the biological stress-responsive systems were not affected by music listening in daily life in these patients. Thus, the pain-reducing effect of music listening did not prove to be mediated by a reduction in levels of stress biomarkers.

Mirroring the heterogeneity in findings on the effects of music listening on pain intensity (Mitchell and MacDonald, 2012), we were not able to find an effect of music listening on pain intensity in daily life. It is assumed that music listening exerts its effects in the central nervous system, which then triggers emotional and cognitive processes as well as alterations in the peripheral nervous system (Koelsch, 2014). Evidence supports the notion that music-induced analgesia is produced by the central nervous system, but does not translate into the peripheral nervous system, i.e., by affecting nociceptive receptors (Garza-Villarreal et al., 2014). Especially as FMS is characterized by a sensitization to pain combined with dysregulated pain-inhibitory pathways, our results support the notion that mere music listening in daily life does not yield improvements in pain intensity in patients with FMS. Although Onieva-Zafra et al. (2013) did find an effect of music listening on pain intensity in patients with FMS, these patients were instructed to listen to music at least once a day for 30 min in the context of a trial. In our study, participants did not receive any instructions regarding their music listening. Our sample listened to music once a day on average while going on with their daily lives; half of our participants listened to music more than once a day and the other half listened to music less than once a day. The effects of an intervention in which patients are guided to listen to music on a daily basis might therefore be beyond the effects of our study, as we did not manipulate the music listening behavior of our participants, but rather studied the short-term effects of music listening as it happens in daily life. More profound effects of music listening might be observable if people were specifically instructed to engage with music on a regular basis.

Pothoulaki et al. (2012) argue that music interventions cannot affect physiological processes in chronic diseases but can yield improvements in quality of life, with increasing perceived control as one mechanism through which music listening has healthbeneficial outcomes. Turner et al. (2007) support this notion, as they showed that improvements in pain coping as achieved by cognitive-behavioral therapy (CBT) are mostly mediated by increases in control over pain. We were able to show that control over pain was improved after listening to music – especially for those who reported listening to music more regularly. It is assumed that characteristics both of the music and of the listener contribute to the pain-reducing effect of music listening (Mitchell and MacDonald, 2012; Pothoulaki et al., 2012). Here, we replicated the finding that music rated as high in valence was effective in reducing pain (Roy et al., 2008). However, the arousal of the music did not play a major role, in contrast to evidence from previous studies in which music low in arousal was used as a stimulus (Garza-Villarreal et al., 2014; Picard et al., 2014). To the best of our knowledge, we were the first to show that reasons for music listening seem to be especially relevant to the healthbeneficial effect of music listening (Linnemann et al., 2015a). In the current study, especially 'activation' and 'relaxation' predicted an increase in control over pain, suggesting that music listening might lead to successful pain management. At first glance, these two reasons for music listening seem to be contradictory, but these findings can be reconciled: In the management of pain, both relaxation and activation are well-known strategies that have the potential to reduce pain. With regard to activation, CBT therapists instruct patients to reduce avoidance behavior as this is thought to maintain the persistence of chronic pain (Philips, 1987; McCracken and Samuel, 2007). On the other hand, relaxation techniques were shown to be associated with improved coping with pain (Turner et al., 2007). Therefore, both activation and relaxation can lead to improvements in pain management. Interestingly enough, the reason of 'distraction' was not found to decrease pain in our study. This is striking, especially as distraction is thought to be a major contributor to musicinduced analgesia (Mitchell and MacDonald, 2012; Pothoulaki et al., 2012). Although we cannot rule out that patients actually perceived distraction while pursuing activation and relaxation when listening to music, distraction in daily life seems to be inefficient for coping with pain in chronic pain patients.

Our model proposes that the health-beneficial effect of listening to music is mediated by a reduction in stress (Thoma and Nater, 2011). There is evidence in the literature that the stress- and pain-reducing effects of listening to music are interrelated (Garza-Villarreal et al., 2014). Some experimental studies have shown that music listening affects both pain and stress (Good and Ahn, 2008; Bauer et al., 2011). However, others have demonstrated that music listening exclusively affects pain management and does not affect stress (Good et al., 2013). Nevertheless, these studies were predominantly conducted in experimental settings, examining acute pain either in healthy controls or in patients undergoing surgery. Furthermore, patients had to listen to music that was mostly chosen by the experimenter. This procedure is questionable, as it is known from the literature that self-selected music exerts the greatest stress-reducing effects (Chanda and Levitin, 2013). Therefore, we chose not to influence the music listening behavior of our participants but examined the effects of mere music listening in daily life. In our study, the pain-reducing effect of listening to music was not mediated by biological stress-responsive systems. This might be due to the already distorted ANS and HPA axis functioning in patients with FMS (Martinez-Lavin, 2002; Tak et al., 2011). Our results suggest that mere music listening may not be capable of impacting these chronic alterations in HPA and ANS functioning. Nevertheless, it remains open to further investigation whether, by means of event-based sampling methods, effects of music listening on acute stressors might be found. Furthermore, it might be possible that a more intense engagement with music (i.e., more music listening, active musicmaking) might be necessary in order to affect these systems in the body. In this context, music intervention studies set in daily life should examine whether the disease-induced alterations in HPA and ANS functioning can be positively affected.

Although this is the first study to examine the effects of music listening on pain and stress in patients with FMS in an everyday life setting, combining both psychological and biological outcomes, this study is not without limitations. First, the timing of the assessment relative to the music listening does not allow us to draw any conclusions about immediate effects of music listening on pain and stress. As there were, in part, up to 4 h between music listening and the assessments, we cannot rule out that music listening had acute effects on subjective pain and stress reports which were not sufficiently persistent to be captured by the later assessment. Second, we only asked about deliberate music listening. Thus, we cannot rule out that participants were exposed to background music during episodes that we coded as no-music episodes. Future studies are necessary to examine the effects of background music on health-related outcomes. Third, our sample size was moderate, although we did use a repeated measures design, which leads to a high number of observations per participant. Fourth, our hypotheses are based on within-person observations. A comparison group (i.e., participants with acute pain and/or healthy participants) would allow the testing of between-subject hypotheses that examine the potentially different mechanisms underlying the pain-reducing effects of music listening between a population with chronic pain versus a population with acute pain. Finally, as we only assessed women, our results do not allow generalizations to male FMS patients. This should be a focus of future research.

## Conclusion

Fibromyalgia syndrome is a complex and in part poorly understood chronic pain disorder. An optimal management of pain is not yet available, although a multimodal approach including music interventions has been suggested (Onieva-Zafra et al., 2013; Picard et al., 2014). We were able to show that mere music listening in daily life has beneficial effects on control over pain. It seems to be relevant why one listens to music, as in our study, listening to music for the reason of 'activation' or 'relaxation' predicted successful pain coping. We did not find this pain-reducing effect to be mediated by stress-responsive systems. Future studies need to examine acute effects of music listening on pain and stress by means of event-based sampling methods. Furthermore, it remains open to further investigation whether a more intense engagement with music in daily life can positively impact these stress-responsive systems by means of music interventions in daily life.

## Author Contributions

All authors contributed to the study design. Data collection was performed by MK, SF, and JD. JS and SF conducted the bio-chemical analyses of saliva samples AL, MK, JS, and UN performed the data analysis and interpretation. AL and UN drafted the manuscript, and MK, JS, SF, and JD provided critical revisions. All authors approved the final version of the manuscript for submission. All authors agree to be accountable for all aspects of the work ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

## Acknowledgments

JS, JD, and UN acknowledge funding by the Volkswagen Foundation. UN and SF acknowledges funding by the Swiss National Science Foundation. We thank the University of Marburg for the funding of participant reimbursements and the Universitaetsstiftung of the University of Marburg for funding the bio-chemical analyses. Further, we thank Elvira Willscher for conducting the bio-chemical analyses of saliva samples and Miriam Rauch, Laura Sanchez, Anna Tepe, and Jean Thierschmidt for assistance in data collection.

## References


to Experience as Indicator for Open-Earedness in Young Adulthood? Individual Differences and Stability of Music Preference]. *Musikpsychologie* 24, 198–222.


Woltman, H., Feldstain, A., Mackay, C. J., and Rocchi, M. (2012). An introduction to hierarchical linear modeling. *Tutor. Quant. Methods Psychol*. 8, 52–69.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Linnemann, Kappert, Fischer, Doerr, Strahler and Nater. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Benefits of listening to a recording of euphoric joint music making in polydrug abusers

Thomas Hans Fritz 1,2,3\*, Marius Vogt <sup>1</sup> , Annette Lederer <sup>1</sup> , Lydia Schneider <sup>1</sup> , Eira Fomicheva<sup>1</sup> , Martha Schneider <sup>1</sup> and Arno Villringer <sup>1</sup>

<sup>1</sup> Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany, <sup>2</sup> Department of Nuclear Medicine, University of Leipzig, Leipzig, Germany, <sup>3</sup> Institute for Psychoacoustics and Electronic Music (IPEM), Gent, Belgium

Background and Aims: Listening to music can have powerful physiological and therapeutic effects. Some essential features of the mental mechanism underlying beneficial effects of music are probably strong physiological and emotional associations with music created during the act of music making. Here we tested this hypothesis in a clinical population of polydrug abusers in rehabilitation listening to a previously performed act of physiologically and emotionally intense music making.

Methods: Psychological effects of listening to self-made music that was created in a previous musical feedback intervention were assessed. In this procedure, participants produced music with exercise machines (Jymmin) which modulate musical sounds.

#### Edited by:

Mari Tervaniemi, University of Helsinki, Finland

#### Reviewed by:

Stefan Elmer, University of Zurich, Switzerland Jennifer Grau-Sánchez, University of Barcelona, Spain

#### \*Correspondence:

Thomas Hans Fritz, Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstrasse 1A, 04103 Leipzig, Germany fritz@cbs.mpg.de

Received: 26 December 2014 Accepted: 11 May 2015 Published: 11 June 2015

#### Citation:

Fritz TH, Vogt M, Lederer A, Schneider L, Fomicheva E, Schneider M and Villringer A (2015) Benefits of listening to a recording of euphoric joint music making in polydrug abusers. Front. Hum. Neurosci. 9:300. doi: 10.3389/fnhum.2015.00300 Results: The data showed a positive effect of listening to the recording of joint music making on self-efficacy, mood, and a readiness to engage socially. Furthermore, the data showed the powerful influence of context on how the recording evoked psychological benefits. The effects of listening to the self-made music were only observable when participants listened to their own performance first; listening to a control music piece first caused effects to deteriorate. We observed a positive correlation between participants' mood and their desire to engage in social activities with their former training partners after listening to the self-made music. This shows that the observed effects of listening to the recording of the single musical feedback intervention are influenced by participants recapitulating intense pleasant social interactions during the Jymmin intervention.

Conclusions: Listening to music that was the outcome of a previous musical feedback (Jymmin) intervention has beneficial psychological and probably social effects in patients that had suffered from polydrug addiction, increasing self-efficacy, mood, and a readiness to engage socially. These intervention effects, however, depend on the context in which the music recordings are presented.

#### Keywords: music therapy, agency (psychology), exercise, mentalizing, mood disorders, addiction

Patients with drug-related disorders are known to have a substance abuse related malfunction of the reward system as a consequence of habituation to high levels of reward-mediating neurotransmitters, which deplete faster from the synaptic gap (Koob and Le Moal, 2005; Volkow et al., 2011). It has been shown that one consequence of this neurological malfunction is a diminished self-efficacy (the perception of having control over one's life; Beck and Lindenmeyer, 1997).

Polydrug abusers also often exhibit antisocial behavior and become involved in criminal activity (Gandossy et al., 1980; Prichard and Payne, 2005). Polydrug abusers often have deficits in social skills, including the development of stable interpersonal relationships (Batra, 2011), and it has been described that patients in a drug clinic can take advantage of interventions fostering social skills (Hawkins et al., 1986; Wittchen, 2006). Making music is known to increase group commitment in therapeutical settings (Cassity, 1976; Henderson, 1983), and may be used to improve social competence (Gooding, 2011). It seems that polydrug abusers respond well to music therapy; as such, it is commonly used as a treatment in patients with substance-related disorders (Gallagher and Steele, 2002; Winkelman, 2003). For this purpose, several musical interventions have been described, including: rhythm activities (Cevasco et al., 2005), improvisation (Murphy, 1983), song-writing (Freed, 1987), lyric analysis, where participants label passages in musical lyrics with which they identify (Walker, 1995) or playing instruments (Miller, 1970).

A positive attitude to life can have a huge range of other psychological effects that are probably indirectly beneficial to the health and well-being of the individual (Scheier et al., 1989; Karademas, 2006). For therapeutic application it is therefore generally desirable to systematically evoke beneficial memories and associations that lead to a positive attitude. Music listening can evoke such positive and therapeutically potentially beneficial effects through strong emotional responses in music listeners (Sloboda, 1991; Jäncke, 2008), and accordingly music has also been shown to modulate physiological arousal, which has been regarded a basic dimension of emotion in a variety of dimensional emotion models (Husain et al., 2002).

Listening to a recording of one's own musical performance, however, through association with the original performance has been shown to be even superior at evoking emotional and physiological experiences (Sutherland et al., 2009). On a motor level, listening to a previously practiced piano melody has been shown to lead to associations of the movements used to create the respective melodies (Lahav et al., 2005). Such newly created auditory-motor associations have been successfully used in therapy to enhance motor performance in patients (Sutherland et al., 2009), and passively listening to recordings of such musical auditory-motor associations have been shown to increase motor learning (Wan et al., 2011).

It is yet unclear if passively listening to recordings of previous performances also has clinically applicable psychological effects, such as those investigated in the current study with polydrug abusers. Here we used a memory induction method, stimulating previous motor and other physiological and psychological experiences during a musical feedback paradigm with Jymmin machines. This intervention allows participants to express themselves musically by exercising on fitness machines that have been modified to transform physical movements into musical sounds and thus allow music to be played interactively in a group. This combination of musical expression and extreme physiological arousal has been shown to create an intense musical and emotional experience that correlates with a decreased sense of exertion (Fritz et al., 2013a) and an enhanced mood (Fritz et al., 2013b).

The present study aimed to assess the psychological effects in polydrug abusers when presenting them a recording that participants had done 1 week before in an emotionally and physiologically intense Jymmin musical feedback intervention. We hypothesized that given the physiological and emotional associations with music created during the act of music making, listening to the recording would lead to greater psychological benefits in terms of increased mood and internal locus of control than listening to a similar piece of music that was not self-created.

## Methods

## Participants

Twenty-two participants (19 male), were tested within an age range of 20–47 in males (mean = 31.11) and 27–43 in females (mean = 37.5). Participants used the fitness machines in groups of three. None of the participants were professional body builders, musicians, or athletes. Clinical data showed that 77.7% of the participants consumed more than two drugs regularly during the pre-clinic period (polydrug use). 63% were involved in criminal activities and therefore incarcerated for between 1 and 96 months. The majority of the investigated population had been sentenced to jail at the time of intervention and were doing the rehabilitation program during their prison sentence (§35 in the German Controlled Substances Act). 48.1% of them had ADHD or related hyperkinetic disorders as a comorbid diagnosis.

Informed consent was obtained from all of the subjects and the experiment was conducted in accordance with the Declaration of Helsinki's ethical principles for research involving humans. It conformed to internationally accepted policy statements regarding the use of human subjects and was approved by the ethics committee of the University of Leipzig.

#### Experimental Design

The experiment included two conditions. In the first condition, participants listened to a recording of their previously performed interactive musical feedback session with Jymmin machines. Note that in the current manuscript we use the word Jymmin not in terms of a genre, but to describe the specific configuration used for the musical feedback system. In the second condition (control condition), they listened to a commercially available music piece—a commercial drum and bass track—that was similar in style to the Jymmin pieces. In order to enhance the ecological validity of the control condition, the control stimulus was selected from a variety of commercially available drum and bass tracks so that each participant listened to a different control excerpt. We controlled for the effects of sequence of condition by counterbalancing condition orders. Half of the participants listened to the Jymmin piece first (sequence one), followed by the control piece. The other half listened to the control piece first and then listened to the recording of their own music session (sequence two).

### Experimental Procedure

The participants were recruited from a drug rehabilitation clinic. The musical feedback session was recorded 5 days prior to the current experiment. In this music session, participants worked out on fitness machines (stomach trainer, stepper and cable lat pulldown) that were enhanced with sensors and a sound system to produce musical sounds through workout movement (Fritz et al., 2013a,b). Musical parameters that were modulated by the group were cutoff-filter and pitch of different tracks of a largely predefined musical piece. The music was performed in groups of three for 10 min. The participants had been shown how to do movements on the fitness machines in a sport physiologically correct way, and were briefly demonstrated the musical sounds each machine could create. They received the instruction: ''Please use the fitness machine in a way that you are physically comfortable with''. When working out, participants created the musical piece in a group performance. This paradigm had previously been shown to be effective at evoking euphoric musical experiences through a combination of musical expression and physical exertion (for a detailed description, see Fritz et al., 2013a,b).

All participants filled out a questionnaire following both of the two conditions (see Section Experimental Design). No baseline measurements were conducted before presenting either of the two conditions. Participants filled out general information items on gender and age. The questionnaire contained the PANAS scale (Krohne et al., 1996), the internal vs. external locus of control short scale (Kovaleva, 2012), and a subscale of the Multidimensional Mood Questionnaire (MDMQ; Mehrdimensionaler Befindlichkeitsfragebogen; Steyer et al., 1997). The following seven self-designed items were presented after each experimental condition (''When I listen to the music piece, I have positive thoughts'', ''I feel relaxed'', ''I would like to perform another 'Jymmin' session'', ''I am in the mood for exercise'', ''I feel peaceful'', ''I felt creatively inspired'', ''I have negative thoughts''). Six additional selfconstructed items were assessed after the Jymmin condition (''When I listen to the music piece, I want to engage in social activities with my former workout partners'', ''I think of the 'Jymmin' session'', ''I consider my former training partners to be interesting'', ''I think of my former training partners'', ''I consider my former training partners to be nice people'', ''I want to perform another 'Jymmin' session with my former training partners'').

The combined questionnaire was filled out following each of the two conditions. Participants took a break of 5 min after the first condition in order to fill out the questionnaire. After that, they were immediately exposed to the second condition. After the second condition, they were asked to fill out the second questionnaire. All of the participants were given commercially available MP3 players with stereo headphones to listen to the previously recorded musical feedback sessions and the control music piece.

The PANAS scale assesses positive and negative affect using 20 items with a five-point Likert scale from 1 (''hardly or not at all'') to 5 (''extremely''). The 20 items consist of adjectives that describe a mental state (e.g., ''active'', ''interested'', or ''nervous''). Participants should rate their current emotional state by choosing their most favorable answer. Within the test, positive and negative affects are interpreted as independent dimensions. The PANAS is a well-established measure of psychological well-being and has been proven valid and reliable, even though the independent two-factorial structure of negative and positive affect has been questioned (Krohne et al., 1996; Crawford and Henry, 2004). However, with an internal consistency (Cronbach's Alpha) of α = 0.84 and α = 0.86 for negative vs. positive affect (Watson et al., 1988), it has good reliability.

The locus of control short-scale measures a self-regulationvariable with four items and a five-step Likert scale ranging from 1 (''does not apply'') to 5 (''fully applies''). The first and second items measure internal loci (e.g., ''If I work hard, I'll succeed.''), whereas the third and fourth assess external loci (e.g., ''My plans are controlled by fate.''). The locus of control concept is a key variable in the social cognitive theory of personality (Rotter, 1954). An internal locus of control indicates the general expectation to have control over pleasant events in one's life (e.g., social recognition, goal achievement) and to avoid unpleasant punishments (e.g., pain, hunger). The external locus of control is a stable belief that pleasant sources of reinforcement cannot be controlled by the individual (e.g., professional failure is unavoidable). A high internal locus of control correlates with the assumption that the individual is personally accountable for their own positive behavioral outcomes, whereas a high external locus goes along with the belief that behavioral outcomes are controlled by external sources such as coincidence or fate. The short-scale measure showed an estimated internal consistency that was sufficient (McDonald's Omega (ω) varying between 0.71 and 0.73; Kovaleva, 2012).

The MDMQ is considered a mental state measurement of current mood (Steyer et al., 1997). The questionnaire contains three bipolar subscales (''good vs. bad mood'', ''calmness vs. agitation'' and ''alertness vs. tiredness'') each with eight items. Each subscale consists of eight adjectives, of which four belong to the negative (e.g., ''bad'', ''uncomfortable'') and four to the positive (e.g., ''well'', ''satisfied'') pole. For our study, we only used the ''good'' vs. ''bad'' mood subscale. The items are rated on a five-point Likert scale ranging from 1 (''not at all'') to 5 (''very''). Despite the short length of the measure (eight items), reliability is high with an internal consistency (Cronbach's alpha) between α = 0.73 and α = 0.89 (Heinrichs and Nater, 2002).

#### Data Analysis

The behavioral ordinal-scaled data were analyzed using SPSS 21 (IBM). Missing values were indicated in SPSS and not included in the analysis. In order to obtain mean scores for the subscales of the PANAS, locus of control and mood questionnaire, responses for each subscale were averaged. The PANAS consists of two subscales describing the participant's positive and negative affect. Ten positive items, such as ''powerful'', ''interested'', and ''excited'' were averaged for the subscale of positive affect. Out of the remaining 10 negative items, e.g., ''angry'', ''ashamed'', and ''hostile'', a mean score was formed indicating the negative affect. The first two items of the four-item locus of control scale measure internal locus, whereas the second half of the scale indicates external locus. For each subscale averages were formed. Items of the MDMQ belonging to the negative pole (e.g., ''bad'' or ''uncomfortable'') were recoded before data analysis so that higher scores on the negative pole were related to good mood. All of these mean scores range from 1–5, reflecting the 1–5 Likert scale used to rate each item.

A Kruskal-Wallis H Test was performed to test the effect of condition order (listening to Jymmin music first, listening to control music first) on the mean scores of PANAS, locus of control and MDMQ. Condition order varied between subjects. Follow-up Mann-Whitney U Tests for independent samples were performed for each of the significant subscales to determine the direction of the effect. In order to correct for multiple comparisons the Bonferroni correction was used and a significant alpha level was set to 0.017. A Spearman's correlation matrix was generated to explore correlation effects between selfdesigned items and scores of locus of control and MDMQ in the Jymmin condition.

## Results

A Kruskal-Wallis H Test revealed that internal control in the Jymmin condition differed significantly between participants of sequence one (Jymmin song first) and sequence two (control song first), χ 2 (1) = 6.599, p = 0.010. A follow up Mann-Whitney U Test for independent samples showed that the score of internal control was significantly higher in the Jymmin condition of sequence one (Mdn = 4.50) compared to the Jymmin condition in sequence two (Mdn = 3.50), U = 22, Z = −2.57, p = 0.010, r = −0.55 (see **Figure 1**). However, there were no significant differences in the score of internal control between the Jymmin and control music condition of sequence two.

Furthermore, the Kruskal-Wallis H Test showed that mood differed significantly between participants of the first and second sequence in the control condition (χ 2 (1) = 6.548, p = 0.010). A follow-up Mann-Whitney U Test for independent samples showed that participants of sequence one had an increased mood score (Mdn = 4.50) compared to participants of sequence two in the control condition (Mdn = 4.00), U = 21.5, Z = −2.56, p = 0.010, r = −0.55 (see **Figure 2**). However, participants of sequence one (Mdn = 4.56) and of sequence two (Mdn = 4.19) in the Jymmin condition, did not differ significantly in their mood scores (χ 2 (1) = 4.421, p = 0.035).

A Kruskal-Wallis H Test revealed no significant differences with regard to the positive and negative affect (PANAS scale) of participants of the first and second sequence in the Jymmin condition (χ 2 (1) = 3.297, p = 0.069 and χ 2 (1) = 0.502, p = 0.478) in the control condition (χ 2 (1) = 2.121, p = 0.145 and χ 2 (1) = 4.805, p = 0.028).

A Mann-Whitney U Test for independent samples revealed significant differences between participants of both sequences with regard to the following dependent variable: participants of sequence one reported an increased desire to do sports in general (Mdn = 3.0) compared to participants in sequence two (Mdn = 2.5), U = 26.0, Z = −2.53, p = 0.012, r = −0.54, after listening to the Jymmin music piece.

Spearman's correlation matrices show that participants who felt more content, happy, and comfortable also reported to think about their training partner as a more likeable (r<sup>s</sup> = 0.722, p < 0.001) and interesting individual (r<sup>s</sup> = 0.702, p < 0.001). Additionally, an increased mood correlated with the desire to take part in another Jymmin session with the same training partners, r<sup>s</sup> = 0.774, p < 0.001, and the desire to perform

between medians of both sequences.

general activities with one's former training partners, r<sup>s</sup> = 0.695, p < 0.001 (see **Figure 3**).

Spearman's correlation matrices show that participants who after listening to their Jymmin recording had a better mood also reported a higher score on the internal locus of control scale (r<sup>s</sup> = 0.495, p < 0.019; see **Figure 4**).

## Discussion

The present study investigated the psychological benefits of passively listening to self-made recordings of a musical feedback intervention (Jymmin) in polydrug abusers, compared to listening to a control piece of music. Results showed that listening to one's own performance had a positive influence on several selfreported psychological variables, but that these observed effects were largely dependent upon context.

Of primary interest was the finding that listening to self-made music had a positive influence on internal locus of control, and desire to engage in sports activities. Also, we found an influence on mood in relation to the participant's readiness to engage socially. We propose two potential factors contributing to these psychological benefits. Firstly, the physical exertion associated with the musical feedback during Jymmin provides a strong physiological experience associated with the music performance. We suggest that a combination of the high arousal created by the strenuous performance may, in the course of the previous Jymmin session, be interpreted as an emotional arousal, an association that can later when listening to the music again recreate some of the original emotional experience. Secondly, participants strongly recapitulate pleasant social interactions during the Jymmin intervention, as demonstrated by the positive correlation between participants' mood and their desire to

engage in social activities with their former training partners after listening to self-made music (see **Figure 3**).

These psychological benefits might be of particular therapeutic use in the context of polydrug abusers. Evidence suggests that polydrug abusers suffer from a decreased control belief and struggle to engage in long-lasting prosocial interactions (Hawkins et al., 1986; Beck and Lindenmeyer, 1997; Peters and Wexler, 2005). It is possible that a self-made music piece acts as a ''reminder'' of an experience in which the individual was in control, thus re-activating positive thoughts. This might be beneficial for building up positive internal beliefs about one's ability to control their situation, which explains the present finding of an increased internal locus of control after listening to self-made music. This idea is further substantiated by the finding that control belief after listening to a recording of a previous Jymmin session correlated with mood such that the higher the control belief after listening to the recording, the better the mood of the participant (see **Figure 4**).

Importantly, our data revealed a powerful influence of context on the aforementioned psychological benefits of the physiological and emotional associations formed during this specific form of musical intervention. The fact that self-made music provided a psychological benefit exclusively when it was presented in the first block is an indication of context dependency. When participants were presented with a control music piece first, the psychological benefits of listening to recording of the interactively performed music deteriorated. Playing physical instruments like the drum or the bass is usually already associated with body movements. This may be partly underlying the effect that already passive music listening can lead to substantial rehabilitation effects in stroke patients (Särkämö et al., 2008)—although note that these rehabilitation effects were probably also due to emotional and motivational effects when patients listened to music they enjoyed. It may be that the deterioration of the effect after listening first to other, commercially available music, may be due to an activation of already existing associations with the commercially available music. This might lead to a diminished impetus of the newly formed associations with the own musical performance. Listening to the commercially available music may lead to auditory-motor and a number of extra-musical symbolical associations that probably strongly determine musical meaning (see e.g., Fritz et al., 2013c). Such associations could, for example, include previous scenarios of drug abuse in environments where music was played. It is unclear if participants might have guessed the experimental manipulation and if this might have biased how they assessed the stimuli. However, if this were the case then one would expect that such a bias would influence the assessment of stimuli irrespective of context, that is the sequence in which the conditions (experimental condition, control condition) were presented, which was not the case.

We observed a significant difference in mood in the control condition before listening to the Jymmin music (sequence two) and after listening to Jymmin music (sequence one). This effect on mood measured only in the control condition is rather unclear, and we can only speculate about possible explanations. The increase in mood in the control condition of sequence one (after listening to the Jymmin recording) may correspond to a sustain of mood effects of listening to a recording of Jymmin, similar to effects observed in a previous study where participants actually performed Jymmin (Fritz et al., 2013b). In this previous research it was shown that alterations in mood evoked by Jymmin lasted for a substantial time (more than 15 min) and was therefore probably hormonally mediated (Fritz et al., 2013b). In the current study we did not observe any significant effects on mood after listening to a Jymmin recording (only in relation to the readiness to engage socially, see **Figure 3**). However, we did observe a descriptive difference between median values after listening to the Jymmin recording and listening to control music. It seems plausible that the lack of statistical significance may be due to the relatively small sample size investigated here (especially when comparing the results to previous studies that measured more participants; Fritz et al., 2013a,b).

The data also show a correlation of positive mood and perceiving the training partners as more likeable and interesting. Positive mood furthermore correlated with the desire to take part in another Jymmin session with the same training partners and the desire to perform general activities with the former training partners. This seems to indicate positive social effects of listening to the previously recorded self-made music.

It is yet unclear if the transience of the psychological benefits of listening to self-made music is exclusive to polydrug abusers. However, it is reasonable to speculate that this may relate to physiological adaptations in the reward system of substance abusers (Koob and Le Moal, 2005), such that the sensitivity of previously formed physiological and emotional musical associations to context (sequence in which conditions were performed in the experiment) observed in the present study might reflect a malfunction in the reward system in polydrug abusers.

A limitation to the present study is that while we chose a similar musical style (drum and bass) for the control condition, and while we selected a different commercially available music piece for every participant to ensure ecological validity, it would have been advantageous to control for tempo and other acoustic features, given that these have a strong influence on for example arousal and mood (Husain et al., 2002). A further limitation is that the beneficial effect of listening to a recording of Jymmin seemed to only be present when the Jymmin recording was presented without previously presenting other control music, which might constitute a potential problem to validate and implement the intervention.

In conclusion, listening to a recording of their own music performance seemed to have a positive influence on selfefficacy (internal locus of control), mood, and the desire to engage in sports activities in polydrug abusers in rehabilitation.

## References


However, we also observed a strong influence of context such that the effects of listening to the self-made music were only observed when participants listened to their own performance first. The data furthermore seems to indicate positive social effects, showing a correlation between an influence of the music recording on the participants' mood and their desire to engage in social activities with their former training partners and how likeable and interesting they were perceived. The context dependence of the observed effects is discussed in terms of other motor and symbolic associations of the commercially available control condition music that may temporarily have overwritten the associations created in the previous musical feedback intervention.


Wittchen, H.-U. (2006). Klinische Psychologie and Psychotherapie: Mit 122 Tabellen. Heidelberg: Springer.

**Conflict of Interest Statement**: The Max Planck Society has a pending patent application on the specific combination of musical expression and exercise workout referred to in the manuscript.

Copyright © 2015 Fritz, Vogt, Lederer, Schneider, Fomicheva, Schneider and Villringer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# **Extreme metal music and anger processing**

#### *Leah Sharman<sup>1</sup> and Genevieve A. Dingle1,2 \**

*<sup>1</sup> School of Psychology, University of Queensland, Brisbane, QLD, Australia, <sup>2</sup> Centre for Youth Substance Abuse Research, University of Queensland, Birsbane, QLD, Australia*

The claim that listening to extreme music causes anger, and expressions of anger such as aggression and delinquency have yet to be substantiated using controlled experimental methods. In this study, 39 extreme music listeners aged 18–34 years were subjected to an anger induction, followed by random assignment to 10 min of listening to extreme music from their own playlist, or 10 min silence (control). Measures of emotion included heart rate and subjective ratings on the Positive and Negative Affect Scale (PANAS). Results showed that ratings of PANAS hostility, irritability, and stress increased during the anger induction, and decreased after the music or silence. Heart rate increased during the anger induction and was sustained (not increased) in the music condition, and decreased in the silence condition. PANAS active and inspired ratings increased during music listening, an effect that was not seen in controls. The findings indicate that extreme music did not make angry participants angrier; rather, it appeared to match their physiological arousal and result in an increase in positive emotions. Listening to extreme music may represent a healthy way of processing anger for these listeners.

#### *Edited by:*

*Julian O'Kelly, Royal Hospital for Neuro-disability, UK*

> *Reviewed by: Ricarda I. Schubotz, Westfälische Wilhelms-Universität Münster, Germany Moshe Bensimon, Bar-Ilan University, Israel*

#### *\*Correspondence:*

*Genevieve A. Dingle, School of Psychology, The University of Queensland, St Lucia, QLD 4072, Australia dingle@psy.uq.edu.au*

> *Received: 17 February 2015 Accepted: 27 April 2015 Published: 21 May 2015*

#### *Citation:*

*Sharman L and Dingle GA (2015) Extreme metal music and anger processing. Front. Hum. Neurosci. 9:272. doi: 10.3389/fnhum.2015.00272* **Keywords: metal music, anger, emotion processing, heart rate, arousal**

## **Introduction**

Music is a widely available form of media with the ability to influence attitudes and manipulate emotions (Juslin and Sloboda, 2010; Wheeler et al., 2011), and listeners are drawn to music that reflects or improves their emotional state (Saarikallio, 2011; Thoma et al., 2012; Papinczak et al., 2015). Heavy metal, emotional (emo), hardcore, punk, screamo, and each of their subgenres form the category of "extreme" music. Extreme music is characterized by chaotic, loud, heavy, and powerful sounds, with emotional vocals, often containing lyrical themes of anxiety, depression, social isolation, and loneliness (Shafron and Karno, 2013). Perhaps, due to these musical characteristics, it has been claimed that extreme music leads to anger, and expressions of anger such as aggression, delinquency, drug use, and suicidal acts (Selfhout et al., 2008). Certainly, evidence is available regarding the effect of a listeners' emotional state on their choice and preference for music listening even when angry. Research on anger processing has found that approach motivation (defined as the impulse to move forward) may be activated by anger (Carver and Harmon-Jones, 2009), such that after experiencing anger we then look to act out approach motivated behaviors, for example, angry facial expression and physical retaliation. Considering the highly arousing nature of the music, along with negative themes commonly contained in the lyrics, extreme music has been interpreted as *eliciting* anger among its listeners, and that this may activate aggressive behaviors (Gowensmith and Bloom, 1997). It is equally plausible, however, that extreme music may be chosen when a listener is angry, because the arousing nature of the music may match the already present internal arousal of the listener and allow him/her to explore and process this emotional state. This study will explore these alternative hypotheses about the influence of extreme music listening on anger processing in a sample of extreme music listeners under controlled experimental conditions.

## **Extreme Music**

Extreme music genres began to emerge in the early 1970s with the decline of the "free love" and optimistic culture of the 1960s (Stack et al., 1994). Due to the consequences of the 1960s era of drug experimentation, decline of peaceful protest movements, and the continuation of the Vietnam War, angry and pessimistic themes began to emerge in new genres of music (Reddick and Beresin, 2002). Thus, punk and heavy metal music were dedicated to notions of anarchy and destruction (Stack et al., 1994; Reddick and Beresin, 2002; Lozon and Bensimon, 2014). Following the rise of punk and heavy metal, a range of new genres and subgenres surfaced. Hardcore, death metal, emotional/emotional-hardcore (emo), and screamo appeared throughout the 1980s, gradually becoming more a part of mainstream culture. Each of these genres and their subgenres are socio-politically charged and, as mentioned earlier, are characterized by heavy and powerful sounds with expressive vocals.

At the forefront of controversy surrounding extreme music is the prominence of aggressive lyrics and titles, such as "Pure Hatred" by Chimaira and "Violent Revolution" from the band Kreator. In a series of five experiments involving first year psychology students and student volunteers (unselected in terms of demographic characteristics or musical preference), Anderson et al. (2003) played musically equivalent songs with and without violent lyrics to the participants. They found that listening to songs with violent lyrics increased participants' state hostility relative to listening to non-violent songs. However, this effect was fleeting and it was disrupted when the participants did intervening tasks. Other research shows that lyrical content is one of the mechanisms linking music with emotional response, although many other musical variables, contextual variables, and individual listener variables also play a role (Juslin and Västfjäll, 2008; Juslin et al., 2008).

The powerful vocals that exist in the most extreme genres such as screamo, where nearly all lyrics are screamed at the listener, may account for the perception by outsiders that this music is angry. From this stems a stereotype that extreme music fans, and especially heavy metal fans, are more aggressive, agitated, and more aroused than the general public (Arnett, 1991; Alessi et al., 1992). Furthermore, extreme music has been held responsible for social problems like depression, suicide, aggressive behavior, and substance misuse (Shafron and Karno, 2013). Some researchers have used the term "problem music" in reference to these genres, meaning music that is associated with psychological vulnerability and social deviance (North and Hargreaves, 2006; Bodner and Bensimon, 2014; Lozon and Bensimon, 2014). In the case of substance use, for example, a correlational study of 7,324 Dutch adolescents found that when all other factors were controlled, preferences for punk/hardcore, techno/hardhouse, and reggae music were associated with more substance use, whereas preferences for pop and classical music were linked to less substance use. A preference for rap/hip-hop only indicated elevated smoking among girls and, interestingly, a preference for heavy metal was associated with *less* smoking among boys and *less* drinking among girls (Mulder et al., 2009). This evidence does not support a causal view. Extreme music typically does not contain themes of illicit drug use, although some songs do contain lyrics related to

alcohol use. Indeed, the movement known as "straight edge" is a subgenre of hardcore punk, whose adherents refrain from using alcohol, tobacco, and other recreational drugs. Furthermore, there are documented examples of rap music being used in therapeutic ways with samples of people who misuse substances (Baker et al., 2012; Lightstone, 2012).

A review by Baker and Bor (2008) found a relationship between various genres of music and antisocial behaviors, vulnerability to suicide, and drug use among young people. However, there was no evidence in these studies for a causal link, and it was instead suggested that music preference is a reflection of emotional vulnerability in these young listeners. More recently, Bodner and Bensimon (2014) investigated personality traits and uses of music to influence emotions among 548 middle class university students aged 18–43 years, who were subdivided into two groups based on their preference for "problem music" genres (*N* = 255 fans of heavy metal, punk, alternative rock, hip-hop, and rap) or "non-problem music" (*N* = 293 who did not endorse any of these in their top three musical genres). There were no differences between the two samples across the big five personality dimensions (extraversion, neuroticism, openness to experience, conscientiousness, and agreeableness). In terms of uses of music to influence emotions, there were no differences between groups in their use of music for entertainment and strong sensation; however, there were small differences in use of music for revival, diversion, emotional discharge, mental work, and solace. In each case, the problem music fans used music for emotion regulation slightly more than the non-problem music fans. The authors interpreted their findings to mean that listening to these types of music allows problem music fans to regulate their mood in a more sublimated way, instead of externalizing negative emotions, which in turn could lead to engaging in antisocial acts.

### **Extreme Music and Anger**

Some evidence is available regarding the effect of listeners' emotional states on their choice and preference for music listening when angry. Shafron and Karno (2013) examined music preferences in a sample of 551 university students and divided the sample into two groups: those who preferred heavy metal and hard rock genres (57%) and the rest. The heavy music fans showed significantly higher symptoms of depression and anxiety than the non-fans; however, there was no difference between the two groups on trait anger. Gowensmith and Bloom (1997) found that heavy metal fans did not show an increase in anger after listening to heavy metal music. In this study, heavy metal music was highly arousing to both fans and non-fans, and in fact, measured statearousal was greater among heavy metal listeners. Despite the arousing influence of the music, heavy metal fans displayed no difference in self-reported anger whether they were listening to a non-preferred music genre (country) or heavy metal. Non-fans, on the other hand, did display greater self-reported anger after listening to heavy metal. It is unclear whether the non-fans were angry as a result of the musical characteristics, or because they were being asked to listen to something they did not enjoy. So, although there is evidence that heavy metal increases state arousal (Stack et al., 1994; Gowensmith and Bloom, 1997), there is as yet insufficent evidence that it causes increased anger.

In a more naturalistic study, Labbé et al. (2007) found that after experiencing a state of induced stress or anger, participants listening to classical music chosen by the experimenter or their own self-selected "calming" music (of any genre) showed significant reductions in anger and anxiety. These reductions were evident in both self-reported ratings and in reduced physiological arousal (heart rate, respiration, and skin conductance) during music listening. In contrast, participants who listened to heavy metal after the stress induction did not reduce self-reported negative emotional states or physiological arousal. However, it is important to note that heavy metal was not a preferred music genre for these participants. This finding highlights the importance of personally selected music in determining the emotional response. Although this research suggests that a song considered relaxing by the listener should reduce anger and stress in the presence of a stressor, it remains to be seen whether this effect generalizes to extreme music genres.

#### **Considering the Case of Music and Sadness**

Related research on another negatively valenced emotion, sadness, might help to shed some light on music and anger processing. Some studies show that people listen to sad music when they are sad in order to improve their mood (Saarikallio and Erkkila, 2007). For instance, Papinczak et al. (2015) showed in both qualitative and quantitative studies with participants aged 15–25 years that they used music to immerse in negative moods such as sadness – a strategy that helped to process their sadness and to feel better. Similarly, a study of 65 adults from five countries found that when they were feeling sad, sad music helped these individuals to connect with their emotions through the music to fully experience sadness and consequently improve their affect (Van den Tol and Edwards, 2013). Despite evoking sadness, Finnish university students reported that they enjoy listening to sad music, and this effect was partly explained by personality traits such as openness to experience and empathy (Vuoskoski et al., 2012). On the other hand, some studies have reported that listening to sad music results in a more depressed mood among participants (Chen et al., 2007; Dillman Carpentier et al., 2008; Garrido and Schubert, 2015) – an effect that may be related to participants' use of maladaptive emotion regulation strategies such as rumination. So, the influence of negatively valenced music on listeners appears to depend on the listening context, their current mood, and moderation by other personality traits.

#### **Study Aims and Hypotheses**

To summarize the literature reviewed here, research on music and emotion supports the function of music to convey and elicit strong emotion. However, to date there has been a limited amount of research on extreme music genres and anger, with the exception of correlational studies showing an association, and one series of experiments claiming that listening to extreme music increases state hostility (Anderson et al., 2003). Thus, the current study sought to explore this question by recruiting extreme music listeners for an experimental study on the effects of extreme music listening (compared to a no music control condition) on anger processing. Given that personally selected music is capable of determining emotional responses (Labbé et al., 2007), participants were asked to bring along their personal music players to the experiment. In contrast to Labbé and colleagues' study in which the participants were instructed to bring along music that they found relaxing, in the current study participants were allowed to listen to any music from their personal listening device that they preferred at the time.

Anger was operationalized in this study in terms of both subjective ratings of hostility and irritability and physiological recording of heart rate, which were expected to increase when participants experienced an increase in anger. The cardiovascular system is complex and has multiple regulatory subsystems from central and peripheral autonomic nervous systems and humoral influences (Bernison et al., 2007). Resting heart rate may be influenced by an individual's age, aerobic fitness, posture, and activity levels. This is less of a concern with within-subjects designs such as was used in the current study, where the participant related factors are kept constant while the experimental factor (e.g., music listening or silence) is varied. Nevertheless, an increase in heart rate may reflect various psychological states including anger, stress, excitement, or fear. Heart rate should therefore be interpreted in combination with participants' subjective ratings (of these psychological states) for a more accurate assessment of emotional response (Bernison et al., 2007).

According to the "problem music causes anger" line of reasoning, extreme music listeners who are angry would be expected to experience an *increase in anger* during music listening (as shown in an increase in heart rate during music listening and an increase in subjective anger ratings immediately following music listening). Thus, the first two hypotheses for investigation are:

Hypothesis 1a: that on a self-report measure of music and emotions, participants will endorse the statement that they listen to extreme music to fully experience their anger but will *disagree* with the statement that they listen to music to calm themselves down when feeling angry; and

Hypothesis 1b: that the participants' subjective ratings and physiological measure (i.e., heart rate) of anger will increase during the anger induction and *will continue to increase* during music listening, and relative to participants in the no music (control) condition.

Another body of research indicates that listeners are drawn to music that is concordant with their current emotional state, and are able to use music as an emotion regulation technique (Saarikallio, 2011; Thoma et al., 2012; Papinczak et al., 2015). According to this "music regulates anger" line of reasoning, angered extreme music fans would be expected to listen to music that matches their anger and helps them to process it and feel better. Further, in Lozon and Bensimon (2014) review on problem music, they also concluded that listeners of music containing themes of aggression and suicidal ideation seemed to feel alleviated of angst and aggression after listening. Thus the alternative hypotheses are:

H2a: that on a self-report measure of music and emotions, participants will *agree* with the statements that they listen to music to fully experience anger, and that listening to music helps them to calm down when they are angry;

H2b: that the participants' subjective ratings and physiological measure (i.e., heart rate) of anger will increase during the anger induction but*will not continue to increase* during music listening, and relative to participants in the no music (control) condition. H2c: that, in accordance with the idea that extreme music may be a method for processing anger, participants in the music listening condition will feel better after music listening compared to the no music control participants, as shown by their endorsement of positively valenced emotions such as "relaxed" and "inspired."

A secondary aim for the study was to analyze what the participants in the music condition selected from their own playlists to listen to when they were angry. This analysis will investigate the features of their chosen music in terms of genre, whether the songs contained angry lyrics, and the speed of tempo (beats/min).

H3: it was predicted that angry participants would select extreme music from their playlists that matched their anger in terms of high tempo and angry lyrics.

## **Materials and Methods**

#### **Participants**

There were 40 people recruited to the study; however, one person's data were unusable so the final sample consisted of 39 participants (72% male), with ages ranging from 18 to 34 years (*M* = 22.36, SD = 3.19 years). Advertisements requested participants for a study of the potential benefits of extreme music listening. It specified that participants should enjoy one or more extreme genres of music, such as heavy metal, punk, hardcore, and screamo, and listen to these at least 50% of the time they chose to listen to music. When individuals confirmed their participation, they were asked to bring along their personal music listening device to the laboratory. Three quarters of the participants (74%) were born in Australia, with the remainder born in New Zealand, USA, New Caledonia, South Africa, Indonesia, Sweden, and Oman. Seven participants were recruited via the online recruitment site (SONA) at the University of Queensland, receiving course credit for respective first year psychology courses. The remaining participants were recruited from the wider community via word of mouth and advertising on social media and community websites. They received a \$10 iTunes voucher as compensation for their time and interest.

In regards to musical involvement, 41% of the participants currently played a musical instrument or sang, 51% attended live concerts on a regular basis (at least once a month), 44% composed music, and 23% had taught music, although it was not the same subsample engaging in all of these musical activities. Of the six activities included in the questionnaire, participants engaged in an average of three, which is similar to other research conducted in unselected adult samples (*authors, unpublished research*). The average number of years playing an instrument or singing was 6.19 years (SD = 5.22 years). The most commonly reported musical preferences were: classic metal 60%, death metal 17.5%, progressive metal 15%, punk 12.5%, power metal 7.5%, melodic metal 7.5%, folk metal 5%, black metal 5%, thrash metal 5%, death core 5%, and hard core 5%. Note that, as most participants indicated more than one preferred genre, the overall figure is above 100%. **Table 1** shows means and SDs on the demographic, musical,

#### **TABLE 1 | Sample characteristics of participants in the music and the control conditions**.


and mood variables for the two conditions (music listening and control), and *t*-tests indicated no differences between the two conditions on these variables.

#### **Procedure**

Participants were randomly assigned to either the music or control condition before the study began. To avoid extraneous influences on heart rate, participants were asked to refrain from smoking, exercise, and drinking caffeinated and alcoholic beverages for at least 3 h before participating (this was checked with questions in the questionnaire). For the baseline heart rate recording, participants were given a diagram and instructions on how to attach their recording electrodes, and then asked to sit silently for 5 min and "not to think about anything in particular." Following this, participants were asked to complete the first set of Positive and Negative Affect Scale (see PANAS in Measures) questions (T1). The experimenter then conducted the 16-min anger interview. Following this, participants completed the second set of PANAS questions (T2). Those assigned to the music condition were instructed to select song(s) of their preference from their personal music device, and were instructed to listen for 10 min. Although all participants were asked to bring their music devices to the experiment, this was the first moment that participants were told they would be listening to music. This was done to ensure that participants would select songs that they would typically listen to when feeling angry. Participants in the control condition were asked to "wait quietly for the next part of the experiment" and sat in silence for the next 10 min. All participants then completed the PANAS items for a third time (T3) followed by a structured interview about the emotional influence of music and the final questionnaires, which included the emotional influence of music questions, DASS, and demographic and musical involvement questionnaire (refer to measures). Participants were then debriefed. The average time for experiment completion was 50 min. Ethical clearance for the procedures and materials was granted through the university ethics committee.

#### **Measures**

#### Demographics and Musical Involvement

Participants responded to demographic questions such as age and gender. Participants' musical background and current musical involvement was assessed in a questionnaire consisting of seven dichotomous (yes/no) questions, such as "do you attend concerts or live music on a regular basis (i.e., at least once a month)?" Participants were also asked to identify the number of years (to the nearest 6 months) that they had played an instrument or sung during their lifetime. These seven items have been used in previous research by the authors and colleagues (*blinded for review*), and found to have good internal consistency (Cronbach's α = 0.76).

### Physiological Measure of Emotion

Heart rate was recorded according to published guidelines (Bernison et al., 2007): 10-mm pre-gelled Ag/AgCl disposable electrodes were attached over the lower rib on the left side of the torso and to the participant's chest on both the right and left to record a lead III electrocardiogram (ECG). These leads were attached to a MP150 Biopac ECG system. Signals were digitized at 1000 Hz and saved for offline analyses. Heart rate, expressed as the number of beats per min (bpm), was sampled across the following time periods: 5 min baseline, 12 min anger induction, 10 min music or no music listening, and final 2 min of music listening and silence to yield an average heart rate (bpm) for each segment.

#### Modified Positive and Negative Affect Scale

Repeated measures of participants' subjective emotional state was assessed with a modified version of the PANAS (Watson et al., 1988) using the time instructions "at the moment". Participants were instructed to indicate how they felt at that very moment and rate 10 emotional words on a 5-point Likert scale from 1 (very slightly or not at all) to 5 (extremely). Five emotions had positive valence (e.g., "inspired" and "enthusiastic"), and five emotions had negative valence (e.g., "irritable," "hostile," and "guilty").

#### Emotional Influence of Music

Participants were asked nine dichotomous (yes/no) questions during a structured interview, regarding the extent to which they listened to extreme music in order to change an emotion (e.g., "*when you are sad, do you listen to music that improves your mood?*") or to fully experience an emotion (e.g., *"when you are angry, do you listen to music to fully experience that anger?").* The emotions were: happy, sad, angry, and anxious, with two extra items relating to "in love" and "well-being". This questionnaire was adapted from a Likert-type scale version used in an international survey of 394 adults (*authors blinded for review*), which found an adequate internal consistency of items on the Change Emotions subscale of α = 0.73 and on the Experience Emotions subscale of α = 0.71 (Papinczak et al., 2015).

## Depression, Anxiety, and Stress Scale

The Depression, Anxiety, and Stress Scale (DASS) is a 42-item questionnaire assessing symptoms of depression, anxiety, and stress over the past week (Lovibond and Lovibond, 1995). Questions were measured on a 4-point Likert-type scale from 0 (did not apply to me at all) to 3 (applied to me very much, or most of the time), with seven items summed to produce each of the three subscales scores: depression, anxiety, and stress. In our sample, the internal consistency values were: depression (α = 0.91), anxiety (α = 0.84), and stress (α = 0.90). Assessment of the DASS (42) on a non-clinical sample (*N* = 1771) (Crawford and Henry, 2003) found means for depression, anxiety, and stress to be 5.55 (SD = 7.48), 3.56 (SD = 5.39), and 9.27 (SD = 8.04), respectively.

### Anger Interview

The stress interview proposed by Dimsdale et al. (1988), and modified by Lobbestael et al. (2008), was used for anger induction. The interview involved participants describing one or more events that produced a strong feeling of anger over a period of 16 min. Participants were presented with a list of topics to help with prompting their recall of angering scenarios, based on those used by Dimsdale et al. (1988) such as "partner/spouse", "work/work colleagues," and "finances".. Other researchers, such as Burns et al. (2003) and Malatesta-Magai et al. (1992), have demonstrated the effectiveness of this technique in their respective studies, finding effects with only a 10-min interview.

### Music and Headphones

As mentioned, participants were asked to bring in their personal music players to the laboratory, and those in the music listening condition were asked to play music from their own collection during the listening phase. Those in the experimental condition listening to their preferred music were provided with Sennheiser HD201 closed headphones.

## **Results**

## **Self-Report Results**

The means and SDs on the DASS (**Table 1**) show that symptoms of depression, anxiety, or stress were in the normal range, and there were no differences between participants in the two conditions. Responses to the nine questions about extreme music influence on emotions are displayed in **Table 2**. A majority agreed with the statements that they listened to extreme music to fully experience anger (79%) and to calm themselves down when feeling angry (69%). They also listened to extreme music to improve other negative moods such as sadness (74%) and less commonly, anxiety (33%). An overwhelming majority stated that they listen to extreme music to enhance their happiness (87%) and to enhance their well-being (100%).

**TABLE 2 | Proportion (%) of participants responding "yes" to the music and emotional influence items**.


## **Experimental Results**

Means and SD on all of the emotion measures for participants in the two conditions are displayed in **Table 3**.

#### **Heart Rate Analyses**

A 2 (Condition: Music vs. Silence) *×* 3 (Time: baseline, after anger induction, after music listening/silence) mixed repeated measures ANOVA was conducted to assess changes in heart rate during experimental periods, refer to **Figure 1**. A significant main effect of time was revealed, *F* (2, 74) = 8.54, *p <* 0.001, <sup>2</sup> <sup>p</sup> = 0*.*18. There was no main effect of Condition; however, a significant Condition *×* Time interaction was found, *F* (2, 74) = 6.36, *p* = 0.003, 2 <sup>p</sup> = 0*.*15. Tests of simple effects at each Condition revealed a significant effect of silence, *F* (2, 18) = 11.02, *p* = 0.001, <sup>2</sup> <sup>p</sup> = 0*.*55. Simple comparisons revealed no significant difference between Time 1 and Time 2, or between Time 1 and Time 3. However, heart rate at Time 2 was significantly higher than at Time 3 (*p* = 0.001). A significant simple effect of time within the Music condition was also observed, *F* (2, 17) = 5.73, *p* = 0.013, <sup>2</sup> <sup>p</sup> = 0*.*40*.* A simple comparison found significant differences of heart rate between Time 1 and Time 2, *p* = 0.008, where heart rate increased during the anger induction. Surprisingly, no significant difference was found between Time 1 and Time 3, *p* = 0.068. However, there was no significant difference among music listeners between heart rate during Time 2 and Time 3, *p >* 0.999, indicating that the increased heart rate following the anger induction was sustained for the music listeners, but not for those in the silence condition.

#### Subjective Ratings

A series of 2 (Condition: Music vs. Silence) *×* 3 (Time) mixed ANOVAs were conducted to compare PANAS self-reported emotions with Condition as the between-subjects factor. Self-reported ratings relevant to anger (PANAS hostile, irritable, and stress) were analyzed. In accounting for discriminant changes, the positively valenced emotions relaxed, active, and inspired were also analyzed. Where sphericity assumptions were not met, tests for Greenhouse-Geisser were reported. Means and SDs for self-reported emotions at each time point for each condition are shown in **Table 3**.

#### *Hostile*

A significant main effect was found for Time, *F* (1.50, 55.63) = 27.48, *p <* 0.001, <sup>2</sup> <sup>p</sup> = 0*.*43*.* There was no main effect for Condition or a Condition *×* Time interaction. Pairwise comparisons of time points revealed no significant difference between Time 1 and Time 3; however, significant differences were

#### **TABLE 3 | Means and SDs for heart rate and PANAS ratings**.

observed between Time 1 and Time 2 (*p <* 0.001), and between Time 2 and Time 3 (*p <* 0.001). The greater ratings of hostility at Time 2, compared to Time 1 and Time 3, indicated that the anger induction worked, and that both music listening and silence resulted in decreased hostility. However, music listening was no different to silence, see **Figure 2**.

#### *Irritable*

A similar pattern of results emerged for PANAS irritable ratings. A significant main effect was revealed for time, *F* (1.69, 62.45) = 22.62, *p <* 0.001, <sup>2</sup> <sup>p</sup> = 0.38, with no significant main effect for Condition, or a Condition *×* Time interaction found. Pairwise comparisons of Time found no difference between Time 1 and Time 3; however, there were differences between Time 1 and Time 2 (*p <* 0.001), and between Time 2 and Time 3 (*p <* 0.001), such that greater ratings of irritability were observed at Time 2 compared to Time 1 and Time 3.

#### *Stress*

Baseline ratings of PANAS stress were higher than those for hostile and irritable, although the pattern of changes across time was consistent for the three PANAS emotions. A significant main effect of Time was found, *F* (1.69, 62.61) = 28.98, *p <* 0.001, <sup>2</sup> <sup>p</sup> = 0*.*54, with no main effect of Condition or a Condition *×* Time interaction. Pairwise comparisons of Time found no difference between Time 1 and Time 3; however, the difference between Time 1 and Time 2 was significant (*p <* 0.001), as was the difference between Time 2 and Time 3 (*p <* 0.001), with greater ratings of stress at Time 2 compared to Time 1 and Time 3.


#### *Relaxed*

An inverse pattern of results was found for the PANAS relaxed ratings, see **Figure 3**. No main effect was observed for Condition, or a Condition *×* Time interaction. A significant main effect of Time, however, was found, *F* (2, 74) = 22.62, *p <* 0.001, <sup>2</sup> <sup>p</sup> = 0*.*38*.* Pairwise comparisons for Time found no significant difference between Time 1 and Time 3. However, there were differences observed between Time 1 and Time 2 (*p <* 0.001), and between Time 2 and Time 3 (*p <* 0.001), with participants reporting less relaxation at Time 2 compared to Time 1 and Time 3.

#### *Active*

A significant main effect was revealed for Time, *F* (2, 74) = 20, *p <* 0.001, <sup>2</sup> <sup>p</sup> = 0*.*36*,* modified by a Condition *×* Time interaction, *F* (2, 74) = 6.98, *p* = 0.002, <sup>2</sup> <sup>p</sup> = 0*.*16*.* No main effect for Condition was found. Tests of the simple effects of Time at each Condition were also conducted. The Silence group displayed significant simple effects of Time, *F* (2, 38) = 15.25, *p <* 0.001, 2 <sup>p</sup> = 0.45, and this was located between Time 1 and Time 2 (*p* = 0.002), and between Time 2 and Time 3 (*p <* 0.001), with participants feeling more active at Time 2 compared to Time 1 and Time 3 in the Silence condition. In the Music condition, the simple effects for Time was also significant, *F* (2, 36) = 12.54, *p* = *<* 0.001,

<sup>2</sup> = 0.41. The key differences were found between Time 1 and Time 2 (*p* = 0.003), and between Time 1 and Time 3 (*p <* 0.001), with music listeners feeling more active after the anger induction and remaining active after music listening.

#### *Inspired*

A significant main effect of Time was revealed, *F* (2, 74) = 4.74, *p* = 0.012, <sup>2</sup> <sup>p</sup> = 0*.*11*,* as well as a significant Condition *×* Time interaction, *F* (2, 74) = 7.22, *p* = 0.001, <sup>2</sup> <sup>p</sup> = 0*.*16*.* No main effect was found for Condition. A pairwise comparison for Time found no significant difference between Time 1 and Time 2, or between Time 1 and Time 3. However, inspiration ratings were greater at Time 3 compared to Time 2, *p* = 0.022, see **Figure 4**. The simple effects of Time at each Condition revealed no effects for Silence. The music group, however, displayed a significant effect of Time, *F* (2, 36) = 10.71, *p <* 0.001, <sup>2</sup> = 0.37. A simple comparison found no significant difference between Time 1 and Time 2. However, significant differences were observed between Time 1 and Time 3 (*p* = 0.021), and between Time 2 and Time 3 (*p* = 0.002), indicating that participants felt inspired after listening to their music.

#### Analysis of Music Selections

An analysis of the 46 pieces of music, the participants chose to listen to when angry, is displayed in **Table 4**. One song was removed because it could not be found; it was assumed that song title and/or artist were incorrectly recorded. Of the accurate song titles provided, 100% of music chosen was classified as "extreme" genre. Of the songs containing lyrics, only 50% contained aggressive themes or conveyed lyrics relating to anger, with the remaining songs containing lyrical themes including, but not limited to, isolation and depression. The tempo of the selected songs ranged from 80 to 181 beats per min, with over half (61%) of selections >100 bpm, representing high tempo music expected to have an energizing or arousing effect on the listeners. In all, less than a third of musical selections (28%) were both high arousal (*>*100 bpm) and contained themes of anger or aggression.

#### **TABLE 4 | Analysis of the music participants played when angry**.


*<sup>a</sup>Lyrical themes of anger/aggression, but contextually not aggressive.*

## **Discussion**

### **Extreme Music and Anger**

The purpose of this research was to test two alternative sets of hypotheses regarding the relationship between extreme music and anger under controlled experimental conditions. The first set of hypotheses followed an "extreme music causes anger" line of reasoning, and the second set of hypotheses followed an "extreme music matches and helps to process anger" line of reasoning. The results overall were supportive of the latter. Among our sample of extreme music fans in the normal range on symptoms of depression, anxiety, and stress, the majority reported that they listened to extreme music for a range of emotional effects – most pertinently to fully experience anger and to calm themselves down when feeling angry.

These reports were supported by the experimental results. The anger induction was successful, as shown in increased ratings of hostility and irritability and increased heart rate at the end of the anger interview. Those who listened to music when angry did not show an increase in heart rate or subjective hostility and irritability. Rather, they showed a decrease in subjective hostility and irritability that was equivalent to those who sat in silence. Heart rate stabilized but did not continue to rise, suggesting that the music that participants selected when angry matched their physiological arousal and allowed them to fully experience it. In the silence condition, heart rate reduced after the anger interview, returning to baseline. These findings are consistent with Gowensmith and Bloom (1997) finding that heavy metal music was highly arousing to both fans and non-fans but did not cause an increase in subjective anger in fans. The findings are counter to the claims that extreme music causes anger and promotes aggressive behavior (Stack et al., 1994; Arnett, 1996).

In addition, the results showed that listening to metal music relaxed participants as effectively as sitting in silence. Ratings of relaxation decreased during the anger induction but increased again during music listening or silence. This result expands on earlier research by Labbé et al. (2007) who reported that personally selected music of any genre is just as relaxing as (experimenter selected) classical music. Unfortunately, because a similar relaxation response was found in both conditions, it is unclear whether it was the music or simply the passage of time after the anger induction that may have increased feelings of relaxation. Nevertheless, ratings on two other positive emotions, active and inspired, further demonstrate that music listening helped participants to feel these positively valenced emotions. Active feelings increased in all participants during the anger induction, consistent with the idea that anger activates approach motivation (Carver and Harmon-Jones, 2009). Active feelings then decreased for participants in the silence condition; yet, they continued to increase in the music listeners. Ratings of feeling inspired were relatively flat from baseline to anger induction for both conditions and were unchanged for those who sat in silence. In contrast, participants who listened to their selected extreme music experienced a significant increase in feelings of inspiration. These effects of extreme music on increasing physiological arousal and subjective inspiration are echoed in other research showing that music can evoke the experience of power – an effect that appears to be independent of musical genre and whether or not the music contains lyrics (Hsu et al., 2015). Taken together, the findings support the view that extreme music listeners use music to regulate their anger and to feel active and inspired. This emotion regulation effect is similar to that found in some research on sad music listening (Saarikallio and Erkkila, 2007; Vuoskoski et al., 2012). For instance, Van den Tol and Edwards (2013) found that people often engaged in sad music listening when sad in order to fully experience their negative affect and to enhance their mood. Indeed, participants in our study also reported listening to extreme music to improve their mood when feeling sad.

#### **What Did Angry Participants Listen To?**

A secondary aim for the study was to analyze what participants in the music condition selected from their own playlists to listen to when they were angry. It was predicted that angry participants would select extreme music from their playlists that matched their anger in terms of high tempo and angry lyrics. The analysis confirmed that all participants chose to listen to extreme music after the anger induction. The tempo and lyric findings were interesting in that half of the chosen songs contained lyrical themes of anger or aggression, with the remainder contained other themes including, but not limited to, isolation and sadness. It is difficult to account for this finding without knowing the detailed content of the angry memories that participants evoked during the anger interview. It is possible that their memories incorporated complicated feelings including anger and sadness and that their selected music matched those feelings. It is also possible that many participants did not select music on the basis of the lyrics – rather on the basis of the instrumental sounds or other musical characteristics. In terms of tempo, the chosen songs had a range of tempo with only 61% having a tempo that would be considered highly arousing (100 beats per min or over). Furthermore, less than a third of all songs possessed both angry themes and high arousal tempo. Potentially, other mechanisms may have linked the music with participants' emotional response, such as episodic memory, emotional contagion, or a brain stem response to the acoustic characteristics of the music (Juslin and Västfjäll, 2008; Juslin et al., 2010).

Unfortunately, it was not possible to conduct an analysis directly linking participants' heart rate to the songs they listened to because we wanted participants to engage in naturalistic music listening and they listened to multiple songs (with varying tempos) for various lengths of time during the 10 min period. We did not have markers on the heart rate recording of which songs were listened to for which periods, and therefore the only analysis available was a summary analysis of the music they listened to (unlinked to their heart rates). Further research is required to explore whether there is a direct relationship between song tempo and heart rate among angry extreme music fans, as has been found in other samples (e.g., Etzel et al., 2006).

Extreme music fans reported using their music to enhance their happiness, to immerse themselves in feelings of love, and agreed that their music enhanced their well-being. What each of these responses indicates is that extreme music listeners appear to be using their music listening for positive self-regulatory purposes. Although this effect cannot be generalized to non-fans, it nevertheless lends support to a growing body of research about everyday music listening and emotion regulation (Saarikallio, 2011; Thoma et al., 2012; Papinczak et al., 2015).

#### **Practical Implications**

Given that some correlational studies have reported an association between extreme music and anger, aggression and delinquency, it is understandable that some parents, teachers, and health practitioners have been concerned about their clients or students listening to extreme music and what this might mean. Earlier studies showed that an individual's music preference is capable of biasing clinical judgment – for example, Rosenbaum and Prinsky (1991) contacted clinicians at 12 psychiatric hospitals posing as a concerned parent of a (fictitious) adolescent male who listened to heavy metal but they made no mention of symptoms of any mental illness. Ten of the services (83%) recommended admitting the adolescent to hospital. The results of our study indicate that responses like these are unjustified. On the contrary, the results show that extreme music may be used to recover from anger and to enhance emotional and mental health.

Practically, this research has various uses in applied settings. For example, greater understanding of anger processing through music may be beneficial within schools. Young people, in particular adolescents, are the greatest consumers of music (North and Hargreaves, 1999; North et al., 2000). Thus, allowing students who are angry and upset to listen to their preferred music (including extreme genres) for 10 min may assist in self-regulation of these moods and result in increased positive affect. Moreover, these findings are extremely useful in clinical settings. Music-based interventions have been found to be effective in the treatment of a range of disorders that commonly involve emotional volatility including the psychoses (Gold et al., 2009), post-traumatic stress disorder (Zoteyva et al., 2015), and substance misuse (Baker et al., 2012; Short and Dingle, 2015). The use of extreme music in therapy may also result in increased engagement and participation in therapy for fans of these genres (Dingle et al., 2008).

#### **Limitations and Future Directions**

Although these results showed that extreme music matches and helps to regulate anger – this effect may be particular to fans of extreme music that are not experiencing any symptoms of distress. Further research is required to examine whether the findings generalize to fans experiencing psychological or behavioral problems. It is also important to note that the study was carried out in a laboratory under controlled conditions and with only the participant and experimenter present. Further, as participants were recruited with an advertisement for the "potential benefits" of extreme music, it partially revealed the study aims possibly leading to bias. In light of the results, it would have been beneficial to have included a third condition in which participants listened to a non-problem music genre in order to control for the general arousing effects of listening to music of any kind.

It is unknown what might happen to participants' emotions if they listened to extreme music for prolonged periods, or what their emotional and arousal levels were half an hour or more after listening had ceased. The study would need to be replicated and extended to include a fourth time point in order to clarify this question. It is not clear from these findings how a naturalistic setting (such as at a social gathering or concert) might influence the link between extreme music listening and anger processing. Further research adopting experience sampling methods might shed light on this (Juslin et al., 2008). Finally, we did not measure individual difference factors such as personality, tendency to

## **References**


ruminate, and other emotion regulation strategies in this study – factors that have been implicated in emotional responses to music in other research (Chin and Rickard, 2014; Garrido and Schubert, 2015). Such musical, contextual, and listener variables may all contribute in some way to listeners' emotional responses, as has been found in previous research (Juslin and Sloboda, 2010).

What may be of interest for future research is how extreme music fans use music listening to process other emotions such as sadness and anxiety? Just over half of the participants in this study indicated that they listen to extreme music to fully experience sadness, and three quarters said they listen to improve their mood when feeling sad. However, there is currently a lack of research putting this to a direct test using experimental manipulation of sad mood. Only a third used music to calm down when anxious, which may reflect the highly arousing nature of the music. It would be interesting to find out if extreme music fans use other genres of music or other non-musical strategies (such as exercise or talking to someone) to regulate their anxiety (Thayer et al., 1994).

## **Conclusion**

This study found that extreme music fans listen to music when angry to match their anger, and to feel more active and inspired. They also listen to music to regulate sadness and to enhance positive emotions. The results refute the notion that extreme music causes anger but further research is required to replicate these findings in naturalistic social contexts, and to investigate the potential contributions of individual listener variables on this relationship between extreme music listening and anger processing.

## **Acknowledgments**

The authors would like to express their sincere thanks to the participants, who were involved in the study, and to Dr. Eric Vanman for the supported use of his social neuroscience laboratory, and for expert comments on the draft manuscript.

L. G. Tassinary, and G. G. Berntson (Cambridge: Cambridge University Press), 182–210.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Sharman and Dingle. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Maladaptive and adaptive emotion regulation through music: a behavioral and neuroimaging study of males and females

Emily Carlson<sup>1</sup> \*, Suvi Saarikallio<sup>1</sup> , Petri Toiviainen<sup>1</sup> , Brigitte Bogert <sup>2</sup> , Marina Kliuchko1,2 and Elvira Brattico2,3,4

<sup>1</sup> Center for Interdisciplinary Music Research, Department of Music, University of Jyväskylä, Jyväskylä, Finland, <sup>2</sup> Cognitive Brain Research Unit, Institute of Behavioural Sciences, University of Helsinki, Helsinki, Finland, <sup>3</sup> Helsinki Collegium of Advanced Studies, University of Helsinki, Helsinki, Finland, <sup>4</sup> Advanced Magnetic Imaging (AMI) Center, Aalto University, Espoo, Finland

Music therapists use guided affect regulation in the treatment of mood disorders. However, self-directed uses of music in affect regulation are not fully understood. Some uses of music may have negative effects on mental health, as can non-music regulation strategies, such as rumination. Psychological testing and functional magnetic resonance imaging (fMRI) were used explore music listening strategies in relation to mental health. Participants (n = 123) were assessed for depression, anxiety and Neuroticism, and uses of Music in Mood Regulation (MMR). Neural responses to music were measured in the medial prefrontal cortex (mPFC) in a subset of participants (n = 56). Discharge, using music to express negative emotions, related to increased anxiety and Neuroticism in all participants and particularly in males. Males high in Discharge showed decreased activity of mPFC during music listening compared with those using less Discharge. Females high in Diversion, using music to distract from negative emotions, showed more mPFC activity than females using less Diversion. These results suggest that the use of Discharge strategy can be associated with maladaptive patterns of emotional regulation, and may even have long-term negative effects on mental health. This finding has realworld applications in psychotherapy and particularly in clinical music therapy.

#### Edited by:

Julian O'Kelly, Royal Hospital for Neuro-disability, UK

#### Reviewed by:

Arun Bokde, Trinity College Dublin, Ireland Stefan Gebhardt, University of Marburg, Germany

#### \*Correspondence:

Emily Carlson, Center for Interdisciplinary Music Research, Department of Music, University of Jyväskylä, P.O. Box 35, FI-40014 Jyväskylä, Finland emily.j.carlson@jyu.fi

> Received: 31 March 2015 Accepted: 10 August 2015 Published: 26 August 2015

#### Citation:

Carlson E, Saarikallio S, Toiviainen P, Bogert B, Kliuchko M and Brattico E (2015) Maladaptive and adaptive emotion regulation through music: a behavioral and neuroimaging study of males and females. Front. Hum. Neurosci. 9:466. doi: 10.3389/fnhum.2015.00466 Keywords: music, emotion regulation, fMRI, prefrontal cortex, gender differences, mental health

## Introduction

Although the ability of music to express and induce emotions seems so essential as to be obvious, it is only research from the last decades that allows us to state with scientific confidence that music can communicate and induce specific emotions to listeners (Balkwill and Thompson, 1999; Juslin, 2003; Juslin and Laukka, 2004; Juslin and Västfjäll, 2008; Fritz et al., 2009; Brattico and Pearce, 2013). This ability to induce emotions makes music listening a potentially powerful means of affect regulation, an important aspect of mental health, and could therefore have clinical applications. Indeed, music therapy is currently being used in the treatment of mental health disorders in clinical settings, with some evidence for its effectiveness, though there is a paucity of well-controlled studies to explain the mechanisms by which desired effects take place (Maratos et al., 2008). The need for improvements in current treatment of mental health disorders is not trivial. Current pharmacological treatment options, such as selective serotonin reuptake inhibitors (SSRIs) for depression, are far from perfect in their efficacy (Kirsch et al., 2008). The World Health Organization (WHO) has reported that up to 50% of those with psychiatric illness worldwide, including the perniciously common (and commonly comorbid) mood and anxiety disorders, do not receive adequate treatment (Demyttenaere et al., 2004), leaving a substantial gap that could be partially filled by the validation and development of musicbased treatments, specifically for these more common disorders.

Music therapists are all too familiar with the need for clinical trials to support the use of such treatments, and most are equally familiar with the difficulties inherent in carrying out such trials, including gaining ethical permissions regarding vulnerable populations, finding financial means within stretched budgets, and simply having the time to consider research while meeting the demands of client care. Conducting clinical research with psychiatric clients can be especially difficult due to drop-out rates, distinctions between session attendance and active participation, and significant differences between treatment facilities where research may be conducted (Silverman, 2008). However, not all research need be done with clinical populations to be relevant to music therapy. In proposing a model of using research to develop therapeutic music interventions, Thaut (2002) emphasizes the importance of identifying shared and parallel brain functions between musical and non-musical psychological processes, allowing in turn for the development of musical experiences that have predictable therapeutic effects. Thaut further suggests that research regarding music therapy should follow four steps in order: (1) defining psychological, neurological and physiological responses to music; (2) defining non-musical processes which are functionally parallel to these musical responses; (3) defining the influence of music on non-musical responses and behaviors; and (4) defining the effects of music therapy on therapeutic outcomes (p. 116–119). The current study seeks to contribute to the first step of this research process, by defining parallels between non-musical and musical behaviors in affect regulation by examining parallels in outcomes between non-musical and musical strategies. We will also address the second step by examining neural responses to music in areas of the brain that have previously been associated with non-musical affect regulation processes, with a particular attention to maladaptive responses.

Given music's ability to affect emotion, the most immediately relevant psychiatric disorders for which music may be explored as a therapeutic treatment are those overtly characterized by disruptions of normal affective functioning, such as clinical depression and anxiety (Juslin, 2003; Juslin and Västfjäll, 2008; Fritz et al., 2009). Following the terminology set by Juslin and Västfjäll (2008), this paper will use affect as an umbrella term to include both emotion and mood where possible for the sake of succinctness, still recognizing that mood and emotion are distinct yet interconnected both conceptually and functionally.

Affect regulation may be defined as a process by which an individual maintains or modifies his internal emotional or mood state, and includes behavioral and autonomic facets (Thayer et al., 1994). Deficits in an individual's affect regulation ability, including the use of maladaptive strategies, have been linked with vulnerability to depression and anxiety (Fernandez-Berrocal et al., 2006; Gross et al., 2006; Gross and Thompson, 2007; Joormann and D'Avanzato, 2010; Joormann and Gotlib, 2010). Efficacious emotion regulation strategies, such as distraction and positive reappraisal, correlate negatively to undesirable outcomes such as depression (Oikawa, 2002; Gross and John, 2003; Garnefski et al., 2004), whereas inefficient emotion regulation strategies such as venting, suppression and rumination relate positively to depression and other mood disorders (Gross and John, 2003; Garnefski et al., 2004; Joormann and D'Avanzato, 2010). Habitual use of emotion regulation has furthermore been shown to affect neural responses to stimuli (Kanske et al., 2012). Previous research has found differences in how clinical psychiatric samples use music for affect regulation compared to non-clinical samples (Gebhardt and von Georgi, 2007; Gebhardt et al., 2014a,b).

Research has shown that there are differences between healthy and mentally ill populations in non-musical cognitive affect regulation strategies for dealing with negative stimuli. Cognitive reappraisal, a process of reassessing a stimulus as being less negative than originally perceived, has been associated with decreased risk of depression (Troy et al., 2010). Effective cognitive reappraisal is associated with increased activation of prefrontal and striatal areas in females, but with decreased amygdala response in males, suggesting important gender differences in affect regulation in the brain (McRae et al., 2008). Rumination, on the other hand, involves repetitive cognitive focus on the negative aspect of a situation, without attempts to alter the perception of the situation, and has been associated with the increased risk of depression and anxiety (Papadakis et al., 2006; Moulds et al., 2007; Arnone et al., 2009). Neuroimaging studies lend support to the existence of distinct neural affect regulation processes (Ochsner and Gross, 2007), and further more when habitual can affect neural responses to stimuli (Kanske et al., 2012). Differing regulation strategies, such as distraction or cognitive reappraisal, activate close but distinct regions of the prefrontal cortex (PFC), namely the lateral and medial prefrontal cortex (lPFC and mPFC), with the right lPFC and the mPFC, specifically the orbitofrontal cortex (OFC), increasing in activation when the regulation strategy requires a decrease of undesired emotion (Ochsner and Gross, 2005). Phillips et al. (2003) has posited two separate but related neural systems for emotional processing: a ventral system responsible for stimulus identification and automatic emotional responses, and a dorsal system responsible for emotional regulation. The PFC has also been shown to activate differentially in participants with depression when compared to healthy participants. Koenigs and Grafman (2009) found that the dorsal lPFC, associated with cognitive responses, was hypoactive in participants with depression, while the ventral mPFC was hyperactive, indicating neural irregularity in emotional regulation in depressed participants compared to healthy controls.

The process by which music can influence and regulate human affect is itself complex, and discovering the neural correlates of music-induced emotion and regulation is currently among the chief research interests in the field of music neuroscience (Levitin and Tirovolas, 2009; Brattico et al., 2013). Juslin and Västfjäll (2008) created a model that differentiates between six discreet but not mutually exclusive mechanisms by which music may change affect and which take place in distinct brain areas, with the amygdala, basal ganglia and inferior right frontal regions among those linked to the induction of basic emotions. Recently, Brattico et al. (2013) have proposed temporally organized model in which experience of discrete emotions, as observed in activation of structures including the amygdala, the anterior cingulate cortex (ACC) and PFC, follows early feature analysis at the brainstem and primary sensory cortical levels and cognitive processing (CP) of syntactic musical rules in non-primary sensory cortices and the ventral lPFC. Discrete emotional experiences, appearing in this model around 300 ms after the stimulus, precede aesthetic judgments and emotions and determinations of liking or disliking, which the authors propose may require listening to a piece of music in its entirety (p. 9).

Individuals make use of these responses to music to regulate their affective state in a variety of ways (von Georgi et al., 2006; Gebhardt and von Georgi, 2007; Saarikallio and Erkkilä, 2007; Gebhardt et al., 2014b). von Georgi et al. (2006) developed the Inventory for Measurement of Activation and Arousal Modulation (IAAM) scale using principal component analysis (PCA) of a battery of tests including a measure of psychological and psychopathological symptoms. Results identified relaxation (RX), CP, reduction of negative affect (RA), fun seeking (FS) and arousal management (AM) as key areas of music use in affect regulation (von Georgi et al., 2006). The IAAM has successfully been used to reveal differences between clinical psychiatric and non-clinical samples (Gebhardt and von Georgi, 2007; Gebhardt et al., 2014a,b). The Music in Mood Regulation scale (MMR) is a questionnaire that was derived originally from interview data with adolescents regarding how they used MMR, and developed through confirmatory factor analysis of a large sample of adolescents (Saarikallio, 2008). The MMR is an individual self-report measurement tool that defines seven categories of music mood regulation strategy, each defined by a typical mood prior to music use, type of music activity, social aspects, and changes in mood following music use (Saarikallio, 2008). The strategies defined by the MMR are: Entertainment, Revival, Strong Sensation, Diversion, Discharge, Mental Work and Solace. Diversion, Solace and Discharge are all related to using music to cope with negative mood states. Discharge is defined by a negative mood such as anger or sadness prior to music use leading the individual to listen to aggressive or sad music, the outcome of which is that the negative feelings has been expressed; it is most similar to the IAAM scale RA, which includes items such as ''I listen to music when I really need to blow off steam'' (von Georgi et al., 2006). Solace is similar to Discharge in that it is defined by a prior negative mood and listening to music that reflects the negative mood, but has an outcome of the listener feeling comfort, while Diversion is a strategy used by listeners wanting to be distracted from negative thoughts, with an outcome of successfully forgetting the current mood (Saarikallio, 2007, p. 96).

These differences suggests that some listening strategies may be more successful than others in achieving affect regulation, a possibility which in turn may have implications for music therapy. According to the Cochrane review on the subject, in the treatment of depression, a small number of clinical studies have shown that the addition of music therapy, including a variety of methods, can offer improved results compared to standard care alone (Maratos et al., 2008). Erkkilä et al. (2011) conducted a controlled trial and found that depression symptoms, anxiety symptoms and general functioning were significantly improved in clients who participated in 3 months of clinical music improvisation sessions compared to those who received only standard care. Similar results regarding music improvisation were found by Albornoz (2011). Other treatment approaches in music therapy in psychiatric settings include assisted RX, songwriting, lyric analysis and pairing music and movement (Silverman, 2007). Since a strong dose-response relationship has been shown in the effectiveness of music therapy treatment with psychiatric illness (Gold et al., 2009), and given some of the difficulties in providing regular and consistent access to active treatment described by Silverman (2008), appropriate use of specific listening strategies by clients outside of the therapy session could prove beneficial to therapeutic outcomes by increasing clients' positive engagement with music. Such possible benefits might be parallel or complimentary to psychoeducational interventions, a meta-analysis of which by Donker et al. (2009) found to be effective in reducing anxiety and depression, and which recent study indicates may be as or even more effective in live music therapy contexts (Silverman, 2014, 2015). Furthermore, since many approaches to music therapy stress client-preferred music to music therapists in training as having the greatest benefit (Borczon, 2004), and clientpreferred music has indeed been shown to be more effective in, for example, pain management (Mitchell and MacDonald, 2006), it would be imperative for a music therapist to know whether and how a client's music listening outside of therapy might become maladaptive and even hinder treatment. The MMR, unlike the IAAM, has not yet been tested with clinical populations. However, since it distinguishes more finely between different ways listeners may use music to decrease negative affect, and because it was developed without regard to psychiatric symptoms, the MMR was considered to be more appropriate for this study, which is aimed at exploring the association of uses of music with maladaptive emotionality in the general population.

The topic of whether music-use might be unbeneficial or even counter-productive in therapy needs to be approached carefully, with an eye towards complexity derived from individual differences. In the past, concerns about adolescent suicide risks and violence lead to public speculation about the negative mental health effects of specific musical genres and the violence expressed in lyrics, but the studies resulted in mixed findings (Jones, 1997; Scheel and Westefeld, 1999; Lacourse et al., 2001), highlighting the danger of examining the relationship between music and mental health only in terms of the musical properties. Currently, a major shift in music research is taking place, with researchers acknowledging that the effects of music can not be understood only through the investigation of music as a stimulus but rather though investigating the ways that different individuals choose to engage with music (Garza Villarreal et al., 2011; Gold et al., 2013). Research now suggests that the selection of a particular affect regulation or coping strategy indeed plays an important role in defining the health-consequences of musical engagement (Miranda et al., 2012; Saarikallio, in press). Recent work has also shown that psychological features of the listener, such as stress reactivity (Thoma et al., 2012) tendencies for absorption and dissociation (Garrido and Schubert, 2011) and clinical depression (McFerran and Saarikallio, 2014), seem to moderate the connection between musical engagement and health-outcomes. For instance, some individuals listen voluntarily to sad music for pleasure, even when negative affect is sometimes experienced as a result, suggesting that personality traits affect both the likelihood that a person will choose to listen to sad music for pleasure, and whether the resulting affect is negative or positive (Garrido and Schubert, 2011, 2013). Research conducted with healthy populations typically suggests that listening to music that reflects one's negative mood is a fundamentally healthy act, and serves as a means for solace (Saarikallio and Erkkilä, 2007), distraction and reappraisal (Van den Tol and Edwards, 2013), and gaining increased insight of the affective state (Skånland, 2013). Depressed adolescents, on the other hand, have been shown to be prone to couple their listening to sad and aggressive music with ruminative tendencies, social isolation and inability to improve their mood (McFerran and Saarikallio, 2014). Since the use of the MMR Discharge strategy in music listening does not tend towards mood repair, it might be considered analogous to rumination as maladaptive regulation behavior. In the MMR strategy Diversion, however, the individual uses music to distract herself from negative thoughts or emotions; the use of Diversion may therefore indirectly indicate less use of rumination, or even be considered its opposite.

The existing research thus allows us to define normal and maladaptive non-musical cognitive strategies for affect regulation, as well as normal and maladaptive non-musical neural processes of affect regulation. The goal of the current research is to further define maladaptive music listening strategies, as well as maladaptive neural responses to music, and explore connections between these two. We defined a maladaptive neural response to music as one that is associated with increased negative or decreased positive emotional experience when compared to normal listening responses, such as could be related to common affective psychiatric symptoms like prolonged negative mood. Because our study involved a normal, non-clinical sample of participants, we did not focus our attention on disruptions of neural responses related to emotion other than valence, such responses that could be considered mania or psychosis, as these were unlikely given our sample. Our goal, per Thaut (2002) model, was to define neural responses to music and parallels with non-musicalresponses, with special attention to behaviors and responses that might be maladaptive. We expected neural responses related to maladaptive emotion regulation could be seen in our non-clinical sample, as previous studies have found differences in neural activation in participants with sub-clinical levels of depression (Felder et al., 2012). The vmPFC and vlPFC, have been previously associated with nonmusical affect regulation, including maladaptive regulation, leading us to hypothesize that neural correlates of maladaptive musical affect regulation would be visible in these areas. Specifically, our hypothesis was that relationships would exist between Discharge scores and increased depression, anxiety and trait neuroticism, the latter being considered as an established risk factor for mood disorders (Costa and McCrae, 1992; Hayes and Joseph, 2003). In turn, we hypothesized that there would be no such relationship with the MMR strategy Solace, despite the similarity of the strategies, and that Diversion strategy would also correlate negatively with mental illness and its risk factors and with maladaptive brain responses. Because previous literature has suggested that notable gender differences exist in affect regulation and neural responses between males and females, we chose to examine males and females separately in our analysis to prevent such possible gender differences from obscuring these exploratory results (Thayer et al., 2003; McRae et al., 2008; Mak et al., 2009; Joormann et al., 2012).

## Materials and Methods

## Behavioral Data Collection

#### Participants

A total of 123 participants (68 females), between the ages of 18 and 55 completed psychological testing. Participants' mean age was 28.8 (SD = 8.89 years). Participants were recruited over a period of 18 months, from around the Helsinki area using fliers and email lists. The majority of these participants were non-musicians (n = 68), while others were identified as amateur musicians (n = 38) or professional musicians (n = 20). Sixty-two of these participants also provided socio-economic information, allow for the calculation of their socio-economic status as indicated by the H index score (Hollingshead, 1975), which ranged from 17 to 66 with a mean of 36.85 (SD = 18.25), with no significant differences between males and females t(60) = 880, p = 0.382. The data collection, taking place in the arc of 15 months, was part of a larger project called Tunteet, including several experimental paradigms, psychological tests and even blood samples. Considering the complexity of the Tunteet protocol and the ability of participants to choose which parts of it to participate, not all the participants of the Tunteet project could be included in this study but only those for whom we obtained the relevant measurements. The full Tunteet protocol was approved by the local ethical committees of the Institute of Behavioural Sciences, University of Helsinki, and the Coordinating Ethical Board of the Uusimaa Hospital District.

### Behavioral Testing

Participants completed self-report measures for assessing psychological functioning and musical engagement on paper and pencil. The Montgomery-Åsbert Depression Scale (MADRS) was used to test for levels of depression, the Neuroticism subscale of the Big Five Questionnaire (BFQ) to test for levels of Neuroticism, and the Anxiety facet of the Hospital Anxiety and Depression Scale (HADS-A) to measure levels of anxiety. Music-related affect regulation was measured using the MMR (Saarikallio, 2008), from which the subscales for strategies of Diversion, Discharge, and Solace were used in the current study to test our hypotheses.

The BFQ assesses the traits defined by the Five Factor Theory of Personality: Openness, Conscientiousness, Extraversion, Agreeableness and Neuroticism. Participants rank their level of agreement from on a five point Likert scale with statements related to each domain (Caprara et al., 1993). Only the subscale results of Neuroticism were used in this study as relevant to our hypothesis, being it associated to a risk of mental problems (Hayes and Joseph, 2003). For assessing depression we used the MADRS which is a diagnostic test, the scoring of which allows clinicians to rank depression levels based on the participants' score between 0 and 60 points. Müller et al. (2000) correlated the MADRS to the Hamilton Depression Rating Scale in order to distinguish four levels of depression: none/recovered (1–8), mild (9–17), moderate (18–34), severe (>35) (Müller et al., 2000). Previous studies have also used the MADRS as a continuous measure (Raison et al., 2007). Although originally intended for clinical populations, previous studies have also used the MADRS to assess depressive symptoms in non-clinical populations, particularly at mild or subclinical levels (e.g., Unden et al., 2002; Van den Rest et al., 2008; Hidalgo et al., 2009; Sarkar et al., 2015). For anxiety assessment we used the HADS, a self-report measure designed to indicate the severity of depression and anxiety symptoms, and possible or probable cases of clinical disorders (Zigmond and Snaith, 1983) with demonstrated validity. In this study, the HADS was translated into Finnish from Swedish, resulting in some discrepancies in meaning, identified by native Finnish speakers. Because of this, only the Anxiety subscale (HADS-A) was used for this study. The HADS-A is scored from 0 to 21; presence of anxiety is classed as mild (8–10), moderate (11–14) or severe (>15). The HADS-A has also previously been used to measure anxiety symptoms in non-clinical (Carroll et al., 2000; Asbury et al., 2006; Van den Rest et al., 2008).

Psychological test scores were correlated to each other with parametric and non-parametric tests using SPSS version 22 (SPSS Inc., Chicago, IL, USA), running on Mac OS X 10.9.5.

#### fMRI Data Collection

#### Participants

A subset of 63 participants also agreed to participate in an functional magnetic resonance imaging (fMRI) scanning session. Inclusion criteria were an absence of hearing or neurological problems, and psychopharmacological medication. Seven participants were excluded from the analysis due to technical issues, excessive movements during scanning, or neuroradiological abnormalities. Recruitment was continuous until the desired number of participants for the experimental paradigm described below was reached. Of the remaining 56 participants (33 female) participants between the ages of 20 and 53 (mean age 28.5 years, SD = 8 years), eleven participants had played a musical instrument for at least 5 years, eight of whom continued to play music actively (at least 2 h/week). Twenty-nine of these participants also provided socio-economic information; their socio-economic status as indicated by the H index scores (Hollingshead, 1975) ranged from 17 to 66 with a mean of 33.62 (SD = 16.94), with no significant differences between males and females, t(27) = −0.458, p = 0.651.

#### Paradigm

The music stimulus of 30 excerpts (10 each representing happiness, sadness, and fear) was derived from the Soundtracks dataset for music and emotion developed at the University of Jyväskylä by Eerola and Vuoskoski (2011), which is publically available for download online.<sup>1</sup> Soundtrack music was considered appropriate for this study because it is composed with the intention of inducing emotional responses in listeners. For the current experiment, 10 representative excerpts were chosen for each of the happy, sad, and fearful categories, lasting 4 s each with 500 ms fade-in and fade-out. Short excerpts were considered appropriate for this study because previous studies have shown that emotion recognition occurs within 500 ms of hearing (Filipic et al., 2010), and have previously been used by Aubé et al. (2015). Both implicit and explicit processing of emotions conveyed through music was also considered, as affect regulation can be both a conscious and unconscious process (Thayer et al., 1994), resulting in six listening conditions in all, which are displayed in **Table 1**.

Each participant underwent a single data collection session. In two blocks, each listened to the excerpts in a randomized order and completed a task of either identifying the emotion expressed by the music (explicit processing), or identifying how many instruments, they heard in the excerpt (implicit processing). Prompted by text on the screen, participants were given three answer choices in each task: ''happiness, sadness or fear'' in the explicit block, and ''one, two or many'' in the implicit block. The questions were presented orally via an intercom prior to each block and visually on the screen. The three answers were then presented on the screen and remained on the screen throughout the entire block. Each music excerpt was followed by a 5 s answer period, during which participants answered by pressing one of three push buttons on a response box.

#### fMRI Data Acquisition and Analysis

fMRI data was collected at the Advanced Magnetic Imaging (AMI) Center at Aalto University, using a 3 T MAGNETOM Skyra whole-body scanner (Siemens Healthcare, Erlangen, Germany). An interleaved gradient echo-planar imaging (EPI) sequence (TR = 2 s; echo time = 32 ms; flip angle = 75◦ ) sensitive to blood oxygen level-dependent (BOLD) contrast was used to acquire 33 oblique slices allowing coverage of the whole brain (field of view = 192 × 192 mm; 64 × 64 matrix; slice thickness = 4 mm; spacing = 0 mm). Following the fMRI tasks, anatomical T1-weighted MR images (176 slices, field of

<sup>1</sup>https://www.jyu.fi/hum/laitokset/musiikki/en/research/coe/materials/ emotion/soundtracks/

#### TABLE 1 | Listening conditions in fMRI study.


view = 256 mm; 256 × 256 matrix; 1 mm × 1 mm × 1 mm; spacing = 0 mm) were collected.

Preprocessing and statistical analysis of fMRI data was performed using Statistical Parametric Mapping (SPM8; Wellcome Department of Imaging Neuroscience, London, UK) run using MATLAB (The MathWorks, Natick, MA, Inc.). Images for each participant were realigned to adjust for movement between slices, normalized spatially onto the Montreal Neurological Institute (MNI) template (6 parameters rigid body model, gray matter segmentation), and spatially smoothed using a Gaussian filter with an FWHM of 6 mm. Brain volumes were then screened to determine whether they met the criteria for high quality and scan stability as determined by small motion correction (<2 mm translation and <2 ◦ rotation). Data was filtered temporally at 128 Hz to minimize artifacts caused by the scanner. fMRI responses were modeled using a canonical, hemodynamic response function (HRF), and the six movement parameters were used as regressers of no interest.

Using SPM8, beta estimates for each participant's average BOLD response at each voxel were computed, and t-contrasts were used to compare whether a condition elicited a different brain activation compared to another. The BOLD response for each condition compared to baseline was computed for each participant.

To test the relation between emotion regulation in the brain and music regulation strategies, we focused our analysis on the PFC, in which emotional processes are controlled and regulated (Ochsner and Gross, 2005; Koenigs and Grafman, 2009). Regions of interest (ROIs) were calculated using the MarsBaR Toolbox (Brett et al., 2002). General linear model (GLM) analysis, done by Bogert et al. (submitted) showed significant activity in a large ROI centered on the mPFC (**Figure 1**). We used activity in this ROI for further analysis in combination with the behavioral data.

Data were analyzed using the SPSS statistical package. Behavioral data was explored using correlational analysis. In order to reduce the number of tests performed, we used ANOVA rather than correlation analysis to explore the relationships between behavioral and fMRI data, first in all participants and then separately in males and females. Mixed design ANOVAs were used, with three levels of music emotion (happy, sad or fear) and two levels of processing type (implicit or explicit) as withinparticipant factors, and high- and low-scoring groups according to MMR scores as between-participant factors. Two-tailed t-tests were used for more detailed assessment of between-participant effects in males and females separately.

## Results

#### Behavioral Data

MADRS, HADS-A and Neuroticism (as measured by the BFQ) results revealed a statistically normal, generally psychologically

FIGURE 1 | Medial prefrontal cortex (mPFC) area identified by general linear model (GLM) analysis of neural activation during music listening.

healthy sample. Twenty-three participants (19%) scored between 9 and 17 on the MADRS, meeting scale criterion for mild depression. Three participants scored above 20 on the MADRS, and could thus be considered moderately depressed. Eighteen participants (16%) scored between 8 and 10 on the HADS-A, and could thus be considered to have mild anxiety. Only one participant scored above 12, and could thus be considered to have moderate anxiety. Internal reliability of all tests was good, with Cronbach's α ranging between 0.70 and 0.85. The results and reliability scores of all tests are displayed in **Table 2**.

Results of all tests were correlated in all 123 participants. As expected, MADRS scores were significantly positively correlated with HADS-A scores, r = 0.53, p < 0.001, and with Neuroticism scores, r = 0.54, p < 0.001. Neuroticism correlated strongly with HADS-A scores, r = 0.62, p < 0.001. MMR Discharge scores were weakly but significantly correlated with HADS-A scores, r = 0.24 p = 0.007, and similarly correlated with Neuroticism scores, r = 0.20, p = 0.02. Neither MMR Diversion nor Solace correlated with MADRS, HADS-A, or Neuroticism scores.



<sup>∗</sup>Significant difference between males and females, p < 0.01.

Correlations were also determined for male and female participants separately to explore possible idiosyncrasies in uses of music and maladaptive emotionality depending on gender. Independent sample t-tests showed that there was a significant difference between females and males in Neuroticism scores, with females (M = 19.13, SD = 7.20) having a higher mean score than males (M = 15.43, SD = 7.84); t(119) = −2.69, p = 0.008. There was also a significant difference in MMR Diversion scores between females (M = 3.22, SD = 0.962), and males (M = 2.88, SD = 0.780); t(119) = −2.07, p = 0.04, with higher scores for females. There were no other significant differences between genders; in Discharge use, males (M = 2.56, SD = 1.06) were only slightly higher than females (M = 2.48, SD = 1.07).

Based on the gender differences in emotionality and in uses of music obtained in the t-tests, we performed separate correlation tests to explore associations between uses of music and emotional profiles in males and females. There were no significant correlations for female participants between MMR scores and other psychological scores. For male participants, positive correlations were found between HADS-A scores and MMR Discharge, r = 0.36, p = 0.007 (**Figure 2**), and between Neuroticism and MMR Discharge, r = 0.32, p = 0.02 (**Figure 3**).

#### Relationships Between fMRI and Behavioral Data

Since only Discharge correlated with anxiety and neuroticism scores in the previous analysis, and since only Diversion was found to be significantly different by gender, only these

two strategies were included in further analysis. T-tests showed no significant differences in MADRS, HADS-A, Neuroticism or MMT scores between this subset and the rest of the group (p ranged from 0.38 to 0.93), suggesting that this group could fairly be considered representative of the whole. In this group, however, there was no significant difference between Neuroticism scores, t(1,54) = −0.62, p = 0.53.

Participants were divided into high and low-scoring groups based on the median scores for Discharge and Diversion. Mixed design ANOVAs were performed on the data, using the three music emotions and the two processing types as within-participant factors; one test with high and low Discharge scorers as between participant groups, one test with high and low and Diversion scorers as between-participant factors, to compare differences in activation in the mPFC. Participants were divided into high scoring and low scoring groups to determine whether high or low scores had a significant effect on mPFC activation. No significant relationships were found between MMR scores and mPFC activation when including all participants in the ANOVA analysis. However, since previous literature has found significant differences between genders in neural activity during emotion regulation (Thayer et al., 2003; McRae et al., 2008; Mak et al., 2009; Joormann et al., 2012), and since our own analysis showed differences between males and females in Diversion use, and correlations with anxiety and Neuroticism with Discharge for males but not females, we conducted further analysis for males and females separately.

ANOVA results showed that, for mPFC activation, there was a significant main effect of Diversion score (low or high) in both females (F(1,31) = 5.66, p = 0.03) and males (F(1,21) = 5.34, p = 0.03). There was also a significant main effect of Discharge score on mPFC activation for males (F(1,21) = 8.65, p = 0.04) but not for females. **Figure 3** shows activation levels of the mPFC during the six different listening conditions for females and males separately, divided by high and low scores in Discharge and Diversion.

Further detail was obtained by performing a series of twotailed t-tests (see **Figure 4**), comparing the mPFC activity in low and high scorers in Discharge and Diversion within each of the six listening conditions, with males and females again analyzed separately. To reduce the number of tests performed, only Diversion was tested in females, since Discharge was not significant in the ANOVA. Two-tailed ttests showed a significant difference in mPFC activity between females scoring high or low in Diversion use during the implicit listening condition for all three types of music: happy, (t(31) = −3.11, p = 0.004) sad, (t(31) = −2.23, p = 0.03), and fearful (t(31) = −2.46, p = 0.02). These results indicate that females who had higher scores in Diversion experienced higher mPFC activity to emotional music.

For Discharge use in males, in turn, we did not obtain any significant differences in two-tailed t-tests. Instead, for Diversion use in males, we found a significant difference in the mPFC activation to fearful music in the explicit listening condition, t(21) = 2.56, p = 0.02. However, it can be noted that, under all conditions, males who more frequently used Discharge or

activation, which is shown above by listening condition.

Diversion had a general tendency towards decreases in mPFC activation during music listening.

## Discussion

The aim of this study was to explore potentially maladaptive behaviors and brain responses in the domain of music listening by defining brain responses to music related to mood regulation, and to exploring relationships between music mood regulation strategies and mental health outcomes. Our results suggest that the use of music as affect regulation is subject to individual differences, including differences such as anxiety, depression and Neuroticism levels that are also related to mental health. Furthermore, the results of this study highlight significant differences between gender both in behavior and neural responses to music in relation to affect regulation. These findings may be useful in guiding music therapy practice, particularly in terms of developing supplemental, psychoeducational methods, as well as informing future research into this area.

#### Behavioral Measures

The results of this study indicate a weak positive correlation between participant scores in the music mood regulation strategy Discharge, Neuroticism and anxiety measures in all participants. This effect is comparable in strength to significant results found by Gebhardt et al. (2014a, p. 488), which showed an increased tendency to use music for RX in psychiatric patients with lower functioning. Although the use of music for RX and the Discharge strategy are not comparable, the results are similar in their suggestion that increased psychological distress may lead individuals to seek relief through music listening more often.

There were significant differences between male and female scores in Neuroticism, in which females scored significantly higher than males as predicted by previous literature (Costa et al., 2001). There were also significant differences between males and females in the use of Diversion as a music mood regulation strategy, with females using music for Diversion more than males. According to Saarikallio (2007), adolescents described using Diversion as a regulatory tacit to prevent their minds from straying to negative or anxious thoughts (p. 99). One explanation for this may be that females are more likely to employ avoidance coping strategies than males (Matud, 2004).

No significant correlations were found between Discharge and depression scores in all participants or when divided by gender. This may be because Discharge, defined as it is by the expression of negative emotion, might be considered an externalizing behavior, while depression is often categorized as an internalizing pathology (Miranda et al., 2012). This may also, however, be because the participants in this study were generally not depressed, or had only subclinical (mild) levels of depressive symptoms, and a clinically diagnosed sample may prove more revealing in this regard. That Discharge was significantly correlated with both anxiety and Neuroticism suggests a relationship with negative mental health outcomes. That one of its characteristics is that sad or angry music is heard, along with its correlation with anxiety and Neuroticism, may suggest a similarity between Discharge and rumination, a maladaptive strategy.

The possibility that Discharge is a maladaptive strategy is further strengthened by the fact that Solace and Diversion were not correlated with anxiety and Neuroticism, though they too, are associated with negative affect prior to musical engagement. It is possible that Solace and Diversion are better mood regulation strategies than Discharge, and may even act as moderating or protective factors in the relationship between music listening and mental health risk. Gebhardt and von Georgi (2007) finding that clinical patients with affective disorders specifically use music for negative affect reduction less than others, may also support the interpretation that positive and effective music listening can have positive effects on mental health and its lack can reflect pathology. However, much more research would be needed to determine causation (Miranda et al., 2012).

These correlations persisted for males but not females when participants were examined by gender. McRae et al. (2008) suggested that affect regulation might be a more automatic process for males than for females. Although Discharge is not a suitable musical analog for CRA, it is worth considering that a person for whom affect repair is less conscious may be more likely to chose music to express rather than actively repairs negative mood; that is, chose Discharge over Solace or Diversion. Similarly, if affect regulation is more conscious for females, a female may be more likely to actively choose to divert attention away from negative affect. Knight et al. (2002) found that males experience greater emotional arousal and greater difficulty regulating that arousal in response to aggressiverelevant stimuli compared to females. Thus, males who chose to listen to aggressive music while in a negative affective state may indeed be prone to prolonging the negative state more than females. This suggests that, as a possibly maladaptive strategy, Discharge may be prevalent and more harmful in males.

## fMRI Results

Because a relationship between music use and risk for mental disorders was suggested by the results above, we further investigated Discharge and Diversion in relation to brain activity. Our aim was to explore neurological correlates of the use of these strategies. We focused our attention on the mPFC, which previous literature has shown to be active in the processing of emotional stimuli (Phan et al., 2005; Vuilleumier, 2005), and in the active in the suppression of negative mood (Ochsner and Gross, 2005; Phan et al., 2005).

Mixed design ANOVAs showed significant main effect of Diversion on mPFC activation in both females and males, and of Discharge on mPFC activation in males only. T-tests revealed that females with higher Diversion scores had higher levels of activation in the mPFC during music listening. Since PFC hypoactivity has been associated with depression (Koenigs and Grafman, 2009), including in subclinical and remitted cases (Kanske et al., 2012; Sarkar et al., 2015), mPFC increased activation in female participants who scored higher on Diversion lends support to the idea that Diversion may be considered an effective, healthy listening strategy.

The only significant t-test for male participants showed that males who scored highly in Diversion use had significantly lower activation of the mPFC while explicitly attending to fearful music. However, the significant result of the ANOVA for males based on Discharge despite no significant t-tests, and the trend towards lower mPFC activation in high scorers in Discharge, suggests a broad tendency of Discharge to be associated with decreased mPFC activity in males. The same tendency, however, was observed in male high scorers in Diversion. This may present a problem for the hypothesis that Diversion is an effective form of music mood regulation. It may also point to importance gender differences in emotion processing and affect regulation. Males have previously been shown to have different neural responses to fearful stimuli compared to females (Schienle et al., 2005). Males have also been shown to have less PFC activation compared to females during affect regulation (McRae et al., 2008). More research is needed to clarify the role of gender in the current results.

That t-tests were generally not significant for male participants, although ANOVA results were significant, may suggest that the listening task was less determinant of males' mPFC activation than other factors, including Discharge use, which would be in line with previous research showing neural responses differences based on habitual emotion regulation strategy (Kanske et al., 2012). However, because here we only used questionnaires rather than an experimental manipulation, it cannot be concluded whether these neural underpinnings are caused by repeated Discharge, anxiety and Neuroticism, or some other factor common to all three.

#### Music Therapy and Future Research

Espousing the philosophy that basic music psychology research can be used to support, strengthen and inform clinical music therapy practice (Thaut, 2002), the results of the current study has aimed to contribute information about neural responses to music, non-clinical behavioral and cognitive engagement with music, and their respective relationships with mental health outcomes. The current results may be particularly applicable to psychoeducational treatment methods. However, further research is needed to determine the applicability of these results to other forms of music therapy.

Further research would of course be required to determine a causal link between Discharge use and mental health risk. Other types of research, including longitudinal study and qualitative investigations, could provide deeper and more nuanced understanding of the role of Discharge in psychological processes related to mental illness and mental health. Study involving clinical samples, and with careful controls for gender, could add much to the current results in terms of understanding the relative efficacy of MMR listening strategies. Finally, specific clinical music therapy studies are imperative to determining the efficacy of music therapy in applying these and further findings.

#### Limitations

Several limitations need to be noted regarding the current study. First, there were no explicit measures of non-musical mood regulation behavior in the participants; connections between MMR results and non-musical types of affect regulation have been therefore be based on theory and previous findings, where empirical findings would be stronger and should be a topic of further study. As this study only examined neural responses in the mPFC, it likely presents an incomplete picture of the relationship between music mood regulation strategies and neural responses to music. Further study could also include clinical samples, which would shed more light on therapeutic applications of the use of music for mood regulation. As this sample was recruited only from the Helsinki area, it is possible that cultural homogeneity had some influence on results, although there are similarities in the CP of music across cultures (Krumhansl et al., 2000). The relatively young sample also limits the generalizability of the results for older populations. However, as suggested by Gebhardt and von Georgi (2007), the use of music to increase positive affect may be a behavior associated with younger populations in general, thus allowing the current findings to be, if not completely generalizable, acceptably relevant. The presence of professional musicians in the sample, as well as the broad recruitment measures, should also be taken into consideration in terms of the generalization of the results. Because, as McRae et al. (2008) showed, the PFC is differently activated in males compared to females during affect regulation, it cannot be said for certain whether these results merely reflect gender differences rather than Dischargeuse, anxiety or Neuroticism. Another limitation is that, to avoid Type II errors, we chose to limit the number of statistical tests performed as much possible rather than to correct for multiple comparisons, which may of course have resulted in Type I errors. However, as there is theoretical support for our findings, we believe that they are unlikely to be largely spurious.

## Conclusion

This study has shown the possibility that an individual's use of music, particularly in response to negative affect, may relate to his or her mental health, as evidenced by correlations between listening tendencies as mental health outcomes. These results suggest Discharge may be an ineffective or harmful listening strategy in response to negative affect, while Solace and Diversion may provide more effective mood regulation. This result may encourage music therapists to explore healthy music-based affect regulation with their clients, and to help clients identify ineffective or harmful listening strategies, and to develop and inform active interventions based on these identified cognitive strategies of music listening.

These results suggest that the mPFC is one key area in the processing music-emotions, and is activated more strongly in females who tend to use Diversion, but was decreased in males who scored highly on Discharge during the listening task overall. These differences point to possible neuro-mechanisms explaining the relative efficacy of Diversion and opposed to Discharge as an affect regulation strategy, implicate gender differences that may be reflective of different neural or psychological functioning of males and females, and require careful exploration future research.

## Acknowledgments

We would like to thank Benjamin Gold, Johanna Nohrström, Taru Numminen-Kontti, and Mikko Heimola for help with subject recruitment and data collection, Jussi Numminen for

## References


screening the MR images, Marita Kattelus, Jurki Mäkelä and Toni Auranen for help and consultancy regarding fMRI data acquisition, Teppo Särkämö and Tiziana Quarto for assistance with scoring the psychological tests, Mari Tervaniemi and Mikko Sams for help in several stages of this project. This work was financially supported by the Academy of Finland (project numbers 272250 and 274037) and the Signe and Ane Gyllenberg Foundation.


in healthy subjects. Psychiatry Clin. Neurosci. 63, 283–290. doi: 10.1111/j.1440- 1819.2009.01965.x


mental well-being in older subjects: a randomized, double-blind, placebocontrolled trial. Am. J. Clin. Nutr. 88, 706–713.


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Carlson, Saarikallio, Toiviainen, Bogert, Kliuchko and Brattico. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The neurochemistry and social flow of singing: bonding and oxytocin

Jason R. Keeler <sup>1</sup> , Edward A. Roth<sup>1</sup> \*, Brittany L. Neuser <sup>1</sup> , John M. Spitsbergen<sup>2</sup> , Daniel J. M. Waters <sup>2</sup> and John-Mary Vianney <sup>2</sup>

*<sup>1</sup> Brain Research and Interdisciplinary Neurosciences Laboratory, School of Music, Western Michigan University, Kalamazoo, MI, USA, <sup>2</sup> Department of Biological Sciences, Western Michigan University, Kalamazoo, MI, USA*

Music is used in healthcare to promote physical and psychological well-being. As clinical applications of music continue to expand, there is a growing need to understand the biological mechanisms by which music influences health. Here we explore the neurochemistry and social flow of group singing. Four participants from a vocal jazz ensemble were conveniently sampled to sing together in two separate performances: pre-composed and improvised. Concentrations of plasma oxytocin and adrenocorticotropic hormone (ACTH) were measured before and after each singing condition to assess levels of social affiliation, engagement and arousal. A validated assessment of flow state was administered after each singing condition to assess participants' absorption in the task. The feasibility of the research methods were assessed and initial neurochemical data was generated on group singing. Mean scores of the flow state scale indicated that participants experienced flow in both the pre-composed (*M* =37.06) and improvised singing conditions (*M* = 34.25), with no significant difference between conditions. ACTH concentrations decreased in both conditions, significantly so in the pre-composed singing condition, which may have contributed to the social flow experience. Mean plasma oxytocin levels increased only in response to improvised singing, with no significant difference between improvised and pre-composed singing conditions observed. The results indicate that group singing reduces stress and arousal, as measured by ACTH, and induces social flow in participants. The effects of pre-composed and improvised group singing on oxytocin are less clear. Higher levels of plasma oxytocin in the improvised condition may perhaps be attributed to the social effects of improvising musically with others. Further research with a larger sample size is warranted.

Keywords: singing, oxytocin, social flow, ACTH, improvisation, music, bonding, trust

## Introduction

People often report a feeling of connectedness during music experiences, either as a listener or a performer. Musicians often discuss "feeling lost in the music" or "finding the groove" during improvisatory experiences (e.g., "trading fours" in a jazz performance) and audience members frequently share this sense of cohesion through a commitment to the music (Pitts, 2004; Hytönen-Ng, 2013). The colloquial phrases used to describe this engagement and connectedness within music experiences align with the theoretical construct of flow, which is an optimal psychological

Edited by:

*Julian O'Kelly, Royal Hospital for Neuro-disability, UK*

#### Reviewed by:

*Graham Frederick Welch, UCL Institute of Education, UK Suzanne B. Hanser, Berklee College of Music, USA*

#### \*Correspondence:

*Edward A. Roth, Brain Research and Interdisciplinary Neurosciences Laboratory, School of Music, Western Michigan University, 1903 W Michigan Ave., Kalamazoo, MI 49008-5434, USA edward.roth@wmich.edu*

> Received: *25 June 2015* Accepted: *07 September 2015* Published: *23 September 2015*

#### Citation:

*Keeler JR, Roth EA, Neuser BL, Spitsbergen JM, Waters DJM and Vianney J-M (2015) The neurochemistry and social flow of singing: bonding and oxytocin. Front. Hum. Neurosci. 9:518. doi: 10.3389/fnhum.2015.00518* state in which a person is completely absorbed in the task at hand (Csikszentmihalyi, 1975). When experienced in group settings, flow holds the potential to facilitate social connection, but the present research base examining social flow and interpersonal connection in music experiences is limited (Hart and Di Blasi, 2015). Additionally, little is known about the neurochemical processes that facilitate social bonding during group music experiences. Over the past two decades, neuroscience research in music has relied heavily on neuroimaging, mapping regions of the brain active during music production and perception. It is only more recently that the neurochemical responses to music have been investigated. Chanda and Levitin (2013) review the chemical and biological effects of music and express a strong need for further research. Current evidence suggests that music's effects on health and well-being may be modulated through engagement of neurochemical systems (Chanda and Levitin, 2013; Fancourt et al., 2014). In particular, group singing has demonstrated positive effects on emotional states and biological outcomes, implicating the neuroendocrine system as a potential underlying mechanism (Kreutz et al., 2004; Kreutz, 2014; Fancourt et al., 2015). The neuropeptide oxytocin may in part be responsible for the social and health benefits of music, while adrenocorticotropic hormone (ACTH) may mediate the engagement and arousal effects of music (Chanda and Levitin, 2013; Kreutz, 2014). These physiological processes may consequently influence the subjective experience of social flow and perception of social connection during music experiences.

## Oxytocin and ACTH

Oxytocin is a neuropeptide produced by large neuroendocrine cells of the supraoptic and paraventricular nuclei of the hypothalamus. Oxytocin is transported from the large neuroendocrine cells to the posterior lobe of the pituitary gland, where it is subsequently released into the bloodstream as a hormone. The paraventricular nucleus (PVN), where oxytocin synthesis is most concentrated, coordinates signals from the brain in response to stress and controls the hypothalamicpituitary-adrenal (HPA) axis (Herman, 2012). Neurons in the PVN release corticotropin releasing-factor (CRF), which promotes the secretion of ACTH into peripheral circulation. ACTH is a neurohormone that stimulates the synthesis and release of glucocorticoids, such as cortisol, from the adrenal gland (Grossman et al., 1982). Oxytocin is colocalized with stress hormones in the PVN and has suppressive effects on the HPA axis, including ACTH (Gibbs, 1986; Windle et al., 2004; Carter, 2014).

The word oxytocin is derived from the Greek words meaning "quick birth." In humans, functions of oxytocin were originally associated with maternal behaviors such as mother-infant bonding, breast-feeding, and uterine contractions (Takahashi et al., 2013). More recent findings reveal the broader scope of oxytocin in human social and emotional behaviors, with effects that are highly dependent on context and individual traits (Bartz et al., 2011). Oxytocin mediates social behavior (Heinrichs et al., 2009) and regulates stress and anxiety (Ditzen et al., 2009). Depending on the context and the individual, it is hypothesized that oxytocin may elicit positive or negative social emotions (Bartz et al., 2011). Under optimal circumstances, oxytocin increases trust (Kosfeld et al., 2005) and is associated with a parent's social attachment to their children (Feldman et al., 2010). While oxytocin production in humans was originally believed to increase only in response to direct physical contact, mothers bonding with their infants demonstrated higher plasma oxytocin levels from vocalizations alone (Leslie et al., 2010). The extended period of nurturing facilitated by oxytocin, as well as it's role in reproductive behavior and physiologic functions, indicate it's importance in human social and intellectual development (Carter, 2014). Research linking social behaviors to positive health and disease outcomes implicates oxytocin as a primary physiologic mechanism (Uchino, 2006).

ACTH may mediate the engagement and arousal effects of music (Chanda and Levitin, 2013). ACTH, which mediates attention (Sandman et al., 1975, 1977) and distress (Mauri and Volpe, 1994), responds to various types of challenges or pain perceived in higher levels of the brain (Herman, 2012). While there appears to be a general consensus among studies that music listening enhances oxytocin synthesis, the role of ACTH is less clear. Preliminary evidence suggests that listening to stimulating music, such as techno, increases plasma ACTH while relaxing music reduces ACTH synthesis and circulation (Gerra et al., 1998). ACTH is examined in this study because it responds to stimuli in seconds (Weijnen and Slangen, 1970). The short duration of each singing condition in this study indicate that ACTH may be implicated in behaviors of arousal and attention. To date, no studies were found that examined both ACTH and oxytocin in active music production.

### Social Affiliation and Engagement in Music

A handful of studies have examined endogenous oxytocin during music production and perception. Postoperative patients listening to relaxing music through headphones demonstrated an increase in serum oxytocin and reported higher levels of relaxation compared to a control group with no music (Nilsson, 2009). Choral singing has been shown to increase salivary oxytocin and elicit positive emotional states (Kreutz et al., 2004; Kreutz, 2014). In professional and amateur singers, peripheral oxytocin increased after an individual 45 min singing lesson, however, music parameters were not identified and non-musical interactions during the lesson may have influenced outcome measures (Grape et al., 2002). In that same study, amateur singers demonstrated a decrease in post-singing levels of cortisol, while professional singers showed the opposite trend, pointing toward higher levels of perceived stress and arousal in the professional singers. This trend is consistent with the recent findings of Fancourt et al. (2015), where low-stress singing without an audience reduced levels of salivary cortisol and cortisone, and high-stress singing in front of a large audience increased levels of both glucocorticoids. Depending on individual traits and context, it is possible that singing may be perceived as a stressful experience with corresponding biological responses. In general, singing appears to have potential benefits on psychological health and well-being, with additional indications of potential physical benefits (Clift et al., 2010). More research is needed, however, to identify the underlying mechanisms linking singing to physical and psychological health.

While there is little information on the neurochemistry of singing, previous research has demonstrated the effects of singing on behavioral and self-reported outcomes. Group singing produced the highest scores on trust and cooperation compared to other group activities, as measured by a trust and dilemma game (Anshel and Kipper, 1988). In those with mental illness, singing has been found to increase mental health, well-being, and social skills (Clift and Morrison, 2011). Children's sense of inclusion and belonging with their peers was positively correlated with their singing abilities in a longitudinal study on the social impact of music (Welch et al., 2014). This falls in line with the theory that music has evolved as a means of social bonding, with evolutionary roots in parent-infant attachment (Freeman, 1998). Therefore, the belief that oxytocin plays a large role in the social and health benefits of music appears to be supported by previous behavioral findings.

#### Social Flow

The concept of flow, frequently referred to as flow state, was first introduced to the field of positive psychology in 1975 by Csikszentmihalyi. According to Csikszentmihalyi, to experience flow is to experience an optimal psychological state (1975). People experiencing flow find themselves completely immersed in the present activity, so intensely focused that all unrelated thoughts and emotions seemingly disappear from their conscious being, allowing for efficient yet effortless execution of thoughts and actions. Autotelic in nature, the flow experience is both enjoyable and intrinsically rewarding (Csikszentmihalyi, 1975, 1997). Though the primary focus of flow theory lies in the flow experience itself, positive consequences of flow including increased motivation, creativity, efficacy, and subjective wellbeing have been observed (Csikszentmihalyi and LeFevre, 1989; Jackson et al., 2001; Mugford, 2004; Fritz and Avsec, 2007; Engeser, 2012; Salanova et al., 2014).

In recent years, researchers have expressed a growing interest in the concept of social flow, which involves not only optimal performance, but also optimal interaction with others (Csikszentmihalyi, 1975; Bachen and Raphael, 2011; Engeser, 2012). Social flow has been studied in various contexts including athletic, occupational, and familial environments. Though none of these areas have been extensively researched, the literature currently suggests that frequency of social flow experience is positively correlated with the quality of interpersonal relationships (Rathunde, 1997; Bakker et al., 2011; Salanova et al., 2014). Underlying the neurochemistry of social flow may be oxytocin and ACTH, which we explore in this study.

#### Social Flow and Engagement in Music

The literature also indicates that various music-related tasks such as music listening, composition, and performing are conducive to facilitating solitary and social flow. Colloquial phrases such as "feeling the groove" and "lost in the music" frequently arise in everyday conversation amongst musicians. Current evidence suggests that these phrases may actually be related to flow states experienced within a musical context (MacDonald et al., 2006; Baker and MacDonald, 2013; Diaz, 2013; Wrigley and Emmerson, 2013). One of the most frequently mentioned musical genres in music-related flow literature is jazz. Jazz musicians frequently report feelings associated with flow experiences such as intense oneness with their musical product and the merging of individual musicians to form a single cohesive entity when performing, particularly when improvising and exploring new sounds (Hytönen-Ng, 2013).

Walker (2010) explored differences between flow experienced in low-interdependent and high-interdependent tasks in the context of a singles and doubles racket sport game. He found that, though the number of consecutive volleys did not differ across conditions, the high-interdependent task was rated as more challenging and the flow experience was reported to be more intense than in the low-interdependent task. In a musical context, participation in a standard performance of a precomposed piece may be considered a low-interdependent task whereas playing or singing in an improvisatory manner can be viewed as a high-interdependent task. Improvisation requires clear communication and cooperation from all group members. Each individual member is challenged to use both their technical competence and artistic instinct in creating an innovative sound within a given musical structure, maintained by the group as a whole (Sawyer, 2006; Rogers, 2013). To date, no experimental studies have been found that examined flow in the context of group music improvisation.

As a feasibility study, a primary purpose was to evaluate the design and methodology. In a recent review, LaGasse (2013) highlights the importance of pilot and feasibility studies in music therapy, especially in previously unstudied areas. The appropriateness of the methodology was assessed by utilizing a small sample size and previously validated data collection procedures. It is hoped that preliminary findings from this study will inform future researchers on the feasibility of neurochemical and flow state data collection in vocal music production. We compared a standard, pre-composed singing performance with an improvised performance of the same song, and hypothesized that vocal improvisation would elicit higher measures of social flow and bonding when compared to a pre-composed vocal performance.

## Methods

### Participants

Four participants (2 males and 2 females) were conveniently sampled based on the following inclusion criteria: jazz vocalists, students at a large Midwestern American university, and over 18 years of age. We included jazz vocalists to explore the neurochemistry and social flow of vocal improvisation within a group context. Vocal quartets are common in jazz music, and therefore, the group structure was familiar to participants and limited extraneous challenges in a controlled setting. Participants were excluded from the study if they met any of the following criteria: medical or psychiatric illness, smoking more than 15 cigarettes per day, drug or alcohol abuse, weighing less than 110 lbs., bleeding disorders (e.g., hemophilia), and pregnancy. Participants were asked to abstain from food and drink (other

than water) 2 h before the experiment and from smoking, caffeine, and alcohol 24 h before the experiment.

#### Design

A mixed design using repeated measures was utilized to explore the effects of pre-composed and improvised group singing on social flow and neurochemical measures of connectedness and arousal (see **Figure 1**). Participants formed one group and performed the same song together in two conditions. The first condition was a performance of the music as it was written without improvisation or further embellishment of the melody, referred to from this point forward as "standard performance" (SP). The second condition was a performance of the music that followed the syntactical harmonic structure (chord changes) of the composition, with improvised melodies, referred to from this point forward as "improvised performance" (IP). In each condition, pre and post-tests measured plasma oxytocin and ACTH, and a post-test survey assessed the level of social flow experienced by participants. Based on the brief duration of each performance and the short half-life of plasma ACTH and oxytocin, a 30 min washout period was utilized between conditions to allow neuropeptide levels to return to baseline. The reported half-life of plasma ACTH is approximately 10 min (Yalow et al., 1964). Similarly, the half-life of plasma oxytocin is estimated at 5–10 min (Amico et al., 1987). In healthy subjects, oxytocin levels peaked after only 5–8 min of music listening, with plasma levels returning toward baseline after 7–10 min (Dai et al., 2012). In a different study examining music perception, researchers utilized a 10 min washout period prior to obtaining baseline measures of plasma ACTH (Evers and Suhr, 2000). Therefore, it was estimated that a period of 30 min between conditions would allow neuropeptide levels to return to baseline. This study was reviewed and approved by the Human Subjects Institutional Review Board at Western Michigan University. Informed written consent was obtained from all participants involved in the study.

#### Musical Considerations

All musical decisions were made in collaboration with the university's school of music vocal jazz director, who was familiar with the participants' skill level and repertoire. The jazz standard "Centerpiece" (Edison and Hendricks, 1958) served as the musical content for both the standard and improvised conditions. The vocal jazz director created two vocal quartet arrangements of the piece, one for the standard performance condition (SP) and one for the improvised performance condition (IP). The SP arrangement was sung as written, with no improvisation or embellishments. The IP arrangement began with the unison singing of the original melody and then allowed time for each participant within the group to improvise over the basic harmonic structure of the original song.

#### Blood Draws

Six milliliters of blood was drawn from the antecubital vein into 6 ml EDTA lavender top tubes containing 5.0 mg EDTA and 2.500 KIU aprotonin. A sterile field was maintained using a Vacutainer blood draw kit. Whole blood was immediately placed on ice and transported to the adjacent lab, where it was centrifuged at 1200 rpm for 12 min at 4◦C. Six milliliters of whole blood yielded approximately 2 ml of plasma per participant. Plasma was aliquoted into microtubes and stored at −70◦C until analysis.

#### Measures

Oxytocin and ACTH concentrations were determined using enzyme-linked immunosorbent assays kits produced by Enzo Life Sciences, Inc. (Farmingdale, NY, USA). The oxytocin kit has been previously validated for human plasma using various methods, including mass spectrometry (Carter et al., 2007). Sensitivity for oxytocin and ACTH was 15.0 pg/mL and 0.46 pg/mL, respectively. All samples were run in duplicate. Oxytocin plasma samples were diluted 1:8 with assay buffer and run unextracted. Both assays were run according to manufacturer instructions. Due to the small sample size, only one plate was needed to analyze each neuropeptide. The intra-assay coefficients of variations for ACTH and oxytocin were 15 and 9%, respectively. All tests were performed in collaboration with the Department of Biological Sciences at Western Michigan University.

There is controversy surrounding the measurement of oxytocin on unextracted samples. In human fluids, there may be interference from various proteins and other substances leading to unreliable measurements (McCullough et al., 2013). However, it is also argued that extraction protocols may underestimate peripheral oxytocin concentrations, as a majority of oxytocin is lost during the extraction process (Martin and Carter, 2013). To avoid precipitation of oxytocin in the blood, samples were run unextracted using a previously validated, sensitive, and specific commercially-available kit.

Social flow was measured using the Flow State Scale-2 (FSS-2; Jackson et al., 2010), a 36-item questionnaire that assessed individual's perceived level of flow within a specific event. The FSS-2 is a post-event assessment and was administered immediately following post-test blood draws in each condition. The 36 items (questions) in the FSS-2 reflect Csikszentmihalyi's definition of flow, with four items directly addressing each of the

following nine flow dimensions: Challenge-skill balance, merging of action and awareness, clear goals, unambiguous feedback, concentration on the task at hand, sense of control, loss of selfconsciousness, transformation of time, and autotelic experience. Participants were directed to respond to each statement using a 5-point Likert scale in which a score of 1 indicated "Strongly Disagree" and a score of 5 indicated "Strongly Agree." The FSS-2 was scored by first calculating the total raw score for each dimension. Each raw dimension score was then divided by 4 to compute the average dimension score. The sum of all average dimension scores provides the total scale score. Group means for each dimension and total scaled scores were used to determine the level of social flow experienced. The maximum total scale score of 45 signifies the ultimate flow experience. The minimum total scale score of 9 signifies no flow. The FSS-2 has demonstrated strong construct validity and reliability across various physical activities including yoga, basketball, soccer, running, and football (Jackson and Eklund, 2002; Jackson et al., 2008). The FSS-2 has also been shown to be valid and reliable in measuring flow during live music performance, with Cronbach's alpha ranging from 0.81 to 0.92 for each of the nine dimensions (Wrigley and Emmerson, 2013).

#### Procedure

Participants (n = 4) formed one group together (a vocal quartet). At the beginning of the experiment, the consent form was reviewed and participants received a brief overview of the procedures. Participants were instructed to avoid physical contact during the experiment. SP pre-test blood draws were then conducted for two participants at a time in a separate room. The remaining two participants received the SP pre-test blood draw after each previous participant was finished. Given the labile nature of oxytocin and ACTH, two phlebotomists were used to expedite the blood collection process and minimize potential protein breakdown after whole blood was placed on ice. Following the SP pre-test blood draws, the vocal jazz director provided participants with brief musical instructions lasting approximately 5 min. Participants were instructed to sing their respective part of the music as it was written, without any embellishment or improvisation. Immediately following the instructional period, participants performed the standard piece together as it was written, with accompaniment provided on the piano by the aforementioned vocal jazz director. Immediately following the standard performance, which lasted 5 min and 38 s, participants were called in pairs to the separate room where individual post-test blood draws were conducted. All participants confirmed that they were able to proceed without ill effects from the blood draws and were then escorted individually to nearby but isolated rooms where they were provided with a paper copy of the FSS-2 and a pencil. Each individual was instructed to complete the FSS-2 according to the directions located at the top of the test-page and upon completion of the survey, take time to rest before completing the second performance. Following the 30 min test-and-rest period, surveys were submitted and individual pre-test blood draws for the improvised performance were conducted. Following the same format as the SP condition, participants received 5 min of instructions prior to the improvised performance. Participants performed the improvised piece together with extensive embellishment and improvisation. The duration of the improvised performance was 6 min and 1 s. Immediately after the improvised performance, individual blood draws were conducted and participants were again given the flow-state survey to complete in separate rooms. Following completion of the surveys, participants were provided with follow up information and thanked for their time.

#### Results

Here we present initial data on social flow, oxytocin and ACTH in standard and improvised group singing. The successful implementation of procedures and collection of data indicate the feasibility of methods used in this study, which serve as our primary outcomes. Given the small sample size, this study was not sufficiently powered for statistical analyses. As such, the statistical analyses presented here are primarily for demonstrative purposes, to inform future research in this area on possible data trends and analyses. Descriptive and parametric statistical analyses were performed using SPSS (IBM) version 22.

## FSS-2

Raw FSS-2 scores indicate that participants experienced social flow in both the standard and improvised conditions. The mean total scaled score in the SP condition, with a maximum possible score of 45, was 37.06 (n = 4). The mean total scaled score in the IP condition, with a maximum possible score of 45, was 34.25 (n = 4). Mean scores for each dimension of flow for the SP and IP conditions are displayed in **Table 1**.

TABLE 1 | FSS-2 Dimension scores in standard and improvised conditions.



TABLE 2 | Descriptive statistics of neurochemical variables, adrenocorticotropic hormone (ACTH), and oxytocin, at each time point in the standard (SP) and improvised (IP) conditions (n = 4).

## Oxytocin and ACTH

Descriptive data for oxytocin and ACTH collected at each time point is presented in **Table 2**. Both the standard and improvised conditions revealed a mean decrease in plasma ACTH after participants sang together. The change in ACTH in the SP condition pre-to-post test (p < 0.05) was 21% greater than the pre-to-post test ACTH outcomes for the IP condition (p > 0.05). **Figure 2** depicts individual changes in ACTH at each time point. Student's t-test of oxytocin concentrations in the SP condition revealed a mean decrease of 10 pg/mL, while the IP condition demonstrated a mean increase of 27 pg/mL. **Figure 3** depicts individual changes in oxytocin concentrations across all time points. In **Figure 4**, individual changes in oxytocin and ACTH are compared at each time point to reveal an inversely proportional trend in 3 out of 4 subjects.

## Discussion

### Feasibility of Experimental Procedures

The primary purpose of this study was to evaluate the feasibility of experimental research procedures that were designed to investigate the neurochemical processes and social

flow experiences associated with group singing. The successful implementation of procedures and data collection indicates the feasibility of the methods employed in this study, which may help future researchers in exploring the relationships between music improvisation, neurochemical activity, and social flow experiences.

Though the drawing of blood in between the completion of the task and the distribution of the FSS-2 is a potential confounding variable in regards to the assessment of flow state, the inclusion of this element allowed the researchers to evaluate the feasibility of efficiently and effectively gathering both selfreport and neurochemical data within a single condition. Based on FSS-2 scores and participant responses during the debriefing period, the procedures were effective in facilitating flow for student musicians and the blood draw did not appear to affect the subjective reports of flow. It is also important to acknowledge that seven out of the eight flow scales were completed within 15 min of the end of the task in question, as it is recommended that the FSS-2 be administered as soon as possible following participation in the activity being assessed (Jackson et al., 2010).

One participant required additional time during the final blood draw, as the phlebotomist was unable to access the

antecubital vein upon the first few attempts. While the blood draw was successful after several minutes, there are inherent challenges in drawing blood from human subjects, such as syncope, that are best controlled for with a larger sample size. We were able to determine, however, that group singing was not impeded by the controlled environment and data collection procedures.

#### Social Flow in Music Improvisation

Raw FSS-2 scores indicate that social flow was experienced in both the standard and improvised conditions. Raw scores also indicate that the experience of flow was slightly greater during the standard performance than in the improvised performance. Previous research has found that creative demand within a task is positively correlated with level of flow (Baker and MacDonald, 2013). In the present study, the improvisation task was structured in a way that required more creative input than the standard performance, but did not appear to facilitate deeper flow experiences.

One particular dimension of flow, loss of self-consciousness, appeared to contribute most to the difference in flow experience between conditions as the mean dimension score was 1.32 points lower on the 5-point Likert scale during improvisation than during standard performance. This observation aligns with the findings of Wrigley and Emmerson (2013), who found that the loss of self-consciousness dimension received the lowest mean score in comparison to the other eight flow dimensions in undergraduate vocal students. While the creative demands in improvisation tasks may appear more conducive to social flow from a challenge-skills perspective, it must also be recognized that creative output in group settings requires a certain level of self-disclosure amongst group members, which may elicit anxiety and inhibit flow. The decrease in ACTH concentrations after singing in both conditions indicates a potential relationship between flow state and stress, which warrants further investigation.

In addition to the increased levels of self-disclosure in improvisation, the participants' level of flow may have been affected by their skill level and past musical experiences. For example, one participant, during debriefing discussion, mentioned that she did not improvise as frequently as the other participants and she felt unsure of herself. However, when presented with a follow up question, she stated that she did not feel as anxious in this setting as in past experiences. This may be, in part, because participants were repeatedly told that the researchers' primary focus was on the nonmusical components rather than the quality of the musical product.

Another participant mentioned that he found himself having to put forth more effort during the improvisation, stating, "When it was spontaneous, I actually had to do a lot more thinking." This statement conflicts with previous reports of professional jazz musicians, who state that when they are improvising they frequently get "in the musical zone" and actions and sounds seem to naturally occur with little felt effort. Some professionals also describe their most memorable improvisation experiences as being surreal (Hytönen-Ng, 2013). This inconsistency between studies sheds light on the importance of accounting for the skill level and past experiences of musicians, as the experiences of student musicians such as those who participated in this study, may differ from those of professional and more experienced musicians.

## Oxytocin and ACTH

We hypothesized that group singing in both conditions would decrease stress and arousal, as measured by ACTH, and increase social bonding, as measured by oxytocin. Despite the small sample size, there was a significant decrease in pre-to-post levels of ACTH during the standard singing performance. This is consistent with previous literature demonstrating the positive effects of music on stress and the immune system (Bittman et al., 2001; Chanda and Levitin, 2013). Singing has been shown to reduce cortisol levels depending on context and individual traits, and is generally associated with psychological health and wellbeing (Grape et al., 2002; Clift et al., 2010; Fancourt et al., 2015). This is the first time that ACTH has been examined during active music production, as opposed to music listening, and the results appear to be supported by previous findings. The improvised singing performance demonstrated a minor decrease in ACTH from pre-to-post levels, which perhaps may be attributed to low concentrations observed at the IP pre-test. This may have been caused by carry-over effects from the first condition (see recommendations for future research).

As expected, mean concentrations of oxytocin increased during the improvised condition. Vocal improvisation naturally elicited behaviors conducive to social bonding, such as listening, responding, spontaneous communication, eye contact, and cooperation. Surprisingly, the standard performance demonstrated a mean decrease in oxytocin concentrations. This should be interpreted with caution due to the small sample size and influence of individual traits on plasma oxytocin levels (Bartz et al., 2011). Studies with a larger sample size have demonstrated significant increases in peripheral oxytocin after choral singing and individual singing lessons (Grape et al., 2002; Kreutz, 2014). The high variability of oxytocin concentrations observed among participants in the present study can be addressed through a larger sample size. A sufficiently powered study using the same design as the current study (1 – β = 0.80, effect size > 0.25, alpha < 0.05) would require 82 participants based on our data.

#### Limitations

Several limitations need to be considered when interpreting the results of this study. The small sample size provides initial data, however, it does not provide enough power to make strong statistical inferences. Prior to conducting the study, participants were informed that the researchers were investigating differences between standard and improvised vocal performance. This knowledge may have influenced responses either during or following each task.

#### Conclusion and Recommendations for Future Research

This study provided support for the feasibility of conducting experimental flow research in which the structure of a musical task is manipulated. Because the present study involved a very small sample, it is recommended that a study be conducted with a larger sample size to further evaluate the experimental procedures and to gain sufficient power that would allow for effective statistical analysis of results. To avoid potential carry over effects, a longer washout period between conditions is recommended. Although previous studies indicated only a short washout period was necessary, mean plasma ACTH at baseline was significantly lower in the second (IP) condition. This may have also been attributed to an order effect, as all participants sang in the IP condition following the SP condition. A repeated measures design with counterbalancing may control for any order effect observed in this study. A washout period at the beginning of the experiment may also reduce variations in neuropeptide levels caused by pre-experimental stimuli. Given the recent findings of Fancourt et al. (2015), a potential extension of this study may also include a rehearsal period and self-report data to evaluate the level of stress experienced by participants during standard and improvised singing. In the present study, the significant decrease in ACTH during the standard performance may indicate a low-stress condition experienced by participants. The stress level of the improvised performance is less clear, however, the upward trend of ACTH and informal statement from one of the participants may perhaps indicate a higher level of stress experienced while improvising.

It is also recommended that future research account for potential effects of personality traits on proneness to flow

## References

Amico, J. A., Ulbrecht, J. S., and Robinson, A. G. (1987). Clearance studies of oxytocin in humans using radioimmunoassay measurements of the hormone during music production tasks. For instance, it has been observed that perfectionist tendencies frequently inhibit flow (Bruya, 2010). With a small sample size, these characteristics could disproportionately influence individual and group-level data. Screening participants with a personality inventory could prove helpful in accounting for such differences and allow for comparisons of flow experience of individuals with varying personality traits.

During this study, the researchers informally noted differences in participants' behavioral tendencies between conditions. For example, a higher frequency of social interaction cues, such as eye contact, were observed during the improvisation task. Behavioral analyses combined with 7–8 formal interview questions of the participants' subjective experience would provide greater insight on social flow and affiliation in standard and improvised group singing.

As stated previously, the current study tested our procedures using typically functioning student musicians. Future researchers might employ the use of music improvisation in dyadic and group settings with non-diagnosed populations (this could include both musicians and non-musicians) first to facilitate positive social interactions and elicit desired nonmusical responses (MacDonald and Wilson, 2014). Once an understanding of biochemistry and flow during singing for typically functioning people is obtained, clinical research with appropriate populations can be pursued. For purposes of translating the research to clinical application, future researchers may also consider structuring the music production tasks in a way that more accurately reflects the current practices of music therapists. For example, many clinical experiences entail the playing of instruments as opposed to singing. Clinicians may also manipulate instruments in such a way that clients, despite age or ability, are able to participate in an error-free music-making experience (improvising modally, for example) that results in the creation of an aesthetically pleasing sound. Collaboration between practicing clinicians and researchers will be important in progressing this line of research and gaining more insight into the facilitation, experience, and outcomes of social flow and affiliation in the context of group music-making.

## Acknowledgments

We would like to thank Professor Greg Jasperse for his support in the recruitment of participants and arrangement of the musical selections. We would also like to thank Brandy King and Jessica Toth for their assistance with the phlebotomy procedures. Funding for this research was provided by the Graduate Student Research Grant from Western Michigan University and the National Science Foundation grant number DBI-1062883.

Anshel, A., and Kipper, D. A. (1988). The influence of group singing on trust and cooperation. J. Music Ther. 25, 145–155. doi: 10.1093/jmt/25.3.145

in plasma and urine. J. Clin. Endocrinol. Metab. 6, 240–245. doi: 10.1210/jcem-64-2-340

Bachen, C. M., and Raphael, C. (2011). Social Flow and Learning in Digital Games: A Conceptual Model and Research Agenda. London: Springer.


trial. J. Clin. Nurs. 18, 2153–2161. doi: 10.1111/j.1365-2702.2008. 02718.x


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Keeler, Roth, Neuser, Spitsbergen, Waters and Vianney. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The challenges and benefits of a genuine partnership between Music Therapy and Neuroscience: a dialog between scientist and therapist

#### Wendy L. Magee<sup>1</sup> \* and Lauren Stewart <sup>2</sup>

<sup>1</sup> Music Therapy Program, Boyer College of Music and Dance, Temple University, Philadelphia, PA, USA, <sup>2</sup> Department of Psychology, Goldsmiths, University of London, London, UK

Collaborations between neuroscience and music therapy promise many mutual benefits given the different knowledge bases, experiences and specialist skills possessed by each discipline. Primarily, music therapists deliver music-based interventions on a daily basis with numerous populations; neuroscientists measure clinical changes in ways that provide an evidence base for progressing clinical care. Although recent developments suggest that partnerships between the two can produce positive outcomes for both fields, these collaborations are not considered mainstream. The following dialog between an experienced professional from each discipline explores the potential for collaboration, as well as the misconceptions that may be preventing further synergies from developing.

#### Edited by:

Jörg Christfried Fachner, Anglia Ruskin University, UK

#### Reviewed by:

Antoni Rodriguez-Fornells, University of Barcelona, Spain Susanne Metzner, Hochschule Magdeburg-Stendal, Germany

#### \*Correspondence:

Wendy L. Magee, Music Therapy Program, Boyer College of Music and Dance, Temple University, 7 N Columbus Blvd, #131, Philadelphia, PA 19106, USA l.stewart@gold.ac.uk

> Received: 14 January 2015 Accepted: 08 April 2015 Published: 01 May 2015

#### Citation:

Magee WL and Stewart L (2015) The challenges and benefits of a genuine partnership between Music Therapy and Neuroscience: a dialog between scientist and therapist. Front. Hum. Neurosci. 9:223. doi: 10.3389/fnhum.2015.00223 Keywords: neuroscience, music therapy, debate, interdisciplinary, patients, clinical protocols

Two professionals from different sides of the neuroscience and music therapy debate present an informal dialog exploring realities and beliefs that have benefited or hindered collaboration. As a music therapist who has turned to neuroscience for evidence in neurological rehabilitation clinical practice, and a neuroscientist who has been motivated by the implications of her research for clinical populations, we present this dialog in an interview format. This format was chosen to encourage genuine questioning and exploration of issues that are implicit to potential collaborations, and which remain unexplored in empirical research.

**WM:** Lauren, in your view, how can music therapy contribute to the wider perspective of clinical practice and research?

**LS:** I think there is no question that the properties of music, in terms of intrinsic features, as well as the potential for engagement, emotional response and interpersonal communication, can be very powerful across a range of clinical situations. When used appropriately, music is ethically acceptable, side-effect free, can be intricately tailored to personal preferences and tastes, and in some cases may provide a cost-effective alternative to pharmacological sedation (Loewy et al., 2006). Exploiting the potential benefits of music is not only essential for advancing clinical practice, but also in elucidating and characterizing how music acts on the brain. There is much to be gained from a joint enterprise where practice and research can reciprocally inform one another.

But achieving such collaborations takes time: How do you think our respective disciplines are doing in this regard, Wendy? Are you sensing a significant productive collaboration in recent years **WM:** I think there are many interesting collaborations emerging that illustrate how a genuine partnership between the two professions can draw on the strengths of each to benefit research and improve clinical practice. One example is the new MANDARI collaboration (music and the neurodevelopmentally at risk infant) which has brought together researchers and clinicians from diverse disciplines to discuss the potential of music at the earliest possible state in life (http://www.gold.ac.uk/ mandari/). The different disciplinary languages and frameworks are explicitly discussed to permit a platform for genuine interdisciplinary engagement, including scholarly critique of frameworks and assumptions that may be implicitly entrenched in our respective disciplines.

A number of studies also provide models for collaborations between the two disciplines. To take just a few examples: Thaut et al. (2005) examined music as a mnemonic device for learning and memory with Multiple Sclerosis patients and its effect on neuronal synchrony; Särkämö et al. (2008) examined the impact on cognitive recovery, mod and brain activation following stroke and O'Kelly et al. (2013) explored brain responses to music in patients with disorders of consciousness who cannot show behavioral responses. Studies such as these demonstrate the potential of a combined music therapy/neuroscience approach to give insights into "how" music works and "why" we see clinical improvements. The knowledge that stems from such collaborations ultimately has the potential improve interventions offered to patient populations.

However, I personally feel that the potential synergies between our two fields have yet to realize their full potential. I've been working in music and neurology for around 25 years and certainly I've wanted to engage with neuroscientists to a greater degree, particularly through my work with complex, brain-damaged populations. As a clinician, I have found reading the neuroscience literature invaluable for drawing out relevant information in order to both inform my own understanding of the brain and, where possible, apply it in an evidence-based way in practice with clients.

Personally, I have been able to build relationships with individual neuroscientists where we have a common interest in clinical populations. However, these relationships have not been able to develop in more systematic ways. We largely read different journals, go to different conferences and belong to different societies. Although music therapists are increasingly attending more neuroscience-based conferences and publishing in neuroscience journals, there is very little infrastructure to allow these two disciplines to interact in ways that can reciprocally inform each other. Perhaps you have thoughts on how we might advance collaborations and dialog? What do you feel has been a barrier to collaborations to date?

**LS:** As you say, there are enormous challenges to interdisciplinary working, which is easy to express support for but more difficult to realize! My recent involvement with the MANDARI collaboration showed me that not only do we speak very different languages but we also have very different motivations for our involvement, and what counts as an interesting question or goal for one person, can seem less important to others. It's hard to articulate our deep-seated motivations, but an honest exchange of where each party is coming from is vital to ensure people are not pulling in different directions without even realizing it.

Added to this is the fact that many areas of clinical practice might remain hidden to the research community, since many clinicians do not have the time or resources to conduct or publish research. They might communicate it within their local practicebased networks only. This can provide a skewed picture of what is actually going on clinically, which often does not reflect the breadth of practice and associated theories and frameworks that are being used.

Special initiatives, such as this Frontiers issue, can provide a platform for knowledge exchange, as can seeking out opportunities to understand more about the very different worlds each of us inhabit. But ultimately, the most productive collaborations will be motivated by individuals who have a vision of how research and practice can complement one another, and who work from a grass roots level to make it happen.

Perhaps we could consider the different kinds of motivations that typically drive clinicians vs. researchers—what are your thoughts on that?

**WM:** A primary motivation of a music therapist is to improve clinical methods in order to benefit the patient. Therapists are very much at the coalface, working with people who do not have straightforward types of pathologies; this is typical in catastrophic brain injury. They do not have neat lesions in one area of the brain, they have complex problems, and they're all different.

For music therapists, the drive to do research is prompted by what happens in the therapy room during the clinical intervention. Therapists are interested in questions about "what is it that works?" and "which process works best for that patient?" Often they work so closely with the patients and their families, they have difficulty in standing back and looking at the bigger picture, which is necessary for a researcher. Lauren, do you feel this is a barrier for neuroscientists engaging with the music therapy profession in research collaborations? Perhaps it is easier for neuroscientists to do this, since they are less engaged in directly working with patients?

**LS:** As you say, one of the important issues for music therapists, is obviously the individualized, tailored approach, while, for researchers, group designs where an intervention can be implemented in the same way across a group of patients, is often preferred. This may involve abstracting something personal and bespoke into a "one size fits all" approach that may, in the end, turn out to be less relevant and less effective for the patient group. So there's a tension between an intervention, which may be idiosyncratic and highly personalized from one patient to the next, with the need for a design that incorporates standardization and replicability. It's possible to have a design that incorporates a tailored approach, and can be analyzed in a statistically robust way, but such an approach is not orthodox for most neuroscientists.

**WM:** Indeed. I should add, the type of well-controlled protocols that neuroscientists are used to challenge real-world settings on two fronts. First, if a protocol does not meet a patientcentered need that the patient or the therapist feels is most important (e.g., an emotional need over a functional need such as hand grasp), then the clinician and the patient lose motivation to continue. There are also ethical questions about using protocols that are not best suited to patient needs. Second, music is a medium that provides opportunities for spontaneity and play, which are both important features in therapy, learning and rehabilitation. These features can be challenging to incorporate into a controlled protocol.

Music Therapists in recent years have become more involved in research to generate evidence, particularly with randomized controlled trials (RCTs), which are considered one of the highest forms of "evidence" in health care. RCTs are challenging on a number of fronts; one of which concerns the difficulty of formalizing the intervention in terms of a standardized protocol. We know that this is one of the criticisms that neuroscientists have of Music Therapy. Ultimately, therapists have been trained to view each client as an individual, and tailor intervention to that individual. Adopting standardized protocols can be seen as not taking account of individual differences and treating that person as a unique being.

This is one reason why RCTs are difficult to do in practice and are rarely the best method for getting at complexity, for instance, researching rehabilitation after catastrophic brain injury where single-subject designs are more suitable. But, on the other hand, if we completely reject the notion of RCTs altogether, we risk missing the opportunity to engage in testing out the efficacy of music therapy interventions, using research designs that are widely recognized as the "gold standard" in health care. An alternative is to do an RCT where protocols are defined in a way that enables flexibility. For example, one protocol, which has been written for working with children with Autism spectrum disorders, defines a complex intervention of improvisational Music Therapy (Geretsegger et al., 2012). This is a challenging intervention to protocolize as it draws on musical spontaneity and play to improve specific nonverbal communicative behaviors typical with this population. The protocol manages to describe the intervention procedures with enough precision to enable a trained therapist to deliver the intervention but also allows for spontaneity in response to the client's musical and communicative behaviors.

**LS:** Another example of an RCT, that has a flexible implementation, can be seen in study where parents were trained to deliver live Music Therapy in the neonatal intensive care unit (Loewy et al., 2013). Although the parents had been trained along broadly similar lines, the detail of delivery was rather different. So you don't always need to disregard the lived experience when you are doing research, you just need to be a bit clever about it.

In relation to this, I'm aware that for most scientists, the Cochrane Reviews (http://www.thecochranelibrary.com/) would be the first port of call in trying to establish whether Music Therapy was deemed effective for a particular clinical group. With their reliance on RCT designs, is there a danger that some high quality Music Therapy studies are being overlooked?

**WM:** The Cochrane Reviews are considered the "gold standard" and they evaluate all the quantitative research that has taken place on an intervention with a specific population, e.g. Music therapy for Acquired Brain Injury (Bradt et al., 2010). However the inclusion criteria used to evaluate research studies are very narrow. This means that many studies that present a compelling argument for the effectiveness of Music Therapy in a certain clinical context are excluded from the "evidence base." The Cochrane's evaluative criteria include principles of randomization, allocation concealment and double blinding in order to minimize or eliminate bias completely. These designs are modeled on principles of testing pharmaceuticals, which is not the best application for many therapeutic interventions. As an author of a Cochrane review, I think that it is really important for us to engage with the evidence debate.

**LS:** In our discussion so far, we have yet to touch on the distinction between Music Therapy and Music Medicine. Could you outline how those two approaches differ?

**WM:** Music Medicine involves interventions using music that have a clinical outcome in mind, but where the outcome is not reliant on the relationship between the client and the person giving the intervention. That is, the intervention does not rely on some type of human musical dialog and relationship development (or process) that is typical in a therapeutic interaction. These interventions are typically implemented by nurses, doctors and even dentists. The interventionist could simply leave the music with the client. A good example of this is the management of pre-operative pain and anxiety, where a patient is given recorded music to listen to. I believe there is a role for non-complex music interventions such as these, where there is minimal risk to the patient and can be delivered by a wide range of health professionals. Such interventions do not require training in how to deliver the intervention, or in how to enhance the interpersonal interaction or analyze the patient's responses. This contrasts with clinical scenarios that do require complex interventions. Some examples of these might be psychological difficulties where the person has trouble in developing or maintaining interpersonal relationships, due to Autism spectrum disorders, an attachment disorder, or is dealing with the psychological trauma caused by bereavement, loss or abuse. These clinical needs demand a human element: another person to work with the client in order to provide them with the experience of relearning to "relate." These clinical needs demand very different musical and therapeutic interventions to simply playing a patient recorded music.

**LS:** So in some cases, is music used as a framework to facilitate a more standard type of talking therapy?

**WM:** Relationship development, through the use of music, is certainly comparable to speaking therapies. Music can be a useful medium to work on interpersonal issues for a number of reasons. Within a musical interaction, you can sing "with" a person, not simply sing "at" or "to" one another; you improvise, listen, attune and respond using imitation or reflection. With some populations it is more effective than communicating with words, particularly for those who may find it difficult to speak or perhaps those who have not yet acquired language or have lost language due to brain damage.

**LS:** I sometimes think that the skills and knowledge that music therapists have are not well understood, from the perspective of the basic science researcher. For instance, at a recent talk I attended, the presenter who was a non-clinician scientist, was

asked whether the described intervention given to a particular clinical group was administered by a music therapist or not. The response was "No, but the person delivering it was a competent musician."

**WM:** Yes, this is important to articulate. In some clinical settings, the assumption may be that a music therapist is there to simply entertain the patient in order to lift their mood. In fact, music therapists are professionals who have been trained to a high standard musically, but more importantly, they have been trained to work with clinical populations and to use music in ways to address a wide range of social, emotional, behavioral and physical needs. Most importantly, they are trained in attuning to other people, musically and emotionally, whilst maintaining strong boundaries between themselves and the client.

Simply learning a protocol through reading a theoretical research paper and attempting to apply it within a clinical setting presents many risks to the patient and the person doing the music protocol. When working with clinical populations, unexpected difficulties can arise whereby an untrained person may not be able to manage the situation, (e.g., extreme agitation, distress, physical self-harm), and interact with the patient safely. A music therapist has skill and expertise to a recognized standard in assessing a situation and adapting a protocol to a clinical situation.

**LS:** Perhaps one of the difficulties in understanding what music therapists do comes from the existence of several different approaches and philosophies within the profession. The kind of Music Therapy that is probably most familiar to neuroscientists is Neurologic Music Therapy (Thaut, 2005), but in music therapy circles, many other "flavors" are dominant and some of them seem to downplay functional goal-setting, which to neuroscientists, can be difficult to appreciate—could you comment on that?

**WM:** I think this point you bring up is a really important issue. As with other professions (e.g., Psychology) there are different theoretical models in music therapy that range from behavioral, to psychodynamic, to music-centered, to humanistic and so on. Each approach has its own merits and some will be more suited

## References


to certain contexts than others. However, the important thing is that the model of music therapy used is appropriate to the patient's needs, and the therapist can articulate the outcomes and rationale behind the method they are using in ways that the patient, families and colleagues can understand.

**LS:** We've covered a lot of ground here, but I wonder if I can finish up by asking you where you see Music Therapy making the biggest inroads going forward?

**WM:** I feel very excited about interdisciplinary collaborations such as that modeled by MANDARI, because these have big implications for both of our professions, and most importantly, for patient care. Interdisciplinary research with other clinical professions (e.g., nursing; medicine) is also growing and will improve research through accessing more participants who are suited to studies. Research that continues to explore music's impact on the brain with clinical populations is also a priority so that we can develop interventions that will have greatest impact, particularly when we consider Dementia and Stroke as the two largest and fastest growing populations in societies around the globe. We need to understand why and how music works and refine interventions. Tapping into populations for which we have no evidence base is also a priority, such as post-traumatic stress disorder, particularly those who have returned from military conflict and the devastated populations left after conflict or torture. Music Therapy's impact in this domain would be relevant both for neurological rehabilitation but also the psychological trauma that cannot be explored easily using verbal interactions. The findings potentially would be relevant for a number of populations where psychological trauma is a major factor.

## Acknowledgments

The authors would like to thank Candida Godbold for transcribing the initial conversation and for providing comments on earlier drafts. LS was supported by the Leverhulme Trust (RPG-297) and Economic and Social Research Council (RES-061-25-0155).

after middle cerebral artery stroke. Brain 131, 866–876. doi: 10.1093/brain/ awn013


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Magee and Stewart. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Boundaries and potentials of traditional and alternative neuroscience research methods in music therapy research

Andrea M. Hunt\*

Department of Music, Immaculata University, Immaculata, PA, USA

Keywords: music therapy, neuroscience methods, research methods, epistemology, opinion

How should music therapists engage with the enormous potential of neuroscience research? The methodological rigors usually employed in such research complicate this highly attractive arena, requiring operationalizing music and removing it from the context in which it is usually experienced (Fachner and Stegemann, 2013). Deconstructing music in this way merely addresses the neural processing of music perception and action, ignoring the holistic experience of music, which unfolds over time and is embedded in personal and situational context (Fachner, 2002). Furthermore, because music therapy by definition is an interpersonal experience involving client and therapist, and the therapy process depends "upon not merely the music, but also the client's experience of it" (p. 115, Bruscia, 2014), research methods which isolate the research subject from this interaction neglect an important component in the clinical dynamic of music therapy.

From a broader perspective, emerging research into the effects of early relationships on brain development and behavior (Schore, 2012), shows that individuals' brains have unique patterns of interacting with the world as well as perceiving and responding to the world. While cognitive neuroscience can identify some global responses to music as stimuli, the high degree of variability across individuals continues to be a serious confounding factor. In response, new research methods are exploring ways to account for individual experience in conjunction with neuroimaging (Varela, 1996) as well as how interpersonal musical interaction correlates with brain activity (Lindenberger et al., 2009).

Therefore, in this piece I will discuss researching and interpreting the behavior of the human brain in relation to music therapy contexts. I will delineate the boundaries of research methods employed in the neurosciences and discuss ways in which new, alternative methods have the potential to meaningfully elucidate clinically relevant information for music therapists.

## Cognitive Neuroscience for Music Therapy

Cognitive Neuroscience (CNS) as a discipline has generated fascinating insights into the structure and functions of the human brain, with near-daily revelations regarding the complex nature of the nervous system. Music psychologists and CNS researchers have discovered brain structures and networks related to music processing of many kinds, including music perception, emotion and music, and sensory processing and music. Other research has focused on the effects of music training on processes such as cognition, emotion, self-regulation, learning, and neuroendocrine functions. Other discoveries include the mirror neuron system, showing how the brain processes perception and translates it into action, as well as the principle of neuroplasticity, where neural pathways can shift and become stronger with repeated use and training. All of these discoveries

#### Edited by:

Jörg Christfried Fachner, Anglia Ruskin University, UK

#### Reviewed by:

Sarah Faber, Anglia Ruskin University, UK Erik Christensen, Aalborg University, Denmark

#### \*Correspondence:

Andrea M. Hunt, ahunt1@immaculata.edu

Received: 01 April 2015 Accepted: 28 May 2015 Published: 09 June 2015

#### Citation:

Hunt AM (2015) Boundaries and potentials of traditional and alternative neuroscience research methods in music therapy research. Front. Hum. Neurosci. 9:342. doi: 10.3389/fnhum.2015.00342 have had major implications in music therapy practice, especially for clinicians who work primarily with brain injury and disease. The recent media coverage (e.g., Moise, 2011) of U.S. Congresswoman Gabrielle Giffords' recovery from an attacker's gunshot wound to her left temporal lobe has highlighted ways music therapists have used this knowledge in neurological rehabilitation.

#### Requirements of CNS Research

CNS research in music cognition or music therapy involves the use of complex imaging systems that require stringent controls in order to obtain reliable, valid data. This involves operationalizing the stimuli—in the context of music listening studies, the music is often considered as the stimulus, while in the context of active music making, playing music is a complex neurological task. The music, and its delivery or activity, must be clearly defined, standardized, and controlled. CNS research also demands controlled designs in order to attain the best results given the limitations and strengths of the imaging methods being used. Often subjects undergo multiple (sometimes hundreds) of trials in order to obtain a large dataset, with the results averaged in order to find universal responses to that stimulus or condition. These strict protocols are necessary to answer specific research questions, with little room for individualizing approaches to fit a clinical music therapy situation.

Furthermore, subjects' movements may be restricted, or they cannot use particular materials or musical instruments while undergoing an imaging study, due to the nature and constraints of the imaging equipment. For example, the magnetic field generated by an fMRI machine would preclude investigating the subject's playing of any instruments that contain metal. Despite these limitations, researchers interested in active music making have found ways to work around these limitations, by using materials safe for the scanner (e.g., a non-ferromagnetic piano keyboard for use in fMRI) or having subjects play instruments that are not only compatible with the imaging method, but also do not require much head movement that would result in artifacts (e.g., saxophonists playing during EEG acquisition in Babiloni et al., 2011). Each imaging method also has its strengths and limitations in its temporal and spatial resolution; thus researchers should choose the most appropriate imaging method for the research question, given the kind of data the imaging can obtain.

## Integrating Music Therapy and CNS

Research in recent years has demonstrated ways that music therapy and neuroimaging can work well together, particularly for rehabilitation from neurologic injury. For example, Altenmüller et al. (2009) used EEG to show how music therapy can improve cortical connections and activity in stroke patients. Schlaug et al. (2009) used diffusion tensor imaging to reveal neurological changes after melodic intonation therapy for persons with left-hemisphere stroke damage. These highlights are in addition to more comprehensive reviews of music therapy in rehabilitation (Hurt-Thaut, 2009; Leins et al., 2009). More recently, researchers have used neuroimaging to discover lasting changes in brain functioning after 18 sessions of music therapy for depression (Fachner et al., 2013) and to identify brain responses to different types of music intervention for pain (Hauck et al., 2013).

## Limits of CNS for Music Therapy

While these studies are encouraging, readers must note that CNS methods have limitations, many of which have been summarized in Christensen (2012). Primarily, operationalized music "stimuli" and the resultant designs often lack ecological validity. Many studies utilize synthesized music or tones, or short segments of music. It is rare for studies to use complete pieces of music. Furthermore, imaging equipment restricts or does not work well with body movement, limiting naturalistic ways that subjects may move while listening to or playing music. Also, because equipment is expensive and specialized, it is often located in a medical setting or laboratory, a context far removed from where clients would usually encounter music therapy. In addition, many CNS studies do not adequately report the sources of the music selections used in the research, making it difficult to interpret findings.

Aside from these methodological restrictions, the epistemological assumptions of CNS research are also restricted. Researchers assume there are universal responses to the music conditions, and dismiss outlier responses as statistical noise. There is no room for investigating unique brain responses to the experimental conditions. Music is assumed to be an object which can be operationalized as a "stimulus," this ignores the socially-constructed meanings of music which are created within and across cultures and groups. When researchers ignore these meanings in their designs and simply create segments of tonal or rhythm patterns as their stimuli, they are really examining the brain's responses to tonal and rhythm patterns–not music.

The socially-constructed nature of music is directly related to the field of music therapy, because music therapy involves a relationship between client and therapist (Bruscia, 2014). This relationship involves the intersubjective nature of music, along with nonverbal communication and social and environmental context. These are all significant factors in the music therapy experience which must be included in order to assure ecological and sociological validity. Translating CNS research to the practice of music therapy therefore requires that the client-therapist relationship be taken into account in the research question, design, and interpretation of results.

### The Future for Music Therapy Research

For CNS to incorporate the interpersonal, subjective, and contextual factors inherent in music therapy, researchers must first be very clear about their epistemological stances in their research, while also considering other perspectives. Each perspective has strengths and limitations, and requires appropriate expertise in a research context.

The philosopher Wilber (2000) has created a model to help conceptualize phenomena in all its forms and permutations

(**Figure 1**). This model, called the Four Quadrants, has been applied to music therapy as well (Bruscia, 1998) to help delineate different clinical phenomena and approaches. The quadrants are organized in a matrix of "Interior" vs. "Exterior" phenomena, combined with "Individual" vs. "Collective" phenomena. For this paper, I will focus on the top two quadrants, "Exterior-Individual" and "Interior-Individual."

Traditional scientific research, including experimental designs and any approach which involves the collection and analysis of observable, measurable data from individuals, is located in the Upper Right quadrant (the Exterior-Individual/"It" region). This is also where traditional CNS research is located music and related behaviors are viewed as objects, and brain responses to it are objectively measured and analyzed. Music therapy approaches located in this quadrant include behavioral interventions and any intervention that focuses on observable, measurable outcomes.

The Upper Left quadrant (Interior/Individual/"I" region) contains phenomena including the individual's subjective feelings, experiences, memories, and values. Research located in this quadrant includes phenomenology and heuristic research, while the music therapy methods here include psychoanalytic or humanistic approaches that emphasize the individual's internal experience which typically cannot be measured or observed objectively.

The future of music therapy research needs to address several kinds of phenomena. First, it needs to account for the variability among subjects' neurological responses. For example, research should consider cultural and personal context in neurological development, which can lead to unique patterns of perception and response (Schore, 2012). Second, research needs to account for human interaction in the music therapy experience—that is, understanding "music" as a verb rather than a noun/stimulus (Small, 1998). In particular, research should attempt to address client-therapist interaction during clinical experiences. In other words, the future of music therapy research in the neurosciences should involve perspectives from quadrants other than the Upper Right.

Some research methods from these quadrants have already been developed in the neurosciences. One such method is neurophenomenology (Varela, 1996). This approach originated as a biological investigation of subjectivity and consciousness, but evolved to include other phenomena including an integrated investigation into the biological and subjective experience of a guided imagery and music session (Hunt, 2011). The approach integrates objective data and subjective experience in individuals, holding that (1) the first-person experience is irreducible, (2) the first-person investigation must be rigorous, and (3) the first- and thirdperson perspectives are equally important. In the Hunt (2011) study, the researcher collected phenomenological data from each participant's imagery report of the music therapy session, and correlated it with EEG coherence data to generate integrated descriptions of biological and subjective experience during the sessions.

Hyperscanning is another research method with great potential for music therapy. Here, imaging data are collected simultaneously from two or more subjects and analyses focus on ways that the brain data synchronize with each other around shared experiences or events. For example, Sänger et al. (2012) examined the EEGs of two musicians playing guitar duets and found that phase locking indices were high when the musicians were setting the initial tempo, as well as immediately prior to, and after onset of, playing. More recent studies have utilized hyperscanning with functional near-infrared spectroscopy (fNIRS; see review in Scholkmann et al., 2013) with great success. Integrating multiple participants' data could even be used within a neurophenomenological approach in order to understand what participants undergo while making music together in music therapy.

In addition to these new methodological and paradigmatic approaches, portable EEG (Wascher et al., 2013; DeVos and Debener, 2014) and wireless fNIRS (Scholkmann et al., 2013) can permit in situ neuroimaging, thereby increasing ecological validity. While these imaging options have limitations in terms of spatial resolution and standardized norms for comparison, they are relatively inexpensive compared to fMRI, PET, and SPECT technology. Research designs and questions which focus on these portable devices' imaging strengths could lead to greatly increased understanding of neurological activity during music therapy experiences.

With the advent of both innovative neuroimaging technology and new research perspectives and designs, music therapists are uniquely poised to undertake ground-breaking research into ways that music therapy affects and benefits clients. However, prevailing thinking around CNS research could divert attention away from these possibilities, and instead focus research on Upper Right phenomena alone. While this research undoubtedly has been beneficial, it has limited translatability to music therapy experience and practice. Let us not limit our understanding to one perspective; instead let us step into new perspectives, as we do with our clients, willingly looking at the world from a new place, with new eyes, and new comprehension.

## References


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Hunt. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Constructing optimal experience for the hospitalized newborn through neuro-based music therapy

Helen Shoemark 1,2\*, Deanna Hanson-Abromeit <sup>3</sup> and Lauren Stewart 4,5

<sup>1</sup> Music Therapy, Temple University, Philadelphia, PA, USA, <sup>2</sup> Sensory Experience in Early Development, Murdoch Childrens Research Institute, Melbourne, VIC, Australia, <sup>3</sup> School of Music, University of Kansas, Lawrence, KS, USA, <sup>4</sup> Department of Psychology, Goldsmiths, University of London, New Cross London, UK, <sup>5</sup> Center for Music in the Brain, Department of Clinical Medicine, Aarhus University and The Royal Academy of Music Aarhus/Aalborg, Aarhus, Denmark

Music-based intervention for hospitalized newborn infants has traditionally been based in a biomedical model, with physiological stability as the prime objective. More recent applications are grounded in other theories, including attachment, trauma and neurological models in which infant, parent and the dyadic interaction may be viewed as a dynamic system bound by the common context of the neonatal intensive care unit (NICU). The immature state of the preterm infant's auditory processing system requires a careful and individualized approach for the introduction of purposeful auditory experience intended to support development. The infant's experience of an unpredictable auditory environment is further compromised by a potential lack of meaningful auditory stimulation. Parents often feel disconnected from their own capacities to nurture their infant with potentially life-long implications for the infant's neurobehavioral and psychological well-being. This perspectives paper will outline some neurological considerations for auditory processing in the premature infant to frame a premise for music-based interventions. A hypothetical clinical case will illustrate the application of music by a music therapist with an infant and family in NICU.

#### Edited by:

Julian O'Kelly, Royal Hospital for Neuro-Disability, UK

#### Reviewed by:

Andreas W. Flemmer, Ludwig Maximilian University of Munich, Germany Joanne Loewy, Mount Sinai Beth Israel, USA

#### \*Correspondence:

Helen Shoemark, Sensory Experience in Early Development, Murdoch Childrens Research Institute, 50 Flemington Road, Parkville, Melbourne, 3052 VIC, Australia helen.shoemark@mcri.edu.au

Received: 17 April 2015 Accepted: 21 August 2015 Published: 03 September 2015

#### Citation:

Shoemark H, Hanson-Abromeit D and Stewart L (2015) Constructing optimal experience for the hospitalized newborn through neuro-based music therapy. Front. Hum. Neurosci. 9:487. doi: 10.3389/fnhum.2015.00487 Keywords: neonatal intensive care unit, music therapy, preterm infant, auditory environment, stimulation, family centered practice

## Introduction

The premature infant brain exhibits a distinctive type of white matter injury, affecting the cerebral white matter, thalamus, basal ganglia, cerebral cortex, brainstem and cerebellum (Volpe, 2009). Regional brain volumes acquired from children, aged 8 years, who were born preterm, show differences across a number of cortical and subcortical brain regions (sensorimotor, premotor, mid-termporal, occipto-parietal, basal ganglia, cerebellum, hippocampus, corpus callosum) several of which correlate with poorer cognitive outcomes (full-scale, verbal and performance IQ scores; Peterson et al., 2000). While such an abnormal neurological profile will have profound consequences for subsequent neuronal and cognitive development, this may be mitigated, at least in part, by providing the most conducive environment for promoting neurological development, given the established role of experience and environment in shaping brain structure and function (van Praag et al., 2000). The present article argues that music-based interventions, grounded in multi-faceted theories, can play a role in optimizing the experience and environment of the preterm infant, particularly when implemented by a music therapist and working alongside parents.

The environment in which a premature infant is cared for is highly atypical when compared to that of a term infant born without medical complications. Certain aspects of the ''normal'' environment are missing: opportunities for predictable sensory stimulation as well as for social interaction (including directed linguistic input). Other aspects are uniquely present in the neonatal intensive care unit (NICU): high levels of unpredictable noise, impinging upon an auditory system that is normally ''protected'' by the attentuating maternal tissues, as well as frequent invasive procedures. Infants are frequently startled, sleep is disrupted, and physiological profile is unstable (Graven, 2000). While the NICU environment is doubtless key to survival, it nonetheless constitutes a considerable source of stress for the preterm infant (Newnham et al., 2009). While we might assume that even the preterm brain will exhibit some resilience to this, owing to the presence of the fast acting sympathetic-adrenal system, as well as the slower acting hypothamalic-pituitary-adrenal axis (HPA), current research suggests that the impact of prolonged or excessive stress has the potential to result in a dysregulated response to future stressors, as well maladaptive changes in the hippocampus, prefrontal cortex and amygdala—areas which are important for learning, decision making and emotion regulation (Radley and Morrison, 2005).

The removal of noise alone is now considered an inadequate response to the traditional problem of inappropriate auditory stimulation in the NICU (Jobe, 2014). While attention to reducing noise is necessary, equal attention must be given to the creation of meaningful auditory stimulation (Rand and Lahav, 2013). Compared to other environmental sounds, musical sound waves have organized structural characteristics of pitch, dynamics, timbre and harmony. Thus, once noise has been reduced, carefully selected music can be used as an environmental stimulus within the recommended decibel levels for stimulation (45–50 dBA). For the infant to detect the music it will need to be sufficiently loud or have frequencies or timbres which distinguish it from the ambient environment (Dearn and Shoemark, 2014). Guiding principles to consider in terms of musical characteristics include predictable patterns in rhythm, melody, and phrasing (Trehub and Trainor, 1990; Gray and Philbin, 2004), sparse gradual changes in tempo in a lullaby style (Trainor, 1996; Rock et al., 1999), consonance (Trainor et al., 2002), smooth melodic contours (Unyk et al., 1992) and an absence of harmonies (Janata et al., 2002; Gray and Philbin, 2004; Siddiqui et al., 2008). While full-term and older infants attend to higher pitches (Trehub and Trainor, 1990; Trehub, 2001), the premature infant's auditory development may be better suited to lower pitches (Graven and Browne, 2008). However, it is essential for the characteristics of the music to be explicitly tailored and adjusted to the infants needs (Hanson-Abromeit, 2015) as well as to acknowledge the preferences of the family to ensure a culturally acceptable and useful stimulus formulated for their family functioning. Thus the individualized application of music requires a relationship between music therapist, infant and family, which differentiates it from other music-based approaches where music may be a more standardized stimulus.

In addition to constituting a source of chronic stress for the premature infant, the unpredictable and noisy environment of the NICU has a detrimental impact on the infant's sleep patterns (Kuhn et al., 2012) and interrupts the transitioning between different physiological states which is important for the newborn infant. A study by Weisman et al. (2011) showed that infants whose sleep-state transitions were mainly characterized by shifts between quiet sleep and wakefulness exhibited the best development, including greater neonatal neuromaturation, less negative emotionality, better cognitive development, and better verbal, symbolic, and executive competences at 5 years. Music can be useful in helping the infant transition effectively between physiological states—for instance from quiet wakefulness to sleep. Early research in this field from Olischar et al. (2011) reported a trend towards more mature sleep-wake cycling in healthy infants at 32 weeks or more gestational age when exposed to a recording of a lullaby. The ability to transition effectively between sleep states, wakefulness, and stress states is also a significant benefit in relation to the many medical and diagnostic procedures that occur in the NICU. Many of these involve pharmacological sedation which can result in adverse reactions, including nausea and vomiting, respiratory depression and cardiac arrhythmias (Harvey, 1985; Greenberg et al., 1993; Paret et al., 1996). There is some evidence to suggest that the use of music, either exclusively, or in combination with a lower dose of pharmacological sedative, can provide an effective and safe alternative (Loewy et al., 2005; Schwilling et al., 2015).

While some aspects of stress in the NICU relate to the presence of noxious auditory stimuli, other aspects relate to the absence of a consistent and appropriate response from a primary caregiver, particularly given the infant's state of physiological dysregulation. This physical separation is also a source of stress for the parents, for whom the typical biological processes of feeding, nurturing and bonding with their infant are curtailed owing to the complex medical needs of the child. Problems of attachment are common, with parents enduring significant experiences of trauma (Coppola et al., 2007) putting at risk their own sensitivity to their baby, and often developing an overall hypervigilance about their baby's medical status. A secure attachment to a caregiver is important for the cognitive development of self vs. other representations, as well as for the providing a template for the formation of interpersonal relationships throughout the lifespan (Borghini et al., 2006; Vrticka and Vuilleumier, 2012). One strong advantage of using music in the NICU concerns its potential for a family-based approach to care, which can promote mutual regulation of the parent-infant dyad, thus building attachment and empowering parents to be involved in the care of their child (Shoemark and Dearn, 2008), as well as providing them with a tool they can use beyond the NICU, bearing in mind that the neurodevelopmental challenges that result from prematurity can last well beyond this early period.

The use of maternal voice can hold special promise for promoting normal attachment (Milligan et al., 2003): fetuses and newborn infants can recognize and orient to their mother's voice, in comparison with a female stranger (DeCasper and Fifer, 1980); mother's voice has been found to increase preterm infant oxygen saturation levels, reducing the occurrence of critical events and inducing quiet alert states (Filippa et al., 2013) and benefitting parental stress levels (Loewy et al., 2013). When singing to infants, caregivers adopt a particular ''infant-directed'' style, characterized by higher pitch, slower tempo and distinctive timbre (Trehub et al., 1997). This style of singing is seen across cultures and attracts and maintains the infant's attention (Trehub and Schellenberg, 1995; Van Puyvelde et al., 2015), and ameliorates distress in older infants more effectively than speech or touching (Trehub et al., 2015), all allowing opportunities for communication and interaction, which are the basis for building secure attachment. In the NICU maternal voice is a means of social communication and emotional and physiological regulation, when physical closeness between parents and infant is often precluded.

Live aspects of maternal singing allow the care-giver to intuitively adapt aspects of their vocalizations to match or alter their infant's state (infant-directed singing), encouraging a reciprocal relationship between the dyad (Hanson-Abromeit, 2003). Singing by an attuned care-giver has been shown to produce a significant benefit for neurobehavioral development in a small group of medically complex newborn infants (Malloch et al., 2012). Integrating parents' cultural systems within the context of infant-directed singing may encourage parent attunement to their infant's cues and support attachment through responsiveness (Loewy, 2015) particularly if it is modified with consideration for the infant's neurological functioning (Stewart, 2009). The music therapist can use principles of maternal self-efficacy (Cˇrnˇcec et al., 2008) to help the parent construct their singing until it begins to feel more natural encouraging them to sing directly to their infant (Mondanaro, 2010; Shoemark, 2013).

The case study below is a hypothetical case constructed to specifically illustrate the principles outlined above in a real-world application of music-based interventions across a premature infant's hospitalization (Shoemark, 2014).

## Case Study

Cheryl and Danny's son Jake was born at 25 weeks gestation. In the first days, Cheryl felt powerless, and simply sat watching his tiny form amidst the technology. She would occasionally whisper through the porthole and hold her hand just above his fragile skin to offer him human contact. The music therapy referral was made to establish the parents' sense of self-efficacy as his parents and maintain a strong sense of their unique place as his parents until Jake was available for age-appropriate neurodevelopmental opportunities.

The music therapist explained to Jake's parents that it would be some weeks before Jake could actively respond to their voices but that exposure each day would be helpful (Doheny et al., 2012). She encouraged them to voice their words rather than whisper as this would have more value to him (Spence and Freeman, 1996) and to use the replicable pattern of song and rhymes which would eventually become familiar to him and mark their presence. Each morning his father would gently sing the opening line of his college chant, with its distinctive low-pitched and descending interval ''Rock chalk, jay hawk, K-U . . .''. After a couple of weeks, Jake began to open his eyes when Danny sang. The parents later reported that this recognition of their experience together had been pivotal in their evolving attachment.

Around 36 weeks, Jake was having trouble getting to, and staying asleep. The nurse made a referral to the music therapist for recorded music to support state transition. Before assessing directly, the music therapist confirmed this need with Jake's mother (acknowledging her pivotal role) who agreed and added that Jake often startled at noise. So for a period after each feed and cares, the music therapist measured the ongoing ambient sound (OAS) level and noise events that were 12 dBA or more. This revealed that: (a) the OAS level was mostly at 55 dBA but there were frequent noise events from the nurses' station outside his room (phone receiver put down, calling out, chair scrapes); and (b) the noise events were detected by Jake (facial flinches, jittery limb movement, repeated disruption to evolving sleep), causing sustained arousal. Two strategies were implemented: Jake's bed was removed from the immediate vicinity of the nurse's station resulting in a reduction in the ambient sound level by 5 dBA (which is significant in perceived loudness), and the number of noise events was reduced by 60%. Secondly, the music therapist prepared a 20 min recorded music playlist played at just over 50 dBA which provided a predictable ambient auditory field over which only a few noise events were notable. Mother and nurses reported that Jake's response was positive with regular transition to sleep occurring within 48 h.

Cheryl talked to Jake each day, discovering for herself that this simple act could settle his breathing rate after the nurses took physiological observations. The music therapist introduced Cheryl to contingent singing, using her voice to consciously support Jake's state regulation and learning about his unique cues to ensure the interaction was within Jake's thresholds for stimulation (Shoemark, 2012). As he became more socially available she agreed with the music therapist that a repertoire of playsongs and lullabies would be a useful extension for their interaction. In discussion with Cheryl's mother on her next visit, the music therapist explored the family's relationship with music, to explicate the potential of song as source of joint attention and intersubjectivity (Malloch et al., 2012). She revealed that singing was always a part of Cheryl's early years, establishing an intergenerational role of music as part of nurturing in this family. As a small child, Cheryl had been transfixed by the Bryan Adams movie theme ''Everything I do, I do it for you'' and would giggle and move to Whitney Houston's ''I will always love you''. They were repeated daily. Through this recollection the mother and grand-mother shared a moment of emotional connection which strengthened the mother's relationship with music (Tronick, 1998). With a guitar accompaniment, the music therapist quietly sang the chorus ''And I-I will always love you . . .'' to bridge the introduction of these songs into the NICU for the two women. They reminisced about how they copied Houston's extended ''I-I-I'' in the chorus, thus engaging their own musical heritage (Loewy et al., 2013). The music therapist redirected the song towards Jake asleep in his humidicrib. She encouraged mother and grand-mother to sing or hum so Jake could hear their voices. Cheryl's mother was able to sing and Cheryl leaned in through the porthole and hummed as she put her hand on Jake's torso. The song was thus coupled with the more familiar touch, providing a smooth introduction to this new experience for Jake.

## Conclusion

This perspectives paper has considered some of the experiential and environmental issues confronting the preterm hospitalized infant. Given the significant impact of experience and environment on brain plasticity, offering an enriched environment via individualized music therapy, is suggested to provide a more optimal context to facilitate neurological

## References


development. As the application of music-based interventions continues to grow in the NICU it is important to consider a theoretical framework grounded in neurological mechanisms, theories of infant-directed singing, development and attachment, as well as the clinical expertise of the music therapist. Music therapists are uniquely placed to adapt the soundscape of the NICU and encourage parental vocal communication in order to promote sensory-system maturation, to reduce the impact of environmental stresses, to facilitate transitioning between different physiological states and to foster attachment. Such music-based interventions are an important and viable option to accommodate the developing neurological status of the premature infant and have the potential to benefit both short and long-term outcomes in the infant and parent.


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Shoemark, Hanson-Abromeit and Stewart. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Theory-guided Therapeutic Function of Music to facilitate emotion regulation development in preschool-aged children

#### *Kimberly Sena Moore1\* and Deanna Hanson-Abromeit2*

*<sup>1</sup> Department of Music Education and Music Therapy, Frost School of Music, University of Miami, Coral Gables, FL, USA, <sup>2</sup> Division of Music Education and Music Therapy, School of Music, University of Kansas, Lawrence, KS, USA*

Emotion regulation (ER) is an umbrella term to describe interactive, goal-dependent explicit, and implicit processes that are intended to help an individual manage and shift an emotional experience. The primary window for appropriate ER development occurs during the infant, toddler, and preschool years. Atypical ER development is considered a risk factor for mental health problems and has been implicated as a primary mechanism underlying childhood pathologies. Current treatments are predominantly verbal- and behavioral-based and lack the opportunity to practice in-the-moment management of emotionally charged situations. There is also an absence of caregiver–child interaction in these treatment strategies. Based on behavioral and neural support for music as a therapeutic mechanism, the incorporation of intentional music experiences, facilitated by a music therapist, may be one way to address these limitations. Musical Contour Regulation Facilitation (MCRF) is an interactive therapist-child music-based intervention for ER development practice in preschoolers. The MCRF intervention uses the deliberate contour and temporal structure of a music therapy session to mirror the changing flow of the caregiver–child interaction through the alternation of high arousal and low arousal music experiences. The purpose of this paper is to describe the Therapeutic Function of Music (TFM), a theory-based description of the structural characteristics for a musicbased stimulus to musically facilitate developmentally appropriate high arousal and low arousal in-the-moment ER experiences. The TFM analysis is based on a review of the music theory, music neuroscience, and music development literature and provides a preliminary model of the structural characteristics of the music as a core component of the MCRF intervention.

Keywords: Therapeutic Function of Music, emotion regulation development, preschooler music development, music and arousal, theory

Emotion regulation is an umbrella term to describe interactive, goal-dependent explicit and implicit processes intended to help an individual manage and shift an emotional experience. The unfolding of one's ability to regulate his or her emotions can be a lifelong process (Ochsner and Gross, 2007), but the primary window of development occurs during the infancy, toddlerhood, and

**Abbreviations:** ER, emotion regulation; MCRF, Musical Contour Regulation Facilitation; TFM, Therapeutic Function of Music.

#### *Edited by:*

*Julian O'Kelly, Royal Hospital for Neuro-disability, UK*

#### *Reviewed by:*

*Graham Frederick Welch, University College London, UK Philippa Derrington, Queen Margaret University, UK*

> *\*Correspondence: Kimberly Sena Moore ksenamoore@miami.edu*

*Received: 01 April 2015 Accepted: 30 September 2015 Published: 14 October 2015*

#### *Citation:*

*Sena Moore K and Hanson-Abromeit D (2015) Theory-guided Therapeutic Function of Music to facilitate emotion regulation development in preschool-aged children. Front. Hum. Neurosci. 9:572. doi: 10.3389/fnhum.2015.00572* preschool years. In fact, these years provide the critical opportunity for adaptive ER development to occur (Bargh and Williams, 2007; Calkins and Hill, 2007; Eisenberg et al., 2007; Stegge and Terwogt, 2007; Cole et al., 2008; Röll et al., 2012). Atypical ER development is considered a risk factor for mental health problems (Hunter et al., 2011; Röll et al., 2012) and has been implicated as a primary mechanism underlying childhood pathologies (Perry and Pollard, 1998; Zeman et al., 2006; Mullin and Hinshaw, 2007; Thompson and Meyer, 2007; Röll et al., 2012), as well as childhood social competence and school adjustment (Calkins and Hill, 2007; Eisenberg et al., 2007; Jahromi et al., 2012). Furthermore, due to the use-dependent nature of neurodevelopment, structural and functional neural changes associated with atypical ER development affect the functionality of an individual's brain through adulthood (Perry et al., 1995).

Given the significance of healthy and adaptive ER development for an individual's mental health, it is important to explore strategies for facilitating its development should an individual be at-risk for maladaptive ER skills. A therapeutic music-based approach may be one way to promote healthy ER development in young children for three reasons:


Recent music therapy literature calls for clear and detailed descriptions of music-based interventions to improve the description, procedures, and rationale for music selections (Robb et al., 2011). The TFM intentionally identifies the theoretical framework to support the purpose, description and appropriateness of the musical elements in relation to intended treatment goal and targeted population (Hanson-Abromeit, 2015). Thus, the purpose of this paper is to outline the TFM Plan as a framework for constructing a music-based stimulus to target ER development with preschool-aged children. This TFM Plan provides a set of guidelines grounded in typical developmental competence to support the music therapist's structure of the music-based stimulus to foster ER development in preschoolers. These guidelines create a basis for clearer intervention reporting and an explicit theory-based rationale of how to structure the music stimulus to more intentionally facilitate therapeutic change. Moreover, the TFM creates greater clarity for the role of the music within an intervention strategy (Hanson-Abromeit, 2015), a core component of the MCRF intervention.

## EMOTION REGULATION DEFINED

Emotion is a valenced response to an internal (i.e., perceived or recalled) or external event, situation, or object (Ochsner and Gross, 2005; Gross and Thompson, 2007; Juslin and Sloboda, 2010; Damasio and Carvalho, 2013) that is deemed significant to the individual (Frijda, 2007; Gross and Thompson, 2007). Simplistically, the emotion process is instigated by a situation, which directs an individual's attention to appraise the situation, then generates a response. This is referred to as the situationattention-appraisal-response modal model of emotion processing (Gross and Thompson, 2007). Theoretically, ER strategies are embedded in this process (Stegge and Terwogt, 2007) and can influence emotion processing at any point in this sequence (e.g., the situation, attention, appraisal, or response periods) (Gross and Thompson, 2007) through both feedback and feedforward mechanisms (Stegge and Terwogt, 2007). ER is a core characteristic of the emotion process as emotional states typically involve an ER component (Juslin and Sloboda, 2010; Lewis et al., 2010), and it functions in general self regulation as well. Self regulation is the mechanism through which the brain attempts to maximize an individual's functioning by minimizing distractions (Lewis et al., 2010), with a purpose towards maintaining a sense of equilibrium and homeostasis. Regulatory efforts can occur across multiple domains and it is likely that there is overlap and continuity between them (Calkins and Hill, 2007; Gross and Thompson, 2007). Indeed, it may be difficult to separate ER from other forms of regulation, such as attention regulation and behavioral regulation. With ER, the emotion serves as a signal that equilibrium is disrupted (Bargh and Williams, 2007) and emotion-based regulatory efforts are intended to return an individual back to a state of homeostasis.

A single agreed-upon definition for ER does not yet exist in the literature, but authors tend to agree on several key features. ER is an umbrella term for a diverse set of processes and strategies (Beer and Lombardo, 2007; Calkins and Hill, 2007; Eisenberg et al., 2007; Mullin and Hinshaw, 2007; Gross and Thompson, 2007; Thompson and Meyer, 2007; Lewis et al., 2010; Gyurak et al., 2011). These processes and strategies can be explicit, voluntary, controlled, and conscious (i.e., they are "top–down" strategies) or implicit, reactive, automatic, and subconscious (i.e., they are "bottom–up" strategies) (Beer and Lombardo, 2007; Calkins and Hill, 2007; Gross and Thompson, 2007; Mullin and Hinshaw, 2007; Thompson and Meyer, 2007; Lewis et al., 2010; Gyurak et al., 2011). In reality, this is not an either-or situation as strategies can occur on a continuum from top– down to bottom–up (Gross and Thompson, 2007). The purpose of ER is to manage and shift emotions through dampening, intensifying, or maintaining the intensity and temporal qualities of the emotion experience (Beer and Lombardo, 2007; Calkins and Hill, 2007; Eisenberg et al., 2007; Gross and Thompson, 2007; Thompson and Meyer, 2007). Thus, ER will alter the intensity of the emotion, but not the quality of the emotion itself (Thompson and Meyer, 2007). As a goal-dependent process, how emotions are regulated will depend on an individual's goal in the given context (Cummings and Davies, 1996; Beer and Lombardo, 2007; Eisenberg et al., 2007; Gross and Thompson, 2007; Thompson and Meyer, 2007). The end result of appropriate ER, therefore, cannot be considered optimal or maladaptive (Thompson and Meyer, 2007) as it should be congruent with what an individual needs in that particular moment or situation. Finally, ER is a dynamic and interactive process (Cummings and Davies, 1996; Calkins and Hill, 2007; Gross and Thompson, 2007; Gyurak et al., 2011) that does not rely solely on modifying emotions, but also on the individual continuing to monitor and appraise the emotions (Thompson and Meyer, 2007). Considering these key characteristics, for the purposes of this paper ER will be defined as a concept that describes interactive, goal-dependent explicit and implicit processes intended to help an individual manage and shift an emotional experience.

## EMOTION REGULATION DEVELOPMENT

The unfolding of one's ability to regulate his or her emotions can be a lifelong process (Ochsner and Gross, 2007), but the primary window of development occurs during the infant, toddlerhood, and preschool years. Although there are multiple possible pathways to ER development, the general trajectory follows a three-stage arc: (1) simple physiologic and reflexive responses (Calkins and Hill, 2007), (2) caregiver-directed coregulatory strategies or the use of simple attentional and motor strategies (Eisenberg et al., 2007; Thompson and Meyer, 2007), and (3) active and intentional self-regulation of emotions (Zeman et al., 2006; Bargh and Williams, 2007; Calkins and Hill, 2007; Eisenberg et al., 2007; Stegge and Terwogt, 2007; Thompson and Meyer, 2007; Cole et al., 2008). Although an innate developmental template for ER development exists, ER skills are also socially constructed (Thompson and Meyer, 2007). These socially constructed skills are influenced by one's cultural experiences, family environments, caregiver interactions, and gender expectations.

In the 1st year of life, ER needs are primarily centered on controlling arousal levels (Calkins and Hill, 2007), managing emotional cues, and handling external and internal stress (Feldman, 2009). They tend to involve either innate physiological mechanisms, such as a generalized approach or withdrawal response to an arousal-inducing stimulus (Calkins and Hill, 2007), or passive, caregiver-directed, mutual regulation (Thompson and Meyer, 2007; Feldman, 2009). Caregiver interactions strongly mediate the development of ER. This is an important consideration as children generally cope more adaptively and develop more appropriate ER strategies when caregivers respond supportively and sympathetically to their emotional expressions (Perry and Pollard, 1998; Gross and Thompson, 2007; Thompson and Meyer, 2007; Koole, 2009). During infancy, sensitive, flexible, and responsive caregiving behaviors become integrated into an infant's repertoire of ER responses (Calkins and Hill, 2007). Such behaviors influence the infant's ER development by demonstrating that stress can be managed (Thompson and Meyer, 2007; Cole et al., 2008) and that an adult can help with the management (Thompson and Meyer, 2007). Furthermore, it is not just through calming a distressed infant that caregivers support ER development but also through stimulating the infant by engaging in face-to-face play, an activity that emerges around 2–3 months of age (Thompson and Meyer, 2007). These calming and stimulating interactions offer multiple and frequent opportunities to practice emotion and arousal regulation (Zeman et al., 2006). They strongly influence developing ER capacities in the infant and support continued ER development during the toddler and preschool years.

During the toddler years, caregivers facilitate the shift from passive co-regulation to active self-regulation of emotions by incorporating a variety of strategies and directives designed to facilitate, prompt, model, and structure the emotion experience (Zeman et al., 2006; Calkins and Hill, 2007; Thompson and Meyer, 2007). Caregiver strategies may include preemptively structuring the environment to help control for emotional demands, providing social referencing cues, prompting the use of specific ER strategies, or providing contingencies for behaviors (Zeman et al., 2006; Thompson and Meyer, 2007). Caregiver directives are also important as they facilitate and externally "coach" the transition from passive co-regulation of emotions to more active, internal self-regulation of emotions. Caregivers utilize a variety of directives, which may include distracting the toddler, helping the toddler problem-solve, providing alternate interpretations of situations, providing social referencing cues, suggesting adaptive responses, offering alternatives for maladaptive behavior, or structuring experiences to help make emotional demands more manageable (Thompson and Meyer, 2007). In addition to the caregiver-facilitated shift toward more active self-regulation of emotions, ER development toward the end of toddlerhood is characterized by an emerging use of objects (e.g., toys), social interactions (Eisenberg et al., 2007), and cognitive strategies (Stegge and Terwogt, 2007) to facilitate ER.

Emotion regulation development in the preschool years, defined here as ages three through five years, is generally characterized by a decline in caregiver interventions and directives (Thompson and Meyer, 2007), a greater emphasis on top–down cognitive strategies (Stegge and Terwogt, 2007), a growing repertoire of behavioral ER strategies, and an increasing understanding and use of culturally defined behavioral display rules (i.e., cultural and commonly gender-based expectations for how an individual shows emotions in a given situation) (Zeman et al., 2006). The caregiver–child relationship continues to be of great importance during the preschool years though there is a shift in how that relationship informs ER development. The "coaching" type of ER development that began during the toddlerhood years persists during the preschool years as caregivers continue to facilitate the transition to active self-regulation of emotions. Caregivers still use a variety of strategies and directives designed to facilitate, prompt, model, and structure the emotional experience (Zeman et al., 2006; Calkins and Hill, 2007; Thompson and Meyer, 2007). These may include preemptively structuring the environment to help control for emotional demands, providing social referencing cues, prompting the use of specific ER strategies, and providing contingencies for behaviors (Zeman et al., 2006; Thompson and Meyer, 2007). However, the employment of these strategies and directives decline during this developmental period as preschoolers take a more active role in the regulation of their emotions (Thompson and Meyer, 2007).

A primary goal during the preschool developmental period is for children to begin to identify appropriate and inappropriate ER strategies and to assess the effectiveness of these strategies (Cole et al., 2008; Röll et al., 2012). Furthermore, unlike earlier years where children are unable to select or modify their environments and situations (Gross and Thompson, 2007), preschool-aged children have a greater ability to utilize strategies designed to alter a situation (Cole et al., 2008). Around age three years, children begin to develop successful inhibitory control (Bargh and Williams, 2007) and the next two years are characterized by an emerging explicit awareness of ER strategies (Cole et al., 2008). Caregiver-preschooler conversations about emotions facilitate ER development as they convey cultural values, gender expectations (Zeman et al., 2006; Thompson and Meyer, 2007), and assist the child in identifying emotions and regulating negative affect (Zeman et al., 2006).

## Childhood Stress and ER Development

Emotions can be viewed as homeostasis-disrupting events as they indicate an organism is not in a steady and calm state of equilibrium (Bargh and Williams, 2007). Furthermore, ER development in infancy centers on controlling arousal levels (Calkins and Hill, 2007) and handling internal and external stress (Feldman, 2009). In a mature individual, the classic "fight, flight, or freeze" stress response is the body's adaptive response to a stressful event, such as occurs during an emotion-inducing event (Perry and Pollard, 1998). In this instance, although the emotion offers an important clue about one's environment, it can also be considered a homeostasis-disrupting event (Bargh and Williams, 2007). Thus, one primary function of ER is to return the organism to a state of homeostasis through managing and shifting the physiological response associated with the emotion experience. When these efforts are not successful, an individual remains in a stressed, disequilibrium state and is said to be dysregulated (Linehan et al., 2007). Given these connections, linking ER development to the childhood stress response is essential.

Stress responses exhibit differently in children than in adults. Instead of the classic stress response, a child's stress response generally takes one of two patterns, (1) hyperarousal, also referred to as hyperactivating or overregulation, or (2) dissociative, also referred to as deactivating or underregulation (Perry et al., 1995; Cummings and Davies, 1996; Perry and Pollard, 1998; Mikulincer et al., 2003; Mullin and Hinshaw, 2007). The type of response pattern a child utilizes is formed in infancy and influenced by caregiver–infant interactions. When an infant experiences stress, its initial reaction is hyperarousal (e.g., crying) as it seeks proximity to the caregiver (Perry et al., 1995; Perry and Pollard, 1998). If this strategy works, the infant will continue to use this strategy when he or she seeks proximity, love, and support (Mikulincer et al., 2003). If the initial hyperarousal response attempt does not work, the infant will disengage from its proximity-seeking behavior and will instead attempt to manage the stress without caregiver support. These self-soothing and "managing" behaviors are on the dissociative end of the continuum and can manifest in behaviors such as distraction, avoidance, numbing, daydreaming, and fainting (Perry et al., 1995; Perry and Pollard, 1998; Mikulincer et al., 2003). The type of stress response a child utilizes is also mediated by age and gender as younger children and females are more likely to use dissociative strategies (Perry et al., 1995).

## DEVELOPMENTAL IMPLICATIONS

Emotion regulation development occurs in infancy and early childhood and is heavily influenced by the caregiver–infant relationship (Schore, 2001; Calkins and Hill, 2007; Eisenberg et al., 2007; Thompson and Meyer, 2007). Due to the usedependent nature of neurodevelopment (Perry and Pollard, 1998), early stress-inducing emotional experiences shape the structure and function of the developing brain. Thus, without being exposed to developmentally appropriate ER experiences, a child is at-risk for developing poor ER skills and maladaptive ER strategies. This has implications for an individual's behavioral response patterns (Perry et al., 1995; Perry and Pollard, 1998; Calkins and Hill, 2007; Eisenberg et al., 2007; Mullin and Hinshaw, 2007; Feldman, 2009; Lewis et al., 2010; Jahromi et al., 2012), emotional and social health (Perry et al., 1995; Zeman et al., 2006; Calkins and Hill, 2007; Eisenberg et al., 2007; Blair et al., 2008; Feldman, 2009; Lewis et al., 2010; Jahromi et al., 2012), cognitive skills and learning (Perry et al., 1995; Schore, 2001; Calkins and Hill, 2007; Blair et al., 2008; Jahromi et al., 2012), and the potential development of psychopathology (Perry and Pollard, 1998; Zeman et al., 2006; Mullin and Hinshaw, 2007; Thompson and Meyer, 2007; Hunter et al., 2011; Röll et al., 2012). In short, the development of adaptive ER skills affects a child's mental health, behavioral and emotional responses to stress, his or her ability to develop healthy and appropriate adultchild and peer relationships, and the child's ability to learn in school.

Children are at-risk to develop maladaptive ER skills regardless of whether they exhibit hyperarousal or dissociative stress response patterns. For example, a link exists between hyperarousal response patterns and uncontrolled "acting out" or aggressive behaviors. Increased amounts of stress in infancy and childhood may alter the developing HPA axis, which could result in a dysregulated stress response system (Perry and Pollard, 1998; Calkins and Hill, 2007). The effect of increased stress that occurs prenatally and during infancy can have even more damaging and lasting effects, since one of the first systems to mature are basic physiological systems implicated in the stress response. These physiological systems are integrated into, and mediate the development of, laterdeveloping emotional, cognitive, and behavioral systems (Calkins and Hill, 2007). A dysregulated stress response system may lead to heightened states of arousal, which is commonly found in aggressively reactive children (Mullin and Hinshaw, 2007). This type of aggressive reactivity can be accompanied by difficulty in controlling such reactivity, as these children may exhibit low levels of top–down, cognitive-based ER strategies (Eisenberg et al., 2007). In addition to increased reactivity, children who have hyperarousal response patterns may exhibit other types of externalizing behaviors, such as inattention, impulsivity, anxiety, hyperactivity, hypervigilance, and antisocial behaviors (Perry and Pollard, 1998; Mullin and Hinshaw, 2007). Such patterns places these children at-risk for externalizing disorders (Zeman et al., 2006) and childhood pathologies such as Attention-Deficit/Hyperactivity Disorder (ADHD), Oppositional Defiant Disorder (ODD), and Conduct Disorder (CD) (Mullin and Hinshaw, 2007).

A child is also at-risk when dysregulation occurs at the dissociated end of the childhood stress response continuum. In these instances, maladaptive ER skills are characterized by an inhibition or overcontrol of an emotion (Calkins and Hill, 2007). Children who exhibit this response pattern are thought to be as highly reactive as their hyperaroused counterparts, while their behavioral responses are characterized by poor attention regulation, poor behavior initiation (Eisenberg et al., 2007), and maladaptive self-soothing behaviors such as rocking or "cutting" (Perry and Pollard, 1998). In addition, they are prone to internalizing problems and disorders (Zeman et al., 2006; Eisenberg et al., 2007; Röll et al., 2012). Having said that, the link between maladaptive ER skills and internalizing problems is not as clear as it is between maladaptive ER skills and externalizing problems (Eisenberg et al., 2007).

Overall, ER significantly influences an individual's ability to function, which makes the development of ER an important area to understand. These effects have also been noted beyond childhood such that adaptive ER may be a marker of poor mental health (Cole et al., 2008; Gyurak et al., 2011; McRae et al., 2012). How stress and adversity is experienced and handled early in life may program an adult's stress response (Hunter et al., 2011). This does not mean that every child who experiences stress is susceptible to such challenges and problems; ER development outcomes also depend on the child's natural temperament and resilience (Ochsner and Gross, 2007) and the environmental context (Mullin and Hinshaw, 2007).

A child's maladaptive ER-related behaviors serve an important communicative function for parents, caregivers, and clinicians as they indicate that the child is in a dysregulated state. Chronically maladaptive behaviors may suggest a pervasive problem that needs to be addressed. The hyperarousal or dissociative behaviors themselves, although observable, are not the primary issues of concern when facilitating ER development. The underlying problem concerns the development of maladaptive ER skills. Therefore, understanding how ER develops is key to effective intervention methods and supports the call for a more theory-based approach to clinical work, including disciplines such as music therapy (Robb, 2012). A theory-based approach should focus less on the hyperarousal and dissociative behaviors and more on the mechanisms underlying ER development—the developing stress response system, the supportive, caring environments, the predictable, safe, flexible, and loving caregiver responses, and the interactions between them. These are the mechanisms that influence brain development and should theoretically be the primary targets of interventions intended to facilitate ER development.

## MUSIC AS AN ER-FACILITATING MECHANISM

Numerous therapeutic techniques and training programs exist to help improve ER in preschoolers. However, a need still exists for approaches that incorporate a wider range of bottom–up and top–down strategies, provide in-the-moment opportunities to manage "stress" (e.g., emotionally arousing experiences), and afford opportunities for this practice to be realized in the context of a developmentally appropriate interactive adult-child relationship. One such therapeutic approach that may fit these needs is a music-based intervention.

The idea that music can induce emotions began to emerge in the scientific literature in the late 1800s (James, 1884) and continued to be mentioned and explored in subsequent literature from the psychological and anthropological fields (James, 1884; Meyer, 1956; Merriam, 1964; Sears, 1968; Berlyne, 1971; Lazarus, 1991; Zajonc, 1994; Frijda, 2007). The connection between music and emotions was largely ignored in the latter half of the 20th century; however, there has been renewed interest over the past 20 years in understanding this phenomenon (Juslin and Sloboda, 2010). It is generally agreed that music evokes emotions and stimulates physiological and behavioral responses (Habibi and Damasio, 2014). In addition, more recent neuroscience research indicates there are shared neural networks implicated in both emotion and music processing (Blood and Zatorre, 2001; Satoh et al., 2001; Brown et al., 2004; Menon and Levitin, 2005; Baumgartner et al., 2006; Koelsch et al., 2006, 2008; Bengtsson et al., 2007; Brown and Martinez, 2007; Foss et al., 2007; Kleber et al., 2007; Mitterschiffthaler et al., 2007; Mizuno and Sugishita, 2007; Berkowitz and Ansari, 2008; Limb and Braun, 2008; Lerner et al., 2009), and specifically between music and ER processing (Sena Moore, 2013). There is also evidence to support the developmentally appropriate use of music-based experiences to target ER development in preschoolers because music stimulates physiologic arousal and induces emotions, and assumes a natural role in bonding and social interactions.

## Developmental Appropriateness

Parents and professionals who work with preschool-aged children know that they are inherently musical. This connection is also well documented in the literature. Preschoolers have an unbridled enthusiasm for music (Trehub, 2006) and music has a prevalent role in their lives (Lamont, 2008). The developmental foundations for music are laid in infancy. Infants are born with a curiosity and attentiveness to musical sounds (McDonald and Simons, 1989; Tafuri, 2008), and with finer perception of frequency, timing, and timbre than what is needed at this point in their musical development (Trehub, 2003). As preschoolers, a child's first social experiences likely involve musical games (McDonald and Simons, 1989; Marsh and Young, 2006). Furthermore, music holds an important role in childcare rituals and routines, both in the home and preschool or daycare settings (Lamont, 2008). Thus, it seems natural and acceptable to utilize a medium with which preschool children are familiar and are inherently drawn to as a therapeutic mechanism.

## The Music – Physiologic Arousal – Emotions Connection

The connection between music, physiologic arousal, and emotions has been recorded as early as the late nineteenth century (James, 1884). More recently, neural structures involved in emotion processing, including the striatum, amygdala, ventromedial prefrontal, anterior cingulate cortex (Damasio, 2000), and insula (Damasio and Carvalho, 2013), have also been implicated in music processing (Habibi and Damasio, 2014). Furthermore, listening to music produces physiologic changes associated with emotion processing. What is particularly notable is that an overt reaction is not typically required for musically induced emotions (Trainor and Schmidt, 2003).

Another notable observation is how early this connection begins. Infants have demonstrated sensitivity to sound and movement patterns and their emotional connotations (Parncutt, 2006). Caregiver–infant interaction patterns mirror one another using music-like qualities that engage infant attention, communicate affective responses, encourage social reciprocity, and facilitate language (Trevarthen and Aitken, 2001). Thus, there is a connection between music and physiologic arousal as a means of modulating an infant's arousal level. This can be seen through caregiver–infant interactions, which often incorporate stimulating music (i.e., play-songs) and calming music (i.e., lullabies) (Trainor and Schmidt, 2003; Trehub, 2003). In addition, music has the ability to convey emotional information (Schubert and McPherson, 2006), even for young children (Trainor and Schmidt, 2003). Preschoolers are able to distinguish basic emotions, such as happy, sad, anger, and fear, and express them through music (Stachó et al., 2013). Thus, the connection between music, physiologic arousal, and emotion induction apparent early in the development process provides additional support for the use of music as a mechanism to facilitate ER development in preschool children.

## Music-mediated Bonding and Interactions

Perhaps the strongest argument supporting the use of music to facilitate ER development may be its role in facilitating bonding and caregiver–child interactions. From an early age children are exposed to musical interactions, first with their caregivers (Cross, 2003; Trainor and Schmidt, 2003; Trehub, 2003; Barrett, 2006; Custodero, 2006; Marsh and Young, 2006) and then through social experiences (McDonald and Simons, 1989; Marsh and Young, 2006) and interactions in a daycare or preschool setting (Lamont, 2008). Caregivers commonly use music as part of their familiar and structured parenting rituals (Barrett, 2006, 2011; Custodero, 2006; Lamont, 2008). These early music interactions help to stimulate the child, soothe the child, or allow the child to share emotional information (Trehub, 2009). Furthermore, caregiver–infant interactions seem to incorporate music-like characteristics in that they have a rhythmic and dynamic backand-forth quality. Such music-like interactions are critically important in helping a child acquire capacities to self-regulate and to bond emotionally with another person (Cross, 2003) and they serve an essential role in nonverbal caregiver-infant communications (Trevarthen and Aitken, 2001; Marsh and Young, 2006) that begin forming before birth (Welch, 2006b; Tafuri, 2008). Grounded in these early experiences, spontaneous singing emerges during early childhood frequently functioning as way to express and regulate the young child's emotions (Barrett, 2006, 2011).

Even beyond infancy and the caregiver–infant relationship, a key characteristic of musical play is its importance as a form of social interaction (Marsh and Young, 2006). Interactive musical play between parents and young children can have a positive effect on the quality of parent–child communication and understanding (Welch, 2006b). Outside the home, music infuses the interactions between young children and their peers. They share musical play ideas, synchronize rhythmic movements with each other, and imitate each other's melodic ideas (Marsh and Young, 2006). As a natural component of early interactions, bonding, and social experiences, music can be an effective mechanism to facilitate therapeutic change in preschool children.

## Neural Support

There is emerging, although inconclusive, neurological support for using music as an intervention mechanism that centers on the role of the right hemisphere in music processing, attachment, and stress. Infants show a right hemisphere advantage for music processing (Trehub, 2003) and it is the right hemisphere that seems specialized for processing musically induced emotions (Peretz, 2010). In addition, the prosodic elements of speech processing, the patterns of stress and intonation that are sometimes referred to as the "music" of speech, may be processed more in the right hemisphere than in the left (Welch, 2006b). Outside of music and speech processing, the right hemisphere is implicated in an organism's ability to cope with stress (Schore and Schore, 2008). Furthermore, caregiver–infant interactions are implicated in the appropriate development of the prefrontal cortex, particularly in the right hemisphere (Calkins and Hill, 2007). Finally, although this does not address ER from a developmental perspective, music activates neural networks implicated in ER processing (Sena Moore, 2013). This means that, although a specific brain-based connection has not yet been made, evidence for a connection between music processing, attachment, stress management, and music and ER exists. Such correlations provide support for pursuing a line of inquiry that explores how music can be used to facilitate ER development.

## THERAPEUTIC FUNCTIONS OF MUSIC FOR TARGETING ER DEVELOPMENT

Strong evidence exists for the use of a music-based treatment approach to facilitate ER development. An important aspect of an effective music-based intervention is to determine how to structure the music stimulus intentionally for this task. One method to develop effective music-based interventions is to design a TFM Plan for the proposed strategy (Hanson-Abromeit, 2015). The TFM has been defined as "the direct relationship between the treatment goal and the explicit characteristics of the musical elements, informed by a theoretical framework and/or philosophical paradigm in the context of a client" (Hanson-Abromeit, 2013). In other words, it allows the clinician to have an explicit understanding of why and how music affects a desired change, thus informing the intentional, therapeutic use of music in clinical practice. For the purposes of this paper, the goal in outlining the TFM Plan is to help determine how to structure a music stimulus so that it is developmentally appropriate for preschool-aged children and can have either a stimulating ("high arousal") or a calming ("low arousal") effect, core components of the music-based MCRF intervention. This analysis is based on a review of music theory, music neuroscience, and music development literature resulting in a descriptive synthesis of the musical characteristics reflective of high and low arousal musical stimuli appropriate for preschool-aged children. The TFM Plan (Hanson-Abromeit, 2015) is detailed in Supplementary Table 1.

## Synthesis of Developmentally Appropriate Music for Preschool-aged Children

Developmentally appropriate music stimuli should be predictable and structured, incorporating rhythmic and melodic repetition and simple consonant harmonies. Melodies should have an easyto-follow contour characterized by descending intervals and step-wise movements. Pitches and pitch intervals should help create "singable melodies" by falling within an octave pitch range and include skips with small-integer ratio intervals (e.g., octaves, perfect fifths, perfect fourths). The music should mostly incorporate binary rhythms, should have a simple form, and should be primarily diatonic. Stylistically appropriate music should sound like popular music, be chant-like and repetitive, or be solitary and free-flowing. Preschool-aged children can be expected to synchronize their motor movements to a beat, detect tempo changes and synchronize to them, and control and alter basic rhythmic patterns. Most will have an imprecise sense of pitch, but will posses a developing ability to sing in tune and can produce and discriminate loud and soft sounds. Furthermore, they should be able to discriminate various musical styles, timbres, and textural changes. If incorporating valence into the music-based experience (e.g., a musically induced positive or negative emotion), major modes can be used to reflect positive emotions and minor modes negative ones. Preschoolers can be expected to focus on words, thus lyrics provide a verbal prompt to use an explicit ER strategy or explore the effectiveness of an ER strategy. This synthesis is based on the following literature: Krumhansl and Keil, 1982; Trehub et al., 1986; Drake and Gérard, 1989; McDonald and Simons, 1989; Morrongiello and Roes, 1990; Schwarzer, 1997; Drake et al., 2000; Dalla Bella et al., 2001; Costa-Giomi, 2003; Drake and Bertrand, 2003; Trainor and Schmidt, 2003; Trehub, 2003, 2006, 2009; Peretz and Zatorre, 2005; Devous et al., 2006; Leighton and Lamont, 2006; Marsh and Young, 2006; Schubert and McPherson, 2006; Welch, 2006a; Marshall and Hargreaves, 2007; Miyamoto, 2007; Lamont, 2009; Patel, 2009; Stewart et al., 2009; Trainor and Zatorre, 2009; Gabrielsson and Lindström, 2010; Tsang et al., 2011.

A variety of characteristics can be combined to create highly arousing music, and a music stimulus can be manipulated in various ways to make it more arousing. Highly arousing music avoids ritardandos and accents on unstable (i.e., rhythmically unstressed) notes. For preschool children, it may include complex ternary rhythmic patterns. High arousal music can also have rising pitches and use instruments that produce extraneous harmonic "noise." It typically will be in a fast tempo with bright or sharp timbres, rising or sharp micro-intonations, fast or shallow vibrato, staccato articulations, quick and abrupt attacks, complex musical textures, or with variable articulation styles. Lyrics can be created to reflect the intended arousal level.

To manipulate a music stimulus to make it more arousing, sudden and unexpected musical events or novel musical elements can be incorporated. These can be melodic (e.g., ascending intervals, wide skips, and intentional mis-tunings), pitch-related (e.g., sharp change in pitch tuning), timbral (e.g., novel timbres or unexpected timbral changes), stylistic (e.g., sudden change in musical style), rhythmic (e.g., sudden or sharp rhythmic changes), or textural (e.g., novel texture or unexpected textural changes). If incorporating valence into the music experience, it may help to reference a two-dimensional circumplex model that features the dimensions of valence and activity, or arousal. Within this model, happiness, anger, and fear are considered emotions of high arousal (Juslin and Timmers, 2010). Happysounding music can be loud, have small tempo variations, and variability in loudness levels. Angry music can be loud and have small tempo variations. Fearful-sounding music can be soft and have substantial variability in terms of volume or tempo. Finally, creating a sense of expectation is an important element of a music-induced emotional response (Stevens and Byron, 2009) and another mechanism through which arousal can be elicited. Musically, expectation can occur by creating a pause in the melodic or rhythmic pattern or through the use of harmonic dissonance. This synthesis is based on the following literature: Trehub et al., 1986; Morrongiello and Roes, 1990; Dalla Bella et al., 2001; Costa-Giomi, 2003; Drake and Bertrand, 2003; Trainor and Schmidt, 2003; Peretz and Zatorre, 2005; Devous et al., 2006; Schubert and McPherson, 2006; Lamont, 2009; Patel, 2009; Stevens and Byron, 2009; Stewart et al., 2009; Trainor and Zatorre, 2009; Gabrielsson and Lindström, 2010; Juslin and Timmers, 2010; Mote, 2011; Yrtti, 2011; Huron, 2013.

## Synthesis of Low Arousal Music

Structuring music to create a calming, low arousal effect will incorporating the familiar developmentally appropriate musical characteristics previously outlined. Other music characteristics that may be considered low arousal for preschoolers include music in a lower-than-normal range, no changes in pitch tuning, soft loudness levels, narrow loudness variability, familiar, soft, or dull timbres, slow tempos, legato articulations, slow attacks, simpler textures, slow vibrato, and limited articulation variability. Rhythmic characteristics of low arousal music include incorporating a ritardando at the end of a song, placing accents on stable (i.e., rhythmically stressed) notes, and avoiding rhythmic change. Additionally, lyrics can be created to reflect the intended arousal level. This synthesis is based on the following literature: Trainor and Schmidt, 2003; Peretz and Zatorre, 2005; Devous et al., 2006; Schubert and McPherson, 2006; Patel, 2009; Stewart et al., 2009; Trainor and Zatorre, 2009; Gabrielsson and Lindström, 2010; Juslin and Timmers, 2010; Mote, 2011; Yrtti, 2011; Huron, 2013.

## TFM APPLICATION TO INTERVENTION: MUSICAL CONTOUR REGULATION FACILITATION

One of the primary goals of ER development during the preschool years is to practice different ER strategies in various situations and contexts. It is through practicing these strategies that ER development transitions from being explicit, top– down, caregiver-facilitated processes to implicit, automatic, selfinitiated ones (Bargh and Williams, 2007). Furthermore, typical ER development is mediated in large part through the child's interactive experiences with a caregiver (Calkins and Hill, 2007; Eisenberg et al., 2007; Thompson and Meyer, 2007). A child may be at-risk for developing maladaptive ER strategies if he or she does not have a secure attachment relationship with a primary caregiver and is not exposed to experiences that help him or her practice ER strategies in an adaptive and safe way (Calkins and Hill, 2007).

Preschool-aged therapeutic strategies explicitly designed to facilitate the practice of ER strategies within the context of a trusted relationship remains extremely limited, with one notable exception (Betty, 2013). Many of the published therapeutic strategies are verbal- or behavioral-based (Webster-Stratton and Reid, 2003; Izard et al., 2008; Johnson, 2012) and although they incorporate evidence to support the efficacy of the particular treatment technique or training program, several limitations exist. First, such strategies only target verbal, top–down ER strategies, even though ER strategies can occur on a continuum from top–down to bottom–up (Gross and Thompson, 2007). Second, there is a separation between the timing of the treatment approaches and the occurrence of an emotionally stressful situation. Children cope more adaptively and acquire more positive ER strategies when caregivers respond supportively and sympathetically to their emotional expressions as they occur (Perry and Pollard, 1998; Gross and Thompson, 2007; Thompson and Meyer, 2007; Koole, 2009). In other words, current approaches are a priori therapeutic methods that may help the child learn ER strategies, but may not provide the child with an opportunity for real-time management of emotionally arousing experiences necessary to internalize the strategies. Third, ER is largely learned through caregiver–child interactions. An a priori therapeutic approach may help the child learn ER strategies, but it does not mean that the child will get the interactions needed to practice and internalize the strategies. Although these limitations are partially addressed through the parent training component included in some programs (i.e., parents are trained to provide necessary in-the-moment responsiveness to an emotionally stressed child) (Webster-Stratton and Reid, 2003; Izard et al., 2008), what may be needed is a therapeutic approach that incorporates real-time, in-themoment opportunities to practice experiencing and managing stressful experiences.

Music may function as a mechanism to provide in-themoment, interactive opportunities for the management and regulation of "stress" (e.g., emotionally arousing experiences) in the context of a healthy adult–child relationship. Behavioral and neural evidence supports using music as the mechanism due to music's natural role in infant and early childhood interactions (McDonald and Simons, 1989; Cross, 2003; Trainor and Schmidt, 2003; Trehub, 2003; Marsh and Young, 2006; Welch, 2006b; Lamont, 2008), caregiver-infant bonding (Cross, 2003; Marsh and Young, 2006), and developmental appropriateness (McDonald and Simons, 1989; Trehub, 2003, 2006; Marsh and Young, 2006; Lamont, 2008). Based on this support, the synthesized TFM Plan identifies the structural characteristics of the musical elements to compose preschool-aged developmentally appropriate (neutral arousal), arousing (high arousal), and calming (low arousal) musical experiences (see Supplementary Table 1). This TFM Plan provides the theory-based rationale for music stimuli purposefully constructed for an intervention context.

The MCRF intervention was designed with the intention that the contour and temporal structure of a music therapy session alternate between high- and low-arousal states in a way that theoretically mirrors the changing flow of the caregiver–infant interaction. In essence, the MCRF intervention does not seek to induce or elicit specific emotions, but to use the music stimulus to manipulate the arousal levels in preschool-aged children, exposing them to alternation of stimulating and calming experiences (Sena Moore, 2014, 2015). This strategy builds on the works of Meyer (1956), Berlyne (1971), and others (Juslin and Sloboda, 2010), as it is intended to manipulate the physiologic aspect of the emotion process, (Zelazo and Cunningham, 2007; Koole, 2009; Juslin and Sloboda, 2010), through music-based experiences. Music stimuli for the MCRF intervention was composed based on guidelines that emerged from the TFM Plan (Hanson-Abromeit, 2015). An intervention manual was designed to provide a clear and detailed description of the MCRF intervention protocol (Sena Moore, 2014). Manualization of interventions improves intervention transparency and subsequent transfer of music-based interventions from research to clinical practice, thus the information included in this manual adheres to music-specific intervention reporting guidelines (Robb et al., 2011).

It is expected that exposing preschool-aged children to alternating characteristics of the musical elements will provide opportunities for them to practice managing high and low arousal experiences in the moment. Furthermore, it is expected that over time, this practice will lead to improvements in ER skills and in their ability to manage high and low arousal situations as measured by changes in hyperarousal and dissociative stress response pattern indicators (i.e., an increased use of adaptive ER skills, increased attending behaviors, decreased emotional reactivity, and decreased aggressive behaviors). As an initial exploration, a study was conducted to provide an examination of the utility of using music as a way to facilitate ER development in preschool-aged children (Sena Moore, 2015). More specifically, as a first step in a phased research agenda, the study explored the feasibility of the MCRF intervention as a way to improve ER abilities in typically developing preschoolers. Utilizing an embedded convergent mixed methods design, the researcher examined whether the MCRF intervention showed promise of being successful, and explored its acceptability and ease of integration as perceived by parents and teachers. Preliminary findings were encouraging. Most parents and teachers noted emotion-based changes in the children following MCRF treatment and they all believed in the importance and helpfulness of music on developmental outcomes. Furthermore, analyses exploring the efficacy of the MCRF intervention indicated clinically significant improvements in ER skills in the children following treatment. Together, these findings endorse future normative and clinical study of the MCRF intervention as way to facilitate ER development, especially as this medium is highly desired by parents and teachers and can be easily integrated in a preschool setting.

## CONCLUSION

The purpose of this paper is to provide a theoretical rationale for the TFM Plan, with specific application to a music-based ER intervention strategy for preschool-aged children, the MCRF intervention (Sena Moore, 2014, 2015). The primary window for appropriate ER development occurs during the infancy, toddlerhood, and preschool years. Atypical ER development is considered a risk factor for mental health problems and has been implicated as a primary mechanism underlying childhood pathologies, as well as childhood social competence and school adjustment. Furthermore, due to the use-dependent nature of neurodevelopment, structural and functional neural changes associated with atypical ER development affect the functionality of an individual's brain through adulthood. Current treatment approaches and training programs are primarily verbal- and behavioral-based. Most of them incorporate top– down ER strategies, which represent only one end of the explicit-to-implicit ER strategy continuum. Furthermore, a disconnect occurs between the timing of the treatment and the need to handle emotionally charged situations in the moment, as well as a lack of the caregiver-child interactive component that is central to typical ER development. Limited therapeutic options exist to provide real-time, adult-child interactive opportunities to manage emotionally arousing experiences and simultaneously practice and internalize ER strategies

The incorporation of intentional music experiences may be one way to address these limitations. Behavioral and neural evidence supports the use of music as the mechanism for such experiences due to music's developmental appropriateness, as well as its natural role in infant and early childhood interactions

and caregiver-infant bonding. The MCRF intervention was designed with the intention that the contour and temporal structure of a music therapy session alternate between highand low-arousal states in a way that theoretically mirrors the changing flow of the caregiver–infant interaction. In essence, this method does not seek to induce or elicit specific emotions, but to use the music stimulus to manipulate the arousal levels in preschool-aged children, exposing them to alternation of stimulating and calming experiences. This paper proposes that the music characteristics can be specifically designed to provide in-the-moment, interactive opportunities for stress management (e.g., of emotionally arousing experiences) and regulation in the context of a healthy adult–child relationship.

Results from the TFM Plan provided preliminary guidelines as to how to structure and manipulate the music stimulus in the MCRF intervention. The TFM described in this paper is a theory-based construct for music stimuli specific to an intervention strategy intended to practice real time ER within a therapist–preschooler relationship. It should be noted that the music stimulus guidelines outlined in this paper only account for Western musical practices. It is outside the scope of this review to consider other cultural music contexts, although the process for analyzing the TFM could transfer to these situations. In addition, there are individual and sociocultural influences to consider when planning how to structure the music stimulus and the TFM Plan should be adapted accordingly for each given situation. For example, the TFM Plan in this paper is appropriate ER experiences for typically developing children aged 3–5. Adaptations may be needed if working with children with a history of complex trauma as they will have unique individual and culturally influenced needs that are not considered in the TFM analysis outlined in this paper.

Further study of the implementation of the MCRF intervention is warranted. Additionally, careful analysis of the music within the context of an intervention strategy will support the reliability and fidelity of the TFM Plan for ER. Greater understanding and reporting of the characteristics of the music-based stimuli as suggested through the TFM Plan (Hanson-Abromeit, 2015) will contribute to the continued refinement of the TFM for preschool-aged ER, the generalizability of the music stimuli to other intervention strategies, and a deeper understanding of the role of music as a mechanism for ER development.

## AUTHOR CONTRIBUTIONS

KSM contributed to the design, acquisition, analysis, and interpretation of the work, drafted the work, and gave approval of the version to be published and agrees to be accountable for all aspects of the work including accuracy and integrity of the work. DH-A contributed to the conception and interpretation of the work and critical revision for important intellectual content, and gave approval of the version to be published and agrees to be accountable for all aspects of the work including accuracy and integrity of the work.

## ACKNOWLEDGMENT

This project was completed in partial fulfillment of the requirements for a doctoral degree at the University of Missouri-Kansas City.

## REFERENCES


## SUPPLEMENTARY MATERIAL

TheSupplementary Material for this article can be found online at: http://journal*.*frontiersin*.*org/article/10*.*3389/fnhum*.* 2015*.*00572


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Sena Moore and Hanson-Abromeit. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The pleasures of sad music: a systematic review

#### Matthew E. Sachs , Antonio Damasio and Assal Habibi\*

Brain and Creativity Institute, Dornsife College of Letters Arts and Sciences, University of Southern California, Los Angeles, CA, USA

Sadness is generally seen as a negative emotion, a response to distressing and adverse situations. In an aesthetic context, however, sadness is often associated with some degree of pleasure, as suggested by the ubiquity and popularity, throughout history, of music, plays, films and paintings with a sad content. Here, we focus on the fact that music regarded as sad is often experienced as pleasurable. Compared to other art forms, music has an exceptional ability to evoke a wide-range of feelings and is especially beguiling when it deals with grief and sorrow. Why is it, then, that while human survival depends on preventing painful experiences, mental pain often turns out to be explicitly sought through music? In this article we consider why and how sad music can become pleasurable. We offer a framework to account for how listening to sad music can lead to positive feelings, contending that this effect hinges on correcting an ongoing homeostatic imbalance. Sadness evoked by music is found pleasurable: (1) when it is perceived as non-threatening; (2) when it is aesthetically pleasing; and (3) when it produces psychological benefits such as mood regulation, and empathic feelings, caused, for example, by recollection of and reflection on past events. We also review neuroimaging studies related to music and emotion and focus on those that deal with sadness. Further exploration of the neural mechanisms through which stimuli that usually produce sadness can induce a positive affective state could help the development of effective therapies for disorders such as depression, in which the ability to experience pleasure is attenuated.

#### Keywords: sad, music, neuroimaging, music therapy, depression

## Introduction

Humans have long devoted effort and attention to the making and consuming of art that portrays and conveys misery. The ancient Greeks were known for staging tragedies that were widely popular; to this day, films and novels that deal with heartache and despair become bestsellers and garner critical attention. The phenomenon is seen across cultures and art forms. Classical music exhibits the phenomenon abundantly. Folk music, such as the Portuguese Fado (Nielsen et al., 2009) or the Irish Lament (O'Neill, 1910), often expresses sadness and grief. Sad-sounding motifs even permeate many modern-day American pop songs (Schellenberg and von Scheve, 2012).

Sadness in everyday life, however, is hardly pleasant. It is one of the six basic emotions (along with fear, happiness, anger, surprise, and disgust) and it results in feelings that most humans prefer not to experience. As is the case with other negative emotions, the importance of sadness throughout human history and across cultures can be explained through the evolutionary

#### Edited by:

Julian O'Kelly, Royal Hospital for Neuro-disability, UK

#### Reviewed by:

E. Glenn Schellenberg, University of Toronto, Canada David Huron, Ohio State University, USA

#### \*Correspondence:

Assal Habibi, Brain and Creativity Institute, Dornsife College of Letters Arts and Sciences, University of Southern California, 3620 A McClintock Avenue, Los Angeles, CA 90089-2921, USA assal.habibi@usc.edu

> Received: 01 April 2015 Accepted: 29 June 2015 Published: 24 July 2015

#### Citation:

Sachs ME, Damasio A and Habibi A (2015) The pleasures of sad music: a systematic review. Front. Hum. Neurosci. 9:404. doi: 10.3389/fnhum.2015.00404 Sachs et al. Pleasures of sad music

advantage that it confers (Ekman, 1992). Sadness results from a perceived loss, such as the loss of a valued object, the loss of health, the loss of status or of a relationship, or the loss of a loved one. It is a complex bodily and neural state, resulting in feelings of low energy, social withdrawal, low self-worth, and a sense of limited horizon of the future (Harter and Jackson, 1993; Damasio, 1999; Mee et al., 2006; Hervas and Vazquez, 2011).

Sad music can be defined objectively, based on its acoustical properties, and subjectively, based on a listener's interpretation of the emotion that the composer is assumed to have conveyed. The musical features generally associated with ''sadness'' include lower overall pitch, narrow pitch range, slower tempo, use of the minor mode, dull and dark timbres, softer and lower sound levels, legato articulation, and less energetic execution (Juslin and Laukka, 2004). The emotional content of music can also be described on a bi-directional space of valence and arousal. In this view, sad music is defined as music with low valence and low arousal (Trost et al., 2012). Others classify music as sad based on either the emotion that is perceived or the emotion that is induced. This is usually determined by directly asking participants which emotion they believe is being expressed by the music or which emotion they feel when listening to the music (Guhn et al., 2007). The lyrics of popular songs and the poetry of classical pieces can play an important role in defining music as sad as they can trigger memories that the listener associates with sadness (Van den Tol and Edwards, 2013), such as themes of regret and lost love (Mori and Iwanaga, 2013).

Given that in most circumstances sadness is unpleasant, how then can it be associated with pleasure when expressed through music? Herein lies the so-called ''tragedy paradox'', the seemingly contradictory idea that humans work to minimize sadness in their lives, yet find it pleasurable in an aesthetic context. The Athenian philosophers of the Pre-Christian era were the first to discuss this matter formally, proposing that art pertaining to negative emotions provides rewards that other art cannot provide. Aristotle, for example, spoke of how tragic theater allowed the audience to experience rapidly, and subsequently purge itself, of negative emotions, a beneficial outcome known as catharsis (Schaper, 1968). Philosophers and psychologists continue to explain the human attraction to sad art in terms of the psychological rewards that are associated with it.

There is room for disagreement, however, regarding the exact relationship between sad music and the associated pleasurable response. Many believe that music perceived as sad does not produce feelings of sadness and instead directly produces a positive affective state (Kivy, 1991). Others argue that, as is the case with Schadenfreude, pleasurable sadness can be viewed as a ''mixed'' emotion in which positive and negative affects are experienced simultaneously (Juslin, 2013). A third position is that sad music does induce feelings of sadness and that this negative affect is then made positive (Vuoskoski et al., 2011).

The recent emergence of new tools in cognitive science and neuroscience provides the possibility of investigating the relationship between perceived sadness in music and positive affect. By investigating how the brain responds to music listening, aesthetic judgment, and emotional processing, it is possible to gain a better understanding of how and why certain auditory stimuli eventually culminate in a pleasurable response.

In this article, we attempt to bring together findings from philosophy, psychology, and neuroscience in order to arrive at a framework for how sad music becomes pleasurable. We also propose ways of assessing the validity of the framework using neuroimaging and suggest how the available facts may be applicable to mental health interventions.

## The Tragedy Paradox: Philosophical and Psychological Approaches

The earliest attempts to reconcile the ''tragedy paradox'' came from philosophy and can be broadly organized into two main schools of thought. The ''cognitivists'' argue that music does not evoke real emotions, but that emotion can nonetheless be perceived in the structure of music, which, in turn, evokes reminders of the feelings associated with that emotion (Kivy, 1991). Cognitivists posit that emotive moments in music occur much too quickly to result in a full-fledged feeling of that emotion and, therefore, music can only act as a tour guide of past emotions (Hindemith, 1961).

On the other hand, the ''emotivists'' claim that music does induce real emotions in the listener (Levinson, 1990). Within the emotivist school of thought, however, there is still disagreement over the exact nature of the inducible emotions. Some emotivists argue that the emotional response is of a different sort than the kind experienced in everyday life. ''Musicsadness'' cannot be the same as ''life-sadness'', they contend, because the environmental conditions necessary for that emotion are not present (Hospers, 1969). Given the inherently unpleasant nature of sadness, the pure fact that music expressing negative valence can even be found pleasant is proof enough that listeners do not feel sad. Instead, one is left only with responses such as awe, transcendence, and chills, which are inherently pleasurable, but do not entail or require the clear goal-oriented action that basic emotions promote (Scherer, 2004; Koneˇcni, 2005).

Other ''emotivists'', such as the philosopher Jerrold Levinson, argue that sad music does induce genuine sadness, and that this response is inherently rewarding. In his account, Levinson lists eight different benefits that can arise from the feeling of sadness evoked by music with a negative valence: catharsis, the purging of negative emotions, apprehending expression, an improved understanding of the emotions expressed in a piece of art, savoring feeling, the satisfaction that arises from simply feeling any emotion in response to art, understanding feeling, the opportunity to learn about one's feelings, emotional assurance, the confirmation in one's ability to feel deeply, emotional resolution, the knowledge that an emotion state has been, and can be, regulated, expressive potency, the pleasure that arises from expressing one's feelings, and emotional communion, a connection to the feelings of the composer or other listeners (Levinson, 1990).

More recently, large-scale surveys in which participants were asked to provide their motives for listening to sad music have revealed that people often cite similar benefits to the ones described by Levinson (Garrido and Schubert, 2011). Furthermore, when participants were specifically asked about each of Levinson's eight rewards relative to their justification for listening to sad music over happy music, they were more likely to associate sad music with the rewards of understanding feelings, emotional assurance, savoring feelings, emotional communion, and emotional resolution (Taruffi and Koelsch, 2014). Additional justifications included the trigger of specific memories, the distraction from current problems (Van den Tol and Edwards, 2013), the engagement of imaginative processes, and the experience of intense emotions without real-life implications (Taruffi and Koelsch, 2014).

Levinson's ideas, and the ensuing survey data, point to a central mechanism by which sad music can become enjoyable: by triggering a number of psychological processes that are pleasurable to begin with. However, neither can fully explain how the association between sad music and psychological rewards arises or why this association is more likely to occur with sad music than with happy music. Sad music may in fact arouse feelings of connectedness and these feelings may be inherently pleasurable, but the question of how and why sad music allows one to feel more connected to others remains.

#### Proposed Psychological Theories

A different line of research attempts to elucidate the relationship between sad music and affective response by exploring the underlying cognitive processes. Based on the notion that positive emotions, such as joy, are often linked to pleasure, while negative emotions are often linked to displeasure, Schubert (1996) proposed that negative-valence music is perceived as sad, but that this perception of negativity does not produce displeasure because the stimuli are considered to be ''aesthetic'' and therefore not actually harmful. In the wake of the dampened displeasure provided by the aesthetic context, a pleasurable response arises from the experience of arousal that the music produces. This theory provides a testable model for how sad music can be linked to pleasure, yet it does not clarify why other negative-valence stimuli, such as fear-inducing music, are generally not enjoyed.

In an attempt to address this question, Huron (2011) suggested that the hormone prolactin is responsible for enabling the enjoyment of sad music. Prolactin is released by endocrine neurons in the hypothalamus in response to tears and to the experience of negative emotions such as grief, sadness, and, more generally, stress (Turner et al., 2002). In such situations, its release encourages attachment and pair bonding as suggested by the fact that levels of prolactin fluctuate when people become parents, hear their children cry, or are mourning a recently deceased spouse (Lane et al., 1987; Delahunty et al., 2007). Huron proposes that the release of prolactin serves to comfort and console, to counteract the mental pain at the root of the negative emotion. He states that music simulates real sadness, which tricks the brain into engaging a normal, compensatory response, i.e., the release of prolactin. But because the listener is aware of the fact that they are not actually in a stressful or grief-inducing situation, the consoling effect of the hormone is produced in the absence of the mental pain that normally precedes it. The fact that the enjoyment of sadness varies greatly from person to person can be explained by differences in personality, emotional reactivity, cultural norms, biology and learned associations (Huron, 2011). No study to date has yet tested levels of prolactin in participants listening to music that evokes other negative emotions and thus this idea remains untested.

Like Schubert's, Huron's theory does not clarify why music is unique in its ability to produce this comforting after-effect. According to his view, other sad stimuli that simulate mental pain should be found pleasurable as well, such as sad faces or sad affective words. But existing research has suggested that this is not the case as the subjective report of experienced pleasure decreased when participants were presented with a sad photo (Wild et al., 2001).

A third proposal comes from Juslin's BRECVEMA model, which describes eight mechanisms by which music can induce emotions: brain stem reflexes, rhythmic entrainment, evaluative conditioning, contagion, visual imagery, episodic memory, musical expectancy, and aesthetic judgment (Juslin, 2013). These mechanisms can work independently and as a group. A mixed emotion, such as pleasurable sadness, can be understood as the result of two different mechanisms generating different affective responses simultaneously. A sad piece of music might evoke a negative affect through the emotional contagion mechanism, which involves feeling the emotions that are recognized in external stimuli, and might evoke a positive affect through the aesthetic judgment mechanism, which involves deciding that the piece of music is aesthetically pleasing. In this account, the sad affective response does not lead to a joyful response, but rather sad music itself produces both sorrow and joy simultaneously (Juslin, 2013).

#### Do Listeners Actually Feel Sad?

One common thread that runs through the available theories is that music that expresses sadness is enjoyed when the perceiver recognizes that the stimulus is not an immediate threat but is aesthetic instead. The fundamental disagreement concerns whether or not people actually feel sad when listening to sad music that they regard as pleasurable.

When people are directly asked the question, the responses vary. Roughly 25% say that they experience genuine sadness and the rest report that they experience some other, albeit related, emotion, most often, nostalgia (Huron, 2011). However, self-reports made in the context of emotional experience may provide inaccurate results since the difference between emotional perception and emotional experience may not be clear or equal to everyone. In studies in which the researchers made a clear distinction between ''perceived'' and ''felt'', participants reported experiencing mixed emotions (Kawakami et al., 2013).

There is behavioral evidence to suggest that participants do indeed experience, as well as perceive, everyday emotions in response to music. Physiological and behavioral differences were found in participants listening to sad music vs. happy music, including decreased skin conductance, higher finger temperature, decreased zygomatic activity, and more selfreported sadness (Lundqvist et al., 2008). Vuoskoski and Eerola (2012) showed that sadness induced by music had similar bias effects on a word recall task and a picture judgment task as sadness induced by autobiographical recall. The results, then, are taken to mean that music can alter perception and judgment in a similar way to genuine sadness, even if listening to sad music was reported as more pleasant than recollecting a sad autobiographical memory. Neuroimaging has also provided some clarification, as sad music activated some of the regions associated with sad affective states (Mitterschiffthaler et al., 2003; Vytal and Hamann, 2010; Brattico et al., 2011). To date findings suggest that both views have merit. At times, feelings of sadness are experienced in response to sad music and can result in pleasure; at other times, sad music can bypass the associated sad feelings and directly induce a pleasurable response. Which scenario occurs most likely depends on personality, mood, and learned associations with the musical stimuli. Exploring the extent to which the emotional response to sad music overlaps with the sadness experienced in everyday life is a fertile area for further research.

## The Influence of Individual Differences, Mood, and Social Context

While sad music may be associated with various psychological rewards that are inherently pleasurable, not everyone experiences the pleasurable response all the time. In addition to the acoustic features of sad music described above, personality, mood, and the surrounding social context are all important factors in determining whether or not sad music is enjoyed. Several key personality measures are correlated with the liking of sad music, including absorption, as measured by the Tellegen Absorption Scale, and scores on subscales of the Interpersonal Reactivity Index (IRI) including fantasy and empathic concern (Garrido and Schubert, 2011). Higher scores on openness to experience and lower scores on extraversion, as defined by the Big Five Model of personality traits, were shown to be associated with the liking of sad music (Vuoskoski et al., 2011; Ladinig and Schellenberg, 2012). Trait rumination, assessed by the Rumination-Reflection Questionnaire (RRQ), was also positively correlated with enjoyment of sad music, suggesting that certain people listen to sad music not because of the resulting positive feelings, but because of some maladaptive attraction to negative stimuli (Garrido and Schubert, 2011).

Situational factors are also important. People report choosing to listen to sad music more often when they are alone, when they are in emotional distress or feeling lonely, when they are in reflective or introspective moods, or when they are in contact with nature (Taruffi and Koelsch, 2014). Some individuals report that their preference for sad music is dependent on the time of day when they listen (Taruffi and Koelsch, 2014). Other studies have shown that liking of sad music increases when the listener is repeatedly exposed to the musical excerpt while distracted or mentally fatigued (Schellenberg et al., 2008) or when the music is preceded by multiple happy-sounding excerpts (Schellenberg et al., 2012). Empirical evidence that context can have an effect on one's emotional response to music was recently found in a study in which participants who listened to music alone showed greater skin conductance response compared to participants who listened to the same music in a group (Egermann et al., 2011).

Mood appears to play a role in preferences for sad music as well, though the exact nature of that role is unclear. The liking of unambiguously sad-sounding music was shown to increase after a sad-mood induction paradigm (Hunter et al., 2011). However, there is evidence to suggest that this effect may vary across individuals as some people appear to be motivated to select music that is incongruent with their current mood (i.e., selecting happy music when they are sad) while others are motivated to select music that is congruent with their mood (i.e., selecting sad music when they are sad; Taruffi and Koelsch, 2014). Whether a person selects mood-congruent or moodincongruent music most likely depends on individual differences and social context. A previous study looking specifically at the interacting effects of mood and personality found that people who scored higher on a measure of global empathy, as well as the fantasy and personal distress subscales of the IRI, were more likely to listen to sad music when they were in a negative mood (mood-congruent). People who scored lower on measures of emotional stability were also more likely to listen to sad music when they were in a negative mood. Interestingly, global empathy scores were positively correlated with people's preferences to listen to sad music when in a positive mood (mood-incongruent), but in this case, the perspective taking subscale, rather than personal distress, was significant (Taruffi and Koelsch, 2014). The connection between these factors and their associations with pleasurable response to sad music is summarized in **Figure 1**.

## The Neuroscience Perspective

Neuroimaging techniques, including functional magnetic resonance imaging (fMRI), can be used to identify areas of the brain that are activated in response to certain stimuli and thus help uncover some of the processes related to the tragedy paradox. To date, however, no study has explored the neural correlates of pleasurable sadness in response to music. In this section, we will simply draw relevant inferences from the literature.

### Sadness in the Brain

#### Perception of Sadness and Sad Mood

The neural correlates of the experience of sadness are often investigated through the use of sad-mood induction tasks. In order to induce the intended feelings, these experiments generally have the participants reflect on sad, autobiographical events and/or view stimuli that express sadness, such as sad faces or sad films (Vytal and Hamann, 2010).

Changes in mood states are associated with activity changes in the anterior cingulate cortex (ACC) and in the insular cortex, two of the main regions of cerebral cortex involved in the

processing of feelings (Damasio, 1999). The two programs are interconnected (Mesulam and Mufson, 1982). Several studies using positron emission tomography (PET) or fMRI have reported heightened activity in both structures during the experience of sadness (Lane et al., 1997; Damasio et al., 2000; Lévesque et al., 2003; Habel et al., 2005). The ACC is also associated with social pain as the result of social-exclusion (Macdonald and Leary, 2005), and the processing of sad faces (Killgore and Yurgelun-Todd, 2004).

Collectively, the hippocampus, parahippocampal gyrus, and the amygdala, are presumed to be important partners in the process of emotional learning and memory. The three areas are neuroanatomically connected (Pitkänen et al., 2000) and recently several studies have shown that they are functionally connected during the processing of emotional stimuli (Hamann et al., 1999; Kilpatrick and Cahill, 2003).

The hippocampus, parahippocampal gyrus, and amygdala are also associated with unpleasant experience, as higher activity was found in these regions when participants viewed unhappy faces and thought about sad past events (Posse et al., 2003; Habel et al., 2005). Increased activity in the amygdala and parahippocampal gyrus was also found, however, during the happy mood induction task (Habel et al., 2005), suggesting that these regions are not involved in processing sadness specifically, but rather are involved in processing salient emotional stimuli (Phan et al., 2002).

Areas in the frontal lobe are also implicated in processing sadness. A recent meta-analysis found that superior frontal gyrus (BA 9), as well as an area slightly anterior to it [sometimes referred to as the medial frontal gyrus (BA 10)], were repeatedly activated during various sad mood induction tasks (Vytal and Hamann, 2010). The caudate nucleus, a region that is highly innervated by dopamine neurons and modulated by the ventral tegmental area (Faggin et al., 1990), was also involved in the same task (Vytal and Hamann, 2010). In addition, activity in the inferior frontal gyrus (IFG, BA 47) was revealed when comparing sad mood induction, directly, to happy mood induction (Habel et al., 2005).

#### Brain Correlates of Music-Evoked Sadness

The regions of the brain that are involved in processing feelings of sadness, in general, also appear to be implicated in the processing of feelings evoked by music. In a study in which participants listened to familiar music that they found sad or happy, sad pieces, compared to happy pieces, were associated with increased activation in the head of the caudate nucleus as well as the thalamus (Brattico et al., 2011). Increased activation in the thalamus has also been found during the processing of sad faces (Fusar-Poli et al., 2009).

Several studies on music and emotion have reported involvement of the hippocampus, parahippocampal gyrus, and amygdala (Blood and Zatorre, 2001; Baumgartner et al., 2006; Koelsch et al., 2006; Eldar et al., 2007). Specifically, in relation to sad music, music that induced a sad mood, judged by subjective reporting, was shown to correlate with increased blood oxygen level dependent (BOLD) signal in the hippocampus and the amygdala (Mitterschiffthaler et al., 2007).

A number of functional neuroimaging studies reported involvement of these regions in the perception of negative valence in music in particular. For example, music perceived as sad, as a result of it being either in a minor mode (Green et al., 2008) or producing low arousal and valence (Frühholz et al., 2014), was shown to correlate with increased activity in the parahippocampal gyrus. That region, along with the hippocampus, was also shown to be involved in responding to dissonant music that was found unpleasant (Blood et al., 1999; Koelsch, 2014). Because of their role in the encoding of memories, the parahippocampal gyrus, hippocampus, and amygdala may also play an important role in processing emotional events related to the music (Ford et al., 2011).

The superior frontal gyrus and the medial frontal gyrus appear to be associated with the perception of emotions in music as well; both regions were shown to be activated when contrasting the response to music in a minor key to music in a major key (Khalfa et al., 2005; Green et al., 2008).

#### Aesthetic Judgments

Aesthetic judgments include both the act of deciding whether or not an auditory stimulus is aesthetic in nature, and therefore not life-threatening, as well as whether the auditory stimulus is beautiful (Jacobsen, 2006). Neuroimaging studies of aesthetic judgment generally produce activation in the frontal lobe cortices and the ACC. The orbital frontal cortex (OFC) has been shown to be involved in various decision-making processes by linking past behavior with their emotional byproducts (Bechara and Damasio, 2005). It is not surprising then, that this general area is repeatedly recruited during tasks of aesthetic judgment (Jacobsen et al., 2006; Ishizu and Zeki, 2011). Other areas of the frontal lobe, including the superior frontal gyrus, and the medial frontal gyrus (BA 9 and 10), were activated when judging the beauty of musical rhythms (Kornysheva et al., 2010) and geometric shapes (Jacobsen et al., 2006). Greater activation in the ACC is also observed when aesthetic judgments are made about both art and music (Kornysheva et al., 2010; Ishizu and Zeki, 2011).

#### Pleasure in the Brain

Activation of the ventral striatum and the nucleus accumbens, during pleasurable music listening was first reported in a study by Blood and Zatorre (2001) and has since been encountered by several investigators using both fMRI (Menon and Levitin, 2005; Koelsch et al., 2006; Salimpoor et al., 2013) and PET (Brown et al., 2004; Suzuki et al., 2008). Salimpoor et al. (2011) showed that there is a direct relationship between increases in pleasure during music listening and hemodynamic activity in the right nucleus accumbens, an area that is part of the ventral striatum. The study also found that the caudate nucleus was involved in the anticipation of a pleasurable response to musical excerpts (Salimpoor et al., 2013).

In a recent fMRI study, Trost et al. (2012) found that music deemed to have positive emotional valence engages the ventral striatum selectively but in a lateralized fashion. Musical stimuli with positive valence and low arousal, those leading to tenderness, increase activity in the right ventral striatum whereas musical stimuli with positive valence and high arousal, those leading to joy, increase activity in the left ventral striatum.

Using connectivity analysis, Menon and Levitin (2005) showed significant interactions during music listening between the ventral striatum, the hypothalamus and the ventral tegmental area of the brainstem, which is involved in the production and dissemination of the neurotransmitter dopamine. The results also suggested that activation of the ventral striatum in response to pleasurable music is modulated by the activity in both the ventral tegmental area and by the hypothalamus (Menon and Levitin, 2005).

Several studies have reported activity changes in the ACC and the insula during the experience of pleasure in response to musical stimuli. In their 2001 study, Blood and Zatorre demonstrated that an increase in the subjective experience of the intensity of aesthetic chills, as well as increases in physiological measures of arousal (i.e., heart rate, muscular activity and respiration rate) occurred concurrently with a rise in cerebral blood flow within the insula and the ACC. Increased activation of the insula was also observed while participants listened to pleasant musical excerpts (Brown et al., 2004; Koelsch et al., 2006).

In an attempt identify the brain regions involved in processing specific emotions in music, Trost et al. (2012) showed that listening to classical instrumental music identified as high in arousal level and positive in valence (such as joy), led to increased respiration rate together with increased activity in the insular cortex. By contrast, listening to musical excerpts that were rated low in level of arousal, regardless of valence, correlated with increased activity in the ACC (Trost et al., 2012).

The OFC has been shown to be involved in the pleasurable response that results from music listening (Blood and Zatorre, 2001; Menon and Levitin, 2005) and the IFG was activated in response to pleasant, consonant music when compared to unpleasant, dissonant music (Koelsch et al., 2006).

In addition, there is evidence to suggest that the thalamus might be involved in the pleasurable response to emotional stimuli as increased cerebral blood flow in the region was also found to be positively correlated with intensity ratings of chills in response to pleasurable music (Blood and Zatorre, 2001) and during self-reported judgments of pleasantness across different modalities (Kühn and Gallinat, 2012).

#### Summary and Neurobiological Framework

The results from the neuroimaging experiments suggest that pleasurable sadness is a consequence of several coordinated neural processes. When a sad musical stimulus reaches the brain, its emotional valence is assessed on the basis of its acoustical properties (i.e., mode, timbre, and loudness), which depends on processing in the brainstem and primary and secondary auditory cortices (Liégeois-Chauvel et al., 1998; Pallesen et al., 2005; Juslin and Västfjäll, 2008). The experience of sadness would result from previously learned associations with the auditory stimulus, the emotional content of the associated words, and the parallel changes in body state induced by the emotional process (Baumgartner, 1992; Ali and Peynircioglu, 2006; Khalfa et al., 2008; Juslin et al., 2013). Linking past experiences with emotional content recruits the network of the parahippocampal gyrus, the hippocampus and the amygdala (Killgore and Yurgelun-Todd, 2004), whereas feelings of the specific emotion, are mediated by a set of subcortical nuclei in the brain stem and basal ganglia, as well as prefrontal, anterior cingulate and insular cortices (Damasio and Carvalho, 2013).

The recognition of consonance or dissonance in the musical stimulus, previous associations and familiarity associated with the musical stimulus, and affective information, such as the emotions and feelings that are perceived or induced by the piece of music (Juslin, 2013), all serve as input for the making of aesthetic judgment, whose coordination depends on the frontal cortices, including those in the superior frontal gyrus, the middle frontal gyrus, the OFC, and the ACC (Jacobsen et al., 2006; Ishizu and Zeki, 2011).

It is often the case that judging a piece as beautiful leads to feelings of pleasure, yet this is not always true (Juslin, 2013). When a subsequent pleasurable response emerges, it can come in the form of increases in emotional arousal, which has been shown to be correlated with increased feelings of pleasure, (Salimpoor et al., 2009), and in the form of episodic memories triggered by the music which can also lead directly to pleasure (Janata, 2009). The experience of pleasure is correlated with activity in the ventral striatum, specifically in the nucleus accumbens, the caudate nucleus, and the orbitofrontal cortex (Berridge and Kringelbach, 2008).

## The Clinical Implications of Pleasurable Response to Sad Music

The most common of mood disorders, major depressive disorder (MDD), is characterized by persistent feelings of unhappiness and is often accompanied by an inability to experience pleasure (anhedonia) and a disturbed ability to describe or identify emotions (alexithymia). Investigating the response of depressed patients to negative-valance stimuli such as sad music, could provide another perspective in understanding the paradox of pleasurable sadness.

Depression appears to influence how one perceives and experiences sadness. Participants with MDD show prolonged or heightened activity in the amygdala and ACC when they process stimuli that express negative valence (Siegle et al., 2002) and increased activity in the insula and ACC when experiencing a sad mood (Mayberg et al., 1999; Keedwell et al., 2005). Given the role of these brain regions in reward processing and emotional regulation (Langenecker et al., 2007), it is possible that this pattern of activity reflects the increased intensity and salience of negative affect that is often associated with depression.

An investigation of the listening-habits of individuals diagnosed with depression produced informative results (Bodner et al., 2007; Wilhelm et al., 2013). Depressed patients expressed an intensified response to sad-sounding music when compared to healthy controls (Bodner et al., 2007). Furthermore, such patients evaluated negative-valence music as significantly more sad and angry than did healthy controls (Punkanen et al., 2011). When depressed individuals and healthy controls were asked about their reasons for listening to music, the degree to which depressed participants referenced engaging with music in order to ''express, experience, or understand emotions'' was significantly higher than in healthy controls (Wilhelm et al., 2013). This difference was interpreted as evidence for the notion that bringing emotions to the forefront of attention, in this case through music listening, is a way of regulating and ultimately reducing the negative affective state that is indicative of depression (Chen et al., 2007).

Neuroimaging studies have shown that depression alters the neural response to music that is found pleasurable. Significant deactivation was found in the medial OFC and the nucleus accumbens/ventral striatum when depressed patients listened to their favorite pieces of music. Of interest, no differences were found between patients and healthy controls relative to how much they reported actually enjoying the musical excerpts (Osuch et al., 2009), suggesting that the neural processing of rewarding stimuli is still effected in patients with depression even when the feelings associated with the rewarding stimuli are not. A related study found that when listening to pleasant musical stimuli, activity in the OFC, as well as the nucleus accumbens, insula, ACC, ventromedial prefrontal cortex (VMPFC), and the lateral hypothalamus, was negatively correlated with measures of anhedonia (Keller et al., 2013).

In sum, depression is associated with varied neurobiological differences in emotional processing and experience. The fact that these differences are also seen in response to music implies that experience of pleasurable sadness to aesthetic stimuli can be influenced by mental illness. Furthermore, the distinct neural activity patterns seen in depressed patients when they respond to rewarding stimuli occurred in the regions known to be involved in processing enjoyable music. This suggests that music may be well suited to target and ameliorate the diminished experience of pleasure associated with various mood disorders (Salimpoor et al., 2013).

## Discussion

### Proposed Framework

Results from various disciplines suggest that pleasure in response to sad music is related to a combination of the following concurrent factors:

	- a. Evocation of memories related to particular musical pieces or pieces similar to them;
	- b. Personality traits;
	- c. Social context;
	- d. Current mood;

We propose that the ways in which these various factors interact to produce pleasure when listening to sad music can be understood in the perspective of homeostatic regulation. Homeostasis refers to the process of maintaining internal conditions within a range that promotes optimal functioning, well-being and survival (Habibi and Damasio, 2014). Emotions, which refer to a set of physiological responses to certain external stimuli, were selected in evolution because they favor the reestablishment of homeostatic equilibrium (Damasio and Carvalho, 2013). Feelings are experiences of the ongoing physiological state and range in their valences, from positive and pleasurable to negative and potentially painful. The valence of the feelings as well as their intensity help signify whether the associated stimulus or behavior is adaptive and should be avoided or sought in the future. Feelings are a critical interface in the regulation of life because they compel the individual organism to respond accordingly. Feelings of pleasure are a reward for achieving homeostatic balance and encourage the organism, under certain conditions, to seek out the behaviors and stimuli that produced them. Feelings of pain, in general, and mental pain specifically, on the other hand, signify homeostatic imbalance and discourage the endorsement of the associated stimuli and behaviors.

When and how music induces a pleasurable response may depend on whether a homeostatic imbalance is present at the outset and whether music can successfully correct the imbalance. There is already evidence to suggest that music has deeply rooted connections to survival (Huron, 2001). Making music encourages group cohesion and social bonding, which can lead to the successful propagation of the clan (Brown, 2000). It may also be a sign of evolutionary and sexual fitness, thus fostering mate selection (Hauser and McDermott, 2003). The fact that music listening has the capacity to communicate, regulate, and enhance emotions further suggests that music can be an effective tool in returning an organism or a group to a state of homeostatic equilibrium (Zatorre and Salimpoor, 2013).

The pleasurable responses caused by listening to sad music is a possible indication that engaging with such music has been previously capable of helping restore homeostatic balance. Given that various psychological and emotional rewards (e.g., emotional expression, emotional resolution, catharsis) are shown to be associated to a higher degree with sad music than happy music (Taruffi and Koelsch, 2014), it may be that sad music, in particular, is preferentially suited for regulating homeostasis both in general physiological terms and mental terms. This notion is further supported by the fact that listening to sad music engages the same network of structures in the brain (i.e., the OFC, the nucleus accumbens, insula, and cingulate) that are known to be involved in processing other stimuli with homeostatic value, such as those associated with food, sex, and attachment (Zatorre, 2005). This is not to say that these regions are unique to the processing of sad music or that other types of music may not be useful for homeostatic regulation. We believe that pleasurable responses to negative-valence music stimuli are best understood through their ability to promote homeostasis.

The lack of a pleasurable response to sad music might mean that either no homeostatic imbalance was present or that the musical stimuli failed to correct the imbalance. It is known that pleasure to higher order stimuli (e.g., money and music) requires learning (Berridge and Kringelbach, 2008) and thus sad music may not evoke a pleasurable response if such a stimulus never became associated, through repeated exposure, with the psychological benefits that influence homeostatic regulation.

There are many ways in which a homeostatic imbalance can arise and there are numerous ways in which sad music can correct such imbalances. For example, an individual who is currently experiencing emotional distress and has an absorptive personality will be able to listen to sad music to disengage from the distressing situation and focus instead on the beauty of the music. Listening to sad music would correct the imbalance caused by emotional distress and the experience would be pleasurable. In the absence of emotional distress and the ensuing negative mood, however, a person who is highly open to experience, and prefers novel and varied stimulation, could find such diverse stimulation in sad music because of the range and variety of feelings associated with it and thus experience an optimal state of well-being (see **Figure 2** for details).

Viewing the tragedy paradox in terms of humanity's deeply rooted biological need to maintain a variety of basic psychological and physiological balances and relative stability over time, should allow researchers to focus less on the individual and situational factors associated with enjoying sad music and more on how these factors interact with each other. We believe that this comprehensive focus will ultimately permit a better understanding of the questions that persist on this issue.

#### Future Directions: A. Neuroimaging Research

The published neuroimaging studies on pleasurable sadness in music are complex and difficult to synthesize due to differences in methodologies, stimuli, analysis, and participant population. While there is some agreement regarding which brain regions are involved in the process, the exact role that each region plays remains unclear. Neuroimaging studies should attempt to elucidate the contribution that different parts of the brain may make to the pleasurable response induced by music by exploring three lines of research: (1) directly comparing music that is perceived as sad but not found pleasurable with music that is perceived as sad and found pleasurable; (2) exploring how the emotional response to sad music compares to the emotional response to other types of sadness, such as sadness due to the loss of a loved one or being ostracized; and (3) considering specifically how the interaction between mood and personality alters preference for sad music.

### Future Directions: B. Using Sad Music in Music Therapy

Because of its proven ability to affect a host of neural processes, including emotions, mood, memory, and attention, music is uniquely suited to serve as a therapeutic tool for psychological intervention. The concept of using music to heal has been around for centuries, but it was only in the second half of the 20th century that music therapy was first considered an established health profession with standardized academic and clinical training requirements and a board-certification program (American Music Therapy Association, 2015).<sup>1</sup> Today, music therapy is used to treat a wide range of mental and physical ailments, including acute and chronic pain (Cepeda et al., 2013), brain trauma (Bradt et al., 2010), autism spectrum disorder (Gold et al., 2006), dementia (Vink et al., 2004), schizophrenia (Mössler et al., 2011), and mood and anxiety disorders (Koelsch et al., 2006; Maratos et al., 2008). Controlled clinical trials have found that music therapy, in conjunction with standard medical care, can have a significant positive effect on various symptoms associated with these illnesses (Gold et al., 2009).

Music can be particularly useful for the treatment of depression given its ability to effectively regulate mood. In general, music therapy techniques that are currently in practice for depression intervention fall into two broad categories: active therapy, which involves playing, writing, and/or improvising music, and receptive therapy, which involves passively listening to music. In active music therapy, the patient and the therapist generally create music together and then engage in a reflective discussion regarding the meaning behind the compositional experience (Erkkilä et al., 2011). In receptive music therapy, preselected music often serves to change the patient's mood or to facilitate guided imagery, relaxation, or motivational exercises. In other forms of receptive music therapy, music is used to stimulate a therapeutic discussion regarding the thoughts, feelings, and memories that the music evokes (Grocke et al., 2007). Both active and receptive music therapy can be beneficial because they allow for various themes and emotions to be experienced and expressed indirectly and without the need for language (Erkkilä et al., 2011).

As previously stated, sad music, to a higher degree than other types of music, is associated with certain psychological rewards, such as regulating or purging negative emotions, retrieving memories of important past events, and inducing feelings of connectedness and comfort (Taruffi and Koelsch, 2014). Therefore, incorporating sad pieces that are found to be pleasurable into receptive music therapy could augment the efficacy of such treatments in ameliorating the symptoms of depression. Actively exploring, with the guidance of the therapist, the natural and spontaneous reactions to sad pieces of music in particular could help patients better comprehend and manage their response to negative stimuli in general, providing them with new ways of coping with sadness and connecting with others. Research into the ways in which sad music becomes enjoyable may inform existing music therapy practices for mood disorders by furthering the understanding of such disorders, offering possible mechanisms of change, and providing support for the use of personalized medicine in mental health care.

## Conclusion

The literature on the enjoyment of sad music is limited and at times conflicting, but allows us to make some general conclusions. Overall, scholars from various disciplines agree that music that conveys sadness can be found pleasurable because in art, the immediate social and physical circumstances usually associated with the negative valence, are not present. In addition, it may be that music that pertains to grief and sorrow is more often found beautiful than music that pertains to joy and happiness because it deals with eudemonic concerns such as self-expression, social connectedness, and existential meaning. Finally, sad music can help individuals cope with negative emotions in certain situations, depending on their personality, their mood, and their previous experiences with the music.

We do not yet have a detailed account of how these factors interact to produce a pleasurable response. Neuroimaging studies suggest that the response is the product of a coordinated effort between various regions of the brain known to be involved in emotional recognition, conscious feeling, aesthetic judgment, and reward processing. Future studies, in particular those that use neuroimaging techniques, should aim at manipulating mood and personality independently to determine the effect that each has on affective responses to sad music. Findings from such studies could provide new evidence for the ways in which everyday stimuli can become rewards and pave the way for new treatments of mood disorders.

<sup>1</sup>http://www.musictherapy.org/about/history/

## References


Hindemith, P. (1961). A Composer's World. New York: Doubleday and Company.

Hospers, J. (1969). Introductory Readings in Aesthetics. New York: The Free Press.


during single-trial self-induced sadness. Neuroimage 18, 760–768. doi: 10. 1016/s1053-8119(03)00004-1


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Sachs, Damasio and Habibi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.