# BRAIN AND ART

TOPIC EDITORS: Idan Segev, Luis M. Martinez and Robert J. Zatorre

#### *FRONTIERS COPYRIGHT STATEMENT*

© Copyright 2007-2014 Frontiers Media SA. All rights reserved.

All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.

**ISSN** 1664-8714 **ISBN** 978-2-88919-360-8 **DOI** 10.3389/978-2-88919-360-8

# *ABOUT FRONTIERS*

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

# *FRONTIERS JOURNAL SERIES*

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing.

All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

# *DEDICATION TO QUALITY*

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view.

By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

# *WHAT ARE FRONTIERS RESEARCH TOPICS?*

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area!

Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# BRAIN AND ART

**Topic Editors:**

**Idan Segev**, The Hebrew University of Jerusalem, Israel **Luis M. Martinez**, Spanish National Research Council (CSIC), Spain **Robert J. Zatorre**, McGill University, Canada

*"Music and the Brain" – Image by Christian Gaser*

# Table of Contents

#### *Brain and Art 09*

Idan Segev, Luis M. Martinez and Robert J. Zatorre

# *Brain and the Visual Arts*


Elina Pihko, Anne Virtanen, Veli-Matti Saarinen, Sebastian Pannasch, Lotta Hirvenkari, Timo Tossavainen, Arto Haapala and Riitta Hari

#### *Artistic Explorations of the Brain 139*

Eberhard E. Fetz


Mengfei Huang, Holly Bridge, Martin J. Kemp and Andrew J. Parker


*(Artist Perspective)*


# *The Musical Brain*

*Distinct Inter-Joint Coordination During Fast Alternate Keystrokes in Pianists With Superior Skill 273*

Shinichi Furuya, Tatsushi Goda, Haruhiro Katayose, Hiroyoshi Miwa and Noriko Nagata


Jessica Phillips-Silver and Peter E. Keller

# *Dance in the Brain*

*The Impact of Aesthetic Evaluation and Physical Ability on Dance Perception 331* 

Emily S. Cross, Louise Kirsch, Luca F. Ticini and Simone Schütz-Bosbach

*Practice of Contemporary Dance Promotes Stochastic Postural Control in Aging 343*

Lena Ferrufino, Blandine Bril, Gilles Dietrich, Tetsushi Nonaka and Olivier A. Coubard

# *Multi-Modal Artistic Processing in the Brain*


Fortunato Battaglia, Sarah H. Lisanby and David Freedberg


# Brain and art

#### *Idan Segev1 \*, Luis M. Martinez <sup>2</sup> and Robert J. Zatorre3*

*<sup>1</sup> Department of Neurobiology and The Edmond and Lily Safra Center for Brain Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel*

*<sup>2</sup> Spanish National Research Council, Instituto de Neurociencias de Alicante, Sant Joan d'Alacant, Spain*

*<sup>3</sup> Montreal Neurological Institute, McGill University, Montreal, QC, Canada*

*\*Correspondence: idan@lobster.ls.huji.ac.il*

#### *Edited and reviewed by:*

*John J. Foxe, Albert Einstein College of Medicine, USA*

**Keywords: neuroesthetics, creativity, performing-arts, emotion, perception**

# **INTRODUCTION**

Museums, concerts, dance performances and films attract billions of people worldwide. Indeed, visual art and music have been with us essentially from the beginning of our species. This must mean that Art (and hence, artists) succeed to tap into particular and powerful mechanisms in our brain. This artistic success also means that understanding the phenomenon of Art is a key challenge for modern brain research.

Could we understand, in biological terms, the unique and fantastic capabilities of the human brain to both create and enjoy art? In the past decade neuroscience has made a huge leap in developing experimental techniques as well as theoretical frameworks for studying emergent properties following the activity of large neuronal networks. These methods, including MEG, fMRI, sophisticated data analysis approaches and behavioral methods, are increasingly being used in many labs worldwide, with the goal to explore brain mechanisms corresponding to the artistic experience.

The 37 articles composing this unique *Frontiers Research Topic* bring together experimental and theoretical research, linking state-of-the-art knowledge about the brain with the phenomena of Art. It covers a broad scope of topics, contributed by world-renowned experts in vision, audition, somato-sensation, movement, and cinema. Importantly, as we felt that a dialog among artists and scientists is essential and fruitful, we invited a few artists to contribute their insights, as well as their art.

In this context, we would like to highlight a key similarity between artists and scientists, in particular neuroscientists. Both art and science seek to explain—each with their own unique set of concepts and tools—the *unknown*. Both science and art often claim to seek the "truth." The focus of modern brain research is to unravel the physical basis that underlies the emerging capabilities of the brain—perception, behavior, emotions and brain-related diseases, whereas the arts elaborate intricately and persistently on these brain-related properties including diseases (see several papers in this volume on art and brain damage) exploring, and in this process also expanding, the range of brain's perceptual and emotional capacity.

This is a unique forte of the arts, which utilize the powerful capacity of the brain to adaptively (plastically) change following perception and action, and propose new ways to view and interpret the world. Indeed, as highlighted in the present volume, art may invoke new "brain states" that are otherwise less likely to be activated by our day-to-day "reality." Art therefore serves to explore and expand the potential capacity of the brain, e.g., via the recent invention of abstract art (see this volume).

In that sense, brain research and the arts are closely interlinked; modern brain researchers have much new to say about the phenomenon of art from a neuro-scientific perspective (hence the new term "Neuroesthetics" coined in by Semir Zeki), whereas artists have a lot to say about how the brain (they typically use the term "mind," as do cognitive psychologists) dealing with what it perceives (consciously or unconsciously). This artistic exploration of the mind is well summarized in the surrealistic manifesto by André Breton who said "*It is by pure psychic automatism by which one intends to express verbally, in writing or by other method, the real functioning of the mind. Dictation by thoughts without any control exercised by reason, and beyond aesthetics or moral preoccupation*."

Joan Miró said that "*art is the search for the alphabet of the mind."* This volume reflects the state of the art search to understand the neurobiological alphabet of the Arts. We hope that the wide range of articles in this volume will be highly attractive to brain researchers, artists and the community at large.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 05 May 2014; accepted: 07 June 2014; published online: 27 June 2014. Citation: Segev I, Martinez LM and Zatorre RJ (2014) Brain and art. Front. Hum. Neurosci. 8:465. doi: 10.3389/fnhum.2014.00465*

*This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2014 Segev, Martinez and Zatorre. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# *Brain and the Visual Arts*


*Robert Pepperell (Artist Perspective)*


Elina Pihko, Anne Virtanen, Veli-Matti Saarinen, Sebastian Pannasch, Lotta Hirvenkari, Timo Tossavainen, Arto Haapala and Riitta Hari

#### *Artistic Explorations of the Brain 139*

Eberhard E. Fetz


Mengfei Huang, Holly Bridge, Martin J. Kemp and Andrew J. Parker


Edward A. Vessel, G. Gabrielle Starr and Nava Rubin


# **AN ARTISTIC EXPLORATION OF INATTENTION BLINDNESS**

**Ellen K. Levy**

# An artistic exploration of inattention blindness†

# *Ellen K. Levy \**

*Visiting Scholar, New York University, New York, NY, USA*

#### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Luis M. Martinez, Universidad Miguel Hernández, Spain Peter Hillman, Bloomfield Science Museum Jerusalem, Israel*

#### *\*Correspondence:*

*Ellen K. Levy, 40 East 19th Street #3R New York, NY 10014, USA. e-mail: levy@nyc.rr.com*

An experiment about inattention blindness was conducted within the context of an art exhibition as opposed to a laboratory context in order to investigate the potential of art as a vehicle to study attention and its disorders. The project utilized a flash animation, *Stealing Attention*, that was modeled after the movie by Simons and Chabris (1999) but with significant experimental differences, involving context and staging, the emotional salience of the objects depicted, and the prior art viewing experience of participants. The study involved two components: observing if viewers watching an animation in a gallery could be distracted from noticing the disappearance of stolen museum antiquities (the targets) by the overlaid flashing images of a card game (the distractors) and then observing whether repetition of the depicted targets throughout the gallery installation could facilitate a redirection of attention that allowed viewers to perceive the targets not initially noted in the animation. My findings were that, after viewing the entire installation and then re-viewing the animation, 64% of the viewers who did not initially remark on the targets in the animation were then able to see them. The discussion elaborates on these findings and then considers ways in which the implications of inattention blindness paradigms might be more fully rendered by uniting insights from the two disciplines of art and neuroscience than by either alone.

**Keywords: attention, inattention blindness, art installation, gallery, museum, art context, animation, antiquities**

# **INTRODUCTION**

In recent years considerable literature has been published on attention by art historians, historians of science, and philosophers in addition to neuroscientists (e.g., Baxandall, 1995; Crary, 1999; Hagner, 2003; Rollins, 2004; Stafford, 2007). They have made important contributions that specifically highlight attention in relationship to art, and their insights have informed our understanding of the attentional system. The fact that art, itself, is constitutive of attentional phenomena suggests why it should hold special interest for neuroscientists. My perspective as an artist has allowed me to locate a point of entry into this rich historical research through exploring inattention blindness, which is the intriguing phenomenon of not being able to see things in plain sight (Mack and Rock, 1998). This paper examines my art experiment, including its challenges and implications. It also explores the possibility that certain artworks, when engaged, can serve as an attentional training ground.

After introducing the topic of inattention blindness, I describe examples of its exploration in several scientific studies. I then relay my own experience in staging an experiment about this phenomenon in an art gallery, including methods, results, and possible confounds. The attentional system and ability to focus are subsequently considered within a broad context of learning. This is followed by a discussion of inattention blindness in art history and then by an analysis of some of the related neuroscience, such as the ability to make attention switches. Finally, I consider why inattention blindness can be more fully rendered through uniting insights from multiple disciplines.

# **INATTENTION BLINDNESS**

The phenomenon of inattention blindness or, more formally, "inattentional blindness" as coined by psychologists Mack and Rock (1998) has been examined by scientists for several decades. Inattention blindness is related to other phenomena, such as the "attentional blink" (the failure to detect a second salient target occurring in succession after the first target) and "change blindness" (the inability of our visual system to detect alterations to something staring us straight in the face); all engage similar principles although change blindness also involves memory. A variety of methods are used by neuroscientists to accomplish the visual disruption; they may insert a blank screen or use a "flicker," a "blink," or diverters like "mudsplashes." A variety of tools can implement the disruption, including stereoscopes, visual masking, and dichoptic methods. Using dynamic visual displays, a series of studies of inattention blindness were conducted in the 1970s and 1980s during which observers were asked to report on a task. As a result of the assignment, viewers often did not notice staged events, causing neuroscientists to conclude that people only remember those objects that receive their focused attention.

Other factors play a role in inattention blindness; cultural bias regarding what is noticed is, in itself, a whole area subject to extended study as are pre-attentive processes. Repeated trials appear to make a difference with respect to perception. Vision scientists Maljkovic and Nakayama (1994) reported that in search for a singleton target, when the unique feature varies randomly

<sup>†</sup> An abbreviated, earlier version of "An Artistic Exploration of Inattention Blindness" was published by *Technoetic Arts: A Journal of Speculative Research*, 8, 93–99 in 2010.

from trial to trial the deployment of focal visual attention is faster when the target feature is the same as in past trials than when it is different, a phenomenon called priming of pop out. (Note that the term, pop out, as used here differs from its use in commonly used pop out ads on the internet. Clearly advertisers bank on the phenomenon of subliminal priming). Performance was also enhanced when the target occupied the same spatial position on consecutive trials (Maljkovic and Nakayama, 1996). However, psychologists Treisman and DeSchepper (1996) found that ignoring a distractor on one trial made it easier to ignore the same item on subsequent trials. Inattention blindness has been explored by Neisser and Becklen (1975), Mack and Rock (1998), and expanded upon by psychologists, Simons and Chabris (1999), among others. In the latter's well known study, "Gorillas in our midst: sustained inattentional blindness for dynamic events," a movie sequence of a complex basketball scene was shown to observers who were directed to count the number of ball exchanges made in a ball game. During the movie, few viewers noticed that an actor dressed in a gorilla suit walked through the scene. On the basis of their results, Simons and Chabris suggested that the likelihood of noticing an unexpected object depends on the similarity of that object to other objects in the display and on the difficulty of the priming monitoring task. They further concluded that observers attend to objects and events; the spatial proximity of the critical unattended object to attended locations did not appear to influence detection.

# **STAGING INATTENTION BLINDNESS IN AN ART GALLERY**

To study inattention blindness in the context of an art exhibition, I utilized an animation that resulted from my collaboration with Michael E. Goldberg, Director of the Mahoney Center for Brain and Behavior, Columbia University, NYC. My study involved two components: observing if viewers watching an animation in a gallery could be distracted from noticing the disappearance of stolen museum antiquities (the targets) by the overlaid flashing images of a card game (the distractors) and then observing whether repetition of the depicted targets throughout the gallery installation could facilitate an"attention switch" that allowed viewers to perceive the targets not initially noted in the animation when re-viewing it again. The reasoning was that the informal "learning" taking place through contextual cueing might cause viewers to recognize the overlooked targets.

I became especially interested as an artist in the boundary between normality and pathology. Part of my motivation was to test first-hand whether the embodied knowledge of images, emotion, and social context that is deeply embedded in art practices is capable of supplementing neuroscience's understanding of attention and its disorders. Part of the controversy over the diagnosis of attention deficit hyperactivity disorder (ADHD) involves determining whether ADHD symptoms such as distraction fall within the bounds of normal perception. The construction of an installation and collaborative animation allowed participants to experience the constraints on the attentional system. Showing the animation within the experimental context of a gallery setting provided a way for viewers to experience a common failure of perception along with an opportunity to reflect upon this experience. The project raised four questions: (1) What does attention make possible? (2) Can attention be shifted? (3) Does art training

help prevent distraction? and (4) Can art train attention? My findings showed that, after viewing the entire installation and then re-viewing the animation, 64% of the viewers who did not initially remark on the targets in the animation were then able to see them. I have used the term "remark" rather than "see" because it is possible that pre-attentive viewing had occurred but had not yet been brought to conscious awareness. I discuss the implications of these results with regard to my premise that art offers a training system for the attentional system.

Images of looted Iraqi antiquities were programmed to gradually disappear over the course of a 3-min animation, and the distraction of viewing hands with flashing cards made them hard to discern (**Figure 1**). A directive was issued at the onset of the animation to "count the number of times the Queen of Hearts appears." After one playing, viewers were questioned about what they had observed; those who did not see the targets were invited to walk around the gallery and then re-view the animation. The aim was to assess whether the repetition of images of looted objects throughout the gallery in static displays could cause the targets to become more salient and result in viewers redirecting their vision from the foreground to the background of the animation.

The design of my own artistic study was different from the scientific studies just considered. As far as I am aware, art experiments are seldom conducted that have explored inattention blindness. In addition, psychophysical tests are not frequently conducted in settings apart from laboratories, and I wanted to determine if a gallery had any advantages over these situations. In fact, scientists, themselves, are increasingly investigating the operations of vision under more natural conditions. As an early example, Neisser (1982) demonstrated the value of studying animals under naturalistic conditions. Unlike such experiments, however, my art installation, *Stealing Attention*, constituted a far from neutral test. It referenced the 2003 US invasion of Iraq and conceivably aroused some of the strong emotions many Americans felt in being led into

**FIGURE 1 | An animation overlay of images of hands playing "three-card monte" (http://www.complexityart.com/subs/images/ flash/stealing\_attention\_feldman.mov).**

war on a misleading premise. The gallery exhibition broadened the parameters of objective scientific testing in that the art encouraged viewers to identify emotionally with the loss of the Iraqi heritage signified by the looting of antiquities.

The distractors in my art experiment were hands with cards that flashed rapidly and were intended to symbolize the game Three-Card Monte. This game is directly in line with other con games in which the card dealer's rapid hand movements keep the person placing the bet from noticing the removal of the winning card. I based some of my images in the animation and throughout the entire installation on works by Caravaggio and Georges de la Tour that dealt with the theme of card thefts. These are images well known to artists and art historians. Many museum goers will recall the mid-sixteenth century work titled *The Conjuror*, by Hieronymus Bosch, which is a study of distraction related to Three-Card Monte. As Macknik et al. (2008) pointed out, in Bosch's artwork the magician performs the shell game for a crowd in medieval Europe, while pickpockets steal the belongings of the distracted spectators.

In psychological parlance, the hand movements of the card dealer in our animation were"distractors"intended to direct attention away from the "critical stimulus" or true targets (the removal of antiquities). The animation symbolically linked the Iraqi invasion and stolen antiquities with the Bush administration's own hidden objectives. In my interpretation, the administration's false claims of weapons of mass destruction were meant to distract the public from the real targets of invading Iraq and toppling Saddam Hussein. Nevertheless, I underestimated the difficulty of choosing a disappearing, partially hidden object to be the stimulus that would capture the viewer's attention, especially when distracted.

# **MATERIALS AND METHODS**

The audience for the exhibition comprised predominantly gallery and museum goers, including scientists, students, and the general public. My study was designed to assess the effects of gallery contextualization upon attentional shifts. My assumption was that art audiences will sometimes have developed special skills; frequent gallery-goers often learn to look intensely and compare viewing works of art with prior experiences. The installation was designed to foster such informal learning through repeating the depictions of similar objects (images of both the targets and distractors) in different media as the viewer moved through the exhibition space. My exhibition offered an opportunity to try to assess the influence of an esthetic environment to promote informal learning; commercial galleries are often conditioned by trends and will rarely accommodate this kind of interest. Written materials accompanied the exhibition, including the title of the exhibition (*Stealing Attention)*, signage (the names of the art works displayed and other information), and a press release; all provided minimal clues as to the content of the exhibition. Another advantage of a public exhibition for a psychophysical test is that serious art visitors will often be engaged in visual search and discrimination tasks. Although viewers are generally free to wander at will, the layout and flow through gallery spaces are often carefully crafted. For example, many artists and curators juxtapose specific objects and images to build a totality of relationships that offer more as a whole than when seen individually. To prompt viewers who did not initially

see the targets in the animation after several viewings, I had placed static images of the targets throughout the installation in order to re-direct their attention to the targets when they returned to the animation. I therefore considered their possible movements through the space and installed static works that could provide repeated cues.

I asked a series of questions to determine what viewers saw before and after moving through the exhibition space1. To a limited extent I was able to assess the involvement of people by:


My study was repeated in several different contexts and with a variety of formats. The animation that I designed with Goldberg was modeled after the Simon and Chabris animation, but with significant differences. The Flash animation program was randomized both positionally and temporally and prevented the viewer from predicting what card would flash and where it would be located on the screen or from determining what antiquity, assuming it was perceived, would next be removed. In each cycle all nine images of the hands of Three-Card Monte players were displayed once and were taken from a pool of nine "cells" of images of hands "playing" cards. Going through one cycle of nine random positions took approximately 2.7 s (0.3 s × 9). One of the nine cells showed the Queen of Hearts. It stayed on the screen for about 300 ms. The construction of the animation included the additional image of a yellow circle that preceded each appearance of the image of the Queen of Hearts with which it was temporally linked. It was on view for only a moment, thus serving as a "flicker" that further distracted the viewer from noticing the disappearing relics. Every third time the yellow circle appeared a target disappeared from one of the three depicted shelves in the background of the animation. It took 30 cycles to go from an image of 10 relics on three shelves to three empty shelves. At this point approximately 81 s had passed, and the program then displayed a gradually fading mound of rubble suggestive of the aftermath of the looting of the museum. The program then paused for 20 s before starting the next iteration. I learned by much trial-and-error what conditions would best foster recognition of the phenomenon in the context of an art exhibition. To collect as much data as possible, I created both a gallery situated and studio-situated experimental situation that allowed me to assess the presence of re-directed viewing among a sample of participants. Data came from the following sources:

<sup>1</sup>The questions were loosely adapted from a 2003 study jointly conducted by the Isabella Stewart Gardner Museum and Institute for Learning Innovation Institute; see http://www.gardnermuseum.org/education/research.


In its first gallery viewing at Michael Steinberg Fine Arts, the animation occupied a fully lit room that contained several mixedmedia two-dimensional representations (**Figure 2**). The exhibition title and signage were intended to offer suggestive clues as to the content of the exhibition without being "giveaways." The same antiquities were depicted in these art works as those shown within the Flash animation. These mixed-media works on wood contained figure–ground reversals and Necker illusion perspective reversals, in which the depiction alternatively recedes and juts forward. The depicted setting for these "thefts" was the interior of a museum sometimes identified through applied lettering as the National Museum of Iraq.

Upon leaving this entrance space, the visitor entered a corridor that had six art works, each 30 × 24--. These consisted of a combination of real and illusory images in which some of the forms were painted to look like collage. The images depicted were of hands appropriated from either Caravaggio or de la Tour paintings. They grasped looted Eastern antiquities that were partially hidden behind playing cards (**Figure 3**). The partial transparency of the hands and cards was very similar to the transparency of the targets in the animation.

The corridor opened into a back room, which had several more of my art works and into a smaller installation room that was painted black (**Figure 4**) and featured a single empty white shelf. Suspended just above the shelf were prints from a database of looted Iraqi objects, which featured images identical to those shown in the Flash animation. If someone viewed the entire exhibition and then returned to the animation, these additional clues were designed to make it more evident that the animation showed the disappearance of stolen Iraqi antiquities. The titles of the static works also provided such clues as *Conning Baghdad*, *Graffiti in Iraq*, and *Fleeced Chariot*.

During the Michael Steinberg Fine Arts exhibition, visitors were asked what they observed. During the opening and a pre-scheduled class visit, a video camera was positioned facing out from the

**FIGURE 3 |** *Disappearing Act***, painted (illusory) and real collage.**

**FIGURE 4 | Black installation room, bare shelf with print-outs of looted Iraqi antiquities.**

wall toward the viewers. I eliminated all but three of those interviewed at the opening, which, with these exceptions, did not offer a consistent testing situation. Having observed the difficulty of target detection during the opening (where there were additional distractions), after the opening I slowed the rate of the flashing hands to make target detection easier.

I was able to approximate similar circumstances of viewing in my studio space to that of Michael Steinberg Fine Arts, including static art works and database prints. This enabled me to document the responses of several different groups of visitors to my studio, including artists, art historians, and musicians.

For a third display at Ronald Feldman Fine Arts, NYC, I created a multi-unit work (**Figure 5**).

Since I did not have the entire gallery space to work with, I needed to provide more clues to the viewer within the animation. This time, instead of the header, "Would you like to play Three-Card Monte?" the text in the animation asked whether the viewer would like to play Three-Card Monte with George W. Bush. When the image of rubble appeared at the end of each iteration, for a brief moment an almost subliminal message appeared that identified the scene as"The National Museum of Iraq."Given the limited space I also needed to rely on depicted still images in one part of the three-part work as a way to contextualize the animation. The

**Table 1 | Summary results of targets seen at three locations.**

animation was placed next to a painted collage, and both were juxtaposed with an empty shelf (over the monitor) from which prints of looted objects dangled. In this way a viewer could compare the images of missing antiquities in each of the three units and flesh out the connections between them. The viewer was therefore offered several ways of assimilating and correlating information.

# **RESULTS**

A total of 82 individuals, predominantly from the arts, were observed in the experiment at all three locations. More than half the participants were female; all were adults and predominantly Caucasian. Overall, 32 of these 82 (39%) remarked on the targets after their initial viewing of the animation. Of the 50 who did not initially remark on the targets, 32 (64%) did after having seen additional visual prompts (**Table 1**).

During the scheduled visit of an art history class on March 28, 2009, several groups of viewers arrived at the gallery at different times. They totaled 19 viewers who consisted predominantly of art history students along with unidentified viewers who joined the groups. Since clusters of people were involved, I asked those present to indicate to me what they saw privately and not to discuss their findings aloud. Of this group, 13 of the 19 viewers did not initially see the targets (the disappearing antiquities). I asked the viewers what they saw in the animation both before they walked through the entire installation and then afterward, while they re-viewed the animation. While people continued to watch the animation, I asked them to report on the cards and anything else they saw. Of 13 viewers, six now saw the targets. For those who still did not see the targets, I explicitly asked them to ignore the distractors; all but one viewer saw the targets.While people walked around the exhibition, I would often ask them what they thought the work was about. I had opportunities to test the perceptions of other gallery-goers in similar ways. Of 31 additional viewers to the show, 18 did not initially see the targets. Of these, 10 saw the targets after moving through the exhibition and re-viewing the animation while being asked the same questions as previously. For those who still did not see the targets, when asked explicitly to ignore the distractors, all but one viewer saw the targets.

After the exhibition had concluded, a small art group of six people (experienced art goers) came to my studio; only two of the six initially saw the targets. Upon further viewing and walking around the studio to see the related still images, only two did not


the attention of engaged participants. In addition this experiment did not disambiguate learning (possibly through a priming effect) and repetitive viewing. Further refinements for the future might include a control group unaccustomed to art exhibitions, assuming that this could be done without disrupting the esthetic environment. Another control could consist of a group led through the installation before viewing the dynamic stimulus on the grounds that it might help distinguish whether repetitive viewing or priming enabled the seeing of the targets in the re-viewed animation. In point of fact, many viewers did go directly to the back or the middle of the exhibition, particularly when others blocked their view of the animation, but systematic questions related to perceiving the targets were not asked of them.

The mixed-media paintings featured depicted images of stolen antiquities identical to those shown in the background of the animation and primed the viewers to recognize those objects (**Figure 6**).

For some viewers, the collage paintings in my exhibition reinforced the viewer's gradual realization that perceptual issues were the subject of the installation. My process was to start by making a drawing that served as the basis for a digital print (**Figure 7**). It was deliberately made smaller than the wood on which it was mounted. A process ensued of cutting, rotating, and repositioning the print on the wood. When pulled apart, the print disrupted some of the continuity of perspective and forms (thus also disrupting the illusionism). All of the repositioning and superimposed painting created a maze of figure/ground reversals, rotations, and line displacements. The paintings thus displayed the circumstances under which illusion occurs and is destroyed. Perspectival illusions were also disrupted by mental attempts to piece the original units together, so these works served as another way to show the viewer how attention could be misdirected. As I noted previously, the complexity of these works might also have been a confound since it made instant recognition of the targets difficult. However, when coupled with the dark room installation and smaller montages that focused on the hands, cards, and targets, sufficient clues were provided to allow recognition of the targets. In addition, the incorporation of text within the large-scale works sometimes indicated that the National Museum of Iraq and looting were the subjects of the art. The role of the static art works and black room installation within the exhibition became that of "contextproviders"as opposed to existing solely as discrete art objects. They provided "contextual cueing" (Chun and Jiang, 1998) and served as emotional signifiers, likely prompting recognition of the targets within the animation.

# **CONDITIONS OF VIEWING**

Mack and Rock have pointed out that three kinds of conditions are generally involved in tests of inattention blindness: inattention, divided attention, and full attention. In my project, the trials were conducted as viewers watched the animation. The first trial was held after the viewer saw the first iteration of the animation and before viewing the entire installation. The second trial was held after subjects viewed the installation and while they re-viewed the animation. Both the first and second trials were inattention trials. The viewers were only asked to report on what they saw. During the second trial, as subjects continued to watch the animation, they were asked to observe the flashing cards and "anything else." This was an explicit divided attention task since the viewers were asked to report on both the distraction and the presence of something else. The divided attention trial thus provided information about the subjects' ability to see both the targets and distractors. If someone still did not see the targets, I conducted a full attention trial in which the subject was explicitly asked to disregard the distraction task (i.e., the flashing cards) and report only the presence of something else on the screen (e.g., the critical targets). With the full attention trial almost all the viewers succeeded in identifying the critical targets.

Returning to the first of thefour questions (What does attention make possible?), I could now answer in agreement with the findings of Mack and Rock that attention is necessary for perception. The assigned task in the animation (count the number of times the

**FIGURE 6 | Static work** *Fleeced* **Chariot, paint (illusory) and real collage on wood, 2009.**

the attention of engaged participants. In addition this experiment did not disambiguate learning (possibly through a priming effect) and repetitive viewing. Further refinements for the future might include a control group unaccustomed to art exhibitions, assuming that this could be done without disrupting the esthetic environment. Another control could consist of a group led through the installation before viewing the dynamic stimulus on the grounds that it might help distinguish whether repetitive viewing or priming enabled the seeing of the targets in the re-viewed animation. In point of fact, many viewers did go directly to the back or the middle of the exhibition, particularly when others blocked their view of the animation, but systematic questions related to perceiving the targets were not asked of them.

The mixed-media paintings featured depicted images of stolen antiquities identical to those shown in the background of the animation and primed the viewers to recognize those objects (**Figure 6**).

For some viewers, the collage paintings in my exhibition reinforced the viewer's gradual realization that perceptual issues were the subject of the installation. My process was to start by making a drawing that served as the basis for a digital print (**Figure 7**). It was deliberately made smaller than the wood on which it was mounted. A process ensued of cutting, rotating, and repositioning the print on the wood. When pulled apart, the print disrupted some of the continuity of perspective and forms (thus also disrupting the illusionism). All of the repositioning and superimposed painting created a maze of figure/ground reversals, rotations, and line displacements. The paintings thus displayed the circumstances under which illusion occurs and is destroyed. Perspectival illusions were also disrupted by mental attempts to piece the original units together, so these works served as another way to show the viewer how attention could be misdirected. As I noted previously, the complexity of these works might also have been a confound since it made instant recognition of the targets difficult. However, when coupled with the dark room installation and smaller montages that focused on the hands, cards, and targets, sufficient clues were provided to allow recognition of the targets. In addition, the incorporation of text within the large-scale works sometimes indicated that the National Museum of Iraq and looting were the subjects of the art. The role of the static art works and black room installation within the exhibition became that of "contextproviders"as opposed to existing solely as discrete art objects. They provided "contextual cueing" (Chun and Jiang, 1998) and served as emotional signifiers, likely prompting recognition of the targets within the animation.

# **CONDITIONS OF VIEWING**

Mack and Rock have pointed out that three kinds of conditions are generally involved in tests of inattention blindness: inattention, divided attention, and full attention. In my project, the trials were conducted as viewers watched the animation. The first trial was held after the viewer saw the first iteration of the animation and before viewing the entire installation. The second trial was held after subjects viewed the installation and while they re-viewed the animation. Both the first and second trials were inattention trials. The viewers were only asked to report on what they saw. During the second trial, as subjects continued to watch the animation, they were asked to observe the flashing cards and "anything else." This was an explicit divided attention task since the viewers were asked to report on both the distraction and the presence of something else. The divided attention trial thus provided information about the subjects' ability to see both the targets and distractors. If someone still did not see the targets, I conducted a full attention trial in which the subject was explicitly asked to disregard the distraction task (i.e., the flashing cards) and report only the presence of something else on the screen (e.g., the critical targets). With the full attention trial almost all the viewers succeeded in identifying the critical targets.

Returning to the first of thefour questions (What does attention make possible?), I could now answer in agreement with the findings of Mack and Rock that attention is necessary for perception. The assigned task in the animation (count the number of times the

**FIGURE 6 | Static work** *Fleeced* **Chariot, paint (illusory) and real collage on wood, 2009.**

Queen of Hearts appears) directed attention to the distractors and at least half the viewers were effectively blind to the targets. This "blinded" group of viewers only succeeded in seeing the targets when their attention had been switched to the circumstances of either divided attention or full attention. Mack and Rock made it clear that the important scientific measure is to compare reports of the critical stimulus in the inattention trial with those in the full attention trial because this difference indicates what is contributed by attention.

With regard to the second question (Can attention be shifted?), most viewers were engaged in a visual search task for the Queen of Hearts. The exceptions were those who disregarded the task, those who successfully divided their attention, and those who started viewing the animation after the counting task had been assigned and were initially unaware of the task. The assigned task

**FIGURE 7 |** *Conning Baghdad***, paint (illusory) and real collage on wood, 2009.**

guaranteed that many viewers would be looking in the general area without expecting or looking for the targets. My findings agreed with Mack and Rock's observation that attention can be shifted when the viewer realizes that something other than what is most visually obvious is at stake. In this case, the distractors were the most obvious thing. However, for more than half of the viewers who had not remarked on the targets at the first trial, the installation created a salient alternative: namely the disappearing antiquities. The way this switch might have occurred is discussed later in this paper. But it seems to me that the important point was that, by viewing the installation in its entirety, many viewers recognized my artistic intention and, as a result, could remark on the targets.

The third question (Does art training help prevent distraction?) asked whether seasoned art viewers might integrate input from the animation into a framework of prior knowledge gained from their gallery or life experience and override the tendency to follow the instructions provided at the onset of the animation. Despite the fact that many viewers reading the instruction immediately started to search for the Queen of Hearts, many were able to see the targets after only a few iterations. In addition, there was evidence that some could do both operations (see the distractors and targets simultaneously). How did they accomplish this? I attributed it to the fact that most viewers in my survey were routine gallery-goers and had learned to encompass a whole visual field.

During the 1960s, psychoanalystAnton Ehrenzhweig had developed a theory that "de-differentiated" viewing was a mark of creativity as opposed to "gestalt-based" viewing proposed by Gestalt theorists such as Rudolf Arnheim and Ernst Gombrich that singled out one particular area of a visual field at the expense of others (Jones, 1996, p. 325). Piaget (1930) used the term "syncretistic" while explaining how children viewed causality. A distinctive feature of children's art was to emphasize a juxtaposition of parts. Ehrenzweig (1962, 1971) similarly described syncretic vision as seeing-together, meaning vision that can ignore the distinctions between figure and ground. He championed this approach to creativity, explaining that syncretism involves the idea of looking at a field without differentiation (such as seeing the figure at the expense of the ground). He stated that no single act of attention can take in the whole of the visual field, but the mark of good art was to be able to create a work in which every detail was viewed as part of the overall structure. Findings have suggested that highly creative individuals deploy their attention in a diffuse rather than a focused manner (Ansburg and Hill, 2003). Ehrenzweig concluded that grasping the picture as an indivisible whole is accomplished by a scattering of focus and serves the vital purpose of aiding survival in the real world. According to Ehrenzweig, this de-differentiated viewing would also allow us to see the two profiles of Rubin's vases simultaneously although he could not test this at the time (Ehrenzweig, 1971, pp. 22–23). The idea was that a viewer can be receptive and take in a mass of concrete detail without needing to consciously identify it. Another word for this visual talent is flexibility. A later study similarly concluded that "formal art training results in a global recognition of the pictorial structures involved along with narrative concerns. Attention is shifted away from local feature analysis and information gathering" (Nodine et al., 1993, p. 227). These explanations are suggestive of why one artist in my

study was able to see the targets and distractors simultaneously and quickly. It also explains how the training that artists receive is essential to developing such flexibility.

Other gallery-goers reported that they had difficulty tracking the cards and stopped counting them altogether. However, this did not seem to impact on their ability to see or not see the background targets. A similar result was reported by Simon and Chabris. Michael Goldberg, who showed the animation to a group of physiology students and colleagues at Columbia (before it had been adjusted for speed and without benefit of any of the contextualization of the animation provided by the installation), noted that most of his viewers saw only the flashing hands and cards. This difference of response between the scientists (at the laboratory) and artists (at the gallery) is suggestive of the difference in training between these groups, but it is inconclusive since the animation shown was not identical. More importantly, the viewers at Columbia would have had no way to identify my artistic intentions without the contextualization from either static images or, conceivably, from sound (if rifle shots and breaking glass had accompanied the animation).

Finally, with respect to the fourth and last question (Can art train attention?), the results indicated that artworks have the potential to redirect attention and thus switch a viewer's "attention-set."At the least, most viewers expressed awareness that a perceptual problem had been staged, and a few noted that their attention was being manipulated. My results therefore answered the question affirmatively that art offers a training ground for attention. Nevertheless, on the basis of my experiment I must qualify an affirmative response to the question whether attention can be trained by art. The reasons for this qualification include the lack of a control group, occasional difficulties of recording data at the time the tests were taken, inability to control test parameters and maintain an esthetic setting, the need to speak with groups on occasion, and the lack of fully consistent circumstances of viewing.

### **IMPLICATIONS FOR LEARNING**

Psychology has investigated learning and memory by dividing it into categories such as non-associative and associative (Thompson, 1986). An example of non-associative learning is habituation and it often involves a single event. By contrast, associative learning involves the conjunction of several events and is divided into Pavlovian conditioning (e.g., the ringing of a bell is associated with food) and instrumental conditioning (e.g., pressing a lever to obtain food). Classic psychological studies have determined that the amygdala complex impacts on the amount of attention an object receives; it assigns an emotional salience (significance) to objects or events through associative learning (Klüver and Bucy, 1997). Researchers (Gallagher and Holland, 1994) have provided evidence that a subsystem within the amygdala provides a coordinated regulation of attentional processes. This is pertinent to my study of inattention blindness because the cues that were supplied by the full installation were not neutral ones, but ones that referenced the war in Iraq and the destruction of a cultural heritage. I suggest that those viewers who made the associations between the targets and what they represented would have "learned" to associate the targets with the war and be more likely to recognize the targets when they returned to the animation. In other words, this

learned association would have given a charged significance to the target and impacted the attentional system.

Posner and Petersen (1990) have shown that different operations within the attention networks are responsible for such activities as disengaging attention, shifting attention, and engaging a selective focus of attention. Routes of neuroanatomical connectivity between the amygdala and other brain systems allow some regulation over the attentional system (Gallagher and Holland, 1994). The role that emotion plays in regulating attention (and "capturing" attention through arousal) can and has been traditionally capitalized upon by educators – and by artists. Greater learning occurs with salient examples and associations.

In 2007, Posner et al. described how individual differences might account for differences in the efficiency of the attentional system, reflecting both genes and experience. Posner and Rothbart (2007) have suggested that we view learning as exercise for the brain, which might strengthen the neural circuits involved with memory work and attention. The basic idea about attention training is that the repeated activation of attentional networks through such training will increase their efficiency. They pointed out that early researchers (e.g., Thorndike, 1903; Simon, 1969) dismissed the idea of attention training because they had concluded that training is domain-specific and cannot be more broadly applied to the general training of the mind. The example provided was mathematics, which was not believed to involve transferable properties. However, Posner and Rothbart demonstrated that attention is an exception to being domain-specific and that attention training can, in fact, be transferred to other areas of the brain. They claimed, "Attention involves specific brain mechanisms, as we have seen, but its function is to influence the operation of other brain networks"(Posner and Rothbart,2007, p. 13). Posner et al. (2008), also proposed that both memory and attention in children diagnosed with ADHD can be improved through art training. They identified some of the factors involved with improvement as including enhanced motivation and the fact that there are specific brain networks involving different art forms. The more general implication may be that viewers might derive indirect benefit from certain artworks to whatever extent the actions prompted by the artworks overlap with the formalized tasks of scientific attentional training and testing.

Psychologists Ellen Winner and Lois Hetland have challenged instrumental claims that study of the arts can lead to improvement in standardized achievement tests (Winner and Hetland, 2000). Their skepticism does not, however, negate other possible benefits of art with regard to learning. Winner et al. (2006) have pointed out the necessity of understanding the actual skills that are gained through art-making. They include experimentation, expression, problem solving, observation, and evaluation, along with understanding the art culture. It seems to me that, as I found in my own art experiment, galleries, and museums can also play a greater role in developing such skills.

#### **THE ROLE OF ESTHETICS**

Some objects, artworks, and performances draw attention not to informational data, but instead set in motion simulated events that may involve a qualitative transformation in the viewers. These objects can be thought of as boundary objects, which probe the way the mind works. My goal was that the installation, *Stealing Attention*, would function in this manner and help the viewer see something that was otherwise invisible. As philosophers and artists (notably Picasso) have frequently pointed out, in order to get to a truth that is invisible, art must falsify vision in some sense. An important part of an artist's training involves the ability to contrive a believable scene or event with the realization that it entails a falsification of vision. In addition, an art student must learn how to manipulate a viewer's attention. These skills are not only part of an artists' training, but must also be developed in rehabilitative work involving the senses.

A true scientific study with strict experimental parameters and controls would have destroyed an atmosphere of esthetic contemplation, and this state was an important component of my project. As Kant pointed out over a century ago, the esthetic object offers viewers a way to experience pleasure through the "quickening" of their "cognitive faculties." This process involves "the active engagement of the cognitive powers without ulterior aim" (Kant, 1790/1951, p. 68). To create a minimal esthetic condition a viewer must realize that a formal event and staging of images are intentional. It must also be recognized that the dynamics of attention actually structure what is perceived as relevant. For my study of inattention blindness, I sought a balance between the sometimesconflicted goals of creating a moving work of art versus designing an effective experiment. Despite these conflicts, what artists can provide to the study of attention are ways to design situations where self-discovery on the part of the viewer might suddenly occur as the viewer registers a moment of surprised recognition of something significant that was previously missed.

# **INATTENTION BLINDNESS AS VIEWED IN ART HISTORY**

In addition to works by Caravaggio, de la Tour, and Bosch, another example of inattention blindness, although also not explicitly labeled as such by art historians,might well be Chardin's*The House of Cards*. According to art historian Fried (2007), Chardin called attention to "the telling juxtaposition of two playing cards in the partly open drawer in the near foreground." Fried noted that in the depicted open drawer in*The House of Cards*,which marks the plane closest to us, one of the cards (the Jack of Hearts) is fully facing the viewer and open to his or her gaze. Fried pointed out that this is in contrast to the second card, which is hidden. He then concluded that Chardin's intentionality is made apparent by his creation of the fiction of a card that is hidden to the depicted figures in the art work and responsible for the work's importance. The intentionality that Fried prized in Chardin is signified by the fact that in Chardin's work, a posed, painted actor looks like he is oblivious to the hidden card and to our viewing of him (**Figure 8**). As Fried has emphasized, we, the viewers, must accept what we know cannot actually be the case, since the likelihood is that this painting, like others, was made from a posed model. Fried's interest in the artist's intentionality is shared by some scientists and philosophers. A large part of the importance of a painting is how it reveals the intentions of the artist and thus is indicative of larger patterns of conscious attentional decisions (Roskill, 1989). Philosopher Rollins (2004) has suggested that the artist's intentionality in creating an artwork marks the difference between an art object and a non-art object that has similar esthetic traits. I suggest that what Chardin staged was an occurrence of what scientists might now

**FIGURE8|Jean-Baptiste-Siméon Chardin,***The House of Cards***, ca. 1737, oil on canvas, 82.2 cm × 66 cm, National Gallery of Art, Washington, AndrewW. Mellon Collection, Source:TheYorck Project, licensed under the Creative Commons Attribution-ShareAlike 3.0 license and the GNU Free Documentation License.**

identify as inattention blindness. This painting then confirms that training in the manipulation of attention is something that artists have long received.

It would seem that, just as cognitive examinations can test for flexibility, art works might also foster learning. One of the tests used to help determine whether an individual has ADHD is the Wisconsin Card Sorting Test, a neuropsychological test of "set-shifting" (Berg, 1948). Stimulus cards that contain shapes of different colors, amounts, and designs are presented to the subject. The person administering the test asks the subject to match the cards by color, design, or quantity. To accomplish this, the participant is then given a stack of additional cards and asked to match each one to one of the stimulus cards, thereby forming separate piles of cards for each. The matching rules are changed unpredictably during the course of the test, and the time taken for the participant to learn the new rules and the mistakes made during this learning process are analyzed to arrive at a score. The test is considered to measure the flexibility in being able to shift mental sets, and it also assesses perseveration and abstract (categorical) thinking. It has thus been considered a measure of executive function.

The patient who has a frontal lobe deficit lacks a "supervisory attentive" system. According to Changeux (1994), when that patient takes the Wisconsin Sorting Card Test, he or she does not become aware of the changes in the examiner's strategy and will perseverate, repeating the same mistakes. Significantly, Changeux has compared the difficulty held by such patients to their inability to intuit the intentionality of an artwork. He stated, "It would appear then that the frontal cortex intervenes both in the genesis of hypotheses and in the elaboration of critical judgment, both faculties being essential for viewing a painting, as we have seen" (Changeux, 1994, p. 192). In this way Changeux makes explicit the generally unrecognized ability of an artwork to test the viewer's mental flexibility.

### **ATTENTION SWITCHING**

How might the repeated images have enabled many viewers to shift their attentional set? Perhaps art historian Jonathan Crary supplies part of the answer. In *Suspensions of Perception* (1999), he addressed the important issue of attentional alternation between engagement and fatigue*.* Crary's thesis was that that the poles of attention and distraction can best be understood as a continuum, pointing out that attention carries within it "the conditions for its own disintegration" (Crary, 1999, p. 47). But Crary also cautioned readers against viewing Cézanne's works as the results of faithfully portraying his "subjective optical impressions" (Crary, 1999, p. 301). He viewed Cézanne as recording attention, itself, during which time Cézanne's alternation between focused intensity and overall defocused viewing embodied his attentional gaze – the countless shifts, saccades, and blinks as the scene changed before the artist. To me his insight into Cézanne's work shows the advantage that accrues to some static works like paintings. They can memorialize the eye's activities, something that could not be accomplished in the same way if the artworks were themselves in motion. In addition, still works can be contrasted and contextualized with a medium such as animation that relies on movement. There is no need to make a choice between these modes. This is why *Stealing Attention* was a multimedia exhibition, utilizing a dark installation room, an animation, and collages: it offered the viewer several ways to confront and contrast information delivered both slowly and quickly. Static images might also offer the viewer the chance to refresh the fatigue that accompanies intense viewing.

The linking of alternating engagement and fatigue with inattention blindness finds some support in science. Dehaene and Changeux (2005) developed a model for inattention blindness that takes into account a neuromodulatory substance that causes the attentional network to exhibit a surge of activation, involving synchronized gamma-band oscillations of increasing amplitude. They proposed that this corresponds to a state of vigilance and also hypothesized a second state transition, involving a temporary increase in synchronized firing. They consider that this state of activity may compete with sensory processing and lead to an extinction of sensory processing that may account for the phenomenon of inattention blindness.

Attentional selection has been distinguished as either goaldirected (top-down) or stimulus-directed (bottom-up; Lamy and Bar-Anan, 2008). Top-down selection, a volitional act, is an executive function of experience and expectations. It is an endogenous control of attention that refers to the ability of the observer's goals or intentions to determine which areas, attributes, or objects are selected for further visual processing. By contrast, bottom-up or exogenous control refers to the capacity of certain stimulus properties to attract attention. Bottom-up attentiveness originates with the stimulus and is almost impossible to ignore. Neuroscientist Charles Connor and his team have speculated, "What happens in the brain when these two processes interact?... the complex dynamic interplay between bottom-up and top-down attention determines what we are aware of from moment to moment" (Connor et al., 2004). Research has focused on the relative contributions of these two sources of guidance. At one end of the continuum, neuroscientist Jan Theuwes proposed that"attentional priority"is entirely under the control of stimulusdriven factors, which entails that attention is directed to the "most salient object" in the visual field regardless of the observers' goals (Theeuwes, 2004). At the other end, neuroscientists Folk et al. (1992) have claimed that objects receiving attentional priority are contingent on attentional goal settings and that a salient object outside the observer's attentional set might not capture attention (e.g., a top-down approach). This issue remains controversial. More recent research has focused on the relative contributions of these two sources of guidance and investigated the extent to which the attentional set adopted by the observer can control which objects in the visual field receive attentional priority. In the absence of any particular intention, stimuli we happen to encounter evoke tendencies to perform tasks that are habitually associated with them.

Neuroscientists have contended that the cognitive task we perform at each moment results from a complex interplay of deliberate intentions that are governed by goals and the availability and frequency of the alternative tasks afforded by the stimulus. In task switching experiments, responses to the same set of stimuli differ depending on the goals of the individual at any point in time (Monsell, 2003). What is known is that a switch from one task to another brings about increased response times and increased errors.

As confirmed by psychologists Arrington and Logan (2005) in discussing switch costs, "...voluntary task switching requires subjects to choose the task to be performed on a given trial and thus ensures that a top-down act of control is involved in task switching. The voluntary task switching procedure inverts the usual question in task switching experiments. Instead of asking whether switch costs reflect a top-down act of control, it asks whether a top-down act of control produces switch costs." These researchers concluded that switch costs are incurred. They determined that top-down accounts typically focused "on the processes that enabled a new configuration of subordinate processes (or task set). The enabling processes may involve updating goals in working memory... or adjusting attentional biases and priorities suggesting that the extra endogenous act of control that occurs on switch trials can be initiated, and at least partially carried out, prior to the onset of the target stimulus" (Arrington and Logan, 2005, p. 684). Task switching has been found to take place under the circumstances of divided attention and also when viewers are instructed to ignore the task in favor of another. However, even voluntary (top-down) choices appear to be influenced by bottom-up factors. Experimental psychologist Nick Yeung has stated that "...present findings suggest that bottom-up factors may be a primary determinant of the costs associated with voluntary task switching. According to this interpretation, the switch cost does not directly index the time consumed by the process of activating or enabling new task-level representations. Rather, the cost reflects a relative failure to activate such representations following a change of task, resulting in increased between-task competition and hence impaired performance" (Yeung, 2010, p. 360). It appears that relatively little is currently known about the extent to which bottom-up factors may contribute to voluntary switching performance. Apparently an asymmetry is involved in making a task switch; it has been attributed to "between-task interference" and explored in computational models (Yeung and Monsell, 2003). It may be easier to make a switch by performing an easier task (Mayr and Bell, 2006). It was found by some researchers that, even when more difficult in terms of the costs involved, participants favored task repetitions over task switches (Yeung, 2010).

The way in which this information pertains to the art experiment that I conducted is that, in *Stealing Attention* a task was assigned to the viewer. This made it likely that the uninitiated viewer would initially utilize top-down guidance in following the instructions. As documented, those viewers interviewed who did not initially remark on the relics disappearing (about half) were generally able to identify the disappearing antiquities after they viewed the entire installation and repeatedly viewed the animation. Apparently attention had been re-directed although I was not in a position to determine how. My hypothesis was that the emotional salience of the images may have played a role in addition to the repetition of the images. It also seems to me that you could account for the new ability of viewers to see the targets by top-down, bottom-up, or combinations of both mechanisms. If top-down, the viewers would now actively seek out those images of targets in the animation that were identical to those in the installation. If bottom-up, the salience of the targets would now have attracted the viewer's attention through priming. It is also recognized that task switching can occur under the circumstances of divided attention and during full attention (viewers are instructed to disregard the distractors).

# **SALIENCE**

How can emotional stimuli direct the focus of attention? This question is very relevant to understanding how the emotional salience of looted antiquities might have helped bring about an attention switch when subjects re-viewed the animation. According to neuroscientist, Rebecca J. Compton, two stages are involved in the processing of emotional information. Compton has stated, "First, emotional significance is evaluated preattentively by a subcortical circuit involving the amygdala; and second, stimuli deemed emotionally significant are given priority in the competition for access to selective attention. This process involves bottomup inputs from the amygdala as well as top-down influences from frontal lobe regions involved in goal setting and maintaining representations in working memory"(Compton, 2003, p. 2115). To me this suggests why a study of inattention blindness might profit by including the impact of emotional as opposed to neutral kinds of stimuli. If so, it would appear that examples of art works that have emotional impact upon viewers will become increasingly pertinent to scientific studies of attention.

#### **CONSTRAINTS AND MODELS**

In McMahon's (2003) view, when normal perception occurs, our attention is generally drawn to the literal meaning of a work. But she explained that if the work exploits particular strategies, it can draw our attention to focus on the phenomena themselves. The example she offered was Pollock's exploitation of the human capacity to pick out fractal patterns. This helped me to understand why many viewers could understand my intentions in my exhibition. In my own artistic study of inattention blindness, by exploiting the conflicts inherent in attention switching, the animation allowed viewers to experience the phenomenon directly and then be able to reflect upon it.

The term "bottleneck" is often associated with attention, emphasizing the physical limits of attention. What is the actual nature of this limit? Does it involve shape at all (like a physical constraint)? If so, exactly what is constrained? According to Posner the concept of constraint is a highly disputed idea about attentional function. Some do not believe in any physical limit but just various forms of interference. In an E-mail exchange (2011) Posner stated, "I believe the executive system imposes a kind of limit because its widespread connectivity produces a necessity for priority. Every other kind of view (e.g., attenuators, channel capacity) has also been suggested." My own experience with staging a study of inattention blindness was also filled with many constraints; not only were there the constraints experienced by viewers, but there were also spatial and time limits during the various exhibitions. What became evident is that all learning proceeds within constraints. This may reflect the fact that constraints force prioritizing to take place if an action needs to be performed.

Computational models of inattention blindness have tried to account for the many possibilities involved. The Block model of an attention capture framework as discussed by Gu et al. (2005, p. 183) relies on the cooperation of an internally driven top-down setting and external bottom-up input. The attentional set consists of a pool of task prominent properties that are maintained in memory. At any given moment only one object has a coherence map that can receive focused attention, and it is designated as the most compelling. This then drives a viewer's gaze. The "Contingent-Capture Hypothesis" relies on filters (Gu et al., 2005, p. 185). According to Gu, the attentional set held by the subject determines when an object receives attention. In addition, before an object can be considered for attention, a transient orienting response to the object must take place. This approach therefore explains why the likelihood of noticing an unexpected object increases with the object's similarity to the currently attended object.

According to Noë (2002), work on change blindness and inattentional blindness in the psychology of scene perception has provoked a new skepticism as evidenced by belief in "the grand illusion," which claims that the richness of our visual world is an illusion. Noë has pointed out that failure to notice change is a pervasive feature of our visual lives. Many of those who have investigated change blindness support the grand illusion hypothesis that the richness and presence of the world are illusions. Noë counters this attitude by pointing out that we are sometimes perceptually aware of unattended detail (amodal perception). He provides the example of our perception of solidity when experiencing a tomato as three-dimensional and round, even though you only see its facing side (O'Regan and Noë, 2001). He has concluded that the sensorimotor account can explain experience not represented in our brains. According to O'Regan and Noë (2001), mastering sensorimotor contingencies generates our conscious visual experiences. These considerations are important to artists who tend to embed abstract concepts in the sensuality of the world. This is yet another reason that scientists might wish to further consider how the artistic staging of tasks that are rooted in salient, sensuous situations has affected the perception of viewers as compared with analogous tasks in scientific experiments that lack such embodiment.

# **CONCLUSION**

Accumulated evidence has shown that attention can be trained. The additional question explored was art's potential to serve as an attentional training ground, examining art in the context of learning and motivation. This paper analyzed inattention blindness within the context of a gallery exhibition and compared it to scientific work on inattention blindness. "Looking" was explored under more natural circumstances as opposed to laboratory conditions. It also discussed how a top-down attentional set can determine which stimuli are processed to the point of recognition. My findings were that the attentional set could be changed by some viewers by careful looking and reflection upon the targets depicted in various settings. The fact that so many viewers could re-direct their attention to locate the target after going through the entire gallery installation was, to me, suggestive that learning had taken place; they could now compare the images of the targets they had viewed in static displays to the targets in the Flash animation. I concluded that art enhances mental flexibility and the viewer's ability to identify the underlying content of an artwork.

Scientists have recently explored how emotional salience can influence attention. Although there is increasingly methodological overlap between some scientific and artistic tests of attention, art works invariably stresses the social and metaphoric dimensions, calling forth memories and associations that might lead to a more

# **REFERENCES**


review of evidence from psychology and neuroscience. *Behav. Cogn. Neurosci. Rev.* 2, 115–129.


impassioned response on the part of the viewer. Images assume an emotional resonance, which is quite different from traditional cognitive science, which deemphasizes emotion, motivation, and context (Kenrick, 2001). Much art can be justly characterized by (1) a refusal to compartmentalize feelings from cognition and (2) assigning high value to subjective experience and social and political context. These are issues of increasing importance to neuroscientists.

A kind of coding is apparent to those versed in art's history. Science similarly has its own history and methods, which must be learned by artists who want to contribute their expertise to scientists. Just as scientists can greatly expand upon their reservoir of images, artists can also benefit from looking at the variety of methods scientists use to represent structures that they cannot see and introduce different kinds of approaches to their installations*.* Attention cannot be owned by a single discipline like science since it is essential to most others, particularly art. Therefore both fields derive benefit from sharing their information, but this can only take place if bridges between them are erected and discourses opened that go deep into analysis.

It seems to me that by reverse logic the Wisconsin Card Sorting Card test supports the hypothesis that art has potential to train attention. This test identifies precisely those features some individuals do not have – the ability to discriminate among categories and identify artistic intentionality. These are the very qualities that art could likely promote. My own experience with testing inattention blindness suggests that these are abilities that, when engaged by a viewer, art may be capable of enhancing.

#### **ACKNOWLEDGMENTS**

The author thanks Jill Scott, Angelika Hilbeck, Roy Ascott, and David E. Levy for thoughtful reading of the manuscript in its earlier stages. I also thank the anonymous Reviewers who provided guidance on the improvement of the manuscript.

covert orienting is contingent on attentional control settings. *J. Exp. Psychol. Hum Percept. Perform.* 18, 1030–1044.


*Rests on A priori Bases*, *from Section 12*, trans. J. H. Bernard (New York: Hafner Publishing Company).


turning tricks into research. *Nat. Rev. Neurosci.* 9, 871–879.


of visual experience. *Synthèse* 129, 79–103.


*the Arts*, eds P. Locher, C. Martindale, L. Dorfman, and D. Leontiev (Amityville, NY: Baywood Publishing Co.), 189–205.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 28 January 2011; accepted: 13 December 2011; published online: 06 January 2012.*

*Citation: Levy EK (2011) An artistic exploration of inattention blindness. Front. Hum. Neurosci. 5:174. doi: 10.3389/fnhum.2011.00174*

*Copyright © 2012 Levy. This is an openaccess article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.*

# **CONNECTING ART AND THE BRAIN: AN ARTIST'S PERSPECTIVE ON VISUAL INDETERMINACY**

**Robert Pepperell** *(artist perspective)*

# Connecting art and the brain: an artist's perspective on visual indeterminacy

# *Robert Pepperell\**

*Cardiff School of Art & Design, University of Wales Institute Cardiff, Cardiff, UK*

#### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Eberhard E. Fetz, University of Washington, USA Patrick Cavanagh, Universté Paris-Descartes, France*

#### *\*Correspondence:*

*Robert Pepperell, Cardiff School of Art & Design, University of Wales Institute Cardiff, Howard Gardens, Cardiff CF24 0SP, UK.* 

*e-mail: rpepperell@uwic.ac.uk*

In this article I will discuss the intersection between art and neuroscience from the perspective of a practicing artist. I have collaborated on several scientific studies into the effects of art on the brain and behavior, looking in particular at the phenomenon of "visual indeterminacy." This is a perceptual state in which subjects fail to recognize objects from visual cues. I will look at the background to this phenomenon, and show how various artists have exploited its effect through the history of art. My own attempts to create indeterminate images will be discussed, including some of the technical problems I faced in trying to manipulate the viewer's perceptual state through paintings. Visual indeterminacy is not widely studied in neuroscience, although references to it can be found in the literature on visual agnosia and object recognition. I will briefly review some of this work and show how my attempts to understand the science behind visual indeterminacy led me to collaborate with psychophysicists and neuroscientists. After reviewing this work, I will discuss the conclusions I have drawn from its findings and consider the problem of how best to integrate neuroscientific methods with artistic knowledge to create truly interdisciplinary approach.

**Keywords: art, neuroscience, interdisciplinary studies, visual indeterminacy, object recognition**

# **Introduction**

The last decade or so has seen substantial interdisciplinary activity between the arts and sciences, with many scientists applying knowledge and methods from their own areas in order to gain new insights into how art is made and appreciated (Journal of Consciousness Studies, 1999, 2000, 2004; Zeki, 1999; Livingstone, 2002; Solso, 2003; Martindale, 2006). Occasionally scientists have worked closely with art historians to share ideas and approaches (Freedberg and Gallese, 2007; Onians, 2008). One of the factors motivating this new collaborative spirit is the realization that artists have made certain discoveries about the way the human brain works that are only now being uncovered by scientists. According to Zeki (1999, p. 2): "…most painters are also neurologists." Cavanagh (2005), another eminent vision researcher, talked of "the artist as neuroscientist." Given that for many centuries artists have been intensively studying the way the world is perceived it is perhaps not surprising they have come to understand certain features of the way, for example, we sense objects, color, form, or depth. Through their investigations artists have left a permanent record of their findings in all the countless works of art in museums and galleries around the world. The task of unpacking all this deposited artistic knowledge and reconciling it with our current scientific understanding of perception and cognition is vast. Which is why the recent collaborative activity is to be welcomed, despite the great inter-cultural and methodological challenges it poses. The challenges will be addressed later in this paper. First, though, I will discuss my own experience as an artist collaborating with scientists in the study of art. As will be seen, the process of working closely with scientists has raised a number of issues of the kind I believe many collaborators face when trying to work across two quite distinct traditions. As a result of this work I remain convinced not only that artists and scientists can work together effectively to create new, mutually relevant, knowledge but that it is very important they do.

The paper will start by introducing the topic that was the subject of the collaborative investigations, which I term "visual indeterminacy." This is an area of great significance in the history of art and, as I hope to show, is potentially of considerable neuroscientific interest also. In setting out the background, I will describe how I became aware of the phenomenon of visual indeterminacy, how other artists have explored its effects, and how I have tried in my own work to produce images that are indeterminate. I will then describe some of the neuroscience that relates to the phenomenon, and the collaborative studies I have undertaken with neuroscientists and psychophysicists to investigate responses to such images. Finally, I will consider some of the implications of this work for art–science interdisciplinarity in general.

# **The perceptual phenomenon of visual indeterminacy**

Visual indeterminacy is a perceptual phenomenon that occurs when a viewer is presented with a seemingly meaningful visual stimulus that denies easy or immediate identification (Pepperell, 2006). I first became aware of it as an undergraduate art student watching the silent German Expressionist film *The Cabinet of Dr Caligari* (Weine, 1920), which is known for its non-naturalistic sets and highly contrasting monochromatic lighting. About three quarters of the way through watching the film something remarkable happened: the image suddenly became unrecognizable. Although I could clearly see the screen was full of shapes (there was no problem

with my vision as far as I was aware) they did not form a meaningful scene, and I was left struggling to identify the forms before me. **Figure 1** shows two stills from the movie, the first at the moment of "non-recognition" and the second from a few frames on, when a figure leans up from a desk, and I was once again able to make out what was depicted.

This 5-s sequence had a big impact on me, with repercussions that continue to this day. Unlike normal visual perception where the world is full of objects we readily recognize, in this short lapse of time my usual conceptual grip on the world failed. I remember the experience as marked by a mild form of anxiety and bewilderment combined with an active struggle to make sense of what I was seeing. I've since realized that such experiences are not uncommon. Indeed, I've spoken to many others who report similar momentary lapses of recognition. I have certainly been aware of it in my own perception many times since.

One of the most vivid televisual memories of my childhood was a segment in an early evening quiz show called *Ask the Family*, broadcast in the UK in the 1970s, which pitted two families against each other in a test of general knowledge and observation. The section of the show in question involved an everyday object being presented in close up or from an unusual angle. As the camera pulled out to reveal the object in full the families raced to identify it as quickly as possible. Part of the reason, I suspect, this piece of television trivia is remembered so readily by those who saw it is because it was one of the rare occasions in popular culture where an image was deliberately presented in such as way as to be unrecognizable. When faced with such images we seem to be compelled to determine their meaning, so paying a different kind of attention to them than we would with easily recognizable views of the same thing. These days we might even enroll the help of the online community to resolve visual conundrums of this kind. The image in **Figure 2** was an image posted on a university bulletin board by a confused IT manager who wanted help in identifying what his Christmasthemed biscuit represented.

Visual indeterminacy can be defined, then, as the perceptual experience occurring in response to an image that suggests the presence of objects but denies easy or immediate recognition. Anecdotal evidence suggests that being confronted with such images arouses a need to determine what is depicted, so that additional attention is given in order to resolve the conundrum.

# **A brief survey of visual indeterminacy in art history**

The allure of indeterminate images has not escaped the attention of artists, who have frequently exploited their capacity to perplex audiences. In the first major study of unrecognizable images in art, Gamboni (2002) tracked the use of indeterminacy in works stretching deep into art history showing the way artists deliberately included elements or passages within paintings that confounded the viewers' capacity to identify what they saw. A famous example is found in a work made in the late eighteenth century by Joseph Wright of Derby titled *Experiment on a Bird in an Air Pump* (National Gallery London, 1768). The painting depicts a scientific demonstration of the effect of oxygen deprivation on a bird, and is generally rendered with immaculate clarity. Yet there is a strange object floating in a backlit jar prominently positioned in the foreground of the scene. Ever since the painting was first exhibited people have wondered what this object is. It has been variously labeled a bird's carcass, a skull, and a pickled organ, but there is still no universally agreed interpretation (Schupbach, 1989,

**Figure 2 | Image of an "indeterminate biscuit" taken from a University web site, which included the caption: "What on Earth is this???? Found in a package of Cadbury's Festive Friends chocolate biscuits in the office this afternoon. What on earth is it supposed to be? Santa? Reindeer? Snowman????? Do tell if you know!"** Image reprinted with permission.

pp. 346–347). It has generally been assumed that the artist deliberately fashioned the object in this way in order to add an extra dimension of interest for audiences.

One artist who did more than most to exploit the artistic possibilities of visual indeterminacy was J. M. W. Turner, the English painter associated with the Romantic movement of the early nineteenth century and famous for his atmospheric landscapes and seascapes. In spite of his now titanic reputation, in his lifetime Turner was often vilified for producing what were seen as unreadable and indistinct works, which many critics thought flouted good taste and artistic probity (see **Figure 3**). It is surprising that the images Turner exhibited publicly and which were complained about most vociferously, such as the landscapes of the early 1800s, appear to us to now as quite clear and distinct. Of one such painting another artist commented: "…so much was left to be *imagined* that it was like looking into a coal fire, or upon an Old Wall, where from many varying and undefined forms the fancy was to be employed in conceiving things" (Gage, 1975, p. 450).

Had critics seen some of the works Turner did not show in public, such as the highly indeterminate *Interior of a Great House: The Drawing Room, East Cowes Castle* of around 1830, now in the Tate collection in London, they would very likely have been bewildered. Historians are still unclear about the subject or the motive for the painting, and indeed even when inspected closely it is impossible to make out all but a fraction of the objects depicted. It has been described as being in a "state of dissolution" (Butlin and Joll, 1984, p. 282) being executed with a vigor and freedom hardly seen in Turner's contemporaries.

It is interesting to speculate what was going on in Turner's mind that led him to create such works, and in the minds of the public who struggled to read even his more recognizable pieces. We do know, however, that his paintings had an impact on the 30-year-old Claude Monet when he visited London in 1870 to avoid conscription into the Franco-Prussian War, where he saw Turner's work for the first time (House, 1986, p. 113). Monet soon echoed Turner's atmospheric images in a painting made in 1872, *Impression, Sunrise*, the title of which, when exhibited in 1874, gave the Impressionist movement its name. This rather sketchy rendering of Le Havre

harbour in fog particularly incensed one contemporary critic, who ridiculed its imprecision and scornfully asked: "What does that canvas depict?" (Leroy, 1874).

In fact, as far as Monet was concerned the function of his painting was not to obscure but to faithfully depict the *appearance* of the world, in other words what he *saw* rather than what he *knew* to be out there in front of him. A painting like one of the many views he made of Rouen Cathedral in the 1890s is not so much a depiction of the cathedral's walls themselves than the light reflected by those walls (House, 1986, p. 221). It is up to us as viewers, according to the theories of vision popular among artists in Monet's day, to read into those patterns of light the form of the cathedral from which they were derived using our own conceptual resources. This is what Gombrich (1960, p. 161) referred to in *Art and Illusion* as the "beholder's share" of the pictorial bargain, the contribution viewers supply to the meaning of the image from their own imaginations. Gombrich (1960, p. 250) noted that Turner's great champion, the art critic, and theorist John Ruskin, urged artists to paint only what arrives at the "childish" or "innocent eye," that is, the eye as a recorder of "flat stains of color, merely as such, without consciousness of what they signify – as a blind man would see them if suddenly gifted with sight." Monet subscribed to such a view himself, remarking that he "…wished he had been born blind and then suddenly gained his sight so that he could have begun to paint in this way without knowing what the objects were that he saw before him." (Nochlin, 1966, pp. 35–36).

As a young man in 1895 the Russian artist later to be credited with introducing abstraction to European art, Wassily Kandinsky, saw one of Monet's series of paintings depicting sunlit haystacks in a Moscow gallery. Unable to recognize what the painting was of, he later recounted:

And suddenly for the first time I saw a picture. That it was a haystack (or rather, a grain stack), the catalog informed me. I did not recognize it … And I noticed with surprise and confusion that the picture not only gripped me, but impressed itself ineradicably upon my memory. Painting took on a fairy-tale power and splendor. And, albeit unconsciously, objects were discredited as an essential element within the picture. (Parsons and Gale, 1992, p. 255).

A similar experience is recounted in a passage from Kandinsky's *Reminiscences* when he returned to his studio at dusk and was astonished to see "an indescribably beautiful picture, pervaded by an inner glow" standing against the wall (Lindsay and Vergo, 1982, pp. 369–370). In it he could discern "only forms and colors" and no comprehensible objects. It was in fact one of his own rather impressionistic paintings turned on its side, the subject of which he had failed to recognize. Kandinsky realized the potential of objectless images to evoke a remarkable perceptual response. He subsequently spent many years refining a visual language through which this insight could be expressed.

Among contemporary artists, Gerhard Richter is somewhat unusual in that he works in a number of quite distinct styles. He is particularly recognized for both his photo-like images, precisely rendered, and his generally larger abstract works, which he frequently produces by an almost chance-like act of scraping, leaving the final effect to the unpredictable interaction between paint and tools.

But rather than being seen as either realistic, in the conventional sense, or abstract, in the sense of non-representational, Richter's work can be better understood as "indeterminate" in the way so far described here. What the artist is trying to produce is a sense of uncertainty, lack of fixedness, which draws the viewer in to try and resolve what they are seeing. Richter himself is very explicit about this, saying: "Pictures which are interpretable, and which contain a meaning, are bad pictures." A good picture, on the other hand, "…demonstrates the endless multiplicity of aspects, it takes away our certainty, because it deprives a thing of its meaning and its name. It shows us the thing in all the manifold significance and infinite variety that preclude the emergence of any single meaning or view." (Elger and Obrist, 2009, pp. 32–33). And in this exchange with the art critic Robert Storr he offers an insight into his own theory of indeterminate perception:

GR: I try to avoid something in the painting resembling a table or other things. It is terrible if it does because then all you can see is that object.

RS: So you allow for aspects or suggestions of images in the abstract work but not actual pictures?

GR: Not actual pictures. I just wanted to reemphasize my claim that we are not able to see in any other way. We only find paintings interesting because we always search for something that looks familiar to us. I see something and in my head I compare it and try to find out what it relates to. And usually we do find those similarities and name them: table, blanket, and so on. When we do not find anything, we are frustrated and that keeps us excited and interested until we have to turn away because we are bored. That's how abstract painting works…

RS: I am just saying that you use paintings as a way of making it difficult for people to read the image.

GR: Yes, that's right.

(Storr, 2003, pp. 178–179).

What is evident from this brief survey of visual indeterminacy in art is that artists who make hard to decipher images are doing so not just to be wilfully obscure or to confound their audiences. They are also acting rather like vision scientists by exploring how certain kinds of images engage the visual system and how we make sense of the world. Moreover, by heightening our visual awareness, so certain artists believe, indeterminate images in their various forms can produce interesting, even revelatory, esthetic experiences.

# **Creating indeterminate images**

Like the artists cited here, my initial interest in the phenomenon of visual indeterminacy was artistic. I became absorbed by the challenge of creating images, both still and moving, that could induce the same state of visual uncertainty in others that I had undergone myself when watching the *Cabinet of Dr Caligari* sequence. I tried many methods of achieving this using film, video, collage, fractal image generation, and digital image manipulation. In each case I was trying to produce a picture of sufficient complexity to strongly suggest the presence of some object or scene yet at the same deny easy or immediate identification. **Figure 4** shows an early paper collage examples of these experiments. I soon found the problem of "trapping" the human visual system in this way much harder than I had first anticipated. As I now realize is well known to vision scientists, the human visual system is extraordinarily effective at rapidly identifying objects in perception (Thorpe et al., 1996; Rousselet et al., 2002). Even given the scantest of clues – such as two dots and a curve – we can interpret things, like faces, almost instantaneously. Alternatively, if the information in an image is too noisy or distorted we simply categorize it as a "meaningless" abstract texture, and make no attempt to discern objects in it (see **Figure 5**).

The challenge in making artworks that are truly indeterminate, then, was to achieve a fine balance between recognizability and abstraction in order to excite the inquisitiveness of the viewer's vis-

**Figure 4 |** *Uncertainty 4***, paper on card, 29 cm** × **15 cm, 1992.** An early attempt to create an indeterminate image using paper collage.

**Figure 5 | The image on the left is a noisy texture that does not suggest any objects and so is effectively treated as abstract.** The simple arrangement of two dots and a curve on the right show how readily we are able to recognize objects, even from the scantiest of clues. The problem of creating indeterminate images is how to avoid both these kinds of interpretation.

ual system while frustrating its capacity for recognition at the same time. After many years of experimentation I gradually developed a method of drawing, and then painting, which seemed to produce this effect quite reliably. I discovered that by using a classical pictorial architecture, of the kind frequently found in European paintings made between the 1500s and early 1900s, I could create an image that incited strong expectations of recognizable objects and scenes. (This classical period was the epoch in figurative art that many people associate with recognizable depiction of forms, in contrast to later Modernism where artists turned increasingly to distortion and abstraction.) By using this overall pictorial structure but omitting, or otherwise manipulating those features of the image that would be readily recognized I was able to achieve a consistently indeterminate image. Some examples of these paintings are shown in **Figures 6, 7, and 8**.

In the early stages of making this work, the process of deciding what made a certain image successfully indeterminate in the terms described above was largely a matter of my personal judgment. I had to rely on my own reading of the image I was producing, and gage whether or not the forms in it were sufficiently evocative of objects or scenes, or whether they were too abstract or textural to incite the curiosity of the viewer. Increasingly I sought the opinions of others by showing the paintings in galleries or the studio and asking viewers to describe the processes occurring in their own minds as they studied the works. After doing this many times I found people tended to report they were having similar kinds of experiences. Their initial response was to think they were seeing a classical painting depicting a familiar theme, such as landscape, figure, or still life. But wherever they looked to find objects that would corroborate this initial response they failed to do so. They would fixate on an area in which they thought they saw a human limb or a piece of cloth, but would then realize that this was a false start, and would look for some other salient feature to pin their interpretation on. Many reported they were looking at certain forms within the images and sifting through the possible interpretations in their mind, testing various options in order to successfully name what it was they were looking at. Most people reported this experience in positive terms, as interesting, or visually exciting, although some did tell me the images were "disturbing" or made them feel anxious.

This process of testing the indeterminacy effect of paintings on viewers was very useful as a way of confirming or refuting my own judgments about the way the images would be read. Those paintings I felt were more effective also tended to be the

**Figure 7 |** *Impulse,* **Oil on canvas, 80 cm** × **70 cm, 2006.** Collection of the University of Exeter.

**Figure 6 |** *Paralysis***, Oil on panel, 27 cm** × **33 cm, 2006.** Private collection.

**Figure 8 |** *The Flight***, Oil on paper, 30 cm** × **40 cm, 2007.** Private collection.

same ones other people would report as having the strongest effect on them. But although useful in guiding my judgment, these viewer surveys were not carried out in any scientifically valid way. They were simply verbal reports elicited under a variety of conditions and recorded rather haphazardly. Having had a longstanding interest in the science of perception and visual consciousness I wondered if scientific methods could be usefully applied to study the effect I was investigating in a more systematic way. I also became increasingly interested in what science might have to say about the phenomenon of visual indeterminacy, and what effects the process of looking at indeterminate images might be having on the vision systems and brains of those looking at them.

# **Scientific background to visual indeterminacy**

As I started to look for scientific literature relating to visual indeterminacy it became clear this was a relatively lightly investigated area of perception compared, for example, to the related phenomenon of ambiguous or reversible images. Ambiguous images, such as the Necker cube, the Duck–Rabbit illusion, or the Boring vase, are distinguished by having alternating interpretations (the image is perceived either as a duck or a rabbit) each of which is quite determinate (Kleinschmidt et al., 1998; Meng and Tong, 2004). Also well known are the issues around perceptual organization and so-called "hidden figures," exemplified in R. C. James' famous photograph of a Dalmatian dog in a dappled environment (Gregory, 1970; Palmer, 1999; Ramachandran and Hirstein, 1999). These, and other similar "puzzle pictures," direct the viewer to search for objects that are concealed in some way within the structure of the image, and once found then not easily lost. An example is the image of a cow first presented by Dallenbach (1951), a version of which is reproduced in **Figure 9**.

When I first saw this photograph I remember having a good deal of difficulty in finding the cow, although once I did it was very hard to see it as anything else. The experience I had prior to the point of recognition was similar, as I recall, to that occurring during the *Cabinet of Dr Caligari* sequence many years before. Both were marked by a sense of struggle in which various alternative interpretations were tried out until the flash of recognition occurred. My interest in such images was less in the moment of recognition than the preceding process of object search, and what kinds of perceptual processes might be taking place during this time.

The perceptual state of visual indeterminacy occurring prior to the moment of recognition bears similarities to the rare neurological disorder of associative visual agnosia. A notable case study of this condition, presented by Humphreys and Riddoch (1987), concerned a patient, John, who had suffered a stroke resulting in a bilateral lesion in the region supplied by the posterior cerebral artery. Much of John's capacity to see was spared, but his ability to recognize what he saw was greatly impaired. When shown a series of line drawings of everyday objects he was able to identify only a small proportion, and relied on "working out" what was depicted from specific clues within in the image, such as the curliness of a pig's tail, rather than by seeing the object "as a whole" (p. 60). In arriving at their diagnosis of associative visual agnosia the authors ruled out other possible factors that could have contributed to John's inability to recognize everyday objects, including any residual deficit in his stored knowledge or visual sensation. He showed no difficulty in recognizing objects by other means, such as touch, or describing them in detail from memory and was able to make quite accurate copies of drawings, albeit slowly. **Figure 10** is a drawing made by John (on the right) copied from the picture of the owl (on the left). The authors note that John could quite accurately copy line drawings of objects, "…even when he had no idea what the

**Figure 9 | This is an image of a cow, although most people are unable to see it at first glance, or even after prolonged study.** Once seen, however it is very difficult to see the image as it appeared prior to the point of recognition. From American Journal of Psychology. Copyright 1951 by the Board of the University of Illinois. Used with permission of the author and the University of Illinois Press.

**Figure 10 | The drawing on the right is the copy made by John, the patient with visual agnosia, of the drawing on the left.** The fact that John could make this copy showed that his capacity to "see" was in tact, although he had no idea what it was he was copying. Reproduced by permission (©1987 Oxford University Press).

object was" (p. 69). To demonstrate this they point out that John faithfully reproduced the gap on the right side of the original owl's head, not seeing it as an omission in the original drawing.

For Humphreys and Riddoch, John's case suggests the normal processes of object recognition involve a number of operations that occur, to some extent, independently of each other but which can be broadly grouped into two layers (pp. 101–102). The first set of processes organize the perceptual "input" data according to position and orientation, and bind multiple visual elements into wholes. A subsequent set of processes then match that input data to associations about function and meaning. The authors conclude: "In general terms, (John's) case supports the view that "perceptual" and "recognition" processes are separable…" (p. 104).

In her extensive study of visual agnosia Farah (2004) makes the same broad distinction between perceptual input and the conceptual associations involved in visual object recognition. While stressing the non-serial, multidirectional processes in vision, she summaries: "Visual form agnosia validates the distinction implicit in the labels "early" and "intermediate" vision, on the one hand, and "high-level," "object" vision on the other, by showing the first set of processes can continue to function when the second set is all but obliterated. It shows us a kind of richly elaborated but formless visual "stuff," from which "things" can be derived" (p. 156). The phrase "richly elaborated but formless visual stuff" accurately describes the appearance of indeterminate images prior to the point of recognition. It is precisely the inability to match this "visual stuff" to one's stored memories and associations that seems to characterize the visually indeterminate state. For most of us this can occur occasionally, but for the unfortunate sufferers of visual agnosia it is a permanent condition.

My hunch was that during the period where viewers are searching for meaning among the pictorial clues something is occurring in their cognitive processing which is different from that occurring during normal recognition. This seemed to be supported by some scientific studies looking at brain responses to unrecognizable versus recognizable images. An experiment by Supp et al. (2005), for example, used EEG techniques to examine the changes in cortical networks within the time-window of the event-related potential component N400. They showed subjects sequences of recognizable and unrecognizable gray scale pictures, the latter matching the former as closely as possible in terms of size, complexity, and structure. The results showed a marked increase in cooperation in certain parts of the brain and a greater degree of overall coherence between different regions during the viewing of unrecognizable pictures as compared with recognizable ones. This, they concluded, reflected the greater demands made on the viewer's perceptual and cognitive resources and consequent "unease" involved in the task of semantically matching the undecipherable stimuli: "…the greater number of coherence increases for meaningless object processing suggests enhanced recruitment of more distributed left and right areas during unsuccessful memory search" (p. 1143). This finding seemed to corroborate my own sense of "unease" when confronted with an unrecognizable image, and the sense of mental struggle involved in trying to resolve the conundrum.

In another study that compared brain responses to recognizable and unrecognizable images, Rainer et al. (2004) measured neural activity in the V4 area when exposing the monkeys to images that were increasingly degraded, from clear to abstract noise. The monkeys learned to recognize familiar images that were degraded compared to novel ones that were treated in the same way. The researchers found that monkeys exposed to indeterminate images showed significantly increased neural activity in both primary and higher cortical areas of the brain than when faced with familiar or recognizable stimuli. From this Rainer and his team drew the conclusion that not only are particular loci in the brain recruited in response to indeterminate stimuli, but that the attempt to decipher such stimuli leads to enhanced overall coordination in brain activity: "This suggests that V4 plays a key role in resolving indeterminate visual inputs by coordinated interaction between bottom-up and top-down processing streams" (p. 275).

It was while presenting a lecture on indeterminate art at the Max Planck Institute for Biological Cybernetics at Tübingen, at the invitation of Gregor Rainer, that I proposed a possible study in which the effects of looking at indeterminate paintings would be

compared to paintings that looked similar but contained recognizable objects. To demonstrate this I showed a painting of my own next to a detail of Michelangelo's Sistine Chapel ceiling (**Figure 11**).

My painting had a similar visual structure and colors as the Michelangelo, but omitted any clearly discernable objects, such as people or clothing. Working on the basis of the intuitive hunch noted above, I proposed that comparing the brain activity of subjects exposed to these similar images might reveal some useful information about the processes involved in object recognition. It happened that in the audience were two scientists who offered, in different ways, to carry out tests using my paintings as stimuli, which led to several collaborative studies being undertaken on art and visual indeterminacy.

# **Scientific experiments on visual indeterminacy**

Working with the neuroscientist Alumit Ishai, and her team at the Department of Neuroradiology in the University of Zürich, I created a set of stimuli that included a selection of my own paintings, all of them in the indeterminate classical style described above, and the same number of paintings made by other artists, which had a similar visual appearance but were full of recognizable objects. These included works by artists such as Turner, Tintoretto, Rubens, Michelangelo, and Fuseli among others. Some samples can be seen online at: http://www.robertpepperell.com/Stimuli/Stimuli.html. These stimuli were divided equally into monochrome and color sets, and then presented in a number of behavioral experiments. Subjects with no specialist art training were shown the stimuli in random order and asked to perform a number of tasks, including deciding whether each image contained familiar objects (a measure of object recognition) and how powerfully the images affected them (a measure of esthetic response). Scientific details of the experiments and the results can be found in the published paper (Ishai et al., 2007) but I'd like to reflect here on some of the findings that I found surprising and interesting from an artistic point of view.

One of the unexpected results concerned the extent to which subjects reported seeing familiar objects in my indeterminate paintings. Given that I had striven so hard to remove any trace of recognizable objects, leaving only strong suggestions, it was interesting to discover that people were claiming to see things they recognized on average 24% of the time. (It was less surprising to me that the effect was stronger with the color images compared with the monochrome as I had always found it easier to create the effect of visual indeterminacy when making monochrome paintings; it is noticeable how readily a pinkish hue will suggest flesh or a bluish hue sky.) As one might expect, subjects reported seeing familiar objects in the other artists' work almost 100% of the time. It was also notable that the subjects gave almost identical scores for esthetic affect across all the paintings in the study, regardless of how recognizable they were. What this seems to indicate is that, in rating their esthetic response, the subjects were less influenced by the literal meaning of the images they saw than the immediate visual impact of the shapes, colors, and composition. This is despite the fact that previous studies have shown a tendency for non-art trained audiences to prefer pictures they can recognize more than abstract ones (Healey and Enns, 2002; Vartanian and Goel, 2004).

In art historical terms the distinction between the meaning of an art work and its physical appearance has been understood in terms of "content" and "form," and this distinction has given rise to prolonged and often impassioned debate among theorists of art and esthetics as to which aspect is the more significant in determining the effect of an artwork and, indeed, whether the two aspects can really be distinguished at all (Bell, 1914). What this study suggested to me as an artist is that the distinction does have some validity given that the "formal" properties of the artworks seem to be a more significant factor in their degree of esthetic appreciation than "meaning" factors, at least over the short (4 s) viewing period used in the trials. The study also showed that subjects were significantly slower to make judgments about indeterminate paintings than they were about recognizable ones, whether they saw objects in them or not, which might suggest that the attempt to find objects in the indeterminate images requires a different kind or greater degree of underlying cognitive processing than when perceiving recognizable images. This seemed to corroborate the implications of the studies cited above, where viewing indeterminate images can lead to differential activity in certain areas of the brain. Crucially, though, there was a significant correlation between the length of time taken to determine whether or not images contained objects and rating of esthetic effect, such that the longer it took to make a decision, the more powerful the image was thought to be.

It is worth elaborating on the fact that the rating of esthetic effect employed in the study was slightly unusual. Rather than using a rating of "ugly to beautiful," on the basis that esthetic experience is synonymous with the appreciation of beauty, we used one of "powerful affect" on a scale of 1–4, with 4 being the most powerful. The reason was that, as even a cursory glance at art history will show, the esthetic impact of a work of art is not necessarily linked to how beautiful, pleasurable, harmonious, or pleasant it is. Some of the most impressive art works can be quite ugly, disturbing, distorted, or dissonant. One thinks of Goya's *Saturn Devouring his Son* (1823, Prado, Spain), Picasso's *Mother and Child* (1907, Musée Picasso, Paris), or the Chapman brother's *Hell* (1999–2000, Saatchi Collection, London). The use of the term "powerful" arguably more accurately captures the range of emotions felt by an audience in response to a work of art and is therefore more objective as a measure of affect than the more limited category of beauty alone. The fact that from this study it appears increased recognition latencies are associated with an increase in the "powerfulness" rating of the image indicates that the amount of struggle or effort needed to comprehend an image has some positive relationship to its esthetic value. This was also something I had intuitively suspected, based on my own esthetic experiences of indeterminate artworks and the fact that such images are so often revered in the canon of art.

Although firm deductions cannot be drawn from this single study, it seems that the experiments described above, coupled with the subjective reports I gathered from viewers when making the works, give good reason to believe that something is happening in the case of viewing indeterminate art works that is not happening with immediately recognizable ones. I know my own experiences of seeing indeterminate images, whether art works or not, to be moments of great vividness and highly focused attention, where the habitual operations of recognition are fleetingly suspended as the mind struggles to resolve the components of the image into something meaningful. This is by no means a straightforwardly pleasurable experience; it can sometimes be quite frustrating or disorienting, and not immediately rewarding. In esthetic terms, however, I regard the experience as being of great value, since for a few moments I am acutely aware of the visual form of the scene before me in a way I am not when the image is semantically determined. This, then, might be thought of as part of the heightened mode of perceptual experience associated with indeterminate images, and may help to explain why artists over the centuries have been so frequently drawn to making them.

A follow-up study used neuroimaging techniques to look at the activity in subjects' brains while viewing the same stimuli plus another set of artworks that were entirely abstract, that is, with no suggestion of objects at all (Fairhall and Ishai, 2008). Also included were scrambled images that were essentially visual noise. The study was design to test the prediction that abstract, indeterminate, and recognizable images would produce a "posterior-toanterior gradient of activation along the ventral visual pathway, with stronger response to abstract compositions in inferior occipital gyrus; stronger response to indeterminate paintings in intermediate regions in posterior fusiform gyrus; and stronger response to representational paintings in anterior fusiform gyrus" (p. 925). Using a similar object recognition task as employed in the previous study, the behavioral data recorded in the scanner revealed an even stronger propensity for subjects to report seeing familiar objects in my indeterminate paintings (now 36% of the time). They even reported seeing familiar objects in the abstract paintings 18% of the time, even though these had been chosen specifically for their lack of object-suggestive content. As expected, there were almost no reports of objects being seen in the scrambled images.

In neuroscientific terms, the results were able to partially confirm the hypothesis, and again I only want to comment here on some of the interesting implications the study had for me as an artist. It was gratifying for me to know, for example, that based on the data in this study at least there is a detectable "indeterminacy effect" produced in subjects when looking at my paintings. By comparing the level of activation between the scrambled paintings and my own there was a significant differentiation in certain brain areas (precuneus and medial frontal gyrus), which were described in the paper as the "neural correlates of object indeterminacy." Once again, subjects took longer to decide on the question of whether the images contained familiar objects or not in the indeterminate and abstract paintings compared to the recognizable ones, which suggests a more effortful process is going on when judging ambiguous or suggestive imagery. But I was slightly surprised that the effect of seeing indeterminate images on recordable brain activation was less pronounced than I had expected. In my naivety about the way the brain works, and what the scanning process is able to detect, I had anticipated a far stronger degree of differential activation during the exposure to indeterminate images as compared to recognizable ones than was found.

But the study did confirm one of my other intuitively held beliefs about the way we perceive the visual world. Part of my anxiety, or unease, during the moment of indeterminate perception in the *Cabinet of Dr Caligari* sequence arose from the sense of compulsion I felt to make sense of what was in front of me. I have felt the same many times since when unexpectedly confronted with an indeterminate scene. The fact, confirmed in this study, that subjects reported seeing objects in images that did not contain them, even more so than in the one cited previously, is evidence of the involuntary impulse we have to turn the rich and complex visual data around us into meaningful things. As the paper concluded: "Our findings indicate that this seemingly effortless process (of recognition) occurs not only with familiar objects, but also with indeterminate stimuli that do not contain real objects. It therefore seems that the primate brain is a compulsory object viewer, namely that it automatically segments indeterminate visual input into coherent images." (p. 929) This helps to explain why indeterminate images can be so compelling.

Separate studies conducted by Wallraven et al. (2007a,b) at the Max Planck Institute used the same indeterminate paintings employed in the previous studies, but this time subjected them to a range of psychophysical tests using eye-tracking and categorization tasks. The purpose was to look at the ways subjects would react to indeterminate stimuli, and also to see if there were any empirical grounds for verifying my own intentions in making my art. Having worked so long to make successfully indeterminate paintings on the basis of intuition, guesswork, and the informally acquired reports of others, it was again fascinating for me as an artist to see what more rigorous and objective measures might reveal about, quite literally, how people looked at the work.

In one set of behavioral experiments, my indeterminate paintings and the visually similar representational paintings were submitted to a range of tests looking at subjects' responses to variations in size and orientation. They also undertook a categorization task where participants were asked to classify the images into one of seven genres, which were "Biblical scenes," "Landscapes with person," "Landscape without a person," "Portrait," "Still life," "Battle scene," and "None of the above." In another experiment, participants were shown the sequence of indeterminate images and, in addition to the categorization task above, were also asked to identify whether or not the images contained people, during both of which their eye-tracking movements were recorded. Again the scientific details can be consulted in the relevant papers, but two outcomes were of particular interest to me as the originator of the indeterminate stimuli.

First, the analysis of the data produced by the experiments seemed to verify my intentions in making the indeterminate images. Wallraven et al. (2007a) proposed generally that visual information can be ordered along two parameter dimensions, namely "abstract/representational" and "unique/ambiguous." Images that score highly on the "unique" parameter are very distinct in meaning, whether they are abstract (as in the case of certain symbols or icons) or representational (as in the case of photographs or photorealistic paintings). Images that are rated as being more "ambiguous" may be almost entirely non-representational (as in the case of certain abstract art) or have multiple meanings (as in the case of certain optical illusions or surrealist artworks). These two parameter dimensions also function on two distinct layers which normally operate together in visual perception: the perceptual layer, which broadly speaking is the same "bottom-up" or "lower-level" set of processes involved in organizing visual data described above in the section on visual agnosia, and the conceptual layer, which consists in the "higher-level" or "top-down" information retrieved from stored memories and associations (**Figure 12**).

As someone interested in the way we perceive images, this "map" of what the authors called the visual "interpretation space" is extremely useful as a way of organizing the different variables that can influence the way an image is read. On could imagine, for example, using it to visualize the whole history of recognizability visual art, and thereby track the shifting patterns of taste across the centuries. When this same parameterization was applied to my own paintings by the team it was expected that in order to fulfill my ambition to make works that were neither fully recognizable nor fully abstract they would need to be assessed as being located roughly in the center–right of the graph, that is, avoiding the extremes of each parameter, but tending toward ambiguity. Once the genre categorization tasks had been carried out the results showed that the images were indeed distributed around a "region of indeterminacy" at the center of the graph, with a bias toward to ambiguous end of the axis. For the experimenters, this data offered a perceptual validation of the artistic program behind the work. It was also interesting to note that, as with the previous studies cited, subjects reported a relatively high percentage (37%) of my images as being "representational," despite the lack of any clear objects being depicted.

The other interesting finding from these experiments from my perspective concerned the differences in eye movements the participants displayed when engaged in the person finding task as compared with the genre categorization task. What surprised me was the extent to which the fixation maps varied between the two tasks, even though subjects were looking at the same images. As has been known since the time of the early eye-tracking experiments by Yarbus (1967) how one looks at a image is critically dependent on what is being looked for. While this is clearly well known to scientists it is not, as far as I am aware, something generally known to artists. Yet is clearly a fundamental aspect of the way we apprehend the world, which presumably reflects the way expectation and meaning are mediated by the visual system in general and the brain in particular. It is interesting to consider, therefore, how directive clues in art, such as titles or hanging context, might affect the way audiences look at works of art. It also alerted me to how important the use of titles might be in my own work in leading the "eye" of the viewer as they try to interpret the image.

The question of how titles affect the interpretations of paintings stimulated the final art–science collaborative study I wish to mention (Wiesmann and Ishai, 2010). The study used a selection of Cubist paintings made by the artists Pablo Picasso, Georges Braque, and Juan Gris in the period before First World War. Cubist paintings of this period are characterized as being highly indeterminate in so far as they are directly observed depictions of everyday objects – tables, fruit, newspapers, glasses, etc., – but represented in a fragmented and "exploded" manner that makes immediate identification very difficult. Art experts, and others familiar with the genre, are able to "read" these Cubist scenes and find within them the various forms and objects from which they are constructed. But those without this expertise tend to see only patterns, lines, and textures rather than distinct objects of any kind (Golding, 1988).

One part of the study looked at the extent to which descriptive titles presented alongside Cubist paintings affected the viewer's capacity to identify objects in the scene. Crucially, however, half the subjects undergoing the task of detecting familiar objects received a short training session before the trial in which they were instructed on how to "read" Cubist paintings and find objects in them. The study gathered both behavioral and fMRI data, and again the scientific methods and results are available in the published paper. Samples of the stimuli can be seen at: http://www.robertpepperell. com/Cubism/index.html. What was surprising from my own perspective was the extent to which the "trained" subjects differed from the control group in terms of the number of objects recognized. Despite the fact that the subjects were not art experts and received only a relatively brief training sessions (30 min) they were significantly better than the control group in recognizing familiar objects. The study also found that the role of the descriptive titles, which effectively declared what the paintings depicted, has little effect on the control group but a marked effect in helping the trained group to find more familiar objects. To me, as both an artist and art teacher, these results were somewhat counterintuitive inasmuch as: (a) I would have expected the process of learning to read Cubist paintings to be something only acquired over many hours of study rather than the brief period of training undergone by these subjects, and (b) that meaningful titles would have had some positive effect on helping those with no training to find familiar objects more often than when looking at the same image only accompanied by the word "Untitled," as was the case here. The study also showed enhanced activation in the parahippocampal cortex of the trained subjects, the amplitude of which increased as a function of the number of objects recognized. This suggested that the subjects had used broader contextual associations to identify the objects in the paintings rather than the cognitive resources normally linked more specifically to object recognition. It is also tempting to wonder whether subjects thus trained in recognizing objects in Cubist paintings are also then better at other object recognition tasks, and indeed whether learning to understand Cubist art can actually improve cognitive performance in other areas; it would certainly be good news for art lovers if that were the case.

# **Art, science, and the brain**

The various investigations I have undertaken with neuroscientists and psychophysicists have proved illuminating and rewarding from my artistic perspective. I initially set out to discover what science might be able to tell me about the specific issue of visual indeterminacy, and how people respond to my paintings. In doing so I have gained an enormous amount of insight into the way the visual system operates, how the brain functions, and indeed how science itself operates when investigating these phenomena. I have become aware of the great potential of the scientific method to elucidate processes that artists often work with intuitively but rarely grasp in any systematic way. But I have also seen at first hand the limitations of the scientific method when studying the experience of art, and have been reminded of the very different cultures that exist between art and science that make meaningful collaboration a sometimes demanding process. In the final section of this paper I want to briefly reflect on these issues and how future joint research between artists and scientists might benefit from these experiences.

In the first place, it is important to acknowledge the inherent limitations of the scientific method when investigating the way we perceive art – at least as they appear from an artist's view. It often goes unremarked, for example, that most if not all lab-based studies of audience responses to art will use reproductions instead of real works of art. Reproductions are not always of the highest quality, and cannot be shown in a way that properly reflects the physical properties of the work itself. When preparing the images for the Cubism study, for instance, it was necessary to conform all the images to the same scale and format due to the demands of the experimental procedure. This meant a lot of cropping and resizing, which resulted in the loss of size discrimination between large and small paintings. And there is the broader question of how valid it is to measure the effects artworks on the basis of reproductions at all. Some empirical esthetics studies have shown significant differences in the judged hedonic or pleasure value of original artworks compared to reproductions (Locher et al., 2001). Certainly any serious scholar of art would make a point of examining the real work before arriving at any definitive evaluation of its esthetic impact. Many qualities inherent in a work of art simply do not covert into photographic media, including scale, degree of surface gloss, texture of brushwork, or the way that certain colors can change depending on the angle of viewing (as is the case, for example, in many paintings by the abstract artist Ad Reinhardt). All these are crucial esthetic properties that artists work hard to control, and their absence or impoverishment in conventional photographic reproductions restricts what many lab-based studies can tell us about the experience of looking at them. Then, of course, there are all the well-known problems associated with subjects being placed in fMRI scanners, with the distracting noise and discomfort they create (Cooke et al., 2007). Something similar, but less intrusive, is true of eye-tracking devices that require the head to be locked in a stable position – something that clearly would not happen in a natural gallery setting. While these limitations do not, in my view, diminish the value of such studies they should perhaps be more frequently acknowledged when discussing the implications of the results.

Another issue that those wishing to study the effects of art on the brain might want to consider is the risk of what might be called "neuro-determinism," that is, the expectation that esthetic experience can be fully accounted for in terms of brain-centered processes. The neurobiologist and pioneer of the neuroesthetic approach to art–science integration, Zeki (1999, p. 217), said in his seminal book on the subject: "My aim in writing this book has been really to convey my feeling that esthetic theories will only become intelligible and profound once based on the workings of the brain…" While he has been careful elsewhere to insist he is studying the neural *correlates* of experiences like beauty and not necessarily the *causes* (Kawabata and Zeki, 2004), there is a understandable temptation to assume that some of our most uniquely human experiences, such as art appreciation, might be explicable purely in terms of certain kinds of brain activity. It is worth noting in this context a growing tendency within philosophy and psychology toward "externalist" models of perception and cognition. These models, to varying degrees, deny or resist the idea that the brain is the sole location of mental properties such as beliefs, memories, and even the mind itself (Noë, 2005, 2009; Clark, 2008; Hurley, 2008; Velmans, 2008). Allied to this is the fact that many artists and art theorists, when discussing the matter, seem to intuitively support the idea that mental properties and esthetic experiences extend beyond the head and into the world (Pepperell, 2011). The purpose of raising this issue here is to point out that certain basic assumptions about how esthetic experiences might be constituted can differ fundamentally between those making the art and those studying its biological effects. In order to achieve a fuller understanding of what the brain contributes to esthetic experience as a whole we will need to reconcile these divergent approaches.

This leads to the final point, which concerns the need to recognize how great the disciplinary gulf still is between art and science, despite all the work done in recent times to bridge it. I have been attending science conferences now for over 10 years, and working closely with scientists on and off for about five. In that time I have rarely found members of the scientific community to be anything other than generous with their time and ideas, politely inquisitive about my proposals, and forgiving of my own naivety about their specialisms. Even so, I am also constantly reminded of how different the basic conceptual categories can be between the arts and sciences, a cultural divide of the kind famously identified by Snow (1993) in the middle of the last century and still largely in force today. The difference is in part, I believe, born from the need for scientists to be explicit, analytical, and logical in their working and reporting processes. Quite often for artists the opposite is the case, their training and traditions having implanted in them a proclivity toward vagueness, synthesis, and irrationality. The Cubist painter Braque (1971) was fond of saying: "Art is meant to disturb; science reassures."

Finding common ground between two such distinct traditions is not always straightforward. It was somewhat sobering for me to discover that the constraints on the experimental equipment used in the collaborative fMRI study cited above required the subjects to express their esthetic appreciation for the artworks on a scale between 1 and 4. For those schooled in the infinite subtleties of artistic expression the idea that the merits of a great Turner or Rubens painting could be judged on such a crude scale and in as brief a moment as 3 or 4 s would border on the absurd. Yet if we are to make any progress at all in understanding art using the empirical methods of scientific enquiry these are exactly the kinds of procedures we will have to adopt, at least until more sensitive techniques of investigation become available. Just as I have had to modify my expectations about what empirical techniques are able to measure so I have been fortunate to find scientific collaborators willing to adjust their disciplinary spectacles in order to appreciate the relatively chaotic point of view of an artist. The result has been, from my point of view, a deeper understanding of what science can tell us about art, and what art can tell us about science.

Art–science collaborations work best when each discipline is enriched through the process, rather than one being parasitic on the other. There is always a risk that the compromises necessary to make progress are made at the expense of the essential values and outlooks

# **References**


of both approaches, which can only result in bad art and bad science. The challenge is how best to reconcile these distinct traditions without sacrificing the integrity of either. Only by meeting this challenge will we be able to create a truly interdisciplinary approach to the study of problems as complex as the way we make and appreciate art.

# **Acknowledgments**

With thanks to Glyn Humphreys, Gregor Rainer, and Christian Wallraven for their helpful comments.

Noë, A. (2009). *Out of Our Heads: Why You Are Not Your Brain, and Other Lessons from the Biology of Consciousness*. New York: Hill and Wang.

Onians, J. (2008). *Neuroarthistory: From Aristotle and Pliny to Baxandall and Zeki*. New Haven: Yale University Press.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 February 2011; paper pending published: 01 April 2011; accepted: 30 July 2011; published online: 17 August 2011. Citation: Pepperell R (2011) Connecting art and the brain: an artist's perspective on visual indeterminacy. Front. Hum. Neurosci. 5:84. doi: 10.3389/fnhum.2011.00084 Copyright © 2011 Pepperell. This is an open-access article subject to a nonexclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.*

# **PERCEPTUAL AND PHYSIOLOGICAL RESPONSES TO JACKSON POLLOCK'S FRACTALS**

**Richard P. Taylor, Branka Spehar, Paul Van Donkelaar and Caroline M. Hagerhall**

# Perceptual and physiological responses to Jackson Pollock's fractals

#### *Richard P. Taylor1 \*,Branka Spehar <sup>2</sup> , Paul Van Donkelaar <sup>3</sup> and Caroline M. Hagerhall <sup>4</sup>*


#### *Edited by:*

*Luis M. Martinez, Universidad Miguel Hernández de Elche, Spain*

#### *Reviewed by:*

*Christoph Redies, University of Jena School of Medicine, Germany Gert Jakobus Van Tonder, Kyoto Institute of Technology, Japan*

#### *\*Correspondence:*

*Richard P. Taylor, Department of Physics, University of Oregon, Eugene, OR 9743, USA. e-mail: rpt@uoregon.edu*

Fractals have been very successful in quantifying the visual complexity exhibited by many natural patterns, and have captured the imagination of scientists and artists alike. Our research has shown that the poured patterns of the American abstract painter Jackson Pollock are also fractal. This discovery raises an intriguing possibility – are the visual characteristics of fractals responsible for the long-term appeal of Pollock's work? To address this question, we have conducted 10 years of scientific investigation of human response to fractals and here we present, for the first time, a review of this research that examines the inter-relationship between the various results. The investigations include eye tracking, visual preference, skin conductance, and EEG measurement techniques. We discuss the artistic implications of the positive perceptual and physiological responses to fractal patterns.

**Keywords: fractals, esthetics, visual preference, eye tracking, EEG**

# **Poured Complexity**

The art world changed forever in 1945, the year that Jackson Pollock moved from downtown Manhattan to the countryside of Long Island, New York. Friends recall the many hours that Pollock spent on the back porch of his new house, staring out at the scenery as if assimilating the natural shapes surrounding him (see **Figure 1**; Potter, 1985). Using an old barn as his studio, he started to perfect a radically new approach to painting. The procedure appeared basic. Purchasing yachting canvas from his local hardware store, he simply rolled the large canvases (sometimes spanning five meters) out across the floor of the barn. Even the traditional painting tool – the brush – was not used in its expected capacity: abandoning physical contact with the canvas, he dipped the stubby, paint-encrusted brush in and out of a can and poured the fluid paint from the brush onto the canvas below. The uniquely continuous paint trajectories served as "fingerprints" of his motions through the air.

These deceptively simple acts fuelled unprecedented controversy and polarized public opinion of his work. Was this painting "style" driven by raw genius or was he simply mocking artistic traditions? Sixty-five years on, Pollock's brash and energetic works continue to grab public attention and command staggering prices of up to \$200M. Art theorists now recognize his patterns as a revolutionary approach to esthetics. However, despite the millions of words written about Pollock through the years, the real meaning behind his infamous swirls of paint remained the source of fierce debate in the art world (O'Connor, 1967; Varnedoe and Karmel, 1998).

One issue agreed upon early in the Pollock story was that his paintings represent one extreme of the spectrum of abstract art, with the paintings of his contemporary, Piet Mondrian, representing the other. Mondrian's so-called "Abstract Plasticism" generated paintings that seem as far removed from nature as they possibly could be (Taylor, 2002a, 2004). They consist of elements – primary colors and straight lines – that never occur in a pure form in the natural world. In contrast to Mondrian's simplicity, Pollock's "Abstract Expressionism" speaks of complexity – a tangled web of intricate paint splatters. Whereas Mondrian's patterns are traditionally described as "artificial" and "geometric," Pollock's are "natural" and "organic" (Taylor, 2011). But if Pollock's patterns celebrate nature's organic shapes, what shapes would these be?

# **Nature's Fractals**

Since the 1970s many of nature's patterns have been shown to be fractal (Mandelbrot, 1982; Barnsley, 1993; Gouyet, 1996). In contrast to the smoothness of artificial lines, fractals consist of patterns that recur on finer and finer scales, building scale-invariant shapes of immense complexity. Even the most common fractal objects, such as the tree shown in **Figure 1**, contrast sharply with the simplicity of artificial shapes.

An important parameter for quantifying a fractal pattern's visual complexity is the fractal dimension, *D*. This parameter describes how the patterns occurring at different magnifications combine to build the resulting fractal shape (Mandelbrot, 1982). For Euclidean shapes, dimension is described by familiar integer values – for a smooth line (containing no fractal structure) *D* has a value of 1, whilst for a completely filled area (again containing no fractal structure) its value is 2. However, the repeating patterns of a fractal line cause the line to begin to occupy space. As a consequence, its *D* value lies between 1 and 2. By increasing the amount of fine structure in the fractal mix of repeating patterns, the *D* value moves closer to 2 (Mandelbrot, 1982; Taylor and Sprott, 2008).

Thus, for fractals described by a low *D* value, the small content of fine structure builds a very smooth, sparse shape. However, for fractals with a *D* value closer to two, the larger content of fine

*<sup>1</sup> Department of Physics, University of Oregon, Eugene, OR, USA*

**Figure 1 | Left: Pollock's house on Long Island.** In contrast to his previous urban life in Manhattan, Pollock perfected his pouring technique surrounded by the complex patterns of nature. Right: Trees are an example of a natural fractal object. Although the patterns observed at different magnifications don't repeat exactly, analysis shows them to have the same statistical qualities.

structure builds a shape full of intricate, detailed structure (Taylor and Sprott, 2008). **Figure 2** (left column) demonstrates how a pattern's *D* value has a profound effect on the visual appearance. The pattern established by clouds (left, top) has a *D* value of 1.3, while the pattern established by the trees (left, bottom) has a *D* value of 1.9. **Table 1** shows *D* values for various natural forms.

# **Pollock's Fractals**

In 1999, we published an analysis of Pollock's paintings that confirmed his poured patterns to be fractal (Taylor et al., 1999a). Building on this initial analysis, a number of groups have shown diverse fractal analysis techniques to be useful approaches to quantifying the visual complexity of Pollock's poured patterns (Mureika et al., 2004, 2005; Mureika, 2005; Lee et al., 2006, 2007; Graham and Field, 2007, 2008; Redies, 2007; Redies et al., 2007; Alvarez-Ramirez et al., 2008a,b; Coddington et al., 2008; Irfan and Stork, 2009; Fairbanks et al., 2010). Our initial analysis employed the wellestablished "box-counting" method, in which digitized images of Pollock paintings were covered with a computer-generated mesh of identical squares (or "boxes"). The statistical scaling qualities of the pattern were then determined by calculating the proportion of squares occupied by the painted pattern and the proportion that were empty. This process was then repeated for meshes with increasingly small square sizes. Reducing the square size is equivalent to looking at the pattern at finer magnification. In this way, we could compare the pattern's statistical qualities at different magnifications. Specifically, the number of squares, *N(L),* that contained part of the painted pattern were counted and this was repeated as the size, *L*, of the squares in the mesh was reduced. The largest size of square was chosen to match the canvas size (*L* ∼ 2.5 m) and the smallest was chosen to match the finest paint work (*L* ∼ 1 mm). For fractal behavior, *N(L)* scales according to the power law relationship *N(L)* ∼ *L*<sup>−</sup>D,

**Figure 2 | Examples of natural scenery (left column) and poured paintings (right column).** Top: Clouds and Pollock's painting *Untitled* (1945) are fractal patterns with low *D* values (*D* = 1.3 and 1.10 respectively). Bottom: A forest and Pollock's painting *Untitled* (1950) are fractal patterns with high D values (both *D* = 1.89).

where 1 < *D* < 2. This power law generates the scale-invariant properties that are central to fractal geometry. The *D* values, which chart this scale invariance, were extracted from the gradient of a graph of log *N(L)* plotted against log *L*. Details of the procedure, along with typical graphs, are presented elsewhere (Taylor et al., 2007). We note that the standard deviation associated with fitting the data to the fractal scaling behavior is such that *D* can be determined to an accuracy of two decimal places (Taylor et al., 2007).

Many of Pollock's paintings are composed of a number of distinctly colored paint layers. One of the central challenges is to separate these layers so that each can be passed through the boxcounting analysis. Colors have been filtered using the "physical" model based on red–green–blue primaries (Taylor et al., 1999a, 2007; Mureika, 2005) and also a "perceptual" model based on L\*a\*b\* color space (Mureika, 2005). The extracted patterns are labeled as color "blobs," and light-colored blobs typically have higher *D* values than darker blobs (Mureika, 2005). The question of how Pollock combined the blobs into an integrated, multi-colored visual fractal led us to investigate his painting technique in detail. We described Pollock's style as "Fractal Expressionism" (Taylor et al., 1999b; Taylor, 2011) to distinguish it from computer-generated fractal art. Fractal Expressionism indicates an ability to generate and manipulate fractal patterns *directly*. In many ways, this ability to paint such complex patterns represents the limits of human capabilities. Our analysis of film footage taken at his peak in 1950 reveals a remarkably systematic process (Taylor et al., 2002). He started by painting localized islands of trajectories distributed across the canvas, followed by longer extended trajectories that joined the islands, gradually submerging them in a dense fractal web of paint. This process was

#### **Table 1 |** *D* **values for various natural fractal patterns.**


very swift with the fractal dimension rising sharply to a mid-range value of *D* = 1.5 at 20 s. He would then break off and later return to the painting over a period of several days, depositing extra layers on top of this initial "anchor" layer (Taylor, 2011). Whether or not fractal layers merge to create a combined pattern that is also fractal depends on the relative *D* values of the individual layers, their densities and their degrees of overlap (Taylor et al., 2006; Taylor, 2011). In Pollock's painting process, the combined patterns are fractal (Taylor et al., 2002; Mureika, 2005). The dark-colored, anchor layer set the initial *D* value of the painting, which was then fine-tuned by adding multiple, light-colored layers (Taylor et al., 2002). This fine-tuning process has been interpreted in terms of a mathematical union of the individual fractal layers (Vicsek, 1989), in which the combined pattern has a *D* value that matches the highest *D* of the individual layers (Mureika et al., 2005): thus as lighter-colored, higher *D* layers were added, the painting's overall *D* value rose. Pollock's multi-stage painting technique was therefore clearly aimed at generating high *D* fractal paintings (Taylor, 2011).

As shown in **Figure 3**, he perfected this technique over 10 years. Art theorists categorize the evolution of Pollock's pouring technique into three phases (Varnedoe and Karmel, 1998). In the "preliminary" phase of 1943–1945, his initial efforts were characterized by low *D* values. An example is the fractal pattern of the painting *Untitled* from 1945, which has a *D* value of 1.10 (see **Figure 2**). During his "transitional phase" from 1945 to 1947, he started to experiment with the pouring technique and his *D* values rose sharply (as indicated by the first gradient in **Figure 3**). In his "classic" period of 1948–1952, he perfected his technique and *D* rose more gradually (second gradient in **Figure 3**) to the value of *D* = 1.7. During his classic period he also painted *Untitled* (see **Figure 2**), which has an even higher *D* value of 1.89. However, he immediately erased this pattern (it was painted on glass), prompting the speculation that he regarded this painting as too complex and immediately scaled back to paintings with *D* = 1.7. This suggests that his 10 years of refining the pouring technique were motivated by a desire to generate fractal patterns with *D* ∼ 1.7. This distinct evolution raises

an intriguing question: did these higher *D* fractal patterns hold a special esthetic quality for Pollock and, if so, do observers of his work share the same preference?

In the following sections, we will present our investigations of how observers look at Pollock's fractals. Our initial motivation for extracting the box-counting dimension for Pollock's patterns was to facilitate a direct comparison with previous investigations of human response to fractals, which focused on the relationship between their esthetic value and *D* (see later). Before we move onto these comparisons, it is important to emphasize that *D* is just one of a spectrum of dimensions that can be used to quantify the scaling properties of fractal patterns. A number of groups, including ours, have gained further insight into the rich structure of Pollock's patterns by performing a multi-fractal analysis (Mureika et al., 2005; Coddington et al., 2008; Irfan and Stork, 2009; Fairbanks et al., 2010). In addition, whereas our work focused on the colored blobs, the "edge" patterns extracted from the luminance gradients of grayscale images of Pollock's work are also fractal (Mureika et al., 2005). The *D* values extracted from these fractal edge patterns can be related mathematically to a power spectrum analysis of the grayscale images (Fairbanks and Taylor, 2011). Spectral analysis of Pollock's paintings reveals a scale-invariant power law behavior (Redies et al., 2007; Alvarez-Ramirez et al., 2008a; Graham and Field, 2008). This latter result is appealing because it facilitates a comparison between the luminance properties of Pollock's work and those of other artworks (Graham and Redies, 2010; Koch et al., 2010) and natural scenery (Switkes et al., 1978; Field and Brady, 1997; Ruderman, 1997; Billock, 2000; Billock et al., 2001). However, the relationship between the grayscale (i.e., power spectral analysis) and colored (i.e., box-counting analysis of the blobs) patterns of Pollock's fractals is not without its complexities (Fairbanks and Taylor, 2011) and our future research will continue to seek a precise characterization of this relationship. For the remainder of the present article, we restrict our analyses to the blob *D* values.

It is also valuable to highlight an inevitable restriction of all of the above forms of fractal analysis – that the fractal magnification range is limited. Unlike mathematical fractals, which span from the infinitesimally small through to the infinitely large, Pollock's fractals can't exceed the canvas size, nor can they be smaller than the smallest speck of paint. Concerns that Pollock's limited magnification range prevents an accurate extraction of *D* values have been successfully addressed elsewhere (Taylor et al., 2006; Taylor, 2011). Of more interest for the current study is whether this magnification range is sufficient for the fractals to induce marked responses in the human visual system. The results that follow show that a magnification factor of only 20 (i.e., the largest pattern is just 20 times larger than the smallest) is enough to trigger striking responses. Significantly, most of nature's fractals match this highly limited magnification range and Pollock's fractals exceed it.

# **How does the eye search through the visual complexity of Pollock's fractals?**

The use of eye-tracking techniques to examine the gaze of the observer is a potentially powerful approach to understanding art appreciation (Busswell, 1935; Yarbus, 1967; Nodine and Krupinski, 2003; Locher, 2006). While eye-tracking investigations of many types of artworks have revealed a great inter individual variability in scan path characteristics, several systematic findings have emerged. For example, the spontaneous gaze behavior in viewing artworks seems to follow a "coarse-to-fine" strategy where an initial global sweep of the image is followed by a later period of visual scrutiny of finer local details (Locher et al., 2007). Furthermore, the points of high salience computed in terms of local feature differences in luminance, color and orientation were found to drive eye fixations in viewing abstract and representational artworks (Wallraven et al., 2007; DiPaola et al., 2010; Foulsham and Kingstone, 2010; Fuchs et al., 2011). Based on these investigations, how the eye scans across Pollock's patterns, which lack obvious salient features and which scale across multiple sizes, is of obvious interest.

**Figure 4A** shows an eye-tracking system used in our study, which integrates infrared and visual camera techniques to determine the location of the eye's gaze when looking at a pattern formed on a computer screen (Hyona et al., 2002). The sizes of the fractal images displayed on the screen were 290 by 290 mm, corresponding to 1024 by 1024 screen pixels (i.e., the image resolution was 35.3 pixels/cm). The eye-tracker can locate the gaze with an accuracy of 4 pixels.

**Figure 4B** shows a magnified section of the spatial pattern traced out by the eye's gaze as it moves across the screen. As expected, the pattern is composed of long ballistic trajectories as the eye jumps between the locations of interest, and smaller motions called micro-saccades that occur during the dwell periods to ensure that the retina does not de-sensitize (Hyona et al., 2002). **Figure 4B** plots the horizontal (*x*) and vertical (*y*) locations in units of screen pixels. Micro-saccades are expected to occur over an angular range of typically 0.5°. This angle translates to a distance of 15 pixels on the screen and, as expected, this approximately matches the typical width of the dwell regions observed in **Figure 4B**.

**Figure 4C** shows the corresponding temporal pattern by plotting the *x* position against time *t*. The periods of relative motionlessness are the dwell periods at a given location, during which time the eye is undergoing micro-saccades. The typical dwell time is approximately 0.4 s. The time scale of the individual micro-saccades is expected to be approximately 10–20 ms. We note that this is on the same order as the sampling rate of the eye-tracking equipment (16 ms, 60 Hz). This measurement limitation would, therefore, impact on any studies of the micro-saccades. However, the focus of our investigations lies with the saccades, since these larger motions are the ones that dictate the search motion, and these operate on longer time scales than the equipment's sampling rate.

**Figure 5** shows the eye's spatial patterns (red trajectories) superimposed on Pollock's fractal paintings that were displayed on the computer screen. The observer was instructed to memorize the observed painting in order to induce the search activity. The observation period lasted 60 s. The *D* values of the displayed monochrome paintings from left to right were 1.11, 1.66, and 1.89. The fourth image (right) is a Pollock painting composed of four differently colored interlocking fractal patterns, each with a *D* value of 1.6.

Details of the box-counting analysis applied to the eye spatial pattern can be found elsewhere (Fairbanks and Taylor, 2011). The results show that the eye trajectories trace out fractal patterns with *D* values that are insensitive to the *D* value of the fractal pattern being observed: the saccade pattern is quantified by *D* = 1.4, even though the underlying pattern varied over a very large range from 1.11 to 1.89. We note that this characteristic value of *D* = 1.4 also holds for observations of multi-colored fractals (far right image of **Figure 5**).

To test this result further, we considered another form of fractal pattern for observers to search through – the computer-generated fractals shown in **Figure 6**. **Table 2** compares the *D* values of the spatial patterns traced out by the saccades and the *D* values of the observed fractal patterns. In each case, the *D* values of the saccades are averaged over the results from six observers, each of whom observed the nine fractal images for 30 s, separated by a checkerboard pattern observed for 30 s. The results confirm that the saccades trace out an inherent search pattern set at *D* = 1.5, regardless of the *D* values of the fractal pattern being observed (Fairbanks and Taylor, 2011).

This insensitivity to such a wide range of *D* values in the observed pattern is striking. It suggests that the eye's search mechanism follows an intrinsic mid-range *D* value when in search mode. Why would the eye adopt a fractal trajectory with a *D* value of 1.5? An appealing possible answer lies in previous studies of the foraging behavior of animals. A number of successful investigations have proposed that animals adopt fractal motions when searching for food. See, for example, Viswanathan et al. (1996). Within this foraging model, the smaller trajectories allow the animal to look for food in small region and then to travel to neighboring regions and then onto regions further away.

Significantly, such fractal motion has an "enhanced diffusion" compared to the equivalent random motion of Brownian motion. This might explain why it is adopted for both animal searches for food and the eye's search for visual information. The amount of space covered by fractal trajectories is bigger than for random trajectories, and a mid-range *D* value appears to be optimal for covering terrain efficiently (Fairbanks and Taylor, 2011). The mathematical properties of fractals, therefore, provide the explanation for why the human eye follows a fractal trajectory with an inherent *D* value set at 1.5.

**left).** The final pattern (right) is a colored composite of four *D* = 1.6 patterns.

This model raises an intriguing question – what happens when the eye is made to view a fractal pattern of *D* = 1.5? Will this trigger a "resonance" when the eye sees a fractal pattern that matches its own inherent characteristics? Could such a resonance lead to a peak in esthetic appeal? In the next section, we will use visual perception experiments to explore this hypothesis.

# **The esthetics of Fractals**

The prevalence of fractals in our natural environment has motivated a number of studies to investigate the relationship between a pattern's fractal character and its visual properties (Pentland, 1984; Cutting and Garvin, 1987; Jang and Rajala, 1990; Knill et al., 1990; Rogowitz and Vosset al., 1990; Gilden et al., 1993; Geake and Landini, 1997). Whereas these studies concentrated on perceived qualities such as roughness and complexity, other studies have focused on esthetics and the quantification of the "visual appeal" of fractal patterns

**Figure 6 | An example of the computer-generated fractals (black and white) viewed by the subjects for the eye-tracking results shown in Table 2.** The red lines are the eye trajectories.

(Sprott, 1993; Pickover, 1995; Aks and Sprott, 1996; Taylor, 1998, 2001; Richards and Kerr, 1999; Richards, 2001; Spehar et al., 2003; Hagerhall et al., 2004; Taylor and Sprott, 2008; Boon et al., 2011). In one of the initial experiments performed in 1994, we used a chaotic pendulum called the "Pollockiser" to generate fractal and non-fractal poured paintings (example sections from two paintings are shown in **Figure 7**; Taylor, 2011). In the perception studies that followed, participants were shown one fractal and one non-fractal pattern (randomly selected from 40 images) and asked to state a preference (Taylor, 1998, 2003). Out of the 120 participants, 113 preferred examples of fractal patterns over non-fractal patterns, confirming their powerful esthetic appeal.

Given the profound effect that *D* has on the visual appearance of fractals (see, for example, **Figure 2**), do observers base esthetic preference on the fractal pattern's *D* value? Previous studies by Aks and Sprott used computer-generated fractals and reported preferred values of 1.3 (Sprott, 1993; Aks and Sprott, 1996). To determine if this was a "universal" esthetic quality of fractals, we performed an experiment incorporating all three categories of fractal pattern: fractals formed by nature's processes (natural scenery), by mathematics (computergenerated images) and by humans (Pollock paintings) (Spehar et al., 2003). Within each category of fractals (i.e., mathematical, natural,

**Table 2 | A comparison of the** *D* **values of the fractal images being viewed, and the** *D* **values of the patterns traced out by the saccades.**


**Figure 7 | The chaotic pendulum (left) employed to generate non-fractal (top right) and fractal (bottom right) poured paintings.** This technique was documented by ABC television in 1998.

and human), we investigated the visual preference as a function of *D*. This was done using a "forced choice" paired comparison technique, in which participants were shown a pair of images with different *D* values on a monitor and asked to chose the more "visually appealing." Introduced by Cohn (1894), the forced choice paired comparison technique is well-established for securing value judgments. In our experiments, all the images were paired in all possible combinations. The presentation order was fully randomized and preference was quantified in terms of the proportion of times each image was chosen. The results, based on 220 participants, indicated that across all categories the visual preference peaks for fractal dimension between 1.3 and 1.5, whereas lower visual preferences are found for fractals outside of this range (Spehar et al., 2003).

We have further extended these findings by measuring visual preference for the same computer-generated fractals that were used in our eye-tracking experiments. The results of this new investigation are shown in **Figure 8**.

For each panel in **Figure 8**, we show how the visual preference for fractal patterns depends on their *D* values. The four panels investigate the response to different fractal "configurations" created using the same computer generation process. The same 20 observers viewed each image configuration1 and the results reveal a remarkable consistency in terms of the preference across the different configurations. For each configuration of fractal image, visual preference was significantly affected by a pattern's fractal dimension (*F*8,19 = 22.16, *p* < 0.0001). As in our previous study (Spehar et al., 2003), the highest average visual preference is observed for fractal dimensions in the 1.3–1.5 range. A comparison across the four panels shows how little this preference varies for different examples of fractal patterns with the same *D* value. This "universal" character of fractal esthetics is further emphasized by an investigation indicating that gender and cultural background of participants did not significantly influence preference (Abraham et al., 2011).

The above perception experiments deliberately focused on relatively simple fractal objects. Each image featured just one form of fractal pattern (for example, the clouds or trees shown in **Figure 2**). Furthermore, the selected images provided a relatively high contrast against a uniform background, facilitating the application of the box-counting technique. An obvious step is to extend our studies to consider preferences for natural scenes, which typically feature a combination of several different fractal objects (e.g., clouds, tress, mountains etc). Although the characteristics of typical scenes are considerably more subtle than the simple shapes considered above, their fractal statistics are well-charted. Analysis has shown that typical scenes are scale invariant, following a power law behavior (Field and Brady, 1997; Billock, 2000; Billock et al., 2001). This behavior is thought to be due to a combination of the following factors: (i) many of the individual objects in the scene are fractal (see **Table 1**), (ii) many scenes contain a power law distribution of object sizes (Field and Brady, 1997; Ruderman, 1997), and (iii) the structure in each of the luminance edges in the scene is expected to follow a power law distribution of sizes (Switkes et al., 1978).

**Figure 8 | Visual preference for computer-generated fractals: The vertical axis in each panel corresponds to the percentage of trails for which patterns of a given** *D* **value were chosen as a function of fractal dimension (***D***).** Each of the four different panels uses a different fractal configuration to investigate this visual preference. The fractal images are shown as insets in each panel. The main effect of fractal dimension (*D*) on visual preference was significant for all four types of fractal images: *F*8,19 = 22.16, *p* < 0.0001; *F*8,19 = 38.01, *p* < 0.0001; *F*8,19 = 15.68, *p* < 0.0001; and *F*8,19 = 1.54, *p* < 0.0001 from the top to the bottom panel respectively.

<sup>1</sup> The experimental procedures were approved by the School of Psychology, the University of New South Wales Human Research Ethics Advisory Panel. Informed consent was obtained from all subjects.

Does the preference for mid-range *D* values of simple fractal objects (Sprott, 1993; Aks and Sprott, 1996; Spehar et al., 2003) extend to these more visually intricate fractal scenes? One possible approach to addressing this issue would be to concentrate on the luminance properties of the overall scene. This could be done by adopting the technique that performs a spectral analysis of the spatial frequencies of the grayscale image of the scene, as discussed earlier (Field and Brady, 1997; Billock, 2000). The appeal of this approach is that the grayscale analysis can be related to key variables in studies of spatial vision, such as Michelson contrast. Furthermore, the grayscale image conveys visual information about the "textures" of a scene and previous fractals research indicates that roughness texture is an important property for perception – in particular, a strong correlation has been found between fractal dimension and perceived roughness (Pentland, 1984; Jang and Rajala, 1990). In contrast, other research indicates that perception is determined by the edge contours of the observed fractal pattern (Rogowitz and Voss, 1990). Therefore, an alternative approach to the analysis of a fractal scene would be to select a prominent edge contour and investigate its impact on perception. The importance of edge contours to the visual system is supported by eye-tracking experiments which show that, in free viewing situations, subjects fixate on definite contours (Rayner and Pollatsek, 1992). The dominant contour in many scenes is formed by the skyline, and consequently these contours have been the focus of previous perception studies. In particular, in architectural studies of tall building skylines, the silhouette complexity significantly affected preference scores while facade complexity was of less importance (Heath et al., 2000). Furthermore, perception experiments using computer-generated images have investigated the impact of matching a city skyline to the background horizon formed by fractal mountains (Stamps, 2002).

Due to this inter-disciplinary interest, our investigations of fractal scenery focused on the importance of the skyline contour for determining esthetic preference of natural scenes (Hagerhall et al., 2004). The skyline contour of natural scenes has previously been found to be fractal, with the *D* value depending on the objects that define the contour (Keller et al., 1987). Our box-counting analysis of the skyline contours extracted from 80 scenes photographed in Australia, Sweden and Italy confirms this fractal behavior. The procedure for extracting the skyline contour, shown in **Figure 9**, is described in detail elsewhere (Hagerhall et al., 2004). The preference experiments, involving 119 participants from the general public, show the most preferred *D* value to be 1.3 (Hagerhall et al., 2004), indicating that the preference for mid-range *D* values revealed for simple fractal shapes (Aks and Sprott, 1996; Spehar et al., 2003) appears to extend to the fractal characteristics of more intricate fractal scenery. The preference for skyline contours with *D* =1.3 was also confirmed in another study with 63 students in psychology and landscape architecture who rated 12 landscape silhouettes extracted from photographs of natural scenery. In addition to preference, the participants rated the "naturalness" of the silhouettes and the results showed that the perceived naturalness was also highest for the silhouettes with a fractal dimension of around 1.3 (Hagerhall, 2005).

To summarize this section, perception studies of fractals generated by nature, mathematics and art indicate that images in the range *D* = 1.3–1.5 have the highest esthetic appeal. These mid-*D* values are close to the *D* values of 1.4–1.5 values predicted from the eye-tracking experiments.

# **Neurophysiological Response to Fractals**

 Does this visual appreciation for mid-range *D* values affect the physiological condition of the observer? In particular, do mid-*D* fractals also induce relaxation in the observer? This question motivated us to analyze the results of experiments performed at the NASA-Ames Research Center, in which 24 participants were seated in a simulated space station cabin, each facing one image on the wall. During continuous exposure to the image, each participant performed a sequence of mental tasks designed to induce physiological stress. Each task period was separated by a 1-min recovery period, thus creating a sequence of alternating high and low stress periods. To measure the subject's physiological response to the stress of mental work, skin conductance was monitored continuously during this sequence (Taylor, 2006). Prior studies have shown skin conductance to be a reliable indicator of mental performance stress with higher conductance occurring under high stress (Ulrich and Simons, 1986). The results showed that the mental tasks induced the smallest rise in stress when the observer was observing a fractal pattern with a *D* value of 1.4 (Taylor, 2006).

#### **Figure 9 | The processing steps used in the extraction of the fractal skyline contour of a natural scene.** Top: one of the natural scenes shown to subjects. Middle: an intermediate processing step used to extract the skyline contour. Bottom: the extracted skyline contour subjected to the box-counting fractal analysis.

To build on this result, we extended our studies of relaxation to include neurophysiological responses (Hagerhall et al., 2008). This was done by monitoring subjects' quantitative EEG (qEEG) response while viewing fractals with different *D* values. Previous EEG recordings by others have shown that people are more wakefully relaxed during exposure to natural landscapes than during exposure to townscapes (Ulrich, 1981) and studies of wall art in hospitals find that images with natural content have positive effects on anxiety and stress (Ulrich, 1993). However, the definition of "nature" adopted in these previous studies was vague compared to our proposal of considering the *D* dependence.

It is generally agreed upon that EEG is a good indicator of cortical arousal. Similarly, there seems to be agreement that, in the awake brain, power in the alpha-band (9–12 Hz) of the EEG is inversely related to activity (Laufs et al., 2003a,b; Oakes et al., 2004). Human behavior also seems to be more strongly related to the alpha-band than to the other frequency bands in the EEG (Davidson and Hugdahl, 1996) and that the alpha components seems to be especially responsive to environmental stimulation (Küller, 1991; Küller et al., 2009). While the alpha component of EEG is considered to show a wakefully relaxed state, the delta component (2.25–3 Hz) is prominent during drowsiness and deep sleep. The beta component (18–24 Hz) is associated with external focus, attention and an alert state (Kolb and Whishaw, 2003). Three regions of the brain–frontal, parietal and temporal – were chosen for the qEEG recordings. Processes in these three associational zones are thought to be complementary (Kolb and Whishaw, 2003). Hence, the three selected regions are expected to reveal significant psycho-physiological impacts of exposure to fractal images.

Based on the preferences for mid-*D* fractals and the possibility, based on the skin conductance measurements, that these fractals might also induce a relaxed state (Taylor, 2006), we hypothesized

that mid-*D* fractals would produce a maximal alpha response in the frontal areas. Additionally, we hypothesized that the different fractal dimensions would generate different levels of activation in the processing of the pattern, i.e., a difference in beta responses would be likely in the parietal and temporal regions.

Computer-generated fractal skylines, shown in **Figure 10**, were chosen with the *D* values of 1.14, 1.32, 1.51, and 1.70. Thirty-two subjects participated in the study. The fractal images were viewed for 1 min each and interspaced by a neutral gray picture for 30 s. This exposure period was chosen to ensure that a relaxation effect in the subjects could occur. Half of the subjects viewed the stimuli with increasing fractal dimension and the other half with decreasing fractal dimension. During the viewing, qEEG was continuously monitored and recorded using a digital EEG recorder.

The results shown in **Figure 11** indicate that fractal images quantified by *D* = 1.3 induce the largest changes in subjects' qEEG response (Hagerhall et al., 2008). This supports the proposal emerging from perception studies that these patterns are visually unique. These fractals generated the maximal alpha response in the frontal region, consistent with the hypothesis that they are most relaxing. They, at the same time, generated the highest beta response in the parietal region, indicating that this pattern was conversely generating most activation in the processing of the pattern's spatial properties. This points to a very interesting interplay between these brain areas for mid-*D* fractals, which requires further investigation.

Current studies are using the fMRI technique to identify more precisely the regions of the brain that are preferentially activated when observing the stress-reducing fractals. Preliminary results suggest that mid-*D* fractals activate the ventral visual stream (including the ventrolateral temporal cortex), the parahippocampal region, and the dorsolateral parietal cortex, involved with spatial long-term memory.

(1.80 ± 1.63) and *D* = 1.70 (1.76 ± 1.54).

Interestingly, the parahippocampal area has also been discussed in relation to detection and regulation of emotional input, for instance with reactions to happy and sad classical music (Mitterschiffthaler et al., 2007). These studies belong to the emerging field of "neuroesthetics," which explores the relationship between the neural cells activated and the esthetics of the object being observed (Zeki, 1999).

(2.80 ± 2.36), *D* = 1.51 (2.53 ± 2.09), and *D* = 1.70 (2.39 ± 1.88). **(B)** Delta for the

# **Conclusion**

Scientific experiments might appear to be a highly unusual tool for judging art. However, our preliminary experiments provide a fascinating insight into the impact that art might have on the perceptual, physiological and neurological condition of the observer. Our future investigations will explore the possibility of incorporating fractal art into the interior and exterior of buildings, in order to adapt the visual characteristics of artificial environments to the positive responses.

Our findings might apply to a remarkably diverse range of fractal patterns appearing in art, architecture and archeology spanning more than five centuries. In addition to Pollock's poured fractals, other examples of fractals include the Nazca lines in Peru (pre-seventh century) (Castrejon-Pita et al., 2003), early Chinese paintings (tenth to thirteenth century) (Voss, 1998), the Ryoanji Rock Garden in Japan (fifteenth century) (Van Tonder et al., 2002), Leonardo da Vinci's sketch *The Deluge* (1500) (Mandelbrot, 1982), Katsushika Hokusai's wood-cut print *The Great Wave* (1846) (Mandelbrot, 1982), Gothic cathedrals (Goldberger, 1996), Gustave Eiffel's tower in Paris (1889) (Schroeder, 1991), Frank Lloyd Wright's Palmer House in Michigan (1950) (Eaton, 1998), M. C. Escher's *Circle Limit III and IV* (1960) (Taylor, 2009) and Frank Gehry's proposed architecture for the Guggenheim Museum in New York (2001) (Taylor, 2001).

Is Jackson Pollock an artistic enigma? According to our results, the low *D* patterns painted in his earlier years should have more "visual appeal" than the higher *D* patterns in his later *classic* poured paintings. What was motivating Pollock to paint high *D* fractals? Should we conclude that he wanted his work to be esthetically challenging to the gallery audience? It is interesting to speculate that Pollock might have regarded the visually restful experience of a low *D* pattern as being too bland for an artwork and that he wanted to keep the viewer alert by engaging their eyes in a constant search through the dense structure of a high *D* pattern. Speculation over Pollock's preference for high *D* fractals leads us back to the fundamental question driving this article: why do most people prefer fractals in the range *D* = 1.3–1.5?

We have noted that one potential explanation lies in a "resonance" with our eye trajectories, which trace out fractal patterns characterized by *D* = 1.4–1.5 when in search mode. However, an alternative explanation was presented by Aks and Sprott (1996) when interpreting their pioneering perception experiment. They speculated that the preference for mid-range *D* values is set through exposure to nature's fractal patterns. Indeed, **Table 2** shows that many of nature's fractals cluster around *D* = 1.3. Perhaps then people's preference for mid-range fractals is based on familiarity with these *D* = 1.3 shapes? Intriguingly, **Table 2** also reveals another

sibility that Pollock's preference was set by exposure to these more complex fractals (Taylor, 2011). Others have used traditional theories of esthetics to explain

cluster at *D* = 1.7, matching Pollock's preference, raising the pos-

Pollock's preference. For example, Mureika appealed to the peak shift effect (Mureika, 2005), one of the "eight laws of artistic experience" (Ramaschandran and Hirstein, 1999). Within this esthetics model, visual interest is strengthened by overtly enhancing key characteristics of an image, in Pollock's case the *D* value. Another esthetics model, the "principle of the esthetic middle," predicts that the viewer will be drawn to a visual scene of mid-complexity (Berlyne, 1971). Given that higher *D* values exhibit higher visual complexity through their higher content of fine structure (Sprott and Taylor, 2008), this might explain why most people prefer mid-*D* fractals, but not Pollock's quest for higher values. Recent studies that supplement *D* with other measures of complexity, such as Gif file size and algorithm length (Taylor and Sprott, 2008; Boon et al., 2011; Forsythe et al., 2011), might prove useful in addressing this issue.

The discrepancy between Pollock's preference and those of other observers might also lie in the fact that the paintings might have looked different to Pollock simply because he spent so much time generating and looking at them. Webster has argued that a prolonged exposure to any pattern or a visual environment leads to a process of adaptation, a process through which the perceptual norms are constantly adjusted (Webster, 2002). We plan to test the idea of habituation to fractals in future experiments that examine if exposure to high *D* fractals causes the observer's preference to move to these higher *D* values.

We finish with a remark made by one of Pollock's friends, Ruebin Kadish, who noted, "I think that one of the most important things about Pollock's work is that it isn't so much what you're looking at but it's what is happening to you as you're looking at his particular work" (Bragg, 1987). This observation emphasizes both the importance of the long tradition of experimental esthetics (Fechner, 1876) and the value of employing modern tools for analyzing human response to art works.

# **Acknowledgments**

We thank M. S. Fairbanks, C. Boydston, N. Kawada, A. P. Micolich, D. Jonas, T. Laike, M. Küller, R. Küller, T. P. Martin, T. Purcell, B. R. Newell, C. W. G. Clifford, M. Sereno, and J. A. Wise.

# **References**


Alvarez-Ramirez, J., Ibarra-Valdez,


spectra in dynamic images and human vision. *Physica D* 148, 136–146.


*ONE* 5, e12268. doi: 10.1371/journal. pone.0012268


C. R., and Williams, S. C. R. (2007). A functional MRI study of happy and sad affective states induced by classical music. *Hum. Brain Mapp.* 28, 1150–1162.


visual art: similar to natural scenes. *Spat. Vis.* 21, 137–148.


mograms. *Fractal Image Encoding Anal. NATO ASI Ser.* 159, 279–297.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 14 February 2011; accepted: 09 June 2011; published online: 22 June 2011. Citation: Taylor RP, Spehar B, Van Donkelaar P and Hagerhall CM (2011) Perceptual and physiological responses to Jackson Pollock's fractals. Front. Hum. Neurosci. 5:60. doi: 10.3389/ fnhum.2011.00060*

*Copyright © 2011 Taylor, Spehar, Van Donkelaar and Hagerhall. This is an openaccess article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.*

# **THE ADVANTAGE OF AMBIGUITY? ENHANCED NEURAL RESPONSES TO MULTI-STABLE PERCEPTS CORRELATE WITH THE DEGREE OF PERCEIVED INSTABILITY**

**Benjamin J. Dyson**

# The advantage of ambiguity? Enhanced neural responses to multi-stable percepts correlate with the degree of perceived instability

# *Benjamin J. Dyson\**

*Department of Psychology, Ryerson University, Toronto, ON, Canada*

#### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Lutz Jäncke, University of Zurich, Switzerland Burkhard Pleger, Max Planck Institute for Human Cognitive and Brain Sciences, Germany*

#### *\*Correspondence:*

*Benjamin J. Dyson, Department of Psychology, Ryerson University, 350 Victoria Street, Toronto, ON M5B 2K3, Canada.* 

*e-mail: ben.dyson@psych.ryerson.ca*

Artwork can often pique the interest of the viewer or listener as a result of the ambiguity or instability contained within it. Our engagement with uncertain sensory experiences might have its origins in early cortical responses, in that perceptually unstable stimuli might preclude neural habituation and maintain activity in early sensory areas. To assess this idea, participants engaged with an ambiguous visual stimulus wherein two squares alternated with one another, in terms of simultaneously opposing vertical and horizontal locations relative to fixation (i.e., stroboscopic alternating motion; von Schiller, 1933). At each trial, participants were invited to interpret the movement of the squares in one of five ways: traditional vertical or horizontal motion, novel clockwise or counter-clockwise motion, and, a free-view condition in which participants were encouraged to switch the direction of motion as often as possible. Behavioral reports of perceptual stability showed clockwise and counter-clockwise motion to possess an intermediate level of stability compared to relatively stable vertical and horizontal motion, and, relatively unstable motion perceived during free-view conditions. Early visual evoked components recorded at parietal–occipital sites such as C1, P1, and N1 modulated as a function of visual intention. Both at a group and individual level, increased perceptual instability was related to increased negativity in all three of these early visual neural responses. Engagement with increasingly ambiguous input may partly result from the underlying exaggerated neural response to it. The study underscores the utility of combining neuroelectric recording with the presentation of perceptually multi-stable yet physically identical stimuli, in revealing brain activity associated with the purely internal process of interpreting and appreciating the sensory world that surrounds us.

**Keywords: ambiguous stimuli, perceptual stability, perceptual intention, visual evoked responses, auditory evoked responses**

# **Introduction**

There is much that can be learnt about art by applying what we know about our perceptual and cognitive systems (e.g., Arnheim, 1974; Gombrich, 1977; Gregory, 1997; Solso, 2003). Although the phenomenological gap between the eventual experience of art and the initial sensory processes that underlie it can be significant (Dyson, 2009), common ground has been established between the gallery and the laboratory. For example, the later works of Piet Mondrian are generally ascribed artistic value, in addition to being composed of visual elements basic enough to allow for systematic experimentation. By manipulating the orientation (Latto et al., 2000), line spacing (McManus et al., 1993; Wolach and McHale, 2005), and color distribution (Locher et al., 2005) of the originals, it is possible to establish the extent to which Mondrian had established the "correct" organization of these features in accordance with a universal esthetic. Another way researchers have attempted to bridge the divide between art and science is to take advantage of the observation that differential perceptual and esthetic experiences may be derived from the same input: one only need consider proponents of optical art such as Bridget Riley and Victor Vasarely (Martinez-Conde and Macknik,

2010) to appreciate how the interpretation of ambiguous sensory information relies on an interaction between bottom-up and topdown processes (e.g., Dyson and Cohen, 2010). In this regard, examining the neurological indices associated with ambiguous experience has been an important step in understanding how purely internal processes such as perceptual interpretation might be expressed in the brain. One of the major advantages of this kind of approach is the way in which identical sensory stimulation can give rise to radically different perceptions, thereby revealing the neural correlates peculiar to conscious interpretation whilst at the same time controlling for differences in lowlevel stimulus processing (Jackson et al., 2008). One particularly intriguing possibility outlined by Zeki (2006) is that sensory information imbibed with ambiguity might facilitate an esthetic experience by promoting brain activity and the active process of environmental interpretation (see Berlyne, 1971, for a historical antecedent of this idea). The present study will examine the extent to which variations in perceptual interpretation for the same sensory information map onto early evoked responses, and, how this potential advantage of ambiguity might be expressed in neural terms.

With respect to the kinds of ambiguous stimuli deployed in the laboratory, images as diverse as the Necker cube, the Rubin face/ vase, the old/young woman, ambiguous cheetahs, dot lattices, overlaid gratings, spinning wheels, rotating spheres, and stereograms (e.g., Portas et al., 2000; Leopold et al., 2002; Sterzer et al., 2002; Gepshtein and Kubovy, 2005; see Windmann et al., 2006, for further examples) have all been used to inform how any one particular conscious percept arises from the numerous interpretations available to us. Moreover, the application of fMRI (e.g., Kleinschmidt et al., 1998; Portas et al., 2000), EEG/ERP (e.g., Kornmeier and Bach, 2004, 2005; Pitts et al., 2007), and MEG (e.g., Struber and Herrmann, 2002; Kaneoke et al., 2009) technologies have all contributed to our understanding of both the spatial and temporal positioning of perceptual interpretation relative to early sensory activity as well as later cognitive and response-based processes. For example, in terms of brain regions involved in the production of conscious interpretation, Rees et al. (2002) argues that whereas V1 activity does not modulate according to the content of subjective experience, activation of the ventral visual cortex seems to be a necessary (though not sufficient) condition for the advent of consciousness.

The application of ERP and MEG has been particularly useful in terms of revealing the latencies at which changes in perceptual interpretation are manifest. Both relatively early and relatively late time windows have been identified that distinguish between the experiences of perceptual change and perceptual maintenance in the absence of any low-level stimulus alteration. For example, with respect to an ambiguous Necker cube, perceptual change relative to perceptual maintenance has been marked by an increase in both positive-going and negative-going deflections 120 and 240 ms after stimulus onset, respectively, maximal at parietal and occipital scalp electrode sites (Kornmeier and Bach, 2004, 2005). Similar modulations have also been reported for other ambiguous figures such as the Rubin face/vase and Schroder's staircase (but not in an ambiguous Cheetah picture; Pitts et al., 2007). Further demarcations between perceptual stability and change have also been observed in later positive-going deflections around 300−400 ms after stimulus onset, with perceptual reversals again related to increased amplitude (Basar-Eroglu et al., 1993; Struber and Herrmann, 2002; Kornmeier and Bach, 2009). Therefore, one potential advantage of ambiguity at a neural level might be that the sensory system fails to habituate to input that has multiple interpretations. Enhanced neural responding may then provide a level of cortical engagement that facilitates the eventual development of an esthetic response (Zeki, 2006).

The current study considers this possibility using one particular type of ambiguous stimulus presentation known as stroboscopic alternating motion (after von Schiller, 1933; also known as "twoframe bi-stable apparent motion" or "bi-stable motion quartets"; Kohler et al., 2008). This particular stimulus consists of two alternating frames (see **Figure 1**), wherein two shapes (traditionally either squares or circles) are presented in both opposing vertical and horizontal locations from one another relative to fixation. In one frame (hereafter, V1), one shape is presented bottom-left whilst a second shape is presented top-right. In a second frame (hereafter, V2), one shape is presented top-left whilst a second shape is presented bottom-right. With repeated alternations of these two frames, two apparent motions have previously been reported: vertical and horizontal, with the former motion characteristically being the easier of the two percepts to maintain when vertical and horizontal variation is of an identical physical magnitude (Chaudrun and Glaser, 1991; Kaneoke et al., 2009).

The aims of the present investigation are three fold. First, I will aim to increase the range of perceptual experience that can be derived from stroboscopic alternating motion by comparing the behavioral report of perceptual stability associated with the intentional production of traditional vertical and horizontal perceived motion (Kohler et al., 2008) with two hitherto unreported interpretations of this multi-stale percept: clockwise and counter-clockwise motion. In a previous study comparing a number of static ambiguous stimuli (i.e., Rubin face/vase, Necker cube) with a dynamic ambiguous "rotating" circle stimulus in which local elements alternated between gray and black, Windmann et al. (2006) reported participants had significant difficulty in perceiving changes between clockwise and counter-clockwise motion relative to the other types of ambiguous figure, although the reasons for this difficulty were not clear. In contrast, other research comparing the speed with which participants judged the motion of an ambiguously rotating cylinder to be clockwise or counter-clockwise with the interpretation of the Rubin face/vase image found little difference between the stimuli (Takei and Nishida, 2010). The clear benefit of comparing the difficulty with which rotational motion can be generated relative to vertical and horizontal motion using the current stimuli is that differences cannot be attributed to the use of physically different stimuli, nor to the differences between static and dynamic ambiguous images. In an additional attempt to further extend the range of perceptual experience in the current experiment, participants will also take part in fifth condition in which they intend to make the percept as unstable as possible by switching the direction of motion as often as they can (hereafter, "free-view"). Significant variations in the report of perceptual stability across these five conditions will support the view that stroboscopic alternating motion should be understood as a *multi-stable* rather than *bi-stable* stimulus, and help to evaluate the difficulty of intended rotational motion alongside potentially stable vertical and horizontal motion and potentially unstable free-view conditions with the same stimulus and, therefore, in the absence of low-level sensory differences.

Second I will consider the impact the degree of visual perceptual instability has upon the processing of auditory stimuli. Various accounts have been put forward for the independence or interdependence in the allocation of processing resource across different modalities (see Larsen et al., 2003). The empirical data continue to be equivocal (e.g., Dyson et al., 2005; Yucel et al., 2005; see Haroush et al., 2009, for a review), with recent evidence suggesting that the relationship between visual and auditory processing relies on a number of factors such as whether variation in visual task load is perceptual or memorial in nature (Muller-Gass and Schröger, 2007), the specific auditory event-related potentials interrogated (Harmony et al., 2000), as well as moment-to-moment fluctuations in visual task load (Haroush et al., 2009). In the current design, participants will attempt visual apparent motion both in the presence and absence of simultaneously alternating yet ignored auditory stimulation (A1, A2). By subtracting the neural response to trials in which participants receive just visual stimulation from trials in which participants receive both visual and auditory stimulation [e.g., (V1 + A1)−V1; (V2 + A2)−V2], the resultant difference wave should provide some indication of the evoked responses generated by ignored auditory stimuli (although see Teder-Sälejärvi et al., 2002, to an objection to these kind of formulae in the context of cross-modal integration). If the relationship between auditory and visual perceptual processing is antagonist (e.g., Yucel et al., 2005), then those visual conditions related to high levels of perceptual instability should yield smaller auditory evoked responses.

Third I will examine the relationship between individual variation in the subjective report of perceptual stability and underlying visual neural activity. As previous electrophysiological data suggests (Basar-Eroglu et al., 1993; Struber and Herrmann, 2002; Kornmeier and Bach, 2004, 2005; Kornmeier and Bach, 2009), change in the interpretation of an ambiguous stimulus is reflected in increased amplitude at a number of points during visual processing. However, the literature is currently silent regarding the relationship between electrophysiological recording and variation in the interpretation of stroboscopic alternating motion. As such it is not clear whether stroboscopic alternating motion belongs to that sub-set of ambiguous images that modulate neural amplitude as a function of perceptual change (Pitts et al., 2007). Encouragingly, a recent study examining stroboscopic alternating motion under conditions of MEG recording reported differences between vertical and horizontal motion approximately 160 ms following frame onset (Kaneoke et al., 2009), placing one index of perceptual interpretation for this stimulus relatively early on in visual processing. In the current paradigm, perceptual switches will be indexed by the subjective rating of stability, with increased instability relating to an increased rate of perceptual switching. Therefore, if early visual evoked responses are predictive of perceptual performance then enhanced visual evoked responses should be associated with trials within which participants experience a high degree of perceptual instability.

# **Materials and Methods**

# **Participants**

Twenty participants were initially run in the experiment. Four individuals had to be rejected on the basis of problems in EEG recording, and a further two individuals were rejected on the basis of their behavioral data (one due to their inability to perceive horizontal motion, and, one due to only being able to perceive intended motion on 42% of trials relative to a remaining group average of 93%). Consequently, the data from 14 participants were included in the final analyses (10 females; mean age=25.86). All participants responded using their right hand, and all participants self-reported as being exclusively right-handed apart from one participant who self-reported as being able to use both left and right hand equally well. All participants received \$10 per hour or course credit for their involvement and provided informed consent prior to investigation according to the ethical guidelines established at Ryerson University. All participants reported normal hearing and normal or corrected-to-normal vision.

# **Stimuli and apparatus**

A white square of on-screen size of 18 mm × 18 mm and a gray fixation cross subtending 8 mm × 8 mm were generated. Visual stimuli were then differentially lateralized to generate one presentation (V1) in which one square was presented top-right and a second square was presented bottom-left relative to central fixation. A second presentation (V2) constituted one square presented topleft and a second square presented bottom-right relative to central fixation. All horizontal and vertical variations were 44 mm away from central fixation, relative to square center. Further visual stimuli were generated constituting a clockwise rotated arrow (*clock*), a counter-clockwise rotated arrow (*counter*), an up–down pointing arrow (*vertical*), a left–right pointing arrow (*horizontal*), and, an up–down + left–right pointing arrow (*free*), all within a central on-screen area of 35 mm × 35 mm (see **Figure 1** for a schematic overview). Two 500 ms tones (low: 400 Hz, and, high: 800 Hz) were also created, each having 10 ms linear onset and offsets. Auditory stimuli were then differentially lateralized to generate one presentation (A1) in which a high tone was presented to the right ear concurrently with a low tone presented to the left ear. A second combined presentation was also generated (A2) in which a low tone was presented to the right ear concurrently with a high tone presented to the right ear (after Deutsch, 1974). Combined auditory presentation was calibrated monaurally at approximately 70 dB SPL(C) using Sennheiser HD202 headphones, and a Scosche SPL1000 sound level meter. Stimulus presentation and response collection was controlled by Presentation (Neurobehavioral Systems) and the experiment was completed in a quiet room.

# **Design**

Ten experimental conditions were established involving the orthogonal combination of visual intention cue (clockwise, counterclockwise, vertical, horizontal, free-view), with the presentation of auditory stimulation (sound, silence) concurrently with visual stimulation. Participants initially completed 10 practice trials (1 presentation × 10 conditions), followed by 3 blocks of 100 experimental trials (10 presentations × 10 conditions). Trial presentation was randomized across both condition and participant. Participants were encouraged to take breaks to avoid fatigue in between blocks, and to withhold responses to avoid fatigue within blocks.

# **Procedure**

Each trial began with the presentation of a black screen for 500 ms, subsequently replaced by a visual cue center screen indicating the desired interpretation of motion for 2000 ms. Participants were asked to prepare to interpret the forthcoming visual stimulation according to the cue, and to ignore auditory input if it was present. This was followed by an 8 s period in which visual, or, visual and auditory stimuli were presented, alternating across two frames presented for 500 ms each, thereby resulting in eight presentation of frame one (V1, or, V1 + A1) and eight presentations of frame two (V2, or, V2+A2; see **Figure1**). Participants were encouraged to focus on a central fixation throughout stimulus presentation. Following stimulus presentation, participants were asked whether or not they were able to generate the requested visual motion for at least part of the trial, and then, how stable the experienced percept was on a 7-point scale (1 = extremely stable, 7 = extremely unstable). Both post-trial prompts were on-screen until the participant responded.

# **ERP recording**

Electrical brain activity was continuously digitized using ActiView (Bio-Semi; Wilmington, NC, USA), with a band-pass filter of 0.16−100 Hz and a 1024 Hz sampling rate. Recordings made from

FPz, AFz, FC1, FCz, FC2, F1, Fz, F2, Cz, CPz, Pz, PO3, POz, PO4, Pz, Oz, M1, and M2 were stored for off-line analysis. Horizontal and vertical eye movements were recorded using channels placed at the outer canthi and at inferior orbits, respectively. Common mode sense (CMS) was taken from an independent electrode situated between P1 and PO3, while driven right leg (DRL) was situated between P2 and PO4. Data pre-processing was conducted using BESA 5.3 Research (MEGIS; Gräfelfing, Germany) using a 0.5 (6 db/ oct; forward) to 30 (24 db/oct; zero phase) Hz band-pass filter. The contributions of both vertical and horizontal eye movements were reduced from the EEG record using the automated VEOG and HEOG artifact option in BESA. Only trials in which participants reported success in perceiving the intended motion for at least part of the trial were included in analyses, and the evoked response from first and last frames within each set of 16 were not included thereby avoiding anticipatory responses. This led to the availability of a potential 210 epochs per frame per experimental condition. Individual epochs defined as 100 ms pre-stimulus baseline and 500 ms post stimulus activity were rejected on the basis of amplitude difference exceeding 75 μV, gradient between consecutive time points exceeding 75 μV, or, signal lower than 0.01 μV, within any channel. For the purposes of the current analyses, data were collapsed across frame and serial position, averaged and finally baselined according to the pre-stimulus interval. Visual stimuli were analyzed according to an average reference, auditory stimuli were analyzed according to an average mastoid reference [(M1 +M2)/2].

# **Results Behavioral Data**

Both the number of trials in which participants reported perceiving the requested motion for at least some part of the trial as indexed by the first behavioral measure, and, the subjective stability of those successful trials as indexed by the second behavioral measure were subjected to separate, two-way within-participant ANOVAs constituting the factors of visual intention (clockwise, counterclockwise, vertical, horizontal, free-view) and audition (sound, silence). All data were analyzed with Greenhouse–Geisser degrees of freedom correction and all significant interactions between factors were further explored using Tukey's HSD test (*p* = 0.05). The ANOVA data are summarized in **Figure 2**. The number of successful trials did not significantly differ across factors with participants indicating successful perceived motion on approximately 28 of 30 trials per condition: intention main effect [*F*(1.7, 21.7) = 1.32, MSE = 17.23, *p* = 0.284], audition main effect [*F*(1, 13) = 2.01, MSE = 4.60, *p* = 0.179], and, intention × audition interaction [*F*(3.0, 39.1) = 1.02, MSE = 1.28, *p* = 0.394]. A significant main effect of intention [*F*(1.8, 23.3) = 18.11, MSE = 2.16, *p* < 0.001] upon perceptual stability was revealed, indicating free-view (4.28) to be more significantly unstable than all other conditions, and, clockwise (2.56) to be significantly more unstable than both vertical (1.32) and horizontal (1.50) perceived motion. Neither the main effect of audition [*F*(1, 13) = 2.39, MSE = 0.05, *p* = 0.146] nor the interaction between intention and audition reached statistical significance [*F*(2.6, 33.2) = 0.81, MSE = 0.08, *p* = 0.478].

**Figure 2 | Group mean (A) trial success and (B) perceptual stability judged on a 7-point scale (1 = extremely stable, 7 = extremely unstable) of five possible interpretations of stroboscopic alternating motion under conditions of concurrent auditory stimulation or silence.** Error bars denote SE.

# **Auditory ERP response**

Auditory evoked responses were generated by calculating the difference wave between silence and sound trials (see **Figure 3A**), revealing maximal activity at fronto-central sites (Fz, FC1, FCz, FC2, Cz). P1 (maximal positivity between 60 and 160 ms post stimulus onset), N1 (maximal negativity between 110 and 210 ms post stimulus onset), and, P2 (maximal positivity between 190 and 290 ms post stimulus onset) peak latencies were sought and mean amplitudes for auditory components were quantified in terms of a 30 ms time window (15 ms either side of peak latency) and analyzed with respect to visual intention only. With respect to P1, N1, and P2, neither peak latencies [*F*(2.7, 35.9) = 0.34, MSE = 428, *p* = 0.885; *F*(3.1, 39.9) = 0.40, MSE = 232, *p* = 0.760; *F*(2.8, 36.8) = 0.65, MSE = 809, *p* = 0.581, respectively] nor mean amplitudes [*F*(3.2, 41.00) = 1.04, MSE = 0.65, *p* = 0.388; *F*(3.1, 32.58) = 0.14, MSE = 0.80, *p* = 0.942; *F*(3.4, 44.8) = 1.91, MSE = 0.62, *p* = 0.134, respectively] differed significantly as a function of visual intention.

# **Visual ERP response**

**Figure 3** shows neural responses to visual stimulation at parietal– occipital electrode sites with (**Figure 3B**) and without (**Figure 3C**) the concurrent presentation of auditory stimulation. C1 (maximal negativity between 25 and 125 ms post stimulus onset), P1 (maximal positivity between 70 and 170 ms post stimulus onset), and, N1 (maximal negativity between 130 and 270 ms post stimulus onset) peak latencies were sought across sites showing maximal activity (PO3, POz, PO4, and Oz). Mean amplitudes for C1, P1, and N1 were again quantified according to a 30 ms time window. Visual ERP components were subjected to the same two-way, repeated measures ANOVA as in behavioral analyses, the results of which are shown in **Table 1**. In **Figure 4**, the left hand panels show mean amplitudes across the 10 conditions of interest and the right hand panels show the group correlation between mean amplitude and perceptual stability.

C1 (71 ms) mean amplitude revealed a significant main effect of intention (*p* = 0.001), with the free-view condition generating larger negativity (−0.95 μV) than vertical (−0.31 μV) and horizontal (−0.58 μV) motion, and, clockwise (−0.70 μV) and counter-clockwise (−0.66 μV) motion generating larger negativity than vertical motion. Individual correlations between mean amplitude and perceptual stability were calculated for each participant, and the average correlation was significantly different from zero [*r* = −0.45; *t*(13) = 7.17, *p* < 0.001], in that larger negativity was associated with increase perceptual instability. P1 (118 ms) mean amplitude revealed no significant group differences with respect to visual intention or audition, but the average correlation between mean amplitude and perceptual stability was revealed to be significant [*r* = −0.28; *t*(13) = 2.36, *p* = 0.034], indicating that smaller positivity was associated with increased perceptual instability. N1 (190 ms) mean amplitude revealed significant main effects of both intention (*p* = 0.001) and audition (*p* = 0.004). In terms of visual intention, free-view (−1.81 μV) trials generated significantly larger negativity than clockwise (−1.31 μV), counter-clockwise (−1.32 μV), horizontal (−1.06 μV), and, vertical (−0.82 μV) intended motion trials. The pairwise comparison between counter-clockwise motion and vertical motion was also significant. Individual correlations between mean amplitude and perceptual stability once again revealed a significant negative correlation [*r* = −0.50; *t*(13) = 5.73, *p* < 0.001], indicating that mean amplitude increased as perceptual instability increased. Visual N1 was also larger under conditions of silence (−1.47 μV) relative to sound presentation (−1.06 μV), possibly due to the spatio-temporal interactions between parietal and occipital visual N1 and fronto-central auditory P2 components (Vidal et al., 2008).

# **Discussion**

Electrophysiological and behavioral measures of performance were collected as participants engaged with stroboscopic alternating motion (von Schiller, 1933) under a variety of different intended visual motions, and, under conditions of concurrent auditory stimulus presentation or silence. The data supported the idea that stroboscopic alternating motion is a multi-stable rather than bi-stable stimulus, in the revelation of two hitherto unreported interpretations: clockwise and counter-clockwise

rotation. Behavioral reports of perceptual stability showed rotational motion to yield less perceptual stability than intended vertical or horizontal motion, but greater perceptual stability than intentionally unstable motion (free-view condition). Early cortical responses to ignored auditory stimuli observed at frontocentral sites did not seem to modulate as a function of visual intention. However, behavioral reports of perceptual stability were correlated with early visual exogenous responses recorded from parietal–occipital sites (C1, P1, and N1) both at a group and individual level, in that increased negativity reflected increased perceptual instability.

In terms of the relative ease with which different motions could be imposed on the stroboscopic display, the current data echoed the previous observation that when vertical and horizontal variation is of an identical physical magnitude, vertical motion is perceived more readily than horizontal motion (Chaudrun and Glaser, 1991;

Kaneoke et al., 2009). This was represented by significant pairwise comparisons in the electrophysiological data, in that C1 and N1 amplitudes were reduced for vertical (but not horizontal) trials relative to counter-clockwise trials. While there are a number of possible explanations for this effect (see Chaudrun and Glaser, 1991, for a discussion), it would perhaps be important to rule out the quotidian explanation that since most contemporary experiments into stroboscopic motion take place on landscape oriented computer monitors, vertical motion brings the shapes closer to the boundary of the monitor than does horizontal motion, thereby potentially creating increased perceived distance in the former case. This could easily be achieved by turning the monitor on its side in future investigations.

The current experiment also presents evidence to suggest that two new interpretations of the previously *bi-stable* stroboscopic stimulus are available: namely, clockwise and counter-clockwise motion. The subjective report of perceptual stability suggests that **Table 1 | Summary of separate ANOVAs for peak latency and mean amplitude for visual C1, P1, and N1 components across parietal– occipital (PO3, POz, PO4, Oz) electrodes.**

#### **ERP component**


*I, intention; A, audition. Statistically significant effects in bold.*

these rotational motions achieved intermediate stability: significantly less stable than either vertical or horizontal motion but significantly more stable than the perception experienced under free-view conditions. Moreover, the difficulties in maintaining rotational motion cannot be attributed to the use of different stimuli or the use of static versus dynamic stimuli, as in previous studies (Windmann et al., 2006; Takei and Nishida, 2010). Further work would help to elucidate the mechanisms involved with rotational motion (cf., Jackson et al., 2008; Liesefeld and Zimmer, 2011) and at least one interpretation of rotational motion is available using a frame-by-frame analysis of the current data. Specifically, it would be possible to evaluate the idea that part of the reason for increased perceptual instability during intended rotational motion is that participants were simply mimicking clockwise and counter-clockwise rotation by recombining vertical and horizontal movements across different frames, thereby instigating a predictable perceptual switch on every trial. With respect to **Figure 1**, clockwise rotation could be mimicked by intending a vertical movement at V1 followed by a horizontal movement at V2, whereas a counter-clockwise rotation could be mimicked by intending a horizontal movement at V1 followed by a vertical movement at V2. If participants generated this pseudo-circular motion by combining vertical and horizontal movement, then a double dissociation should be predicted: the clockwise condition should be more similar to the vertical condition and the counter-clockwise condition should be more similar to horizontal condition in V1, and these associations should be reversed in V2.

With respect to the relationship between resource allocation across visual and auditory processing (Haroush et al., 2009), the data must remain silent as no significant differences in auditory evoked response were observed as a function of visual intention. **Figure 3A** hints at differential auditory P2 activity as a function of rotational versus non-rotational motion, but this failed to reach statistical significance. Support for the independence between visual and auditory resource cannot be based on support of the null hypothesis as in the current experiment, as it is always possible that an effect in audition could be observed by further increasing the difficulty of the visual task (Valtonen et al., 2003). As above, it is perhaps worth considering what a more detailed analysis might reveal in terms of the interactions between auditory and visual stimulation. The auditory stimuli selected in this experiment had weak synesthetic (Martino and Marks, 2001) relations with the visual stimuli on a frame-by-frame basis, in that the vertical positions of the squares were assigned a high or low pitch value, and the horizontal positions of the squares were assigned to left or right headphone presentation (cf., Mudd, 1963). In this regard, concomitant auditory stimulation for V1 (high right square with low left square) was a high pitched tone in the right ear and a low pitched tone in the left ear (A1), whilst for V2 (low right square with high left square) the auditory stimulation was a low pitched tone in the right ear and a high pitched tone in the left ear (A2). One of the reasons for pursuing these frame-by-frame interactions further is because the consistency between audio and visual information is potentially violated on half the trials as a result of an auditory *Octave Illusion* (Deutsch, 1974). In short, when participants are asked to report what they hear on the basis of the acoustic presentation described above, a common (although by no means sovereign) perception is of hearing a high pitched tone in the right ear for A1 but a low pitched tone in the left ear for A2. While conflicting account of the illusion have been put forward in terms of either suppression or fusion between ears (Chambers et al., 2004; Deutsch, 2004), it is clear that the illusion is only apparent on half the trials (A2). Therefore, in terms of congruency between auditory and visual presentation, V1 +A1 should provides a stronger sense of perceptual unity than V2 + A2. Consequently, an interaction between frame and visual evoked responses in the presence or absence of sound might provide evidence for the octave illusion even when participants are instructed to ignore the acoustic information.

The most critical finding however was the provision of clear evidence for the neural expression of perceptual instability. Using perceptual instability as an index of the frequency of perceptual switching in the current design, the data echo the previous findings of amplitude modulation as a function of perceptual switching in a number of other ambiguous figures (Basar-Eroglu et al., 1993; Struber and Herrmann, 2002; Kornmeier and Bach, 2009). Specifically, the data replicate the observation of a relatively late larger negativity for perceptual reversals, but not the observation of a relatively early larger positivity for perceptual reversals (Kornmeier and Bach, 2004, 2005; Pitts et al., 2007). Rather, the degree of perceptual instability across the 10 experimental conditions was characterized by increased negativity throughout the exogenous visual response, beginning around the time of C1 (approximately 70 ms) and returning to baseline around 400 ms after frame onset (see **Figures 3B,C**). Perhaps one reason why the current ERP data show an early and consistent increase in negativity during perceptually unstable trials, rather than both increases in negativity and positivity (Kornmeier and Bach, 2004, 2005), is a result of the averaging protocol. In contrast to previous studies where electrophysiological analyses were compared between times at which participants report an

intentional (or spontaneous) switch in perception compared to when the percept is stable (e.g., Kornmeier and Bach, 2009), the current analysis utilized an aggregated measure of performance whereby all frames within the trial (apart from the first V1 or V1 + A1 frame and the last V2 or V2 + A2 frame) were considered to increase signal-to-noise ratio. Therefore, it is possible that the significant modulations in neural responses reported above are the result of both sustained and transient evoked responses, although a purely sustained response account would struggle to argue for why significant neural modulation was observed at some time points and not others. Nevertheless, the recent work of Liesefeld and Zimmer (2011) provides one example of a sustained evoked response that behaves in a manner analogous to the transient evoked responses observed here. In the context of mental rotation, these researchers showed that a negativegoing, parietal slow-wave component increased as the amount of rotation increased, and also increased for counter-clockwise relative to clockwise rotational requests. The association between increased negativity and increased task complexity in their data provides a useful link with the association between increased negativity and increased perceptual instability in the current study. In order to tease apart the contributions of sustained and transient evoked response in future work, the comparison between longer and shorter epochs within the same paradigm would be of particular benefit.

A final set of issues revolve around the observation that perceptual changes can be initiated under a number of conditions such as the induction of change by presenting disambiguated versions of the image (e.g., Sterzer and Kleinschmidt, 2006; Takei and Nishida, 2010), the intention to change as a function of task demands (e.g., Kohler et al., 2008), and, the spontaneous experience of change as a result of neural fatigue, attentional diversion, rate of stimulus delivery, and, eye movements (e.g., Baker and Graf, 2010). Consequently, there may be subtle differences between conditions that produce perceptual instability as a result of the *failure of stable intention* (i.e., vertical, horizontal, clockwise, counter-clockwise) and conditions that produce perceptual instability as a result of the *success of unstable intention* (i.e., free-view). Although the current data suggest that neural responses during free-view trials were only quantitatively rather than qualitatively different from all other trials, these subtly different styles of processing may reveal different patterns of neural performance, perhaps in pre-frontal areas implicated in holding particular perceptual interpretations in mind (Windmann et al., 2006). Moreover, it is also important to acknowledge variability in the ratings of perceptual stability between participants and consider how certain interpretations of the external world become more or less likely as a function of long-term experience. Distinct neural differences between visual artists and non-artists (e.g., Bhattacharya and Petsche, 2005; Kottlow et al., 2011), and, between musicians and non-musicians (e.g., Ohnishi et al., 2001; Schlaug, 2001) have been reported and it is a matter for future work as to how such sensory expertise might modulate the experience of ambiguous or multi-stable stimuli.

In sum, it would appear that the more ambiguous the sensory information, the more accentuated the brain's response to it. This current observation made with stroboscopic alternating motion

# **References**


(von Schiller, 1933) appears to be a characteristic of some (but not all; Pitts et al., 2007) bi- or multi-stable stimuli tested in the laboratory (Windmann et al., 2006). It remains a question for future research whether the relationship between perceptual report and neural activity also apply to the sensory stimuli to which esthetic value is ascribed. For example, Ishai et al. (2007; see also Pepperell, 2011) discuss the consequences of examining indeterminate art and note that, relative to paintings containing recognizable objects, indeterminate paintings yielded longer response latencies (see also Dyson and Cohen, 2010). Therefore, ambiguity within artwork may have the net result of facilitating viewer or listener engagement, and it is interesting to consider how these effects might have their origins in neural activity. Further studies that integrate both brain and behavior metrics of esthetic reaction will be ideally positioned to evaluate how early neural responses might predict one's eventual reaction to artwork. Interacting with ambiguity then may provide a natural, neurological route to promoting deeper and richer reactions to the world (Zeki, 2006). As always, it will be the combination of bottom-up and top-down factors that generate the final perceptual experience: a carefully crafted ambiguous percept on the part of the artist combined with a willing to invoke ambiguity on the part of the perceiver may serve to engage the brain to the fullest extent. The likelihood that the viewer or listener has an esthetic response to their sensory world may then begin with an exaggerated cortical response to it.

# **Author's Note**

Benjamin J. Dyson is supported by an Early Researcher Award granted by Ontario Ministry of Research and Innovation. Data was presented at the 18th Annual Meeting of the Cognitive Neuroscience Society, 2nd–5th April, San Francisco, USA. The author would like to thank Idan Segev, Lutz Jäncke and Burkhard Pleger for comments on earlier versions of the manuscript.


*ONE* 3, e3982. doi: 10.1371/journal. pone.0003982


a study of Mondrian. *Empirical Stud. Arts* 11, 83–94.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 28 February 2011; paper pending published: 07 June 2011; accepted: 18 July 2011; published online: 23 August 2011. Citation: Dyson BJ (2011) The advantage of ambiguity? Enhanced neural responses to multi-stable percepts correlate with the degree of perceived instability. Front. Hum. Neurosci. 5:73. doi: 10.3389/ fnhum.2011.00073*

*Copyright © 2011 Dyson. This is an openaccess article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.*

# **THE RIGHT HEMISPHERE IN ESTHETIC PERCEPTION**

**Bianca Bromberger, Rebecca Sternschein, Page Widick, William Smith and Anjan Chatterjee**

# The right hemisphere in esthetic perception

# *Bianca Bromberger, Rebecca Sternschein, PageWidick,William Smith II and Anjan Chatterjee\**

*Department of Neurology, The University of Pennsylvania, Philadelphia, PA, USA*

#### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Juliana Yordanova, Bulgarian Academy of Sciences, Bulgaria Bernd Weber, Rheinische-Friedrich-Wilhelms Universität, Germany*

#### *\*Correspondence:*

*Anjan Chatterjee, Department of Neurology, The University of Pennsylvania, 3 West Gates, 3400 Spryce Street, Philadelphia, PA 19104, USA. e-mail: anjan@mail.med.upenn.edu*

Little about the neuropsychology of art perception and evaluation is known. Most neuropsychological approaches to art have focused on art production and have been anecdotal and qualitative. The field is in desperate need of quantitative methods if it is to advance. Here, we combine a quantitative approach to the assessment of art with modern voxel-lesionsymptom-mapping methods to determine brain–behavior relationships in art perception. We hypothesized that perception of different attributes of art are likely to be disrupted by damage to different regions of the brain. Twenty participants with right hemisphere damage were given the Assessment of Art Attributes, which is designed to quantify judgments of descriptive attributes of visual art. Each participant rated 24 paintings on 6 conceptual attributes (depictive accuracy, abstractness, emotion, symbolism, realism, and animacy) and 6 perceptual attributes (depth, color temperature, color saturation, balance, stroke, and simplicity) and their interest in and preference for these paintings. Deviation scores were obtained for each brain-damaged participant for each attribute based on correlations with group average ratings from 30 age-matched healthy participants. Right hemisphere damage affected participants' judgments of abstractness, accuracy, and stroke quality. Damage to areas within different parts of the frontal parietal and lateral temporal cortices produced deviation in judgments in four of six conceptual attributes (abstractness, symbolism, realism, and animacy). Of the formal attributes, only depth was affected by inferior prefrontal damage. No areas of brain damage were associated with deviations in interestingness or preference judgments. The perception of conceptual and formal attributes in artwork may in part dissociate from each other and from evaluative judgments. More generally, this approach demonstrates the feasibility of quantitative approaches to the neuropsychology of art.

#### **Keywords: aesthetics, brain damage, neuropsychology, neuroesthetics**

# **INTRODUCTION**

Historically, art and esthetics have been well ensconced in the humanities and have not been considered seriously within the sciences. Fechner (1876) began the field of empirical esthetics. More than a century later, neuroscience is playing catch-up, and is finally coming of age (Skov and Vartanian, 2009; Chatterjee, 2011). Theoretical positions and a few books linking neuroscience to art have appeared (Ramachandran and Hirstein, 1999; Zeki, 1999; Livingstone, 2002; Chatterjee, 2004a). Empirical studies using imaging techniques looking at our responses to beauty (Aharon et al., 2001; Ishai, 2007; Winston et al., 2007; Chatterjee et al., 2009) as well as to different kinds of artwork (Kawabata and Zeki, 2004;Vartanian and Goel, 2004; Jacobsen et al., 2005; Ishai et al., 2007; Cela-Conde et al., 2009) are being published. Recent conferences devoted to art and neuroscience (Nadal and Pearce, 2011) attest to the growing interest in the biology of esthetics. In this paper, we examine the state of one important aspect of neuroesthetics, the neuropsychology of art (Chatterjee, 2004b; Bogousslavsky and Boller, 2005; Zaidel, 2005). We outline reasons that this aspect of neuroesthetics has been relatively undeveloped and report our initial attempts to rectify this situation.

Since the late nineteenth century,much of our knowledge of the brain bases of cognitive and affective functions has been derived from observations of people with brain damage. From close clinical observations made by Broca, Wernicke, Lichtheim, Lissaur, and Leipmann, the basic tenets of the biology of language, visual semantics, and motor control were established. Over the twentieth century, cognitive neurology and neuropsychology benefited from methods of experimental psychology. Our understanding of memory, emotional processing, decision-making and virtually every domain of cognition advanced from analysis of patients with brain damage. Despite the recent ascendency of functional neuroimaging, the inferential power of lesion studies is robust (Chatterjee, 2005; Fellows et al., 2006). Yet, relatively little about the neuropsychology of art is known.

Many have made observations of the kind of art produced by people with neurologic disease (Bogousslavsky and Boller, 2005; Zaidel, 2005). These observations are made with the hope of inferring the neural bases for artistic production from its derangement by brain damage (Chatterjee, 2006). Despite the fact that such observations date back at least to the 1940s (Alajouanine, 1948), the field has not matured (Chatterjee, 2009). Artists with brain damage that continue to produce a body of work are rare, and it is difficult, if not impossible, to conduct large-scale group studies of artistic production. Most reports describe anecdotal observations and offer a few art examples from which inferences are drawn.

Thus, we are left with a collection of anecdotes that are fascinating by themselves, but do not contribute to a comprehensive understanding of the systems involved, or any formal tests of hypotheses. A critical obstacle to advancing this work is the lack of quantitative measures. How do we quantify a work of art? Doing so must be critical if we are to measure change. How can we assess change if we do not know what is changing and can reliably measure this change?

To address this issue of how to measure change in artwork, we developed a test called the assessment of art attributes (AAA;Chatterjee et al., 2010). We designed the AAA keeping in mind the need for componential analysis and quantification in the neuropsychology of art. The AAA is based on the widely held assumption that artworks have formal-perceptual qualities and content-conceptual qualities (Russell and George, 1990; Woods, 1991). We selected six formal-perceptual attributes and six content-conceptual attributes based on a review of the literature with special consideration to the kinds of attributes thought to have changed in individuals with brain damage. The formal-perceptual attributes correspond to early and intermediate visual processing. They are: Color temperature (warm–cold), Color saturation (calm–vibrant), Stroke style (controlled–loose), Depth (flat–deep), Balance (low–high), and Complexity (simple–complex). The content-conceptual attributes correspond to higher/late visual processing and its contact with other domains, like semantics and emotional systems. They are: Representational accuracy (less–more), Abstractness (less– more), Realism (less–more), Animacy (less–more), Symbolism (less–more), and Emotionality (less–more). We familiarize each participant on each attribute. Their assessments are made using a Likert scale, giving quantitative form to these descriptive attributes. The paintings in the AAA were selected from the Western canon, covering different time periods. A well-known artist created each painting to ensure reasonable esthetic quality in our stimuli. However, the selected paintings were not the artists' most popular works (e.g., Hopper's *Nighthawks*) that might be familiar to even artistically naïve participants.

We have shown that the AAA can be used to assess systematic change in the art produced by people with neurological disease. Using the AAA (Smith et al., 2011), we reported that in patients with left or right focal brain damage, art becomes more abstract, distorted, and less realistic. The paintings are also produced with looser strokes, less depth, and more vibrant colors. Notably, art produced following left brain damage, becomes more symbolic, a change not seen in right brain damage. By contrast, the paintings of people with Alzheimer's Disease became more abstract and symbolic and less realistic and depictively accurate (van Buren et al., 2010).

If our understanding of the nature of artistic production following brain damage has been rudimentary, our knowledge of the effects of brain damage on artistic perception is virtually nonexistent. Based on extant neuropsychological (Chatterjee, 2004b) and functional neuroimaging (Brown et al., 2011) observations, it is unlikely that we evolved perceptual and semantic representations and emotional neural systems designed uniquely for esthetic experiences. Rather, particular combination of regional activations dedicated to general perceptions and emotions give rise to esthetic experiences. The experience of looking at and appreciating

visual art likely relies on a diverse set of perceptual and cognitive processes (Chatterjee, 2004a; Leder et al., 2004; Nadal et al., 2008). From admiring the precision of a portrait to responding to the emotional resonance of a landscape, art requires the viewer to perceive many attributes while also forming judgments of liking and interest. We know little about the areas of the brain that are responsible for the perception and evaluation of visual art. The question of how to adequately quantify deviations in perception applies in the same way that it does to deviations in production. Here, we show that the AAA can be used to assess brain–behavior relationships in art perception.

We focus our investigation on the role of the right hemisphere. The right hemisphere participates prominently in visual spatial attention and representation (Heilman et al., 1993; Chatterjee, 2003). Despite limited evidence for the popular view, the right hemisphere is often considered the artistic hemisphere. For our initial attempts to investigate the neural correlates of art perception,we chose to focus on right hemisphere damage patients to avoid confounding language comprehension with judgment in our study. For example, if a participant does not understand what the word "symbolic" means, it would be difficult to assess their judgment of symbolism in any painting. We limited our investigation to the perceptual abilities of artistically inexperienced or "naïve" brain-damaged participants, given that artistically experienced individuals may judge art differently (Cupchik and Gebotys, 1988; Hekkert and Van Wieringen, 1996; Chatterjee et al., 2010). In principle, the same study could be conducted in patients with expertise in art. In practice, such patients are less common in our population. Finally, we use contemporary lesion analysis methods in our study. Voxel-lesion-symptom mapping (VLSM) techniques allow us to formally assess the way in which damage to a brain area correlates with behavioral scores (Bates et al., 2003; Kimberg et al., 2007;Wu et al., 2007), with the advantage that one does not have to establish a deficit cut-off. Rather, behavior in VLSM is considered a continuous variable.

To summarize, neuropsychology has historically been an important aspect of cognitive neuroscience. Yet, the neuropsychology of art has been relatively underdeveloped. In our view, an important reason for this lack of development has been the lack of quantitative methods. To rectify this problem, we developed the AAA to quantify artistic attributes. We also use modern lesion analyses techniques to establish brain–behavior relationships.

# **MATERIALS AND METHODS PARTICIPANTS**

Twenty individuals with damage to their right hemisphere from stroke (mean age 58.7, 5 men, 15 women) and 30 age-matched healthy controls (mean age 58.8, 9 men, 21 women) participated in the study. Subjects were recruited from the Focal Brain Lesion Database at the Center for Cognitive Neuroscience at the University of Pennsylvania.

# **PRE-TEST SCREENING**

Participants with brain damage were given a set of visual tests including a shape detection test, a dot counting test, and a position discrimination test from the visual object and spatial perception battery (VOSP; Warrington and James, 1991), as well an Ishihara test for colorblindness and a Grayscales test (Mattingley et al., 2004) for right–left bias. They were also given basic background neuropsychological screening tests (see **Table 1**).

Participants completed a questionnaire that indicated their experience with visual art. Assessors included the number of art and art history classes taken, frequency of museum and gallery visits, and time spent making or reading about visual art. Based on previous data (Chatterjee et al., 2010) only participants with a score of less than 14 were deemed artistically naïve and were included in the study. Four patients did not meet this criterion and were not included in the analysis reported here.

# **ASSESSMENT OF ART ATTRIBUTES SCALE**

We used the AAA battery to obtain a quantitative measure of individuals' abilities to judge perceptual and conceptual qualities of visual art. The AAA measures one's ability to perceive 12 different attributes (6 formal and 6 conceptual) of visual art (Chatterjee et al., 2010). Participants rate 24 images of paintings from the Western art historical canon (**Table 2**) on each of the 12 scales on a 5-point Likert scale. Before beginning the AAA battery, participants look at each image to orient themselves to the range of styles of paintings they would be rating. In order to define the dimensions of each attribute scale, participants first see a training slide and two example images that illustrate the extremes of the scale. Participants are allowed to ask clarification questions before proceeding. Participants then rate each of the images of paintings on

#### **Table 1 | Patient demographics and screening data.**

a 5-point Likert scale. Images are presented in random order and no time limit is imposed. After all images have been rated on the 12 formal and conceptual qualities, participants then evaluate the paintings. They rate each painting for preference and for interest on a 5-point Likert scale.

# **LESION DATA**

Every patient's lesion was drawn on a standard brain template ("Colin 27" from the MNI) by one of two senior neurologists. Using MRICron, lesions are defined with respect to anatomically defined structures (e.g., inferior frontal gyrus, angular gyrus, etc.) as well as Brodmann areas using Automated Anatomical Labeling and Brodmann Areas maps available in the MRICro software package (Rorden and Karnath, 2004). VLSM correlations were assessed by regressing behavioral scores on lesion status scores across subjects independently for each voxel. Only voxels that included at least two participants with brain damage analyzed using a false discovery rate of 0.01 (**Figure 1**).

#### **ANALYSIS AND RESULTS**

The group results of the screening tasks are shown in **Table 1**. In general patients did well on these tasks. Data from age and artexperience matched control participants were used to develop a baseline measure of esthetic perception and evaluation. For each scale, the 24 paintings were assigned a unique rank order based on their average rating by control participants. Then, individual


*\*Grayscales test for left–right bias (*+*1* = *rightward bias,* −*1* = *leftward bias).*

control participants' scores were correlated with the rank order for each scale using Spearman's Rho, as a measure of non-parametric correlation. If an individual's Rho statistic fell two SD below the average Rho on a particular scale, their ratings were not used in establishing the rank order of paintings for that attribute. See Chatterjee et al. (2010) for details of these procedures.

#### **Table 2 | List of paintings used in the AAA.**

#### **Vermeer, "The Letter"**

Holbein, "Portrait of Dirk Tybis" Hopper, "The Gas Station" Pollock, "Number One" Henri, "Laughing Child" Garsia, "Apocalypse of Saint-Sever" Cassatt, "Self Portrait" Heda, "Still Life With Oysters, Rum Glass, and Silver Cup" Brueghel, "Netherlandish Proverbs" Kahlo, "Two Fridas" Dalí, "Gala and Tigers" Newman, "Eve" Cassatt, "On the Balcony During Carnival" Matisse, "The Blue Room" Van Eyck, "Man in a Turban" Cézanne, "Still Life with Kettle" Rothko, "Red and Orange" DeKooning, "Woman" Buoninsegna, "Virgin and Child Enthroned" Picasso, "Reclining Nude" Pissaro, "Landscape with Flooded Fields" Dewing, "The Piano" Eakins, "The Gross Clinic" Matisse, "Seated Riffian"

The results of this analysis demonstrate the reliability of the AAA. Controls tended to have high average Spearman's Rho and low SEfor each of the 12 attribute scales and interest and preference ratings (**Table 3**).

Once the rank order of paintings was established for each of the 12 scales, we determined the degree to which right hemisphere damaged individuals' esthetic perceptual abilities deviated from normal. Using Spearman's Rho, we correlated each braindamaged individual's ratings on each attribute with the rank order determined by the group control data. Then, for each attribute, the individual's Rho was subtracted from the average Rho of the controls for a difference score. This difference score reflects the degree to which a brain-damaged participant's ratings of a particular attribute differed from the average ratings of the normal control participants (**Table 3**).

We were interested in querying specific possible relationships between our screening tasks and the patients' judgments of different attributes. Specifically, could color perception as measured by the performance of the Ishihara plates account for deviations in judgment of either hue or tone? These correlations were not significant. Similarly, could performance on low-level perceptual tests from the VSOP account for performance on other perceptual judgments.We found a correlation between performance on shape detection and deviations on the attribute of simplicity (*r* = 0.51, *p* < 0.05).

To establish brain–behavior correlations of esthetic judgment and evaluation, two analyses were conducted. The first analysis investigated whether right hemisphere damage in general produced specific impairments in esthetic impairment. The second analysis tested for more specific locations within the right hemisphere that were likely to produce impairments in esthetic perception. We should be clear that the results of the first analysis are not predictive of the second. For example, as a group, the brain-damaged individuals might be at floor performance on a



*\*denotes attributes in which deviation of patients is significantly different at p* < *0.003.*

specific attribute and would show group effects when compared to control subjects. Because floor performances would mean relatively little variance in performance, that behavior would probably not correlate with variance in lesion locations in a VLSM analysis. In such a scenario, the attribute is likely to be associated with brain regions in a non-linear manner, sensitive to disruption in different possible areas. The converse is also possible. For example, if right inferior parietal damage were critical in apprehending a specific attribute, participants with such damage would have impaired performances and participants sparing this location would not. As a group, the right hemisphere damage group might not be statistically different than control participants, because many of the patients perform normally. However, in this case, VLSM analysis would reveal specific brain–behavior relationships.

For the first analysis to investigate whether right hemisphere damage in general produced specific impairments in esthetic impairment we conducted *t*-tests to test whether the group deviations were significantly different than the mean scores of the control participants, controlling for multiple comparisons at a significance level of (*p* < 0.003). We found that the patients as a group were impaired in judging the content-conceptual attributes of abstractness and depictive accuracy and the formal-perceptual attribute of stroke quality.

For the second analysis to test whether specific locations within the right hemisphere when damaged are likely to produce impairments in esthetic perception, we took the distribution of deviation scores and assess if the difference scores for each attribute obtained correlated with location of brain damage (**Table 4**) with a false discovery rate of 0.01 (**Figure 2**). In regards to the content-conceptual attributes, damage to the right frontal lobe, especially the inferior frontal gyrus, was associated with deviations in judgments of abstractness, realism, and animacy. Deviation in symbolism was associated with damage to posterolateral temporal cortex, especially the superior temporal gyrus. In addition, damage to the right parietal lobe was related to deviations in judgments of animacy. For the formal-perceptual attributes, damage to regions of the right temporal and frontal lobes as well as right insula was associated with deviations in perception of depth. Deviations in judgments of interest and preference were not associated with any specific regions of damage.

# **DISCUSSION**

Our study examined how brain damage affects the perception and evaluation of art. We were motivated to demonstrate that quantitative approaches in the neuropsychology of art are feasible. Our study is only a first step in this direction. In what follows, we shall mention the advantages of this approach and outline our results. We then discuss some limits of the study, and how the field might move forward.

The results demonstrate that the neuropsychology of art can be investigated in a systematic and quantitative manner. We have shown previously that art production can be approached quantitatively (Smith et al., 2011). Now, we extend this approach to art perception. Quantitative approaches have the advantage of allowing formal tests of hypotheses and replication. These advantages do not denigrate the qualitative insights one might derive from careful observation and theoretical analyses of art. However, it is hard to see how the neuropsychology of art could mature as a science without quantification (Chatterjee, 2009).

Our study incorporates quantification in two ways. First, is the use of the AAA. This assessment allows quantification of specific attributes of any artwork (Chatterjee et al., 2010). There is nothing about the assessment that restricts its use to neuropsychology. The assessment could just as easily be used for other purposes, such as to compare the work of different artists or to assess the nature of change in any given artists' style over time. Second, is our use of VLSM techniques (Bates et al., 2003). This method represents a general advance in lesion analyses and is being applied to the perception of art for the first time.

Our basic findings are that damage to the right hemisphere can affect the perception of selective aspects of art (see **Table 4**). This cohort of patients as a group had impaired performance when judging the content-conceptual attributes of abstractness and depictive accuracy and the formal-perceptual attribute of stroke quality. We also found that damage to lateral frontal–parietal– temporal cortices was associated with deviations in the judgment of 4/6 content-conceptual art attributes: abstractness, symbolism, realism, and animacy. Of the formal-perceptual attributes, only depth was correlated with damage to the inferior prefrontal cortex. The fact that the patients as a group were impaired in judging depictive accuracy and stroke quality and these attributes did not show specific brain–behavior correlations, suggests the following hypothesis. Judging the attributes of depictive accuracy and stroke quality maybe especially vulnerable to right brain damage in different locations and these attributes may instantiated nonlinearly in the brain. No brain area was associated with deviations


in judgments in evaluating preference or interestingness of these artworks.

Elsewhere, we have argued that any esthetic experience is built upon at least three components (Chatterjee, 2004a, 2011). These components are the experiences of the sensory qualities, the associated sets of meanings, and the emotional responses evoked by the esthetic object. Broadly, one might regard the formalperceptual attributes of theAAA as probing the sensory experience, the content-conceptual attributes as probing the meaning, and the evaluative questions as probing the emotional response to these paintings. From our data, we would tentatively propose that these three components of visual esthetic experiences segregate broadly in the organization of the brain. Most of our participants had damage in the distribution of the right middle cerebral artery. This distribution of brain damage involving lateral frontal, parietal, and temporal cortices was more likely to affect judgments of conceptual attributes. We would predict that damage in the posterior cerebral artery distribution affecting ventral occipital and temporal cortices might be more likely to affect perceptual attributes. Furthermore, given the extensive data implicating the ventral striatum and orbitofrontal cortex in assigning subjective reward values (Kable and Glimcher, 2009), we would predict that damage to ventro-medial prefrontal cortices would be more likely to affect people's evaluation of paintings.

We should be clear about the limits of this study. A general limit is that we have relatively little experimental control over ways that broad cultural and sociological factors might contribute to how people apprehend art. One would expect that cultural factors would be more likely to produce variance in judgments of content-conceptual attributes than formal-perceptual attributes (Chatterjee, 2002). Yet, w note that the content-conceptual attributes were more likely to be disrupted than the formal-perceptual attributes in this study, suggesting that role of cultural factors in this assessment were not sufficient to obscure the effects of brain damage. However, future studies that address both cultural and biological variables will be needed to provide a rich understanding of art apprehension.

Another specific limit of this study is the sampling of brain regions. While 20 participants is a relatively large group of braindamaged subjects, as mentioned above, we did not sample the ventral occipito-temporal or ventro-medial frontal cortices. We have studies currently underway to probe these areas. Another limit is that we restricted ourselves to people with right brain damage. Since this is the first study of its kind, we did not wish to confound our results with concomitant language comprehension deficits that follow from left brain damage. However, we have shown that art production can be profoundly affected by left brain damage (Smith et al., 2011). Given that production and perception must overlap at some representational levels,we would predict that left brain damage would also affect art perception. Again future studies will need to sort out the role of the left hemisphere in art perception.

Finally, we recognize that the description and evaluation of art are qualitatively different. People are more likely to agree about whether or not an image has warm tones than to agree about whether or not the image is appealing. This difference is

**FIGURE 2 | Results of the voxel lesion symptom mapping analyses showing areas where damage was associated with significant deviations of aesthetic attribute judgments.**

evident even in our healthy participants in whom agreement on preference was the lowest than it was for any of the descriptive scales. Given that the evaluation of artwork is less stable than descriptive judgments, assessing the effects of brain damage in preference is also more difficult. The problem of separating variance inherent in individual differences from those produced by the effects of brain damage remains a methodological challenge.

# **REFERENCES**


important role in advancing neuroesthetics. However, to date most neuropsychological reports related to art have been anecdotal and qualitative in nature. For this field to mature as a science, we advocate the use of quantitative methods. Here, we offer one approach that uses both quantitative behavioral and lesion analyses examining the role of the right hemisphere in art perception and evaluation.

To summarize, we believe that neuropsychology will play an

beauty. *Proc. Natl. Acad. Sci. U.S.A.* 106, 3847–3852.


an emerging field. *Brain Cogn.* 76, 172–183.


Brain systems for assessing facial attractiveness. *Neuropsychologia* 45, 195–206.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 09 March 2011; accepted: 14 September 2011; published online: 14 October 2011.*

*Citation: Bromberger B, Sternschein R, Widick P, Smith W II and Chatterjee A (2011) The right hemisphere in esthetic perception. Front. Hum. Neurosci. 5:109. doi: 10.3389/fnhum.2011.00109*

*Copyright © 2011 Bromberger, Sternschein, Widick, Smith and Chatterjee. This is an open-access article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.*

# **HOW DO WE SEE ART: AN EYE-TRACKER STUDY**

**Rodrigo Quian Quiroga and Carlos Pedreira**

# How do we see art: an eye-tracker study

# *Rodrigo Quian Quiroga1,2\*† and Carlos Pedreira1†*

*<sup>1</sup> Department of Engineering, University of Leicester, Leicester, UK*

*<sup>2</sup> Leibniz Institute for Neurobiology, University of Magdeburg, Germany*

#### *Edited by:*

*Luis M. Martinez, Instituto de Neurociencias de Alicante, CSIC-Universidad Miguel Hernandez, Spain*

#### *Reviewed by:*

*Mariano Sigman, Universidad de Buenos Aires, Argentina Luis M. Martinez, Instituto de Neurociencias de Alicante, CSIC-Universidad Miguel Hernandez, Spain*

#### *\*Correspondence:*

*Rodrigo Quian Quiroga, Department of Engineering, University of Leicester, LE1 7RH, Leicester, UK. e-mail: rqqg1@le.ac.uk*

*†Rodrigo Quian Quiroga and Carlos Pedreira have contributed equally to this work.*

# **INTRODUCTION**

What principles use our brain to create an exquisite representation of the external world, the one that allow us to recognize aface, reach to an object or appreciate a piece of art? Scientists have been dealing with this type of questions for quite some time, but the root of the query on how we perceive goes back to ancient philosophy. In fact, it was more than two milleniums ago that Aristotle, the brilliant Greek philosopher, noted that our minds create images, internal representations of the external world, which we use for our thought (Aristotle, De Anima; 431a, 431b). Later on, in the nineteenth century, Hermann von Helmholtz developed further this idea and argued that perception involves unconscious inferences from the incomplete information we get from the different senses (von Helmholtz, 1866; Gregory, 1997, 1998). The process of seeing is, therefore, far from a reproduction of the images impinging the retina. It is rather the result of our unique interpretation of ambiguous sensory information. The most notable confirmation of this statement is given by the existence of visual illusions, where assumptions made by our brain can trick or bias our perception, even if we are fully aware that this is happening (Gregory, 1997, 1998; Eagleman, 2001).

Visual information is converted into neural firing patterns in the retina and further processed by the cerebral cortex (Kandel et al., 2000). In cortex, the process of visual perception – i.e., extracting and inferring relevant features of what we see – starts in the primary visual area (V1), in the back of the head, and continues along the ventral visual pathway, going up to the infero-temporal cortex (IT; Logothetis and Sheinberg, 1996; Tanaka, 1996). Although much remains to be understood, after decades of research with single cell recordings in monkeys and fMRI studies in humans, converging evidence have shown that

We describe the pattern of fixations of subjects looking at figurative and abstract paintings from different artists (Molina, Mondrian, Rembrandt, della Francesca) and at modified versions in which different aspects of these art pieces were altered with simple digital manipulations. We show that the fixations of the subjects followed some general common principles (e.g., being attracted to saliency regions) but with a large variability for the figurative paintings, according to the subject's personal appreciation and knowledge. In particular, we found different gazing patterns depending on whether the subject saw the original or the modified version of the painting first. We conclude that the study of gazing patterns obtained by using the eye-tracker technology gives a useful approach to quantify how subjects observe art.

**Keywords: eye-tracker, art, mondrian, rembrandt, visual perception**

segregated areas in this processing pathway represent different aspects of the visual stimulus, from the encoding of local orientations in V1 (Hubel and Wiesel, 1962), to a more complex representation of faces in IT (Gross et al., 1969), and even concepts in higher, memory related areas (Quian Quiroga et al., 2005). Both using fMRI and single cell recordings, the subjectivity of perception has been clearly shown in different studies using ambiguous or alternating percepts. These studies linked the neuronal firing to the conscious perception by the subjects, which is triggered by the external stimulus but it is dominated by the subjects' own internal representations (Logothetis, 1998; Leopold and Logothetis, 1999; Kreiman et al., 2002; Quian Quiroga et al., 2008).

One of the most subjective perceptual experiences is given by arts and it is, perhaps, this unique and highly variable personal experience what makes art so attractive. But in spite of this subjectivity, the perception of, for example, a self-portrait of van Gogh is ruled by similar visual perception principles as those involved in recognizing a familiar face, a cat or an apple.We may then ask what processes in our brain make art so especial and, most importantly, how we can scientifically start addressing this question. Such studies require the interaction between art and science, two fields that, with few notable exceptions, as in the case of the genial Leonardo da Vinci, have grown in parallel with only counted interactions. In spite of the impact that the scientific study of art could have, it is somehow understandable that such enterprise is only starting to take off (for pioneering studies see Zeki and Lam, 1994; Gregory et al., 1995; Ramachandran and Hirstein, 1999; Zeki, 1999; Livingstone, 2002; Cavanagh, 2005). On the one hand, art perception is too subjective and challenging for rigorous scientific exploration. On the other hand, artists may fear that scientists could bring a

misleading reductionism that would oversimplify all the aspects involved in the appreciation of art.

In an attempt to bridge these two fields, i.e., using scientific methods to study art, in this study we used eye-tracking technology to quantify how subjects look at different art pieces. Far from a reductionist approach, we try to give further insights on how subjects "see art" according to their particular gazing patterns and, besides finding general common principles, we also observed a large variability according to the subjects' previous knowledge and preferences.

# **MATERIALS AND METHODS**

# **SUBJECTS AND STIMULI**

Eye-tracking recordings were done in 10 subjects (six male, four female; age 23–34) sitting comfortably in front of a 24 inches computer screen, at a distance of about 70 cm,in which different images were shown for 1 min each. Images were digital reproductions (size 1024 × 768 pixels) of original and modified art pieces from Piero della Francesca (Baptism of Christ), Piet Mondrian (Composition No. 8, and Composition II in red, blue, and yellow),Rembrandt van Rijn (Philosopher in meditation), and contemporary artist Mariano Molina (center of gaze). Modified versions of the paintings were done with Photoshop. Subjects were asked to freely view at the images, which were shown in a pseudo-randomized sequence. To balance and evaluate possible learning effects, i.e., seeing the original or the modified version of a painting first, half of the subjects saw the sequence of images in the forward order and the other half in the reverse order.

#### **RECORDINGS AND DATA PROCESSING**

The eye movement data was collected with an EyeLink II system (SR Research) in remote monocular mode with a sampling rate of 1 kHz. Before each stimulus presentation a quick calibration was done using a grid of five data points: one in the center and four displaced 150 mm in the four cardinal directions.

For each image presentation we obtained the spatial coordinates of each fixation. To quantify results, for each figure we defined different regions of interest (as described in the Results section) and compared the relative number of fixations in these regions, between the original and modified versions of the paintings, using *T*-tests.

# **RESULTS**

**Figure 1** shows the eye gazing pattern of a subject looking at the painting "center of gaze" by Mariano Molina, a contemporary artist from Argentina. In this painting the artist blurred most of the image, focalizing the attention of the viewer in a central area. Clearly, the saccades made by this and nine other subjects (see **Figure A1** in Appendix) were attracted to this "center of gaze." In particular, the number of fixations in this area during the 60 s of free-screening of the painting was significantly larger than those in the rest of the painting (right hand side of **Figure 1**; *T*-test: *p* < 10−3), in spite of the fact that the area of this region of interest was less than a third of the area of the rest of the canvas. The eye-tracker study in this case confirmed and quantified the degree to which the artist achieved the expected effect in his creation.

The dutch artist Piet Mondrian (1872–1944) is unanimously acclaimed for the mastery of balance and harmony in this paintings, created with vertical and horizontal lines, and an exquisite use of primary colors (Gombrich, 1995). The upper left plot of **Figure 2** shows Mondrian's Composition II in red, blue, and yellow and the saccade patterns of a subject looking at this painting. The balance in this canvas is in part achieved by the use of a small square of blue, with a high contrast with the surrounding white rectangles, and a much larger red patch, but with a lesser contrast with the neighboring rectangles. For this painting we defined four regions of interest, as marked in the figure, and observed that the subject's gazing was attracted to these regions. The figure in the right, displays a modified version of the painting, in which we altered the balance (high contrast small square vs. low contrast large square) by swapping the blue and the red color patches. Compared to the original, the modified painting looks awkward, out of balance. The eye-tracking pattern of the same subject confirmed this impression, as the fixations were basically constrained

to the high contrast area of the large blue square. In fact, it looks as if the fixations were somehow trapped in this area. Although with some variability, there was an overall similar pattern of fixations for the other subjects (**Figure A2** in Appendix). To quantify this observation, we calculated the number of fixations during the 60 s presentations in the four regions of interest for all subjects (**Figure 2** bottom plots) and statistically compared the number of fixations in large area minus the ones in the other three areas (i.e., ROI1 − [ROI2 + ROI3 + ROI4]) for the original and the modified Mondrian. The difference between the original and the modified version of the painting was significant (*T*-test: *p* < 0.05), in agreement with the observation that in the modified version the large blue rectangle attracts more fixations than the rest of the canvas.

To further study harmony and balance, as quantified by the number of fixations obtained with the eye-tracker,we used another famous Mondrian painting, the "Composition No. 8" (**Figure 3**, left hand side). In this case, the balance is obtained with a more complex combination of color (e.g., the red rectangle on the topleft corner) and the density of lines and line crossings (toward the center and bottom right of the painting). In the figure we show superimposed the saccades of a subject, who scrolled most areas of the painting and went back and forth from the top-left red rectangle to the center right line crossings and the salient colors of the very bottom. The right hand side figure shows a modified version of the painting,in which the red rectangle was moved to the bottom right. In this case the subject clearly biased his attention, resulting in a larger number of fixations in the lower right corner. As in the previous case, after a simple change the balance of the painting is completely lost. With some variability, similar results were seen for the other subjects (**Figure A3** in Appendix). For this painting we defined three regions of interest, as shown in the Figure, and we calculated the number of fixations in each of these regions. The bottom plots of **Figure 3** show the average relative number of fixations in each region of interest for all subjects. To quantify the

**FIGURE 2 | Original (left) and modified version (right) of Mondrian's "Composition II in red, blue, and yellow," with the fixation patterns of a subject superimposed.** For both versions of the painting we defined four regions of interest (ROI1 in blue, ROI2 in cyan, ROI3 in yellow, and ROI4 in red) and the average relative number of fixations in each region is shown on the bottom. In the modified version, the large blue patch attracts most fixations, thus breaking down the balance of the original painting.

bias toward the bottom right corner, we statistically compared the number of fixations in the bottom right (ROI3) with the ones in the other two areas (i.e., ROI3 − [ROI1 + ROI2]) for the original and the modified Mondrian paintings. This comparison showed a tendency toward larger number of fixations for the most salient region (ROI3) in the modified picture (*T*-test: *p* = 0.069), which did not reach significance due to a large variability in the number of fixations to the region ROI2. However, considering only ROI3– ROI1, the difference between the original and modified painting became significant (*T*-test: *p* < 0.05).

Next we studied two examples of figurative art. Rembrandt van Rijn (1606–1669) is considered one of the masters of the *chiaroscuro*, i.e., the use of strong contrasts between light and darkness (Gombrich, 1995). One fabulous example of his technique is given by "Philosopher in meditation" (left painting in **Figure 4**), where the black background on the left hand side increases the contrast of the white/yellow color of the window, thus giving a formidable reinforcement to the brightness of the sunlight coming through (for a detailed analysis of the use of color and contrast in this painting see Livingstone, 2002). The high contrast used by Rembrandt drives the attention to the figure of the philosopher and the rest of the scene remains somehow in the dark. This produces an extraordinary effect in the painting, as the character in the bottom right appears to be relegated to a secondary role and the spiral stair seems to be leading to a mysterious dark room upstairs. Such descriptions are of course very subjective and hard to be quantified or tested under rigorous scientific experimentation. However, we hypothesized that the removal of the black background on the left and bottom (**Figure 4** right hand side), should diminish the saliency of the philosopher and should make the picture look more homogeneous, somehow creating the perception of the picture "opening up," as if the overall illumination had been increased. The gazing pattern of a subject

**FIGURE 3 | Original (left) and modified version (right) of Mondrian's "Composition No. 8," with the fixations of a subject superimposed.** In the original version the subject explored most of the canvas and in the modified version, the gaze is attracted to the bottom right corner. The color rectangles show three regions of interest (ROI1 in blue, ROI2 in green, and ROI3 in red). The relative average number of fixations for all subjects are shown in the bottom plots. There were a larger number of fixations to ROI3 for the modified version.

is shown superimposed to the original and modified versions of the painting. As expected, in the original version there were a large number of fixations to the philosopher and fewer fixations to the character in the right or the rest of the scene, whereas in the modified version the distribution of fixations was more homogeneous. To quantify this observation we defined two regions of interest, around the philosopher (ROI1) and around the other character (ROI2), and calculated the difference between the fixations in these two regions (ROI1–ROI2) both for the original and the modified versions. The bottom panels show the average relative number of fixations for all subjects, where we observe that the difference in the number of fixations between both regions was larger for the original version, with a larger number of fixations to the philosopher. However, this difference did not reach statistical significance (*T*-test: *p* = 0.33). The lack of statistical significance was due to a large variability from subject-to-subject, as for this painting the particular and subjective interest of each of the subjects (e.g., an interest in the philosopher, the person in the right, an exploration of the background, previous knowledge of the painting, etc.) completely modified the obtained gaze pattern (see **Figure A4** in Appendix). It is also very interesting to note that a different result was obtained for the subjects that first saw the original version, compared to those that first saw the modified version. In the former group, a larger number of fixations to the philosopher was observed both in the original and modified versions (right upper plots in **Figure A4** in Appendix). This could be attributed to the fact that, in the modified version the subjects remembered and paid attention to the salient character of the philosopher they saw in the original version. On the contrary, for those subjects who first saw the modified version, the philosopher and the person on the right were at first equally salient and a larger number of fixations to the philosopher were obtained

only in the original version (right bottom plots in **Figure A4** in Appendix).

Finally we studied a masterpiece from the Renaissance, "The baptism of Christ" by the Italian master Piero della Francesca (1415–1492). The left hand side of **Figure 5** shows the original painting and the right hand side a modified version in which we changed the position of the dove (representing the Holy Spirit). In the original painting there is a vertical symmetry axis formed by the body and hands of Christ, his face and the position of the dove. The superimposed eye patterns of the subject shown in the figure clearly reflect the saliency of this axis, which is not present in the modified version. In fact, in the modified version the eye pattern of the subject seems more erratic, going from one face to the other, and the harmony of the vertical symmetric axis, the key feature of this composition, disappeared. For this painting we defined four regions of interest, as shown in the figure. The average number of fixations for all subjects (**Figure 5**, bottom plots) shows a larger number of fixations to the dove (ROI3) for the original painting compared to the modified one (*T*-test: *p* < 0.05). In contrast, there were no significant differences in the number of fixations to the dove in the modified position (ROI4). This stresses the fact that the dove is more salient when it is part of the abovementioned symmetry axis, but as in the case of the Rembrandt, we also found a completely different pattern of fixations depending on whether the subject looked first at the original or the modified version of the painting. For the subjects that first looked at the modified version (**Figure A5** in Appendix, bottom two rows), the dove was not salient (i.e., they were few fixations to it) and when later looking at the original, they spent time going back and forth the actual and previous position of the dove. In other words, they noted the change and ended up looking more times at region ROI4 in the original (even if this was the empty space where the

**FIGURE 5 | Original (left) and modified version (right) of Piero della Francesca's "Baptism of Christ" with the fixations of a subject superimposed.** In the modified version the position of the dove was changed, eliminating the vertical symmetry axis of the original version, something that is clearly observed in the saccade patterns done by the subject. The bottom plot shows the average relative number of fixations in four regions of interest (ROI1 in blue, ROI2 in cyan, ROI3 in yellow, and ROI4 in red).

dove was before), compared to the number of fixations in this region when the dove was actually there in the modified version. The subjects that looked at the original first tended to follow the abovementioned vertical axis and when later looking at the modified version, they also focused on the altered position of the dove (**Figure A5** in Appendix top two rows). This example clearly shows how the gazing patterns, which we use as a proxy to quantify and study the subjects' appreciation of art, can be completely modified by the subjects' previous knowledge; i.e., seeing the original or the modified version of the painting first.

# **DISCUSSION**

In this study we used eye-tracker technology to quantify how subjects look at art pieces. We found that in spite of the inevitable subject-to-subject variability, there were common basic patterns of fixations. In the painting by Molina, "The center of gaze," the attention of all the subjects was inevitably attracted to the area in the painting with a sharp resolution. Therefore, the manipulation implemented by the artist acted as a low level visual cue that forced a common pattern of exploration of the painting. We then studied the concept of balance in twoMondrian paintings and showed how simple alterations could destroy the delicate harmony of the composition, biasing the eye toward determined sections of the canvas. Although there was a similar pattern of response for the subjects, the manipulation was more complex than the one used in the "center of gaze" and therefore there was a much larger variability across subjects. These results are in line with previous studies showing changes in the perception of Mondrian's paintings

#### **REFERENCES**


when varying their orientation, proportional relations and original colors (McManus et al., 1993; Latto et al., 2000; Locher et al., 2005).

In the painting of Rembrandt we varied the contrast and, although there was a tendency in agreement with what was expected, we found that there was a large influence of the subjects' knowledge, depending on whether they first saw the original or the modified version of the painting. A similar effect was also present for the painting of Piero della Francesca. These examples show how the perception of art is a very complex process conditioned by factors at different levels. On the one hand, there are basic visual principles, such as contrast and saliency, which introduce some uniformity in the gazing pattern of different subjects by driving the attention to particular areas and, on the other hand, there are also more complex cognitive factors, such as previous experience and knowledge, which introduce a large variability across subjects.

In summary, we found some common principles in the way people look at art and a large variability depending on the subjects' own interests, artistic appreciation, and knowledge. This large subject-to-subject variability makes the scientific study of art very challenging. It is indeed very difficult to find common principles but this lack of uniformity and objectivity is perhaps one of the reasons that make art so unique, personal and fascinating.

#### **ACKNOWLEDGMENTS**

This work was funded by the MRC and the AHRC Beyond Text Program.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 22 March 2011; accepted: 22 August 2011; published online: 12 September 2011.*

*Citation: Quiroga RQ and Pedreira C (2011) How do we see art: an eye-tracker study. Front. Hum. Neurosci. 5:98. doi: 10.3389/fnhum.2011.00098*

*Copyright © 2011 Quiroga and Pedreira. This is an open-access article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, providedthe original authors and source are credited and other Frontiers conditions are complied with.*

# **APPENDIX**

**Mondrian.** The upper (lower) two rows show the subjects that looked at the

subjects.

**Mondrian.** The upper (lower) two rows show the subjects that looked at

the average relative number of fixations in each region of interest for the five subjects.

**FIGURE A4 | Saccade patterns for all 10 subjects looking at the original and modified versions of "Philosopher in meditation" by Rembrandt.** The upper (lower) two rows show the subjects that looked at the original (modified) version first. The graphs on the right of each row show the average relative number of fixations in each region of interest for the five subjects.

subjects.

**Francesca.** The upper (lower) two rows show the subjects that looked at the

# **SPECIFICITY OF ESTHETIC EXPERIENCE FOR ARTWORKS: AN FMRI STUDY**

**Cinzia Di Dio, Nicola Canessa, Stefano F. Cappa and Giacomo Rizzolatti**

# Specificity of esthetic experience for artworks: an fMRI study

# *Cinzia Di Dio1, Nicola Canessa2, Stefano F. Cappa2 and Giacomo Rizzolatti 1,3\**

*<sup>1</sup> Department of Neuroscience, Università degli Studi di Parma, Parma, Italy*

*<sup>2</sup> Center for Cognitive Neuroscience and CERMAC, Vita-Salute San Raffaele University, Milan, Italy*

*<sup>3</sup> Brain Center for Social and Motor Cognition, Italian Institute of Technology, Parma, Italy*

#### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Bernd Weber, Rheinische-Friedrich-Wilhelms Universität, Germany Philip D. Zelazo, University of Minnesota, USA Son Preminger, Interdisciplinary Center Herzliya, Israel*

#### *\*Correspondence:*

*Giacomo Rizzolatti, Department of Neuroscience, Università degli Studi di Parma, Via Volturno 39/E, 43100 Parma, Italy. e-mail: giacomo.rizzolatti@unipr.it*

In a previous functional magnetic resonance imaging (fMRI) study, where we investigated the neural correlates of esthetic experience, we found that observing canonical sculptures, relative to sculptures whose proportions had been modified, produced the activation of a network that included the lateral occipital gyrus, precuneus, prefrontal areas, and, most interestingly, the right anterior insula. We interpreted this latter activation as the neural signature underpinning hedonic response during esthetic experience. With the aim of exploring whether this specific hedonic response is also present during the observation of non-art biological stimuli, in the present fMRI study we compared the activations associated with viewing masterpieces of classical sculpture with those produced by the observation of pictures of young athletes. The two stimulus-categories were matched on various factors, including body postures, proportion, and expressed dynamism. The stimuli were presented in two conditions: observation and esthetic judgment. The two stimuluscategories produced a rather similar global activation pattern. Direct comparisons between sculpture and real-body images revealed, however, relevant differences, among which the activation of right antero-dorsal insula during sculptures viewing only. Along with our previous data, this finding suggests that the hedonic state associated with activation of right dorsal anterior insula underpins esthetic experience for artworks.

**Keywords: neuroesthetics, sculpture, human body, insula**

# **INTRODUCTION**

Neuroesthetics is the field of cognitive neuroscience that investigates the neural bases of esthetic experience. In visual art, esthetic experience appears to be based on an initial visual encoding of the observed artwork (Kawabata and Zeki, 2004) and on subsequent processing carried out in a series of higher order cortical areas (e.g., Vartanian and Goel, 2004; Lacey et al., 2011). Spatial coding (e.g., Cela-Conde et al., 2009; Cupchik et al., 2009), motor activation (Jacobsen et al., 2006; Freedberg and Gallese, 2007), and activation of emotional centers (Jacobsen et al., 2006; Di Dio et al., 2007; Cupchik et al., 2009) are some of the processes that appear to take place during esthetic experience (for a review, see Di Dio and Gallese, 2009).

In a previous study we investigated the neural correlates of esthetic experience during the observation of masterpieces of classical and renaissance sculpture (Di Dio et al., 2007) In this study, sculpture images were presented to participants in two versions: original ("canonical") and proportion-modified. The rationale underlying proportion modification was that, in these masterpieces, proportion is strictly related to esthetic evaluation of the stimuli. By altering proportion in a controlled fashion and by keeping every other factor constant, it was then possible to unfold the neural correlates associated with esthetic experience for these artworks. Furthermore, in this study participants viewed the stimuli in three conditions: observation, esthetic judgment, and proportion judgment. The distinctive feature of this protocol was to allow participants, during observation condition, to observe the images without expressing any *explicit judgment*. In fact, explicit judgments that require decision-making may induce specific task-related processes that could diminish the neural activation responsible of hedonic responses.

The results showed that, on the whole, independent of stimulus and condition types, the observation of images of classical and renaissance sculptures elicited activation of several visual areas, the inferior parietal lobule (IPL), the ventral premotor cortex plus the adjacent posterior portion of right inferior frontal gyrus (IFG), as well as deep structures, including the hippocampus and the insula.

Most interestingly, the contrast canonical vs*.* modified sculpture images revealed activation of a brain network, which included cortical areas encoding the physical properties of the stimuli, areas encoding implied motion, and the right anterior insula. The emotional response, hallmarked by insula activation (Mesulam and Mufson, 1982, 1985; Augustine, 1996; Damasio, 1999; Damasio et al., 2000; Craig, 2003; Dupont et al., 2003; Critchley et al., 2004, 2005) was particularly strong during observation condition, in which the participants could be said to respond most spontaneously to the presented images.

Support for the finding that the hedonic dimension of esthetic experience is related to insular activation also comes from a recent study by Cupchik et al. (2009). In this functional magnetic resonance imaging (fMRI) study, participants viewed various categories of paintings (portraits, nudes, still-life, and landscapes) that were presented in two conditions: one that required the participants to observe the images in an objective and detached manner to gather information about the content of the stimulus ("pragmatic condition"), and one that required them to observe the paintings in a subjective and engaged manner, appreciating the feelings evoked by the stimuli ("esthetic condition"). Note that, similarly to our "observation" condition, instructions given to the participants prior "esthetic" condition were to experience the mood evoked by the artworks without making any explicit judgment about the stimuli. Results showed that observation of paintings under the "esthetic" condition vs. baseline condition (viewing of paintings accompanied by no explicit task-related instructions) elicited bilateral activation of the insula, suggesting that this area is crucially implicated in the hedonic feeling associated with esthetic experience.

In the present study we investigated, using fMRI, whether the hedonic response associated with esthetic experience when viewing art masterpieces occurs also during the observation of non-art biological stimuli or whether it is distinctive of esthetic experience for artworks. For this purpose,we compared the activations evoked by sculpture images with those produced by the observation of real human body (HB) images depicting young athletes. The athletes posed with body postures that resembled those portrayed in the sculpture images (for details, see Material and Methods). In order to match the body configurations across stimulus-categories, all stimuli represented male figures (see **Figure 1** for an example of stimuli).

This study was composed of two experiments. In both of them, we presented the two stimulus-categories (art vs*.* biological nonart) in two conditions: observation and explicit esthetic judgment. The main difference between the two experiments laid in the stimulus presentation protocol and in the instruction provided to participants prior scanning. In Experiment 1, the stimuli (sculptures and real HB images) were presented intermixed in a semirandomized order within the same functional runs. This protocol emphasized the differences between the two stimulus-categories. In Experiment 2, each stimulus-category was presented separately in different functional runs. By keeping the two stimuluscategories in separate runs we intended to highlight differences in brain activations distinctive of each stimulus-category.

The results showed a similar, yet not identical, activation pattern for the two stimulus-categories. The direct comparisons between sculpture and real HB images revealed differences at the visual and, most importantly, at the emotional level of processing. We argue that the activation pattern observed for sculptures images, inclusive of insula activation, pinpoints the hedonic aspect of esthetic experience. This type of experience is lacking when viewing non-art biological stimuli.

# **MATERIALS AND METHODS PARTICIPANTS**

Thirty-two healthy right-handed Italian students [16 females (mean age = 21.4, SD = 1.23, range = 19–25) and 16 males (mean age = 23.43, SD = 1.39, range = 21–30)] participated in Experiment 1. Twenty-four healthy right-handed Italian students [12 females (mean age = 20.28, SD = 1.16, range = 19–23) and 12 males (mean age = 22.86, SD = 3.26, range = 19–30)] participated

**FIGURE 1 | Example of experimental stimuli used in this study. (A)** Images of canonical sculptures; **(B)** images of proportion-modified sculptures; **(C)** images of canonical real human bodies; **(D)** images of proportion-modified real human bodies. Proportion-modified images **(B,D)** are presented with a long trunk-short legs relation (images on the left) and with a short trunk-long legs relation (images on the right).

in Experiment 2. All participants were naïve to art criticism, as assessed during recruitment. They had normal or corrected-tonormal visual acuity. None reported a history of psychiatric or neurological disorders, or current use of any psychoactive medications. They gave their written informed consent to the experimental procedure, which was approved by the Ethics Committee of San Raffaele Scientific Institute (Milan) and Local Ethics Committee of Parma.

# **STIMULI**

Sixteen two-dimensional images of male sculptures (S) and 16 images of real male human bodies (HB) were chosen following the selection method described in Di Dio et al. (2007). For the present study, stimuli were selected out of an initial pool of images composed of a total 56 images of sculptures (28 canonical and 28 modified – see below) and 56 real HB images (28 canonical and 28 modified – see below). In this preliminary behavioral study, which was aimed at stimulus selection for the fMRI experiment, we examined the relation between esthetic judgment and proportion in 22 observers naïve to art criticism. Participants of the behavioral study underwent observation, esthetic judgment and proportion judgment conditions. To assess the probability that the stimuli

were perceived either proportioned or disproportioned according to our prior categorization (canonical and modified), during proportion judgment, participants had to rate stimuli proportion on a dichotomous measure (0 – disproportioned; 1 – proportioned). During esthetic judgment condition, on the other hand, participants had to rate the stimuli on a scale from 0 (ugly) to 7 (beautiful). By using a continuous scale, we aimed at increasing sensitivity on the assessment of the esthetic response to the stimuli, which still needed to be quantified in this preliminary behavioral stage.

The original canonical images of sculptures were chosen from classical examples that met the golden ratio criteria (proportion torso: legs (T–L) = 0.62 ± 0.01). The real-body images were selected from pictures taken specifically for this study by a professional photographer using athletes, whose body proportion and figure resembled those portrayed in the sculpture images. Also the proportion associated with the selected real-body images met the golden ratio criteria (proportion torso: legs (T–L) = 0.62 ± 0.02). Athletes were required to pose following the postures depicted in the sculpture images. All images were black and white and represented only male bodies that were comparable across categories in terms of body structure, proportion between body parts, posture, expressed dynamism. Expressed dynamism of the *canonical* sculpture and real-body stimuli was assessed by nine independent judges during the preliminary behavioral study aimed at stimulus selection (see above). The criteria according to which the evaluators assigned the stimuli to each category were the following: sense of balance, position of the limbs, feeling of motion, direction of eye-gaze, and facial expression. Based on the judges' rating, stimuli were initially categorized into 10 dynamic and 18 static sculpture images and 12 dynamic and 16 static realbody images. With respect to this further sub-categorization, the stimuli selected for the fMRI study contained an even number of judged-dynamic (8) and judged-static (8) images within each category.

A *modified* version of sculpture and real-body images was created by altering the proportion between torso and legs (T–L) of the original images, thus producing two new sets of stimuli identical to the formers except for proportion. Using the algorithm employed in the previous experiment (Di Dio et al., 2007), half of the images were modified by shortening the torso and elongating the legs (modification range T–L = 0.5 − 0.6), whereas the other half followed the opposite modification pattern, with long torso and short legs (modification range T–L = 0.64 − 0.75).

An example of the two stimulus-categories (original and modified) is in **Figure 1**.

#### **PARADIGM AND TASK EXPERIMENT 1**

The stimuli were presented in a 2 × 2 design, with two levels of stimulus-category [sculpture (S) and real HB (HB)] and two levels of stimulus-type [canonical (C) and modified (M)]. The stimuli were presented in two separate experimental conditions [observation (O) and esthetic judgment (AJ)]. Each participant underwent eight separate fMRI runs, repeating each experimental condition twice. The condition order was maintained fixed across all participants, with observation condition first (runs 1–4), and explicit esthetic judgment last (runs 5–8). By keeping the observation runs first, we aimed at measuring unbiased (spontaneous) brain responses to the stimuli. The participants expressed their explicit esthetic judgment during the esthetic judgment condition.

Every run comprised 32 trials. Sculpture images were presented in 16 trials, and real HB images were presented in the other 16. Within each category-specific trials, half of the images (eight) were presented in the canonical version and half (eight) in the modified version. To reduce possible cross-category carry-over cognitive effects, stimuli were presented in a semi-randomized order, with mini-blocks consisting of eight consecutive images of the same stimulus-category (either S or HB), never repeating the same image within a run.

Participants lay in the scanner in a dimly lit environment. The stimuli were viewed via a back-projection screen located in front of the scanner and a mirror placed on the head-coil. The software Presentation 11.0 (Neurobehavioral systems, Albany, CA, USA1) was used both for stimuli presentation and for the recording of the participants' answers. At the beginning of each run, a 4 s visual instruction informed the participants about the upcoming condition. On each trial, the stimulus appeared at the center of the screen for 2.5 s and was followed by a 3 s blank-screen interval. Subsequently, a question mark instructed the participants to respond to the stimulus in accordance with the task introduced (see below). The question mark remained on screen for 400 ms and was followed by an inter-stimulus interval (ISI; white-cross fixation) whose duration was varied ("jittered") at every trial, in order to desynchronize the timings of event-types with respect to the acquisition of single slices within functional volumes and to optimize statistical efficiency (Dale, 1999). The OptSeq2 Toolbox<sup>2</sup> was used to estimate the optimal ISIs (mean ISI <sup>=</sup> 3.87 s, range = 1.5–19.750 s). Each scanning run lasted approximately 6.5 min.

During observation condition (O), the participants were required to simply observe the images and, when the question mark appeared, they had to indicate whether they paid attention to the image or not. During the esthetic judgment condition, they were required to decide whether they esthetically liked the image or not. Thus, both conditions required a response from the participants. Using the index or middle finger of the right hand, the participants answered yes or no, according to the instruction presented at the start of each run. The question "did you pay attention to the image?" was introduced to make sure that participants were actually looking at the stimuli during fMRI scanning.

#### **PARADIGM EXPERIMENT 2**

Participants lay in the scanner in a dimly lit environment. The stimuli were viewed via digital visors (VisuaSTIM) with a 500,000 pixel × 0.25 square inch resolution and horizontal eye field of 30˚. The visors were applied directly on the volunteers' face. The digital transmission of the signal to the scanner was via optic fiber. The software E-Prime 2 Professional (Psychology Software Tools, Inc., Pittsburgh, PA, USA3) was used both for stimulus presentation and recording of the participants' answers.

<sup>1</sup> http://www.neurobs.com

<sup>2</sup> http://surfer.nmr.mgh.harvard.edu/optseq/

<sup>3</sup> http://www.pstnet.com

The structure of the experimental trials within each run was identical to that described for Experiment 1. Differently from Experiment 1, in Experiment 2 the total duration time of each run doubled (about 12 min), making four the total number of functional runs. However, the actual main difference with Experiment 1 laid in how stimuli were presented. In Experiment 1, stimuli presentation was organized in randomized mini-blocks of eight stimuli belonging to the same category (either S or HB). In Experiment 2, instead, half of the participants (*N* = 13) were presented with all sculpture images first (runs 1–2) and then with real HB images (runs 3–4), and half of the participants were presented with the opposite order. In this way, instructions for each experimental condition (particularly for observation condition, where we aimed at priming the proper mind-state) could be addressed more precisely in accordance with the specific stimulus-category to follow. More specifically, during observation condition of sculpture images the volunteers were required to observe the images as "they were in a museum." During observation condition of real HB, they had to observe images "as if leafing through a magazine where they would have seen boys posing for photograph shots." For both stimulus-categories, participants were instructed to relax and observe the stimuli trying to explore each image in full.

# **fMRI DATA ACQUISITION**

For Experiment 1, anatomical T1-weighted and functional T2∗-weighted MR images were acquired with a 3 T Philips Achieva scanner (Philips Medical Systems, Best, NL, USA), using an eight-channels Sense head-coil (sense reduction factor = 2). Functional images were acquired using a T2∗-weighted gradient-echo, echo-planar (EPI) pulse sequence (38 interleaved transverse slices covering the whole brain with the exception of the primary visual cortex and the posterior part of the cerebellum, TR = 3000 ms, TE = 30 ms, flip-angle = 85˚, FOV = 240 mm × 240 mm, inter-slice gap = 0.5 mm, slice thickness = 4 mm, in-plane resolution 2.5 mm × 2.5 mm). Each scanning sequence comprised 120 sequential volumes. Immediately after the functional scanning a high-resolution T1-weighted anatomical scan (150 slices, TR = 600 ms, TE = 20 ms, slice thickness = 1 mm, in-plane resolution 1 mm × 1 mm) was acquired for each participant.

For Experiment 2, anatomical T1-weighted and functional T2∗ weighted MR images were acquired with a 3 T General Electrics scanner equipped with an eight-channels receiver head-coil. Functional images were acquired using a T2∗-weighted gradientecho, EPI pulse sequence (acceleration factor asset 2, 37 interleaved transverse slices covering the whole brain, TR = 2100 ms, TE = 30 ms, flip-angle = 90˚, FOV = 205 mm × 205 mm, interslice gap = 0.5 mm, slice thickness = 3 mm, in-plane resolution 2.5 mm × 2.5 mm). Each scanning sequence comprised 306 sequential volumes. Immediately after the functional scanning a high-resolution inversion recovery prepared T1-weighted anatomical scan (acceleration factor arc 2, 156 sagittal slices, matrix 256 × 256, isotropic resolution 1 mm × 1 mm × 1 mm, TI = 450 ms, TR = 8100 ms, TE = 3.2 ms, flip-angle 12˚) was acquired for each participant.

# **fMRI STATISTICAL ANALYSIS**

Image pre-processing and statistical analysis were performed using SPM8 (Wellcome Department of Cognitive Neurology4), implemented in Matlab v7.6 (Mathworks, Inc., Sherborn, MA, USA;Worsley and Friston, 1995). The first 6 volumes (Experiment 1) and the first four volumes (Experiment 2) of each functional run were discarded to allow for T1 equilibration effects. All remaining volumes from each participant were then spatially realigned (Friston et al., 1996) to the first volume of the first run to correct for between-scan motion, and unwarped (Andersson et al., 2001). A mean-image from the realigned volumes was created. The T1 weighted anatomical image was coregistered to such mean-image, and segmented in gray-matter, white matter, and cerebro-spinalfluid. During the segmentation the gray-matter component was automatically normalized to a gray-matter probabilistic map5. The derived spatial transformations were then applied to the realignedand-unwarped T2∗-weighted volumes, that were resampled in 2 mm × 2 mm × 2 mm voxels after normalization. All functional volumes were then spatially smoothed with an 8-mm full-width half-maximum (FWHM) isotropic Gaussian kernel to compensate for residual between-subject variability after spatial normalization.

Statistical inference was based on a random-effects approach (Friston et al., 1999). This comprised two steps. At the first (singlesubject) level, fMRI responses were modeled in a design-matrix comprising the onset-times of the following regressors: instruction, stimuli (S and HB; C and M), blank intervals, and question mark that cued overt responses. Regressors modeling events were convolved with a canonical hemodynamic response function (HRF), and parameter estimates for all regressors were obtained at each voxel by maximum-likelihood estimation. Linear contrasts were used to determine (a) common effects (sculpture vs*.* baseline and real HB images vs*.* baseline, for both canonical and modified image types within each stimulus-category), and (b) differential effects associated with the presentation of the sculptures (C–M and M–C) and of the real HB images (C–M and M–C), separately for each of the two conditions (O and AJ). Finally, differential effects were also observed across stimulus-categories, contrasting the effects evoked by sculpture images vs. real HB images (and *vice versa*) within each experimental condition. For each participant, this led to the creation of 11 contrast-images in Experiment 1, that is one for each of the sub-conditions (2 × 2: stimulustype × stimulus-category) for each experimental condition (O and AJ) plus three common to all conditions (instruction, blank interval, and motor response); and of 10 contrast-images in Experiment 2, that is one of each of the sub-conditions (2 × 2: stimulustype × stimulus-category) plus two common to all conditions (blank interval and motor response – see below).

These contrast-images then underwent the second step where the regressors of interest were modeled into Flexible Factorial analyses. The models considered the pattern of activation of the two stimulus-types (C and M) vs. implicit baseline for each of the two stimulus-categories (S and HB) for each condition (O and AJ) Linear contrasts were used to compare these effects. Correction

<sup>4</sup> http://www.fil.ion.ucl.ac.uk/spm

<sup>5</sup> http://Loni.ucla.edu/ICBM/ICBM\_TissueProb.html

for non-sphericity (Friston et al., 2002) was used to account for possible differences in error variance across conditions and any non-independent error terms for the repeated measures.

Within the Flexible Factorial analyses, the following contrasts were tested. First, the "common effects of stimulus-category" (S, C + M vs. baseline) and (HB, C + M vs. baseline) averaging across the two experimental conditions (O and AJ). Second, contrasts explored main and simple effects of stimulus-category comparing activations in response to canonical sculpture vs*.* canonical real human stimuli canonical sculptures (SC) vs. canonical human bodies (HBC) and vice versa across and within the two experimental conditions (O, AJ). Finally, every stimulus–type (canonical vs. modified) specific effect was assessed within stimulus-category ("S": C vs*.* M, M vs*.* C; "HB": C vs*.* M, M vs*.* C) separately for each condition (O, AJ).

In order to analyze only activations above baseline, all contrast analyzes (in both Experiment 1 and 2) were masked inclusively for the effect under investigation (e.g., for the contrast SC–HBC during AJ, the contrast was masked inclusively by SC\_AJ). Results were thresholded at *P* < 0.05 FWE corrected at the cluster or voxel level (cluster size estimated with a voxel level threshold of *P*-uncorrected = 0.001).

The location of the activation foci was determined in the stereotaxic space of MNI coordinates system. Those cerebral regions for which maps are provided were also localized with reference to cytoarchitectonical probabilistic maps of the human brain, using the SPM-Anatomy toolbox v1.7 (Eickhoff et al., 2005).

# **RESULTS**

**PRELIMINARY BEHAVIORAL RESULTS**

#### *Proportion judgment*

In the preliminary behavioral study, aimed at stimulus selection, we assessed participants' capacity to recognize proportion modifications in both sculpture and real-body images. Proportion rating was taken on a dichotomous measure (0 = disproportioned; 1 = proportioned). Non-parametric data analyses for related samples were carried out on the sum of the scores obtained within each stimulus classification [SC, modified sculptures (SM); HBC, modified human bodies (HBM)] testing probability rating between pairs of stimulus-combinations.

Wilcoxon signed-rank test compared, separately, scores obtained for the canonical SC images with their corresponding modified versions (SM) and scores obtained for canonical HB (HBC) images with their modified versions (HBM). Results revealed that the probability to rate a canonical image proportioned was greater than for the modified images in both stimulus-categories [SC–SM = 19 positive differences, two negative differences (*N* = 22); *z* = 3.13, *p* = 0.002; HBC–HBM = 19 positive differences, two negative differences (*N* = 22); *z* = 3.49, *p* < 0.001]. Additionally, analyses were carried out comparing proportion scores across categories (SC vs. HBC; SM vs. HBM). Results revealed that the probability to rate canonical sculpture images proportion did not differ from that of rating proportion of canonical real-body images [SC–HBC = 10 positive differences, 11 negative differences (*N* = 22);*z* = 0.001, *p* = 1]. Similarly, comparison between the modified versions of the stimuli across category showed no significant differences in proportion ascription [SM–HBM = 11 positive differences, eight negative differences (*N* = 22); *z* = 0.46, *p* = 0.65].

#### *Esthetic judgment*

Esthetic ratings were provided on a scale ranging from 0 (ugly) to 7 (beautiful). Data analysis was carried out using repeated measures general linear models (GLM) and Greenhouse–Geisser values are reported when the sphericity assumption was violated (Mauchly's Test of Sphericity, *p* < 0.05).

To test for differences between canonical and modified stimuli across categories, a 2 × 2 repeated measures analysis was performed with two levels of stimulus-category (S vs*.* HB) and two levels of stimulus-type (C vs*.* M). The results revealed a main effect of stimulus-category [S > HB *F*(1,21) = 9.33, *<sup>p</sup>* <sup>=</sup> 0.006, partial <sup>η</sup><sup>2</sup> <sup>=</sup> 0.32, power <sup>=</sup> 0.83] as well as a main effect of stimulus type [C > M; SC mean = 4.1, SD = 1.22; SM mean = 3.8, SD = 1.29; HBC mean = 3.5, SD = 1.1; HBM mean <sup>=</sup> 3.2, SD <sup>=</sup> 0.9; *<sup>F</sup>*(1,21) <sup>=</sup> 24, *<sup>p</sup>* <sup>=</sup> 0.0001, partial <sup>η</sup><sup>2</sup> <sup>=</sup> 0.52, power = 0.99]. These results showed that sculpture images were rated esthetically higher than real-body images and that canonical stimuli were rated higher than their corresponding modified versions in both stimulus-categories.

# *Dynamic vs. static*

Canonical sculpture and real-body images were classified into dynamic and static according to the criteria described in the Section "Materials and Methods." Expressed dynamism was assessed by nine independent judges (inter-rater correlation coefficient (ICC) = 0.85; *p* < 0.001).

To test for differences in esthetic rating between dynamic and static stimuli across categories, a 2 × 2 repeated measures analysis was performed with two levels of stimulus-category (S vs*.* HB) and two levels of stimulus-dynamism (Static vs*.* Dynamic). The results revealed no effects of either stimulus-category or stimulus-dynamism (*p* > 0.05).

#### *fMRI behavioral results*

Behavioral data analysis was carried out on the basis of participants' responses during AJ scanning sessions. Responses were dichotomous (see Materials and Methods). Since each stimulus was repeated twice, only responses that were consistent between repetitions were used for analysis. Overall, most of the responses were congruent between repetitions (% of congruence Experiment 1: SC = 95,SM = 93,HBC = 92,HBM = 92;% of congruence Experiment 2: SC = 95, SM = 93, HBC = 92, HBM = 92).

A 2 × 2 repeated measures GLM analysis with two levels of stimulus-category (S vs. HB) and two levels of stimulus-type (C vs. M) was carried out considering the percentage of judged-asbeautiful responses ascribed to each stimulus type/category. On the whole, data obtained from the fMRI behavioral responses replicated the results described above for the preliminary study. Results from Experiment 1 showed a main effect of stimuluscategory [S <sup>&</sup>gt; HB; (*F*(1,30) <sup>=</sup> 4.29, *<sup>p</sup>* <sup>=</sup> 0.047, partial <sup>η</sup><sup>2</sup> <sup>=</sup> 0.13, power = 0.52)] as well as a main effect of stimulus-type [C > M; (*F*(1,30) <sup>=</sup> 18.22, *<sup>p</sup>* <sup>&</sup>lt; 0.001, partial <sup>η</sup><sup>2</sup> <sup>=</sup> 0.39, power <sup>=</sup> 0.99)]. Results from Experiment 2, showed only a main effect of stimulustype [C <sup>&</sup>gt; M; (*F*(1,17) <sup>=</sup> 21.14, *<sup>p</sup>* <sup>&</sup>lt; 0.001, partial <sup>η</sup><sup>2</sup> <sup>=</sup> 0.55, power = 0.99)] and no significant difference in esthetic rating across categories (*p* > 0.05).

Finally, using the categorization of the *canonical* stimuli into dynamic (*n* = 8) and static (*n* = 8), we carried out a 2 × 2 repeated measures GLM analysis on esthetic rating ascribed as a function of stimulus-category (SC vs*.* HBC) and of stimulus-dynamism (dynamic vs. static). Results revealed no significant differences in either Experiment 1 and 2.

### **fMRI RESULTS**

# *Experiment 1*

*Overall effect of viewing sculpture and real human body images.* In the first fMRI analysis, we assessed, separately, the overall effect of viewing the sculpture (S) and the real HB images. In both cases, we pooled together brain activations in response to canonical (C) and modified (M) images across the two conditions (observation and esthetic judgment) and contrasted them with implicit baseline.

With respect to sculpture images, BOLD signal increase was found in the occipital lobe, inferior, and middle temporal lobe, IPL/intraparietal sulcus), pre-SMA, ventral premotor cortex, and in IFG. Signal increase was also observed in deep structures, including the hippocampus, amygdala, and insula. Most of the activations were bilateral, although more extensive in the right hemisphere (**Figure 2A**). The results are summarized in **Table 1A**.

With respect to viewing real HB images, BOLD signal increase was mostly found in the same areas that were activated when viewing sculpture images (**Figure 2B**; **Table 1B**). The main difference between the overall activations evoked by the two stimuluscategories laid in the lack of activation of the insular cortex when viewing real HB images (see between-category analysis below).

#### *Between-category differences*

*Canonical sculpture vs. canonical real human bodyimages.* This analysis was carried out comparing activations associated with observation of canonical stimuli only. Direct comparison between sculpture and real HB images across experimental conditions (observation and esthetic judgment), revealed enhanced activation for canonical sculpture images in the fusiform gyrus bilaterally. Simple contrasts analyses within each experimental condition revealed additional enhanced activation of the antero-dorsal portion of the right insula during esthetic judgment condition (**Figures 3A,B**). These results are summarized in **Table 2A**.

# *Canonical real human body vs. canonical sculpture images.*

Direct comparison between canonical real HB and canonical sculpture images across experimental conditions (observation and esthetic judgment), revealed enhanced activation for real HB images in right thalamus and in right superior temporal sulcus (STS, 46 50 10, *K*<sup>E</sup> = 63, *P*-uncorr = 0.008). Contrast analysis of simple effects between stimulus-categories for each experimental condition separately revealed that these activations were particularly enhanced during esthetic judgment condition (**Figures 4A,B**; **Table 2A**).

# *Between-types differences*

*Canonical vs. modified sculpture images.* The direct comparison of canonical vs. modified sculpture images produced no significant

**FIGURE 2 | Activations for (A) sculpture and (B) real human body images vs. implicit baseline in Experiment 1 pulling together canonical and modified stimulus-types across the two experimental conditions (observation and esthetic judgment).** *P*-FWE corr < 0.05.

enhanced activation for either canonical or modified images in neither of the two experimental conditions (observation and esthetic judgment). These findings are in contrast with the results obtained in our former experiment (Di Dio et al., 2007), where the direct contrast of canonical vs. modified images across all experimental conditions revealed signal increase for the canonical stimuli in some cortical areas and in right insular cortex, particularly during observation condition (see Experiment 2 below).

*Canonical vs. modified real human body images.* The direct comparison of canonical minus modified real HB images produced no significant enhanced activation in neither of the two experimental conditions (observation and esthetic judgment). The opposite direct comparison (modified minus canonical real HB images), on the other hand, produced enhanced activation for the modified images in the left amygdala during observation condition, as well as enhanced activation in a posterior cortical region straddling the inferior and middle temporal gyri during esthetic judgment condition (**Table 2B**).

**Table 1 | Brain activations reflecting the common effects of Canonical and Modified stimuli (pulled together) vs. baseline across conditions (observation and esthetic judgment) observed in Experiment 1 for (A) sculpture and (B) real HB images.** The statistical significance refers to *P*-FWE corr < 0.05 at the voxel level.


*(Continued)*

#### **Table 1 | Continued**


**FIGURE 3 | (A)** Activation in the contrast canonical sculpture minus canonical real human body images (masked incl. by canonical sculpture images) during esthetic judgment condition in Experiment 1. **(B)** Activity profile within right insula (38 26 10) in arbitrary units (a.u.), ±10% confidence intervals (*P*-FWE corr < 0.05).

# **EXPERIMENT 2**

# *fMRI results*

*Overall effect of viewing sculpture and real human body images.* In this analysis, we assessed, separately, the overall effect of viewing both sculpture (S) and real HB images pooling together brain activations in response to canonical (C) and modified (M) images across the two conditions (observation and esthetic judgment) with respect to implicit baseline.

**Figure 5A** shows BOLD signal increase for sculpture images. Most of the activations replicated those observed in Experiment 1 (see **Table 3A**). Activated areas included occipital cortex, fusiform gyrus, lingual gyrus, posterior parietal cortex, IPL, pre-SMA, premotor cortex, and IFG. Additionally, enhanced activations were observed in deep structures, including hippocampus, amygdala, and the anterior insula. Most of the activations were bilateral. Finally, differently from Experiment 1, signal increase was also found in medial frontal areas, including right anterior cingulate cortex and left orbitofrontal cortex.

**Figure 5B** shows activations relative to viewing real HB images. Similarly to Experiment 1, BOLD signal increase was found in the same areas that were activated when viewing sculpture images, the major difference being an additional activation at the level of the basal ganglia nuclear complex (**Table 3B**).

# *Between-category differences*

*Canonical sculpture vs. canonical real human body images.* Direct comparison between canonical sculpture and canonical real HB images across experimental conditions (observation and esthetic judgment) revealed greater activations for sculpture images in lingual and fusiform gyri. Additional activations were observed from simple contrast analyses. More specifically, during observation condition there was increased activation for canonical sculpture vs*.* canonical HB images in right cuneus, right IPL, right IFG pars triangularis, and pars opercularis, and in the anterior dorsal part of right insula (**Table 4A**).

# *Canonical real human body vs. canonical sculpture images.*

The direct comparison between canonical real HB vs*.* canonical sculpture across experimental conditions (observation and esthetic judgment) revealed enhanced activations bilaterally in the caudal part of the temporal lobe straddling the middle and superior temporal gyri and extending medially to include the STS. Simple contrast analyzes showed that activation of left STS was particularly strong during esthetic judgment condition (**Table 4A**).

# *Between-type differences*

*Canonical vs. modified sculpture images.* The direct comparison of canonical vs*.* modified sculpture images revealed significant differences during observation condition only. More specifically, signal increase was observed for canonical images in the caudal

**Table 2 | Brain activity reflecting the effect of stimulus (A) category (canonical sculpture and canonical real HB images) and (B) type (canonical and modified) for the two conditions (O** = **observation; AJ** = **esthetic judgment) in Experiment 1; SC** = **sculpture canonical; SM** = **sculpture modified; HBC** = **real human body canonical; HBM** = **real human body modified.** The reported statistical significance is at cluster level and refers to activations significant (*P*-FWE corr < 0.05) at the cluster and/or voxel level.


part of right middle temporal gyrus, IFG pars triangularis, and, crucially, in right anterior dorsal insular cortex (**Figures 6A,B**; **Table 4B**).

The contrast modified vs. canonical sculpture images revealed signal increase during esthetic judgment condition in right supramarginal gyrus and right ventral premotor cortex (BA44; **Table 4B**).

#### *Canonical vs. modified human body images*

*The comparison of canonical vs. modified HB images revealed no significant differences.* The opposite contrast (modified vs. canonical HB images), on the other hand, revealed signal increase for the modified images in ventral premotor cortex (BA44) during observation condition and in superior parietal lobule, inferior temporal gyrus, and fusiform gyrus during esthetic judgment condition. All activations were lateralized in the right hemisphere (**Table 4B**).

### **DISCUSSION**

In a previous study we showed that activation of the anterior sector of the right insula is associated with the hedonic state underpinning esthetic experience during the observation of artworks (Di Dio et al., 2007). The main aim of the present study was to investigate whether this specific hedonic response is also present during the observation of non-art biological stimuli. For this purpose, we compared brain activations when participants observed sculpture images with brain activations during the observation of real HB represented by photographs of young athletes.

The global pattern of cortical activations during the presentation of sculptures and real HB was very similar. Activations included visual occipital and temporal areas, IPL/intraparietal sulcus, ventral premotor cortex, and IFG. Signal increase was also observed in deep structures, such as the hippocampus and amygdala. Most of the activations were bilateral, although more extensive in the right hemisphere. The direct comparison between SC and canonical real bodies highlighted, however, some important differences. The observation of sculpture images determined, relative to real HB images, a greater activation of *right anterior dorsal insula,* as well as activation of some visual areas and, in particular, of fusiform gyrus. The opposite contrast (HB minus sculpture images) showed a greater activation of the STS.

It is known from both monkey (see Desimone et al., 1984; Tsao et al., 2006; Gross, 2008) and human studies that portions of the inferotemporal lobe and of its human homolog (the fusiform gyrus), play a crucial role in the processing of faces (for review see McKone and Kanwisher, 2005; Gross, 2008). Furthermore, it was also shown that some sectors of fusiform gyrus encode, with nearly the same level of selectivity, images of human body (Peelen and Downing, 2004; Schwarzlose et al., 2005). In this light, it is plausible that the fusiform activation observed in the present study reflected a detailed visual analysis of the physical aspects of the body (e.g., size, shape, proportion) of the sculpture as compared to real HB images.

The comparison between real HB vs. sculpture images showed a consistent activation of the STS. STS is a region known to be involved in visual processing of movement of body parts. Thus, STS activation was likely due to a matching between the observed HB images and the representation of body movement encoded in this region (see Perrett et al., 1989; Allison et al., 2000; Pelphrey et al., 2004; Thompson et al., 2005). Note that, although in the

present study we used static stimuli, there is evidence that these stimuli, when implying motion, are able to activate visual areas encoding overt movements, as shown for area MT/V5 by Kourtzi and Kanwisher (2000).

In the present study, both sculpture and real-body images contained an even number of static and dynamic stimuli. It is then likely that activation differences observed between real-body and sculpture images were not be ascribed to differences in some stimulus properties (such as dynamism – see also behavioral results), but rather to different attention deployment in the two cases. Attention was more focused on action in the case of real human images, whilst it was more focused on the physical aspects of the body in the case of sculpture images. In turn, these different attention allocations could be related to different attitudes toward the presented images. In the case of the real HB, the implicit attitude of the observers would be that of trying to understand the meaning of the represented gestures and, possibly, the intention of the observed individuals. In contrast, the sculptures constitute an

artistic representation of the HB and the spontaneous attitude of the observers would be that of exploring them with the purpose of appreciating their physical properties.

The most important finding of our study lays, however, in the activation of right insula in the contrast sculpture vs. real HB images. The activated part of the insula was located in its rostrodorsal sector. This sector corresponds to the insular region also found activated in our previous study in the contrast canonical vs. modified sculpture images (Di Dio et al., 2007) and confirmed from the same contrast in Experiment 2 of the present study. Since canonical proportions are positively related to esthetic evaluation of sculpture images, we interpreted this activation as the hedonic signature of esthetic experience when viewing artworks.

Insula is an extremely complex and heterogeneous structure including a posterior granular (sensory part), a central large dysgranular, and a small rostro-ventral agranular (motor and vegetative parts) sector (see Mesulam and Mufson, 1982, 1985; Augustine, 1996). A recent meta-analysis of the human insula by **Table 3 | Brain activations reflecting the common effects of canonical and Modified stimuli (pulled together) vs. baseline across conditions (observation and esthetic judgment) observed in Experiment 2 for (A) sculpture and (B) real HB images.** The statistical significance refers to *P*-FWE corr < 0.05 at the voxel level.


*(Continued)*

#### **Table 3 | Continued**


**Table 4 | Brain activity reflecting the effect of stimulus (A) Category (canonical sculpture and canonical real HB images) and (B) Type (canonical and modified) for the two conditions (O** = **observation; AJ** = **esthetic judgment) in Experiment 2; SC** = **sculpture canonical; SM** = **sculpture modified; HBC** = **real human body canonical; HBM** = **real human body modified.** The reported statistical significance is at cluster level and refers to activations significant (*P*-FWE corr < 0.05) at the cluster and/or voxel level.


Kurth et al. (2010) revealed four functional distinct regions corresponding to sensory–motor, olfacto-gustatory, social–emotional, and cognitive networks of the brain. Social–emotional aspects activate the ventro-rostral part of the insula while all tested functions, except for sensory–motor function, overlap on its anterior dorsal portion. These data allow one to specify better the functional role of this region in mediating hedonic experiences when viewing artworks. This region is not encoding the mere emotional aspect of the stimuli, but integrates cognitive and emotional processes to create a coherent experience of the attended stimuli. Although activation of this region is not uniquely deputed to esthetic experience (see Kurth et al., 2010), our results indicate that

it plays a fundamental role in providing an hedonic quality to art processing.

One may argue that insular activation observed for sculpture images, and not for real HB stimuli, could have been triggered by the sculptures complete nudity, a factor that was not counterbalanced between categories. In this respect, some experimental evidence coming from studies investigating the neural correlates of emotional response to arousing stimuli report insula activation. Often, the arousing stimuli represent film clips or photographs depicting nudes and sex scenes (e.g., Stoléru et al., 1999; Gizewski et al., 2006; Safron et al., 2007). In these studies, right ventral insula and/or left insula were found activated when attending arousing stimuli. The rostro-ventral insular sector found activated in these studies is different from the more dorsal sector observed in our study. Anterior ventral insula is often associated with a representation of autonomic states (e.g., Critchley et al., 2002) and with the presentation of stimuli holding a socio-emotional status (see Kurth et al., 2010 for a review). Most noteworthy, our results indicate that the insular sector found activated in the contrast sculpture vs*.* real HB stimuli showed also a lower activation in association with decreased esthetic valence conveyed by the proportion-modified stimuli. For this reason, we suggest that the right antero-dorsal insular activation observed for sculpture images in the present study is evoked by an hedonic state associated with the esthetic dimension of the sculptures.

Insular activation was absent in the case of observation of real HB images, irrespective of proportion modification. It is worth noting, in this respect, that behavioral data showed that proportion affected esthetic rating in both stimulus-categories; namely, the canonical images were preferred to modified images also in the case of real HB. What these data seem to suggest is that the enhanced insular activation observed for sculpture images compared to real-body images, and particularly for canonical ones, emerged from attendance to specific physical properties of the sculpture images that, when altered, determined a diminished hedonic response in the viewer. This specific hedonic response was not present when judging the esthetics of real-body images. This does not imply that there is no esthetic experience associated with the viewing of real-body images. However, our data show that this experience does not have the same neural substrates as those underpinning the viewing of sculptures. Exploration of the neural correlates associated with esthetic experience for real HB was beyond the purpose of the present study and we cannot assert any conclusions on this issue.

# **CONCLUSION**

Here we tested whether the neural activations underpinning hedonic experience when viewing an artistic representation of the HB (masterpieces of classical art) are also present when observing images of non-art biological stimuli (real HB). Imaging results indicated that esthetic experience for artworks recruited the anterior sector of right dorsal insula. This sector was not activated when attending real HB images. This indicates that esthetic experience for artworks and non-art biological stimuli does not share the same neural substrate.

It would be too reductive, however, to think that esthetic experience occurs because of the activation of the antero-dorsal insula alone. Our view is that esthetic experience derives from a *joint* activity of neural cortical populations responsive to specific elementary or high order features present in works of art and neurons located in emotion controlling centers. A recent meta-analysis on the functional properties of the different sectors of the insula indicates that the insular region we found activated during the viewing of artworks does not merely mediate emotions but links emotion

# **REFERENCES**


to cognition. We suggest that this binding plays a fundamental role in determining the hedonic dimension of esthetic experience for artworks.

# **ACKNOWLEDGMENTS**

We are grateful to Fondazione Cassa di Risparmio di Parma (CARIPARMA) for providing the facilities to conduct this study.


P. J. (2007). Neural correlates of sexual arousal in homosexual and heterosexual men. *Behav. Neurosci.* 121, 237–248.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 28 March 2011; accepted: 31 October 2011; published online: 18 November 2011.*

*Citation: Di Dio C, Canessa N, Cappa SF and Rizzolatti G (2011) Specificity of esthetic experience for artworks: an fMRI study. Front. Hum. Neurosci. 5:139. doi: 10.3389/fnhum.2011.00139*

*Copyright © 2011 Di Dio, Canessa, Cappa and Rizzolatti. This is an openaccess article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.*

# **AESTHETIC APPRECIATION: EVENT-RELATED FIELD AND TIME-FREQUENCY ANALYSES**

**Enric Munar, Marcos Nadal, Nazareth P. Castellanos, Albert Flexas, Fernando Maestú, Claudio Mirasso and Camilo J. Cela-Conde**

# Aesthetic appreciation: event-related field and time-frequency analyses

#### *Enric Munar <sup>1</sup> \*, Marcos Nadal 1, Nazareth P. Castellanos 2, Albert Flexas 1, Fernando Maestú2, Claudio Mirasso3 and Camilo J. Cela-Conde1*

*<sup>1</sup> Human Evolution and Cognition (EvoCog), University of the Balearic Islands, Palma (Mallorca), Spain*

*<sup>2</sup> Laboratory of Cognitive and Computational Neuroscience, Complutense University of Madrid-Polytechnic University of Madrid, Madrid, Spain*

*<sup>3</sup> IFISC, Instituto de Física Interdisciplinar y Sistemas Complejos (CSIC-UIB), University of the Balearic Islands, Palma (Mallorca), Spain*

#### *Edited by:*

*Luis M. Martinez, Universidad Miguel Hernández, Spain*

#### *Reviewed by:*

*Rodrigo Q. Quiroga, University of Leicester, UK Manuel Molano, Instituto de Neurociencias de Alicante, Spain*

#### *\*Correspondence:*

*Enric Munar, Human Evolution and Cognition (EvoCog), University of the Balearic Islands, Ctra. de Valldemossa, km. 7'5., 07122-Palma, Illes Balears, Spain. e-mail: enric.munar@uib.cat*

Improvements in neuroimaging methods have afforded significant advances in our knowledge of the cognitive and neural foundations of aesthetic appreciation. We used magnetoencephalography (MEG) to register brain activity while participants decided about the beauty of visual stimuli. The data were analyzed with event-related field (ERF) and Time-Frequency (TF) procedures. ERFs revealed no significant differences between brain activity related with stimuli rated as "beautiful" and "not beautiful." TF analysis showed clear differences between both conditions 400 ms after stimulus onset. Oscillatory power was greater for stimuli rated as "beautiful" than those regarded as "not beautiful" in the four frequency bands (theta, alpha, beta, and gamma). These results are interpreted in the frame of synchronization studies.

**Keywords: neuroaesthetics, experimental aesthetics, MEG, ERF, visual perception, time-frequency analysis**

# **INTRODUCTION**

Neuroaesthetics is growing fast, and the neural correlates of aesthetic appreciation are now coming into focus (Chatterjee, 2011; Nadal and Pearce, 2011). Neuroimaging studies have revealed that positive aesthetic experiences, reported as high liking, preference, or beauty ratings, are associated with at least three patterns of brain activity. First, the enhancement of low-and high-level visual, somatosensory, and auditory cortical processing has been observed while people report aesthetically positive engagements with paintings or landscape photographs Vartanian and Goel (2004) (Vartanian and Goel, 2004; Yue et al., 2007; Cela-Conde et al., 2009; Cupchik et al., 2009), dance movements or postures (Calvo-Merino et al., 2008, 2010), and music excerpts (Brown et al., 2004; Koelsch et al., 2006), respectively. Second, activity in cortical regions involved in top-down processing and evaluative judgment is also a common finding (Cela-Conde et al., 2004; Jacobsen et al., 2006; Lengger et al., 2007; Cupchik et al., 2009). Finally, several studies have reported activation of cortical and subcortical brain regions considered to be part of the reward circuit. These regions are related with different facets of affective and emotional processing. Namely, the orbitofrontal cortex, which seems to be involved in the representation of reward value, has been associated with positive aesthetic experiences of music (Blood et al., 1999; Blood and Zatorre, 2001), architecture (Kirk et al., 2009a), and paintings (Kawabata and Zeki, 2004; Cupchik et al., 2009; Kirk et al., 2009b). Activity in the anterior cingulate cortex, possibly related with the monitoring of one's own affective state, has also been identified while rating paintings (Vartanian and Goel, 2004; Cupchik et al., 2009), architecture (Kirk et al., 2009a) and music (Blood et al., 1999). Subcortical components of the reward circuit, such as the ventral striatum, the caudate nucleus, the substantia nigra, or the amygdala, have been shown to be involved in aesthetic experiences by a considerable number of studies (Blood et al., 1999; Blood and Zatorre, 2001; Brown et al., 2004; Vartanian and Goel, 2004; Koelsch et al., 2006; Bar and Neta, 2007; Gosselin et al., 2007; Mitterschiffthaler et al., 2007; Cupchik et al., 2009; Kirk et al., 2009b; Salimpoor et al., 2011).

Although these studies collectively provide an overall picture of the brain regions involved in aesthetic appreciation, little is known about the temporal course of the underlying neural processes. In the present study we apply novel neuroimaging data analyses, currently used in diverse areas of the neurosciences, to explore and tentatively characterize the dynamics of the neural correlates of aesthetic preference. We thus aim to overcome a common objection faced by neuroimaging data analysis: the assumption of the stationary nature of neurophysiological signals. Most spectral studies of continuous time series, such as Electroencephalography (EEG) or Magnetoencephalography (MEG) recording, involve the use of a spectral analysis based on Fourier transformation. Although this technique has been extremely fruitful in the advance of neuroscience, it assumes that neural activity under study is stationary, and thus does not allow inferences on dynamical changes. Indeed, any analysis based entirely on the classical Fourier transform ignores the dynamical aspects. New methods suited to reveal temporal variations are therefore, required to study the essential role of temporal resolution.

There have been successful attempts to adapt Fourier-based methods, for example, by means of sliding windows (Bayram and Baraniuk, 1996; Lovett and Ropella, 1997; Xu et al., 1999), similarly to the classical Gabor transform (Mallat, 1999). The wavelet transform is a method of time series analysis capable of coping with complex non-stationary signals—it was, in fact, designed to do just that. Although it has been increasingly used in the field of neuroscience during the last decade, it has been part of brain signal analysis from the very beginning. The analysis of EEG recordings has been its most frequent application (see e.g., Alegre et al., 2003; Quiroga and Garcia, 2003; Castellanos and Makarov, 2006; Campo et al., 2010; Castellanos et al., 2010). The wavelet transform technique provides high temporal resolution with good frequency resolution, and offers a reasonable compromise between these parameters. These advantages fit well with the purpose of Time-Frequency (TF) estimation of a signal, allowing the study of the spectral power dynamics, and hence a detailed comparison between experimental conditions during all the steps composing a designed task (Lindsen et al., 2010).

In this article, we report the results of two different analyses of MEG data recorded during a typical aesthetic appreciation task. First, we present the results of a standard event-related field (ERF) analysis. Second, we also performed a TF analysis, which, on the one hand, compared pre- and post-stimulus activity in different frequency bands and, on the other, compared the different activity related with stimuli regarded as beautiful and not beautiful in different bands and brain regions. This kind of TF analysis avoids the misleading simplification resulting from the localization of apparently static and isolated foci of neural activity. It has the potential, therefore, of making a significant step forward in the characterization of the dynamics of large-scale neural communication inherent to aesthetic appreciation, among many other complex cognitive faculties (Lindsen et al., 2010), which emerges from multifaceted cognitive processes related with neural activity in different brain structures and at different time frames.

# **MATERIALS AND METHODS PARTICIPANTS**

Ten women and 10 men volunteered to perform the aesthetic appreciation task. They were all graduate students at the Complutense University in Madrid. They all had normal or corrected-to-normal vision, and no previous training in art. All were right-handed and gave informed consent. The experiment was approved but the Ethical Committee of the Comunitat Autònoma de les Illes Balears (Spain).

# **PROCEDURE**

The resolution, size, perceived complexity, color spectrum, and luminous emittance were homogenized for the 400 stimuli used in this study. These five operations and the stimuli were described in detail by Cela-Conde et al. (2009). These images belonged to five different categories, and represented a broad enough range of contents and styles to provide each participant with positive and negative aesthetic experiences throughout the task: (1) Fifty reproductions of abstract paintings; (2) Fifty reproductions of seventeenth and eighteenth century realist paintings; (3) Fifty reproductions of Impressionist paintings; (4) Fifty reproductions of Post-impressionist paintings; (5) Two hundred photographs of landscapes, artifacts, urban scenes, and the like (true-life pictures from the Master Clips Premium Image Collection, IMSI, San Rafael, CA; the book *Boring Postcards*, London, Phaidon Press; and photographs taken by us). We used the collection *Movements in Modern Art* from the Tate Gallery, London, as a guide to select artistic styles, to which we added seventeenth and eighteenth century realist painting. To avoid the activation of facial-recognition brain mechanisms, pictures containing close views of humans were not included. Four stimuli (two artistic and two natural) were used for the participants' preliminary training.

Before going into the MEG isolated room, participants received a short briefing about the technique and the aesthetic appreciation task they were required to carry out. They were informed that it was a two-alternative forced choice (2AFC), with two response levels: (a) beautiful, and (b) not beautiful. Participants were asked to indicate whether they found each stimulus to be "beautiful" or "not beautiful", based on their own subjective opinion. Half of them indicated "beautiful" by raising an index finger, and the other half indicated "not beautiful" also by raising an index finger. The remaining stimuli were classified as belonging to the other condition—as "not beautiful" and "beautiful," respectively. Half of the participants used the right index finger to answer and the other half used the left index finger. At the beginning of the task, a gray screen appeared for 1 s, then a stimulus was presented for 3 s, during which the participant could choose to respond or not. There was a random interstimulur interval lasting between 1000 and 1200 ms after each 3 s presentation. The same four stimuli were presented at the beginning for all participants' training trials.

During the experimental session, the stimuli were presented using a computer running the SuperLab application. The images were projected with an LCD video projector placed outside of the MEG shielded room onto a series of mirrors located inside, the last of which was suspended ≈1 m above the participant's face. The pictures subtended 1.8◦ and 3◦ of vertical and horizontal visual angles, respectively. While participants carried out the aesthetic appreciation task, MEG recordings were performed with a whole-head neuromagnetometer (Magnes 2500 WH, 4-D Neuroimaging) consisting of 148 magnetometer coils.

Raw data were collected using a sampling rate of 668.45 Hz and band pass filtered between 0.1 and 50 Hz. MEG data were subjected to an interactive environmental noise reduction procedure. Fields were measured during a no task opened-eyes condition. Time-segments containing eye movement or blinks (as indicated by peak-to-peak amplitudes in the electro-oculogram channels in excess of 50µV) or other myogenic or mechanical artifact were rejected and time windows not containing artifact were visually selected by an experienced investigator, leading to 12 s clean segments. The minimum number of trials obtained after artifact rejection was 90 for every participant. Digitized MEG data were imported into MATLAB Version 7.4 (Mathworks, Natick, MA) for analysis with custom-written scripts.

# **DATA ANALYSIS AND RESULTS**

# **BEHAVIORAL ANALYSIS**

In order to detect any possible effect of stimuli category on participants' responses, we carried out a means comparison with the "beautiful" responses as dependent variable and stimuli category as independent variable. On average, artistic stimuli were rated as beautiful on 99.7 of the 200 possible occasions. Nonartistic stimuli, on the other hand, were rated as beautiful on 96.1 of the instances. This difference, however, was non-significant [*t*(19) = 0.69; *p* = 0.4].

With regards to the set of artistic stimuli, a one-way (ANOVA) was performed taking into account the four subcategories (abstract, realist, impressionist, and post-impressionist artworks). On average, participants rated 23.4 of the abstract artworks, 24.75 of realist artworks, 26.05 of the impressionist artworks, and 24.85 of the post-impressionist artworks as beautiful. Again, there were no significant differences among the ratings for these stimuli categories, [*F*(3,76) = 1.693; *p* = 0.176].

# **EVENT-RELATED FIELDS (ERF)**

ERFs were derived by averaging single trials. Average ERFs were calculated for each condition (beautiful and not beautiful), individual sensor (148), and participant (20). A period of 500 ms prior to target onset was defined as the baseline. ERFs were calculated for 100 ms periods, individual sensor and for each condition (**Figure 1**). A strong positive effect appeared in the right anterior temporal region within the 100–200 ms window. Its corresponding negative effect appeared in the contralateral hemisphere, this is, the left anterior temporal region. This effect was sustained during the subsequent time windows until 500 ms. This was observed both for beautiful stimuli (**Figure 1A**) and not beautiful stimuli (**Figure 1B**).

We selected the main contributing sensors to this effect: sensors 108, 109, 110, 111, 126, 127, 128, 129, 144, 145, 146, 147, and

148 for the right hemisphere; sensors 96, 97, 98, 99, 114, 115, 116, 117, 132, 133, and 134 for the left hemisphere. Grand averages of these sensors were calculated from 0.5 s prior to stimulus onset to 1 s after stimulus onset (**Figure 2**) for both conditions. Three peaks showed the dominant effects during the aesthetic appreciation task in the pointwise ERF analysis: an early component (160–180 ms) with the largest amplitude, an intermediate component (250–280 ms), and a late component (450–480 ms). The last two peaks seemed to be a consequence of the first one, a sort of sustainment of activity.

For statistical analyses, a procedure used in other kinds of ERF and Event-Related Potentials (ERPs) studies was applied to analyze the activity of the MEG waveform, in this case, as a function of rated beauty. Parametric and non-parametric tests were calculated for each time point after stimuli onset for each individual MEG sensor in order to identify the modulation of the ERF as a function of beauty. These analyses were conducted using a significance criterion of *p* < 0.05 with Bonferroni correction based on the 1356 tests per analyzed sensor, a test for each epoch. The analysis was performed independently for each sensor. There were no significant differences between beautiful and not beautiful stimuli in the modulation of the ERF.

# **TIME-FREQUENCY ANALYSIS**

This analysis aimed to overcome the aforementioned criticism of Fourier transformation-based spectral analysis: the assumption of the stationary nature of neurophysiological signals. The wavelet transform's advantages fit well with the purpose of offering a TF characterization of a signal, allowing the study of the spectral

**FIGURE 2 | Time course of activity in the right and left hemispheres related with "beautiful" and "not beautiful" stimuli from 500 ms prior to the stimulus onset to 1000 ms post-stimulus. (A)** Average activity of the 108, 109, 110, 111, 126, 127, 128, 129, 144, 145, 146, 147, and 148 right sensors. **(B)** Average activity of the 96, 97, 98, 99, 114, 115, 116, 117, 132, 133, and 134 left sensors.

power dynamics and, therefore, a detailed comparison between experimental conditions, i.e., beautiful and not beautiful, during the stages of the aesthetic appreciation task. The wavelet coefficients, *W*(*p*,*z*), can be obtained as follows:

$$W(p, z) = \frac{1}{\sqrt{p}} \int\_{-\infty}^{\infty} \varkappa(t) \, \Psi^\* \left( \frac{t - z}{p} \right) dt$$

where the parameter *z* defines the time localization, and *p* the wavelet timescale representing the period of the rhythmic component (Mallat, 1999; Torrence and Compo, 1998; Grinsted et al., 2004). TF representation of MEG data was calculated on a single trial basis for a 1500 ms time window starting from 500 ms before and ending 1000 ms after the onset of the stimulus presentation, using a Morlet wavelet function. Thus, a baseline correction was performed in order to estimate stimulus evoked oscillations. Power in the standard frequency bands of theta (4–8 Hz), alpha (8–12 Hz), beta (12–30 Hz), and gamma (30–50 Hz) was computed. The sensors were grouped such that they related to five brain regions: Frontal (17 sensors), Right Temporal (30), Left temporal (36), Occipital (32), and Central (33) (**Figure 3**).

TF sensor representation in the four spectral bands and time were compared between the previous and the posterior activity to the stimulus onset using a Kruskal Wallis test (*p* < 0.001) with a False Discovery Rate (FDR) correction. We used an FDR correction due to the exploratory character of this study, and not an Family wise Error Rate (FWER), which is better suited for confirmatory designs (Storey, 2002; Groppe et al., 2011). Nevertheless, some studies have demonstrated that the FDR control is useful both with "independent test statistics" and "dependent test statistics" (Benjamini and Yekutieli, 2001; Verhoeven et al., 2005) and has been applied to TF analyses in the context of clinical neuroimaging studies (Kobayashi et al., 2009). Note that we refer to time-spectra changes and not TF changes due to the fact that we have averaged frequency in order to estimate the dynamical (time dependent) changes per spectral band and not per frequency scale.

Statistical differences are summarized and represented in **Figure 4**. Stimuli rated by participants as beautiful and not beautiful were considered separately. The results for the theta band are presented in the upper part of the figure, next the results for the alpha band, and thereafter those for beta and gamma bands. Within each of the bands, the results from the frontal sensors (F) are presented first, then those from the right temporal sensors (RT), then those from the left temporal sensors (LT), followed by those from the occipital sensors (O), and finally those from the central sensors (C). The resulting patterns for stimuli rated as beautiful and not beautiful were very similar. The largest activity in both conditions appeared in the alpha waveband in temporal lobe regions at close to 200 ms after stimulus onset. These results confirmed the previous ones obtained through ERF analysis, that is to say, there is a maximum activation between 100 and 200 ms, and, in addition, they specify that such activity is characterized by oscillations mainly within the alpha band.

On the other hand, the power values were also statistically analyzed using a Kruskal Wallis test with an FDR correction to compare the spatial-TF patterns between the beautiful and not

beautiful stimuli. In this case, there were many more differences, and more intense, between beautiful and not beautiful than the reverse—i.e., not beautiful minus beautiful (**Figure 5**). For this reason, we present the results of the subtraction of the not beautiful stimuli power from the beautiful stimuli power with a *p* < 0.0005—a possible false positive each 2000 contrasts—and the opposite results with a *p* < 0.05—a possible false positive each 20 contrasts.

The most outstanding differences between the beautiful and not beautiful conditions occurred from around 400 ms onwards. In the theta band, there was a notable power increase in the frontal and left temporal lobe regions with beautiful stimuli compared to not beautiful stimuli. In the same direction, there were noticeable differences in activity in the frontal, occipital and, to a lesser extent, left temporal regions in the alpha band. Although with lesser intensity, significant differences also appeared in the occipital, frontal, left temporal, and right temporal regions in the beta band. Finally, there were also lesser significant differences in the occipital, right temporal, and frontal regions in gamma band.

Importantly, the subtraction not beautiful minus beautiful hardly produced any significant differences (right-hand column of **Figure 5**), even accepting a very high Type I error (*p* < 0.05). This suggests that after 300–400 ms the power of the spatial-TF patterns while viewing stimuli rated as beautiful are larger than while viewing stimuli rated as not beautiful in any frequency band and in all cortical regions.

# **DISCUSSION**

In this study we explored the dynamics of neural activity underlying aesthetic appreciation in two different ways. Our ERF analysis revealed no differences between brain activity related with stimuli rated as beautiful and not beautiful, although both conditions showed a clear peak 170 ms after stimulus onset. Conversely, TF analysis showed that 300 ms after stimulus onset activity in the four frequency bands and in the five defined brain areas was greater for stimuli rated as beautiful than as not beautiful.

The brain region labels used to describe profiles of power could be subjected to small spatial deviations. A direct relation between the position of the sensor and the immediate brain region cannot, therefore, be established. However, we have grouped the signals in the sensor space into five sensor groups (F, RT, LT, O, and C) to limit this effect. In addition, the magnetic field measured with MEG is much less distorted by biological tissue than the electric potentials from EEG and, as a result, a much more direct relation between the original source and the signal captured at the sensor space can be expected.

# **EVENT RELATED FIELDS**

Neuroimaging and neurophysiological techniques have only recently and scantly been used to characterize the dynamical nature of brain activity related with the appreciation of beauty and other aesthetic features. Although such studies are still considerably outnumbered by fMRI experiments designed to identify the spatial location of activity, Jacobsen and Höfel (2003) and de Tommaso et al. (2008) pioneering work constitutes essential reference points for our findings.

In contrast with our ERF results, Jacobsen and Höfel (2003) and de Tommaso et al. (2008) obtained significant EEG differences between activity associated with beautiful and not beautiful stimuli. Jacobsen and Höfel (2003) reported that stimuli regarded as not beautiful were accompanied by a fronto-central negative deflection about 300 ms after onset. On the other hand, de Tommaso et al. (2008) found an increase in the N2m (260 ms) while participants viewed neutral pictures, unlike the beautiful ones. Interestingly, such N2m amplitude pattern was not observed in a second task where participants were required to perform a simple recognition task with items they had previously rated.

Given that there is no simple correspondence between ERFs and ERPs responses, our results may not be straightforwardly comparable with Jacobsen and Höfel (2003) and de Tommaso et al. (2008). On the one hand, ERFs are sensitive to only of a subset of the neural activity that can be detected by ERPs and, on the other, selectivities that are clear in the ERFs may be diluted with ERPs (Liu et al., 2002). Besides the neuroimaging technique, there were other differences between our study and the other two that may contribute to the different results. Jacobsen and Höfel (2003)

used black and white geometrical stimuli and compared participants' rating of an aesthetic feature (beauty) and a formal feature (symmetry). de Tommaso et al.(2008) used famous paintings and complex colored geometrical shapes, the stimulus presentation lasted 750 ms, and responses were collected on a 10-point scale. We have argued elsewhere that this kind of experimental design and implementation peculiarities can be a source of disparate results (Nadal et al., 2008).

Our ERF analysis showed a peak at 170 ms followed by prolonged activity with two lesser peaks: 270 and 450 ms. The positive pole of the first component was located in the right temporal region, and the negative one in the contralateral left

region. The two other components were located more frontally than the first, but equally symmetrical. Previous studies have established associations between early ERF components, approximately at 170 ms, and specific cognitive processes. They have mostly related early ERFs with face processing or emotions, sometimes separately, sometimes jointly.

Most recent studies of facial processing that used ERFs have reported two early MEG responses (Halgren et al., 2000; Liu et al., 2002; Itier et al., 2006; Susac et al., 2010). The first one takes place around 100 ms and is basically circumscribed to the occipital lobe. Although it is face-selective, it does not appear to be related with facial identification. The M100 is usually considered as indexing the early encoding stage of visual information (Itier et al., 2006). The second ERF component related with face processing is M170. The main source is the inferior temporal area, consistent with the fusiform gyrus (Lu et al., 1991; Linkenkaer-Hansen et al., 1998; Halgren et al., 2000; Liu et al., 2000; Taylor et al., 2001; Kloth et al., 2006). Usually, it is of positive polarity on the left hemisphere and negative polarity on the right hemisphere, the opposite pattern to our findings. The M170 is involved in face identification, so its pattern is different depending on individual facial traits. It seems to reflect refined stages where identity encoding begins (Itier and Taylor, 2002; Itier et al., 2006), sensory input is transformed for future processing, sensory code is translated to cognitive processing (Halgren et al., 2000), or/and a deeper processing is carried out as a function of stimulus ambiguity (Itier et al., 2006). Liu et al. (2002) believe that the relation between M100 and M170 might be a continuum along which facial processing becomes increasingly precise.

Although studies examining the early components of neural processing of emotion with ERP analysis abound in the literature, only a few experiments have dealt with the relation between emotional processes and early MEG components—see Olofsson et al. (2008) for a review on ERPs. Using ERFs, Peyk et al. (2008) found greater activity for pleasant and unpleasant stimuli than neutral stimuli during the 120–170 window, with right positive polarity and left negative polarity. The source of this activity was located in occipito-parieto-temporal regions. D'Hondt et al. (2010) reported similar results at 180 ms after stimulus onset at occipito-temporal areas, and hypothesized that this activity could result from the rapid detection of emotional content by the amygdala.

Acknowledging that our analysis was performed at sensor level, our TF findings (**Figure 4**) can help us approximate the source location for activity occurring at around 170 ms. Increased activity in the four frequency bands was located in both temporal regions, and the occipital region was the third one with greater activity. In particular, the temporal activity was most significant in the anterior superior sensors. Thus, our 170 component presents certain similarities and certain differences with those found in previous facial processing and emotion studies: (a) activity is located in similar brain regions; (b) the location of the most significant values identified in our study were more anterior than those reported in the aforementioned literature; (c) the hemispheric distribution observed in our study is similar to Peyk et al. (2008), but opposite to that shown by other facial processing studies; (d) our analysis did not produce significant differences between conditions. How can we explain these similarities and differences?

We believe that points (a) and (b) support the notion that the M170 we found is related with perceptual and content processing. Note that our stimuli included no image portraying or suggesting faces. This might be why the activity resulting from our analysis is not located as posteriorly as that reported in face and emotion processing studies. It might be more anterior because of the semantic and content analysis involved in aesthetic appraisal. Another explanation could be related with task requirements. It is conceivable that deciding whether a picture portrays a face or a certain emotional expression or not requires processes that are more primitive than deciding whether one finds a stimuli to be beautiful or not. In this scenario, the second kind of task relies more on cognitive (frontal) or semantic (temporal) resources than the first kind. In relation with point (d), decisions about beauty require a more refined and more precise processing beyond the first perceptual and cognitive stages, and the ERFs cannot detect it. Notice that the two next peaks (270 and 450) had a more fronto-temporal distribution, which suggests a cognitive analysis rather than a perceptual one.

For several reasons, point (c) does not seem to be especially relevant. For instance, very few among the prior studies commented this issue. Moreover, there was no consistency among the studies as to the association between polarity and hemisphere. In any case, the main motivation of the present study was to characterize the dynamics of brain activity related with the appraisal of stimuli as beautiful and not beautiful, and this was the reason we carried out a TF analysis after the ERF analysis.

# **TIME-FREQUENCY ANALYSIS**

In the first TF analysis we compared the baseline pre-stimulus power with the post-stimulus power. Significant differences appeared around 170 ms post-stimulus in the four frequency bands, and the pattern and power of oscillations were similar for beautiful and not beautiful stimuli (**Figure 4**). The largest difference between the baseline and the post-stimulus power appeared in the alpha frequency band. Alpha band seems to have a direct role in attention (Palva and Palva, 2007), working memory (Halgren et al., 2002), object recognition (Mima et al., 2001) and sensory awareness (von Stein et al., 2000). It is reasonable to believe that initial activity (up to 300 ms) was related with some of these processes.

In the second TF analysis, we compared brain activity related with both kinds of stimuli. The comparison of the power associated with not beautiful stimuli against that associated with beautiful stimuli revealed no significant differences (note that *p* < 0.05 for "Not Beautiful > Beautiful" in **Figure 5**). The opposite contrast also produced no significant differences before 300 ms, but it revealed that the oscillation power associated with beautiful stimuli was significantly greater than the power associated with not beautiful stimuli 400 ms after stimulus onset and beyond.

Lindsen et al. (2010) carried out a similar analysis with EEG data from a facial preference task using a 2AFC paradigm. Participants looked at one face until they decided to replace it with a second face, and then they indicated their preferred face in terms of approachability. Analysis of the power values showed that preferred faces presented in second place, but not those presented in first place, were related with an increased theta band activity around 500 ms over the fronto-central electrodes, and a decrease in the gamma band around 650 ms over central-occipital electrodes. These results in theta band in relation with the preferred second face are quite coincident with ours, though other results reported by Lindsen et al. (2010) are not. Although they used a face preference task and a very similar TF analysis to ours, a number of procedural differences especially the temporal arrangement of the stimuli and the type of decision, in addition to the important differences between registering brain activity with EEG and MEG—could account for different results.

In addition to revealing significantly greater power for beautiful stimuli than for not beautiful stimuli from 400 ms onwards, our results indicate that these differences occur in all four frequency bands. We believe that one possible interpretation of these results can be in terms of oscillation synchronization. The frequency of oscillations depends on cellular pacemaker mechanisms and neuronal network properties. Aesthetic appreciation, like many other cognitive faculties, emerges from the coordinated interaction of mechanisms and networks distributed across different brain areas (Nadal et al., 2008; Nadal and Pearce, 2011). The precise mechanisms underlying the coordination of these interactions—neural activity occurs at various spatial and temporal levels which must be dynamically adjusted—remains an unresolved problem in neuroscience (Uhlhaas and Singer, 2006). The synchronization of neural oscillatory activity constitutes a possible solution to this problem (Buzsáki and Draguhn, 2004; Schnitzler and Gross, 2005; Uhlhaas and Singer, 2006). Bhattacharya and Petsche (2002, 2005) have already highlighted the importance of synchronization in aesthetic tasks. Their analysis of EEG data showed significant differences in the degree of phase synchronization between artists and non-artists during visual perception of paintings (Bhattacharya and Petsche, 2002). Significantly higher synchrony was found in the high frequency beta and gamma bands in artists during the perception of the paintings. Since it has been claimed that these frequency bands are related with binding elementary visual attributes into a coherent ensemble, they interpreted their results as reflecting artist's enhanced ability for binding details of complex artworks to create internal representations.

In line with this argument, the greater oscillatory power observed in our study in relation with stimuli rated as beautiful might owe to a greater synchronization of oscillations. If this were the case, we could hypothesize that the general synchrony during the perception of beautiful stimuli would be the cause of the common effect across all bands. This interpretation, however, must be considered with caution, because, although a change in power spectrum often coincides with a change in synchronization, this is not always the case. Without losing sight of the exploratory nature of this study (and thus we used the FDR correction in the comparison tests), we believe it is interesting to ask what could the functional significance of this hypothesized synchronization while viewing beautiful images be? Could the frontal lobe be the hub of low frequency synchronization—theta and alpha?

Based on the Global Neuronal Workspace framework (Baars, 1993; Dehaene et al., 1998), as well as von Stein and Sarnthein (2000) and Palva and Palva (2007) proposals, our results might be seen as suggesting that a possible specific aesthetic global neuronal workspace is established during aesthetic tasks in which processing beautiful stimuli is related with a greater synchronization of neural activity than not beautiful stimuli. In this workspace for aesthetic appreciation, theta band activity would reflect the coordination and communication in several bands networks, and the control of working memory functions (Klimesch et al., 2010; Sauseng et al., 2010). Alpha band would reflect the internal, top-down processes, among others, expectations or generating hypothesis about the viewed stimuli. Gamma band would reflect local and basic visual analysis, like Gestalt principles or binding perceptual features (Nyhus and Curran, 2010). Finally, beta band would reflect semantic and supramodal binding related with the current stimulus.

# **CONCLUSIONS**

The main objective of the present study was to describe the dynamics of brain activity during an aesthetic appreciation task. Thus, we carried out an ERF analysis and an exploratory TF analysis.

The ERF analysis revealed a peak of activity at about 170 ms independently of whether the stimulus was rated as beautiful or not beautiful. This M170 component was confirmed by the first TF analysis in which we compared activity before and after stimulus onset. This peak of activity originated in temporal regions. Previous studies have related the M170 with the beginning of the coding of object identity, and with the transformation of a sensory code to a cognitive processing, and with processes involved in resolving stimulus ambiguity. In this sense, we believe that the activity peak we observed at about 170 ms reflects this perceptualcognitive processing. Our ERF results, however, reveal no significant differences in activity related with stimuli considered beautiful and those considered not beautiful. Studies examining the neural underpinnings of emotion have found significant differences between ERFs related with neutral and non-neutral (pleasant or unpleasant) stimuli. Given the tight relation between pleasantness and beauty (Marty et al., 2003), we believe that there are no significant differences in our results because two non-neutral conditions were used. Thus, it seems appropriate to introduce the neutral response option in future experiments of aesthetic appreciation.

The TF analysis showed that oscillatory power related with beautiful stimuli was significantly greater than the power related with stimuli rated as not beautiful from 300–400 after stimulus onset, whereas the opposite contrast showed no significant differences. These differences appeared in the four frequency bands. Synchronization of oscillations could be a possible interpretation of those results. In earlier work (Nadal et al., 2008; Nadal and Pearce, 2011) we have argued, from evolutionary and cognitive points of view, that aesthetic appreciation emerges from the coordination of processes involving different brain regions. In light of the results presented in this paper, and in the absence of a firm candidate for the mechanism that explains such a coordinated interaction, we believe that future studies should test whether synchronization functions indeed as a coordination mechanism. Bhattacharya and Petsche (2002, 2005) results, in fact, revealed the importance of synchronization in tasks related with aesthetic appreciation. Although, as we have already noted, our interpretation must be considered with caution, our results could suggest that a specific aesthetic global neuronal workspace is configured during aesthetic tasks in which processing beautiful stimuli is related with a greater synchronization of neural activity than not beautiful stimuli. In this workspace for aesthetic appreciation, different frequency bands would reflect different perceptual and cognitive processes. Although this interpretation satisfies the need to account for the distributed spatial and temporal neural activity underlying aesthetic appreciation (Nadal et al., 2008; Nadal and Pearce, 2011), it is only based on the amplitude analysis and an exploratory experiment, which provides only a partial perspective of the synchronization. Additional synchronization analyses will be necessary to confirm our proposal (Bhattacharya and Petsche, 2005).

# **REFERENCES**


# **ACKNOWLEDGMENTS**

This research was supported by grant SEJ2007-64374/PSIC from the Spanish *Ministerio de Educación y Ciencia*. The authors are grateful to Almudena Capilla for helpful and useful comments on ERF analysis.

effortful cognitive tasks. *Proc. Natl. Acad. Sci. U.S.A.* 95, 14529–14534.


memory. *Neurosci. Biobehav. Rev.* 34, 1023–1035.


control: increasing your power. *OIKOS* 108, 643–647.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 31 March 2011; accepted: 21 December 2011; published online: 05 January 2012.*

*Citation: Munar E, Nadal M, Castellanos NP, Flexas A, Maestú F, Mirasso C and Cela-Conde CJ (2012) Aesthetic appreciation: event-related field and time-frequency analyses. Front. Hum. Neurosci. 5:185. doi: 10.3389/ fnhum.2011.00185*

*Copyright © 2012 Munar, Nadal, Castellanos, Flexas, Maestú, Mirasso and Cela-Conde. This is an openaccess article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.*

# **EXPERIENCING ART: THE INFLUENCE OF EXPERTISE AND PAINTING ABSTRACTION LEVEL**

**Elina Pihko, Anne Virtanen, Veli-Matti Saarinen, Sebastian Pannasch, Lotta Hirvenkari, Timo Tossavainen, Arto Haapala and Riitta Hari**

# Experiencing art: the influence of expertise and painting abstraction level

# *Elina Pihko1\*, Anne Virtanen1,2, Veli-Matti Saarinen1, Sebastian Pannasch1, Lotta Hirvenkari 1,Timo Tossavainen3, Arto Haapala2 and Riitta Hari <sup>1</sup>*

*<sup>1</sup> Brain Research Unit, Low Temperature Laboratory, Aalto University School of Science, Espoo, Finland*

*<sup>2</sup> Department of Philosophy, History, Culture and Art Studies, University of Helsinki, Helsinki, Finland*

*<sup>3</sup> Department of Media Technology, Aalto University School of Science, Espoo, Finland*

#### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Lutz Jäncke, University of Zurich, Switzerland*

*Tamer Demiralp, Istanbul University, Turkey*

*\*Correspondence: Elina Pihko, Brain Research Unit, Low Temperature Laboratory, Aalto University School of Science, PO BOX 15100, 00076 Aalto, Finland. e-mail: pihko@neuro.hut.fi*

How does expertise influence the perception of representational and abstract paintings? We asked 20 experts on art history and 20 laypersons to explore and evaluate a series of paintings ranging in style from representational to abstract in five categories.We compared subjective esthetic judgments and emotional evaluations, gaze patterns, and electrodermal reactivity between the two groups of participants. The level of abstraction affected esthetic judgments and emotional valence ratings of the laypersons but had no effect on the opinions of the experts: the laypersons' esthetic and emotional ratings were highest for representational paintings and lowest for abstract paintings, whereas the opinions of the experts were independent of the abstraction level. The gaze patterns of both groups changed as the level of abstraction increased: the number of fixations and the length of the scanpaths increased while the duration of the fixations decreased.The viewing strategies – reflected in the target, location, and path of the fixations – however indicated that experts and laypersons paid attention to different aspects of the paintings. The electrodermal reactivity did not vary according to the level of abstraction in either group but expertise was reflected in weaker responses, compared with laypersons, to information received about the paintings.

**Keywords: art perception, esthetic judgment, eye-movement, electrodermal activity**

# **INTRODUCTION**

In paintings, the viewer's eye is easily caught by human figures, especially faces. Although gaze behavior during picture viewing is affected by physically salient visual features, also cognitive factors, such as the given task, are important (Buswell, 1935; Yarbus, 1967; DeAngelus and Pelz, 2009). Moreover, the viewer's internal cognitive plans or strategies may differently guide the gaze. In art schools and classes for art history, future artists and experts on art are trained to pay attention, beyond the figurative elements, to other aspects of the paintings, e.g., the historical context, different painting styles and the composition of objects, forms, and color. Thus, artists and experts on art are expected to view paintings differently from laypersons.

Differences in gaze behavior can be studied by analyzing fixations and saccades. Fixations are the periods when the eyes are relatively stable and visual information is gathered, while saccades are fast ballistic eye-movements which bring the fovea from one fixation point to another. The idea of expert cognitive strategies has prompted several studies on comparison of the eye-movements of experts vs. laypersons in different areas of expertise. For example, experienced radiologists were found to apply a "global" analysis of mammography images in detecting breast cancer; the expertise was considered to arise as a shift from detailed scanning to a holistic, gestalt-like perception (Kundel et al., 2008). Expert chess players, on the other hand, fixated beside the chess pieces and at the center of the board, whereas novices fixated more often

directly at the piece they needed to recognize (Bilalic et al., 2011). The particular eye-movement behavior of the chess experts was accompanied by bilateral brain activation, in contrast to only lefthemisphere activation in the novices, and the authors suggested the right hemisphere activation to be linked to holistic processing of the stimuli.

Expertise is reflected in holistic processing also in subjects viewing art. Nodine et al. (1993) showed that untrained viewers fixate more on central and foreground figures, whereas art-trained viewers spend more time looking at background features, consistently with the idea that untrained viewers focus more on individual objects and art-trained viewers more on the relationships among the pictorial elements. Accordingly, Kapoula and Lestocart (2006) suggested that experts scan a larger surface of a painting than do laypersons. Vogt (1999) and Vogt and Magnussen (2007) provided further evidence for different viewing strategies of art-trained and untrained subjects by showing that untrained viewers spend more time on areas with recognizable objects and human features than do artists. However, differences in gaze patterns are less obvious between the groups. Illes (2008) and Kristjanson and Antes (1989)reported great individual variability in durations of fixations of both experts and laypersons viewing paintings. The artists made longer fixations while viewing familiar paintings whereas the non-artists' fixations were longer while viewing unfamiliar paintings (Kristjanson and Antes, 1989).

Expertise affects not only the viewing strategies, but also the viewers' art preferences. Representational art depicts elements that are easily recognized by most people, whereas with increasing level of abstraction the recognizable elements disappear. Non-professional art viewers prefer representational over abstract paintings. They also give higher scores on an affective scale to representational rather than abstract paintings (Uusitalo et al., 2009). Art education and frequency of visits to art galleries were linked to a tendency for positive ratings of abstract art (Furnham and Walker, 2001; Uusitalo et al., 2009). When subjects viewed postimpressionist paintings and their manipulated "abstract" versions whose content could not be identified, non-experts and industrial design students preferred the original paintings over the abstract ones, while the ratings of senior art school students did not differ significantly between the original and abstract versions (Hekkert and van Wieringen, 1996). Similarly,Illes (2008)found a clear preference for figurative paintings and dispreference for non-figurative ones in laypersons, but not in artists or experts.

An interesting, but largely overlooked, question is the relationship between the cognitive and bodily measures of experiencing art. De Jong (1972) compared the esthetic likings and skin conductances between three groups: students of art history, students of art, and non-experts. While the non-experts' likings differed from those of the experts-in-training, it was not possible to differentiate between "beautiful" and "ugly" paintings by means of skin conductance. Self-reported evaluations of valence and skin conductance responses evoked by viewing emotional pictures did not correlate with each other and were associated with activation of different brain areas (Anders et al., 2004).

Taken together, experts' viewing strategies and esthetic appreciations seem to differ from those of laypersons. However, many of the earlier studies suffer either from a small number of subjects (Yarbus, 1967; Nodine et al., 1993; Zangemeister et al., 1995), a small number of paintings (Zangemeister et al., 1995; Smith et al., 2006), or a lack of professional categorization of paintings into abstract and representational groups (Zangemeister et al., 1995; Uusitalo et al., 2009). One of the motivations for the present study was to replicate and expand earlier results on art expertise by investigating a larger number of subjects and paintings. By increasing the number of paintings, we were further able to group the paintings into subcategories along the continuum from representational to abstract. In our study, two of the authors, both experts on art history and esthetics, selected and categorized the paintings. Attention was also paid to specifying the group of experts. As discussed byVogt and Magnussen (2007), the expertise that painters acquire by training to produce figurative art may be supported by special perceptual information-processing strategies. As these strategies are not necessarily typical for all artists or experts on art, the studied groups should be carefully defined. In the present study, the expertise was specified as the subjects' knowledge on art history.

We investigated whether the expertise acquired by professional studies in art history affects esthetic judgments and gaze patterns of subjects viewing digitized images of paintings. Specifically, we were interested to analyze whether the continuum from representational to abstract paintings (five categories) would be reflected in these measures. As non-experts tend to dislike abstract paintings (see above), we hypothesized that the increasing level of abstraction would gradually decrease the esthetic judgments of laypersons, while those of experts would not change. Further, as laypersons spend more time looking at the figurative elements (Vogt, 1999; Vogt and Magnussen, 2007), we examined whether the increasing level of abstraction and disintegration of the figurative elements would affect differently the fixation parameters of the experts compared with those of the laypersons. To have a broader view, we also studied the effect of expertise on emotional reactions to the paintings by collecting self-reported evaluations of positive/negative feelings evoked by the painting and measuring electrodermal reactivity.

# **MATERIALS AND METHODS SUBJECTS**

Half of the subjects (*n* = 20) were experts on art history who had been studying art history as a major subject or esthetics as a major and art history as a minor subject in the University of Helsinki. Thus viewing and evaluating paintings had formed an important part of their training. The laypersons' group (*n* = 20) consisted of university students or graduates with no visual art studies or hobbies. In addition to the educational background, the groups were matched by gender and age. Both groups included 17 females and 3 males. The mean age was 30.2 years (range 24–49) in the expert group and 29.8 years (21–43) in the control group. All subjects had normal or corrected to normal vision (4/20 in expert group and 5/20 laypersons wore eyeglasses, and 5/20 and 3/20, respectively, had contact lenses). All subjects signed an informed consent form before the experiment. The study had prior approval by the Ethics Committee of the Hospital District of Helsinki and Uusimaa.

# **STIMULI AND EXPERIMENTAL PARADIGM**

The stimuli consisted of 35 fine art paintings by renowned artists representing different styles in the Western tradition of painting from the sixteenth century up to the 1980s (**Table 1**). The paintings were downloaded from the ARTstor digital library (http://www.artstor.org/index.shtml) and selected to represent different subject categories and a continuum from representational to abstract art. The representational–abstract continuum had five categories: (I) representational paintings, (II) less representational paintings where the subject matter can be wellrecognized despite less details than in category I due to style, technique used etc., (III) paintings in which the subject matter is difficult to understand, at least at first sight, (IV) almost abstract paintings where the style approaches full abstraction, with only a few identifiable details or with details that are difficult to recognize, and (V) abstract paintings. Each category had seven paintings. All paintings, except those in the abstract category, depicted human beings, landscapes, or urban scenes.

Digitized copies of the paintings were projected (Mitsubishi Electric HC6000) at resolution of 1024 by 768 pixels to a screen (112 cm width, 100 cm height) about 2.5 m in front of the subject. Due to the different formats of the artworks and some free moving of the subject, the paintings, fully covering the screen in either direction, were seen in visual angles of 19–25˚ horizontally and 15–23˚ vertically.

The Presentation software (Neurobehavioral Systems, Inc.,) was used for controlling stimulus presentation. The software ran

### **Table 1 | Paintings in the five categories from representational (I) to most abstract (V).**


*[R] indicates the "rehearsal" paintings.*

#### **Table 2 | Mean** ± **SD ratings of esthetic judgments and emotional evaluations.**


on a stimulus PC which was connected to the eye-tracking PC to provide correct timing.

The experiment consisted of two viewing sequences (Parts 1 and 2) of all paintings; thus each painting was displayed twice. Subsequent to each painting, the subject had to answer questions on a printed questionnaire (see below). We used a presentation time of 10 s in Part 1 and a presentation time of 30 s in Part 2 since it is known that 10 s is sufficient to obtain an overview of a picture while 30 s is the average observation duration for an esthetic judgment when unlimited time is given (Locher et al., 2007). In the Metropolitan Museum of Art, visitors typically view paintings for less than 30 s, with a median of 17 s (Smith and Smith, 2001). In Part 1, following the 10 s of image presentation, the subjects had 30 s to answer the questions. After that, a sound indicated the end of the answering period, and the subjects had to switch their gaze back to the screen. In Part 2, subsequent to the 30-s presentation, a 25-s period was given for answering. To avoid confusion, paintings were numbered sequentially (1–35), and the respective number was printed on the questionnaire and shown on the screen during the answering period. The full experiment, including preparation time, lasted on average 1.5 h.

The paintings were shown in a fixed pre-randomized order to one half of the subjects, and in the reverse order to the other half. Both Part 1 and Part 2 began with the presentation of the same five "rehearsal" paintings – one from each category – always shown first. The rehearsal paintings were presented to acquaint the subjects with the duration of picture presentation and the time for answering the questions, as well as to give an overview of the different image categories for facilitating the subsequent ratings.

Each 30-s presentation period in Part 2, except for the five "rehearsal" paintings, was accompanied by auditory information (presented via two loudspeakers) about the painting. The rehearsal paintings were used as control pictures to separate the effect of information. For half of the paintings, the information given in Part 2 consisted of neutral facts. For example, for Gauguin's *Pool*, *Martinique*, the subjects heard that "Gauguin made the painting while staying on this Caribbean island, and the painting is from the painter's early impressionist period; hence the ambiance has been conveyed with small, distinguishable brush strokes and with pure colors." For the other half of the paintings, some "tabloid-type," emotion-evoking details were added. For example, for Macke's *Separation*, the story went: "Macke was influenced by the avantgarde movement and expressionism, and he was combining these styles in his work. He was called to join the army in the beginning of the World War I and died just few weeks later at the age of 27 years."

# **BEHAVIORAL MEASUREMENTS**

In Part 1, subjects had to first indicate if they had seen the picture before. During Parts 1 and 2, subsequent to each picture presentation, participants had to answer a questionnaire asking for esthetic evaluation ("In your opinion, is this painting a good work of art?" Scale: 1 – not at all good, 5 – very good) and the emotions evoked ("In your estimate, what is the quality of emotion evoked by this painting?" Scale: −2 very negative, +2 very positive).

# **EYE-TRACKING**

Gaze patterns were measured using a semi-portable, video-based iViewX HED4 eye-tracking device (SensoMotoric Intruments, Teltow, Germany). The sampling rate of the eye tracker was 50 Hz, the spatial tracking resolution was <0.1˚, and the gaze-position accuracy better than 1˚. The system was attached to a cap – thus allowing small head and body movements while the subject was sitting on a sofa – and connected to the eye-tracking PC, and from there via serial connection to the stimulus PC. Before Part 1, a 9-point gaze calibration was performed. When needed (in 30% of the subjects), the calibration was repeated after the rehearsal pictures or before Part 2.

# **ELECTRODERMAL ACTIVITY**

Changes in electrodermal activity (EDA) were measured between two electrodes attached to the index and ring fingers of the subject's non-dominant hand. A small amount of conductive paste was put between the electrodes and the skin. Low (0.5 V) DC voltage was applied between the two electrodes, and the conductance of the body in between them was measured. Sensors were connected to a ME6000 (Mega Electronics Ltd., Finland) data logger, which sampled the EDA at 1000 Hz. Offline, the data were transferred to MegaWin analysis software (Mega Electronics Ltd., Finland), which was used for handling and exporting the data.

# **ANALYSIS**

First, the HED4 eye tracker calculated the gaze position in the scene video coordinates. In an offline analysis with Matlab software (Natick, MA, USA), we determined the position of the projected painting in each frame of the video. We used the scale invariant feature transform (SIFT; Lowe, 1999) to extract salient key points from images and matched them to find corresponding points between images automatically. Even though SIFT features are designed to be robust to changes in lighting, straightforward matching of SIFT features between the original images and the video failed because projection and subsequent video imaging changed the images too much. To solve this problem, we picked a reference frame in the video for each painting, matched the reference frame to the painting image manually, and then matched the video frames to the reference video frame by using the SIFT feature. Finally, we mapped the eye-tracking data from each video frame to the painting image via the reference video frame. The accuracy error of the transformed data points was less than 30 pixels in the scale of the original image, and the error rate was inspected manually in several data sets. This transformation was necessary because we allowed the subjects to view the images without restricting their head movements.

We then imported the raw eye-tracking data to OGAMA software (Voßkühler et al., 2008) for event detection and for preparation of statistical analysis. Detection of fixations was based on the dispersion-threshold-identification (I-DT) algorithm (Salvucci and Goldberg, 2000), with a dispersion radius of 1˚ and a minimum fixation length of 80 ms. Gaps between the fixations were classified as saccades.

The average fixation duration, average fixation count, and total length of the scanpath (sum of all saccades) were then computed for each subject and painting. Furthermore, region-of-interest (ROI) analysis was conducted to determine the total fixation duration on each predefined ROI. The ROIs, drawn manually, included heads and faces on paintings depicting human characters. ROI analysis was performed for Categories I–IV of the representational–abstract continuum; Category V was excluded because, by definition, no human figures were depicted there.

Furthermore, scanpaths were analyzed as spatial and temporal sequences with the ScanMatch method (Cristino et al., 2010). For the spatial alignment of the sequences,an 8 × 8 substitution matrix was created, dividing the screen in 64 sectors, each with a size of 128 × 96 pixels and a gap penalty of 0. The small gap penalty value was chosen as it "benefits the global alignment of the sequences" (Cristino et al., 2010, p. 695). In addition, a temporal binning was applied with a bin size of 100 ms. Thus, in the sequence a fixation of 100 ms was counted only once while a fixation of 300 ms was counted three times.

Whenever appropriate, the statistical analyses were carried out using repeated measures ANOVA (SPSS 14.0.1). Greenhouse– Geisser correction for *F* and *P* values was used if the sphericity assumption was violated.

# **RESULTS**

# **BEHAVIORAL DATA**

As expected, experts were familiar with a larger number (7.5 ± 4.6; mean ± SD) of the 35 paintings than laypersons (1.1 ± 2.1). In other words, the paintings got more "Yes, I have seen it before" answers from experts than laypersons (Mann–Whitney Test, *U* = 260, *z* = −4.4, *P* < 0.001).

In general, the esthetic ratings were higher in Part 2 than in Part 1 (Wilcoxon signed ranks test, *z* = −2.02, *P* = 0.043 for both groups). The same was true for the "rehearsal" pictures without audio (experts:*z* = −2.8,*P* = 0.003; laypersons:*z* = −2.9, *P* = 0.002).

As shown in **Figure 1A**, the level of abstraction affected the esthetic judgments differently for both groups of participants.

The grades of the laypersons were highest in the representational Category I and lowest in the abstract Category V (**Table 2**) [Friedman's ANOVA, Part 1: <sup>χ</sup>2(4) <sup>=</sup> 41.0, *<sup>P</sup>* <sup>&</sup>lt; 0.001, Category I vs. V: *<sup>P</sup>* <sup>&</sup>lt; 0.001 sig <sup>=</sup> 0.0125; Part 2: <sup>χ</sup>2(4) <sup>=</sup> 44.0, *<sup>P</sup>* <sup>&</sup>lt; 0.001; Category I vs. V: *P* < 0.001 sig = 0.0125]. In Part 1, the judgments of the experts were not affected at all by the abstraction level [Friedman's ANOVA, <sup>χ</sup>2(4) <sup>=</sup> 6.8, *<sup>P</sup>* <sup>=</sup> 0.15] and Part 2 showed a slight effect to the opposite direction [Friedman's ANOVA, <sup>χ</sup>2(4) <sup>=</sup> 9.7, *<sup>P</sup>* <sup>=</sup> 0.046]. Accordingly, the abstraction level only affected the emotional evaluations of laypersons [Friedman's ANOVA, Part 1: <sup>χ</sup>2(4) <sup>=</sup> 17.1, *<sup>P</sup>* <sup>=</sup> 0.002; Part 2: <sup>χ</sup>2(4) <sup>=</sup> 21.3, *P* < 0.001]: representational paintings evoked the most positive emotions (Part 2, Category I: 0.58 ± 0.14; mean ± SE), but they became less positive and even negative with growing abstraction (Part 2, Category V: −0.15 ± 0.11), whereas experts' grades did not

change [Part 1: <sup>χ</sup>2(4) <sup>=</sup> 2.4, *<sup>P</sup>* <sup>=</sup> 0.7; Part 2: <sup>χ</sup>2(4) <sup>=</sup> 5.9, *<sup>P</sup>* <sup>&</sup>lt; 0.2; **Figure 1B**].

#### **EYE-TRACKING DATA**

**Figure 2** shows how for both experts and laypersons, the number of fixations (main effect of Part, *F*1,38 = 93.0; *P* < 0.001) and the length of the scanpath (*F*1,38 = 37.8; *P* < 0.001) decreased from Part 1 to Part 2 (first 10 s). The mean duration of fixations increased from Part 1 to Part 2 (*F*1,38 = 56.8; *P* < 0.001).

Generally, the gaze patterns were affected by the level of abstraction in both laypersons and experts (**Figures 2** and **3**; **Table 3**). In both groups, the mean duration of fixations decreased (main effect of Category for both Part 1 and 2, *P* < 0.001) and the length of scanpath increased (main effect of Category for both Part 1 and 2, *P* < 0.001) from representational toward the more abstract

categories with no group differences. Also the number of fixations increased in both groups (main effect of Category for both Part 1 and 2, *P* < 0.001), this increase was stronger for laypersons in Part 2 and is evidenced in the contrast interaction of Group by Category for Category I vs. V (*F*1,38 = 8.7; *P* = 0.005).

The fixations were longest for paintings depicting human beings, with no differences between experts and laypersons (main effect of Category *F*3,114 = 47.1; *P* < 0.001; *P* < 0.001 in comparison with landscapes, urban sceneries as well as abstract paintings).

From these paintings depicting human beings, ROIs including the heads and faces were selected for further analysis. In Part 1, the (total) fixation duration for faces was 12% longer in laypersons than experts (**Figure 4**; *F*1,38 = 4.4, *P* = 0.042;), it was generally longest in the representational Category I, which also separated laypersons from experts (main effect of Category *F*2.4,90 = 30.8, *P* < 0.001, Group by Category interaction

*F*2.4,90 = 30.8, *P* = 0.021, Category II: *F*1,38 = 6.6, *P* = 0.014), whereas toward the more abstract categories the group differences disappeared. In Part 2,the fixation durations did not differ between the groups.

To compare the scanning strategies between the groups, we calculated the average distances of the fixations from the center of the paintings. In Part 1, the distance was larger for experts than laypersons (6.8˚ ± 0.15˚; mean ± SE vs. 6.3˚ ± 0.12˚ respectively; *t* = 3.0; *P* = 0.005; see **Figure 5**); in Part 2, the distances did not differ between the groups.

Moreover, to examine the similarity of the scanpaths in the two groups, we compared for each picture all scanpaths pairwise, separately for each group, using the ScanMatch algorithm (Cristino et al., 2010). Mean similarity indices per picture showed that the scanpaths were more similar in Part 2 than in Part 1 (main effect of Part, *F*1,56 = 55.1, *P* < 0.001). In Part 1, the scanpaths were more similar in the layperson group than the expert group (Part by Group interaction, *F*1,56 = 10.5, *P* = 0.002, *t*-test between the groups in Part 1: *t* = −2.9, *P* = 0.006). When Part 2 was divided into three consecutive sequences of 10 s, a main effect for Sequence (*F*2,16 = 19.6; *P* < 0.001) indicated a higher similarity in each group during the first 10 s than the rest of the viewing time. No differences were observed regarding the similarities of scanpaths between the categories.

#### **ELECTRODERMAL ACTIVITY**

Electrodermal reactivity (the difference between maximum and minimum EDA values) was not affected by the level of abstraction in either group, neither in Part 1 nor in Part 2.

Electrodermal reactivity was larger in both groups during Part 2 – when either neutral or tabloid-type information was given – than during Part 1 (**Figure 6**; main effect of Part, *F*1,32 = 46.8, *P* < 0.001), and the change from Part 1 to Part 2 was larger (25.9 vs. 13.0 nS) for laypersons in comparison with the experts (interaction of Part by Group, *F*1,32 = 5.2, *P* < 0.029). Furthermore, the tabloid-type information tended to have stronger effect on laypersons than experts (three-way interaction of Part by Group by Type, *F*1,32 = 5.4, *P* = 0.026).

### **DISCUSSION**

We examined whether and how expertise in art history would affect the self-reported esthetic and emotional ratings, eye-movements, and EDA during viewing of paintings. We were interested in how the continuum from representational to abstract paintings is reflected in these measures. As expected, the abstraction level affected the ratings of laypersons and experts differently. Esthetic judgments and emotional valence decreased with increasing abstraction level for laypersons, but not for experts. Contrary to the cognitive ratings, however, the abstraction level affected the number and duration of fixations as well as the length of the scanpath in both groups. Nevertheless, in Part 1, the fixation duration on the face areas, the distance of the fixations from the center of the picture, and the similarity of scanpaths differed between the groups and thereby indicated different viewing strategies.

The abstraction level affected both the number and duration of fixations. For the most representational category of paintings, the number of fixations was smallest and the fixation durations


**Table 3 | Results of statistical analyses for number and duration of fixations and length of scanpaths for Part 1 and Part 2.**

were respectively longest, whereas paintings of the most abstract category elicited more fixations with shorter duration. This finding is compatible with the idea that, in representational paintings, the eyes fixate longer on the figurative details than in abstract paintings where the figurative elements are lacking and the subject keeps searching for them. The paintings with human figures evoked the longest fixations in both groups. The ROI analysis of Part 1 revealed that for the most representational paintings depicting humans, laypersons had longer fixations than experts on the face and head areas, whereas for the more abstract categories the group differences disappeared. These results support the notion that while human figures are strongly salient in attracting the gaze, their effect can be inhibited by expert viewing strategies (Vogt, 1999; Vogt and Magnussen, 2007). However, the group differences were seen only in Part 1. In Part 2, the fixation durations were similar for both groups also in the representational paintings, most likely because the longer viewing time allowed subjects to concentrate on the details.

Several factors tend to keep the gaze focused on the center of the screen. First, between the displays, while subjects were answering the questionnaires, an image number was displayed in the center of the screen, which may have focused the gaze toward the center at the beginning of the display of the next picture. Second, subjects have a general tendency for fixating the middle of the screen irrespective of the distribution of the image features (see Tatler, 2007). Third, in art, main figurative elements often appear in a central position (Locher et al., 2007; Tyler, 2007). However, the larger distance of fixations from the image center observed for experts indicates that expertise can inhibit the center-viewing tendency. This interpretation is in line with earlier suggestions that eye movements of experts cover wider areas of paintings than those of laypersons (Kapoula and Lestocart, 2006), or that experts

generally use more global than local viewing strategies than nonexperts (Zangemeister et al., 1995). While laypersons concentrate on the details of the picture, experts also examine the spatial construction while evaluating the esthetics of the painting (Kapoula et al., 2008). However, in Part 2, the fixation distances from the center were similar in both groups. This result can be a combined effect of the longer viewing time, allowing for concentration on the details and the given information that guided the gaze similarly in both groups (Richardson et al., 2007). Nevertheless, we want to emphasize that for such kind of investigations the number of subjects and stimuli play an important role, since the viewing behavior varies considerably according to the artwork and the viewer, as illustrated in **Figure 5**.

The scanpaths in Part 1 were more similar within the layperson group than within the expert group. Despite the strong effect of low-level visual saliency (Koch and Ullman, 1985; Itti and Koch, 2001) in guiding saccades, semantically meaningful features attract fixations (Yarbus, 1967; Nyström and Holmqvist, 2008). We argue that, during the first viewing, both the social (human figures) and non-social saliency guided the gaze of the laypersons in a similar way, whereas the experts were using their individual training- and expertise-related strategies in scanning the pictures, resulting in a top-down inhibition of social (figurative) cues. The disappearance of the difference between similarity indices in Part 2 can be explained by the audio information that guided the viewing process similarly in both groups (Richardson et al., 2007). Interestingly, during the first 10 s of

Part 2, the scanpaths were more similar within both groups than during the remaining periods. This finding agrees with earlier suggestions that early viewing is guided more by low-level processes before a stronger involvement of individual strategies comes into play (Tatler et al., 2005). Thus, with longer viewing time, the consistency of fixation locations between observers decreases. It is possible that the combined effect of the audio information and (social and non-social) saliency factors were more powerful in the beginning of the viewing period, after which the scanpaths became more individual.

The two parts in the present study are not directly comparable because of the longer duration and the additional provided information in Part 2. Some differences between the parts are discussed below.

Esthetic judgments of the paintings were higher in Part 2 than Part 1. At least repetition, longer viewing time, and information given about the paintings have to be considered as possible contributing factors to this difference. However, repetition of images of real-world scenes has been shown to lower rather than increase preference ratings (Biederman and Vessel, 2006). Regarding the effects of viewing time, the results diverge. Locher et al. (2007) found that a longer viewing time (100 ms vs. unlimited time, mean 32.5 s) raises the pleasingness of art stimuli, whereas Smith et al. (2006) found no such effect (viewing times varying between 1, 5, 30, and 60 s). Moreover, Smith et al. (2006), by either showing or omitting painting captions, noted that the information about a painting did not affect the ratings of viewers (mixed group of

art-trained and lay viewers). The role of information in raising the ratings in the present experiment is improbable since the ratings were also higher for the "rehearsal" pictures in Part 2 that were not accompanied by auditory information. Thus, we argue that judgments in Part 2 raised ratings mainly due to the longer viewing time.

The eye-movement parameters also differed between the two parts of the experiment: in both expert and layperson groups, the duration of fixations increased from Part 1 to Part 2 (first 10 s in Part 2), while the number of fixations decreased. Even though the longer viewing time was controlled for, as both parts were analyzed for the 10 first seconds, the subjects knew that the viewing time was longer in Part 2 than Part 1, and they could thus take more time to examine the details as reflected in the longer fixations.

Finally, the electrodermal reactivity was stronger in both groups in Part 2 than in Part 1. The information given during Part 2 affected the EDA values of the laypersons more than those of the experts suggesting that the laypersons were more susceptible to

### **REFERENCES**


engages lateral areas in both hemispheres. *PLoS ONE* 6, e16202. doi: 10.1371/journal.pone.0016202


the information given, which is understandable as they knew less about the paintings and painters in advance.

The digitized pictures projected on the screen obviously did not have all qualities (e.g., texture, size) of the original paintings. The reproduction type (original or digital painting) does not, however, affect the target of fixation (Kapoula et al., 2008). However, original paintings viewed in the gallery are rated more pleasant and interesting than their slide or computer reproductions, both by art experts and laypersons (Locher et al., 2001). In our study, the experts gave on average higher esthetic and emotional ratings than did the laypersons. This difference cannot be caused by the format of the painting, but reflects the low ratings that the laypersons gave to the more abstract paintings.

In conclusion, we found that expertise in art history strongly influences the cognitive but hardly any of the psychophysiological measures of subjects experiencing art. Esthetic judgments and emotional valence ratings given by the laypersons depended on the level of abstraction, being more positive for the representational than abstract categories, whereas those of the experts did not show this tendency. Although the gaze patterns of both groups were similarly affected by the level of abstraction, the expertise was reflected in the viewing strategies, e.g.,*where* the subjects were looking. This result agrees with the global viewing strategies previously detected in various expert groups.

#### **ACKNOWLEDGMENTS**

This study was supported by the Academy of Finland (National Centers of Excellence Program 2006–2011), the aivoAALTO project of the Aalto University, Aalto MIDE programme (project UI-ART), ERC Advanced Grant #232946 to Riitta Hari, and European Commission grant (FP7-PEOPLE-2009-IEF, EyeLevel 254638) to Sebastian Pannasch. We thank Cathy Nangini for valuable suggestions on earlier versions of the manuscript and Ville Renvall for support in the experiment preparation. Image credits for **Figure 5**: Eugène Boudin (French, 1824–1898), *Villefranche*, c. 1891. Oil on panel, 16 1/16-- × 12 7/8-- (40.8 cm × 32.7 cm). (c) Sterling and Francine Clark Art Institute, Williamstown, MA, USA, 1955.547. Juan Gris (José Victoriano González Pérez, Spanish, 1887–1927) *Still Life before an Open Window, Place Ravignan,* Work Type Paintings, 1915. Oil on canvas, 45 5/8-- × 35-- (115.9 cm × 88.9 cm). Philadelphia Museum of Art, Philadelphia, PA, USA, The Louise and Walter Arensberg Collection, 1950, 1950-134-95.


Chien) de Francis Bacon. *Intellectica* 44, 215–226.


qualities of paintings. *Perception* 30, 449–465.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 31 March 2011; paper pending published: 14 June 2011; accepted: 16 August 2011; published online: September 2011. 12*

*Citation: Pihko E, Virtanen A, Saarinen V-M, Pannasch S, Hirvenkari L, Tossavainen T, Haapala A and Hari R (2011) Experiencing art: the influence of expertise and painting abstraction level. Front. Hum. Neurosci. 5:94. doi: 10.3389/fnhum.2011.00094*

*Copyright © 2011 Pihko, Virtanen, Saarinen, Pannasch, Hirvenkari, Tossavainen, Haapala and Hari. This is an open-access article subject to a nonexclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.*

# **ARTISTIC EXPLORATIONS OF THE BRAIN**

**Eberhard E. Fetz**

# Artistic explorations of the brain

# *Eberhard E. Fetz\**

*Department of Physiology and Biophysics, University of Washington, Seattle, WA, USA*

#### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

*\*Correspondence:*

*Eberhard E. Fetz, Department of Physiology and Biophysics, University of Washington, Seattle, WA 98195-7290, USA. e-mail: fetz@uw.edu*

The symbiotic relationships between art and the brain begin with the obvious fact that brain mechanisms underlie the creation and appreciation of art. Conversely, many spectacular images of neural structures have remarkable aesthetic appeal. But beyond its fascinating forms, the many functions performed by brain mechanisms provide a profound subject for aesthetic exploration. Complex interactions in the tangled neural networks in our brain miraculously generate coherent behavior and cognition. Neuroscientists tackle these phenomena with specialized methodologies that limit the scope of exposition and are comprehensible to an initiated minority. Artists can perform an end run around these limitations by representing the brain's remarkable functions in a manner that can communicate to a wide and receptive audience. This paper explores the ways that brain mechanisms can provide a largely untapped subject for artistic exploration.

**Keywords: art, brain, neural, collage, aesthetic**

# **INTRODUCTION**

All cognitive processes are ultimately performed by neurons in the brain, providing ample opportunities to elucidate the neural mechanisms that underlie artistic behavior and the appreciation of art. Several neuroscientists have explored the ways that graphic artists exploit the properties of the visual system to create resonant works (Livingstone, 1988, 2000, 2008; Zeki and Lamb, 1994; Zeki, 1999; Cavanagh, 2005; Liu and Miller, 2008). For example, the dominance of portraiture as an art form is clearly based on effective stimulation of the specialized face areas of the human brain (Zeki, 1999). The brain regions involved in processing artistic input have been extensively documented through modern imaging techniques (Vartanian and Goel, 2004; Di Dio et al., 2007; Kim and Blake, 2007; Fairhall and Ishai, 2008; Bosnar-Puretic et al., 2009; Chiu, 2009; Kowatari et al., 2009) and electrophysiological recordings (Bhattacharya and Petsche, 2002). Ramachandran has proposed several perceptual principles and described the neural mechanisms that may account for the aesthetic appeal of enduring art (Ramachandran and Hirstein, 1999; Ramachandran, 2011). Art historians have also expounded on the ways that an understanding of brain mechanisms can inform the appreciation of art (Clausberg, 1999; Cantz, 2000). The neural mechanisms underlying aesthetic experience have also been explored and elucidated in several publications (Rentschler et al., 1988; Ramachandran and Hirstein, 1999; Zeki, 1999; Vartanian and Goel, 2004; Nadal et al., 2008).

Interestingly, images of the brain and the spectacular forms of neural structures have their own aesthetic appeal. Larink has just produced a comprehensive review of how the brain has been depicted historically in various contexts that serve social and other functions (Larink, 2011). The wide range of images of neural morphology revealed by successively more sophisticated techniques, from early staining methods to contemporary multicolor reconstructions is lavishly illustrated in "Portraits of the Mind" (Schoonover, 2010). When appreciated by the right hemisphere rather than the left, the spectacular color images of intermingled neurons rival the aesthetic appeal of Richter's abstract paintings. Schoonover's book also illustrates the complex patterns produced by multichannel neural activity. A collection of artistic pieces inspired by the brain's anatomy and functions formed an eclectic exhibit in Rotterdam called "Neuro-artonomy," described in a book by the same name (Voogd, 1998). The book also includes many imaginative essays by neuroscientists and artists concerning each other's endeavors.

Beyond these reciprocal bridges between art and neuroscience there is a largely unexplored area of brain function as itself a subject for artistic representation. The neural networks in our brains effortlessly perform common miracles of perceiving the world, controlling volitional movements and performing higher functions like speech and thought. These cognitive functions are all produced by complex patterns of neural activity, but how mental events emerge from material mechanisms remains an enduring mystery. The remarkable relation between mind and brain has stimulated philosophers and scientists through the ages and holds a fascination for the layman. Neuroscientists have made considerable progress in elucidating the neural mechanisms underlying cognitive function, but their findings are disseminated in professional publications that are inaccessible to the larger population lacking the requisite training.

Many deep issues have remained impervious to scientific analysis. How does consciousness emerge from brain activity? How are coherent cognitive functions produced by the myriad impulses coursing through seemingly chaotic neural networks? What are the mechanisms that generate emotions, control thinking, mediate satori, etc.? While science remains unable to answer these questions, I believe that art can make an end run around the conceptual impasse. Art excels at rendering mysteries that surpass rational exposition. The components of the above questions are all amenable to various forms of graphic representation and their relationships can be explored by appropriate associations of images. Obviously such an approach would not answer these questions in scientific terms, but it can still communicate important relationships intuitively. Moreover, artistic approaches can communicate these issues to a wider audience than the neuroscience community. Such works could simply incorporate neural images as a reminder that brain mechanisms underlie mental events and behavior. But to attain aesthetic resonance such works should go beyond the images of neural forms and explore representations of the ways that these complex structures give rise to cognitive activity.

#### **ARTISTIC REPRESENTATIONS OF THE BRAIN**

Despite the brain's key role in mediating human experience, we have relatively few examples of artistic representations of brain mechanisms and functions. A possible early example is Michelangelo's "Creation of Adam" in the Sistine Chapel, which is said to employ an outline of the brain in God's cloud, as a metaphor for the neural basis of creativity (Meshberger, 1990; Lakke, 1999). Others have seen evidence of additional neuroanatomic images in the Sistine frescoes (Suk and Tamargo, 2010).

Depictions of brain mechanisms become particularly compelling when displayed in public exhibitions of walk-in renderings of neural images, accompanied by recordings of neural activity, as in the "Mindscape" project (O'Shea and Sneltvedt, 2006). The Mindscape exhibit, first presented in a Brighton church in 2004, was a multimedia artwork that had "resonance in the science of the brain." The artist Sol Sneltvedt was inspired to represent the neural mechanisms underlying dynamic states of mind and collaborated with neuroscientist Michael O'Shea to create an immersive audiovisual installation. Large-scale projections of neural structures and multiple soundtracks of neural activity surrounded viewers with a vivid representation of brain activity. The projections fluctuated between different neural images that were synchronized with audio tracks of neural recordings. Representing both fastscale electrical activity and slower chemical communications, the visualizations presented a rich experience of dynamic brain activity. Similarly, a current travelling exhibition called "BRAIN: The World inside Your Head" presents an elaborate multimedia exhibit designed to impress viewers with the workings of brain mechanisms that mediate mental experience (details of the exhibit and the location of its latest incarnation can be found via Google).

Excellent examples of artistic exploration of mind and brain are described in the paper by Goeffrey Koetsch in this Frontiers Special Topic issue (Koetsch, 2011). He illustrates the works of eight New England artists dealing with mental processes and alluding to neural mechanisms, presented in an exhibition called MINDmatters (see http://www.laconiagallery.org/ exhibit18.html).

A prolific contemporary artist whose works have been consistently inspired by the relationships between mind and brain is Todd Siler. Siler has produced innumerable striking images and installations representing the interactions between brain, mind and the world in varying levels of abstraction (see http://www.

toddsilerart.com). **Figure 1** illustrates two pieces from his extensive "Mind Icon" series, rendered on slabs shaped like frontal brain sections and painted with images representing mental and neural activity. His dramatic works and creative approaches are described in numerous publications (Siler, 1975, 1990, 1993), and in a comprehensive paper in this Special Topic issue (Siler, in press).

Another example inspired by brain function is the collage in **Figure 2**, which alludes to the types of neural interactions that mediate cognitive processing. The subject is the brain of Eric Heller, a physicist interested in wave mechanics, who has also explored aesthetic renderings of quantum processes (see http://www.ericjhellergallery.com). Heller's brain converts quantum events described by the Schrödinger equation into Matlab code that generates spectacular graphic patterns produced by quantum wave propagation and resonance. Such images lend themselves perfectly to representations of brain waves and reverberating neural activity, and are here combined with neural images. One of Heller's famous figures shows the cumulative pathways that electrons would produce when emerging from a central source and propagating over a potential field (Topinka et al., 2001); this image, shown on a cover of *Nature*, resembles a generic biological form, including the morphology of Golgi neurons (see prefrontal cortex in **Figure 2**). The collaged images in **Figure 2** allude to the resonant interactions between frontal and occipital areas of the brain when creating visual art.

A third example (**Figure 3**) provides a metaphor of the conscious self emerging from the complex tangle of neural networks. The human figure may be recognized from its original incarnation in the Flammarion engraving, depicting an explorer breaking through the confines of the physical world to discover celestial realms beyond (http://en.wikipedia.org/wiki/ Flammarion\_engraving). This piece is intended to be rendered with the white portion being a mirror surface, reflecting the real world.

# **FUTURE OPPORTUNITIES**

The collages in **Figures 1**–**3** illustrate the sort of representations through associated images that are possible with pictorial

**FIGURE 2 | "Resonant Transformations," a digital collage by the author.** The geometric images were created by Eric Heller, using simulations of quantum dynamics and incorporated here to allude to dynamic neural interactions in the brain.

**FIGURE 3 | "Awakening," by the author.** The white portion of the image is intended to be rendered as a mirror surface reflecting reality. Alternatively, the image can be rendered on clear glass, with the white portion transparent.

presentation. More profound effects could be achieved with multimedia techniques. The dynamic operations of the brain can be effectively represented by videos of wave propagation through neural networks or the limitless contents of cognitive activity. Projecting these videos onto 3D structural renderings of brain networks could provide effective representations of functions emerging from forms. The works could even be made interactive by incorporating real-time videos of the observer, and could be designed to stimulate the viewer to recognize his own instantaneous brain activity.

A key advantage of art over science in representing brain function is its wider palette of discourse, allowing it to evoke associations that transcend objective thought. The mind-brain problem has remained refractory to rational resolution, but its underlying mechanisms may be approached by artistic representation and appreciated by aesthetic revelation. This presents a rich opportunity for artistically inclined neuroscientists, who understand the operations of brain mechanisms and can render insightful representations. The process of creating visualizations of brain function could even give neuroscientists righthemisphere insights that could inform their left-hemisphere investigations. This opportunity should also interest conceptually inclined artists, who have the requisite creative talents and are inspired to explore this largely untapped aesthetic realm. Their challenge would be to draw on the rich trove of current scientific information and breathe aesthetic life into the deeper issues. Finally, as demonstrated by the Mindscape project, there are ample opportunities for productive collaborations between artists and neuroscientists to create new works inspired by the resonant relations between mind, brain and reality.

# **REFERENCES**


*Front. Hum. Neurosci.* 5, 110. doi: 10.3389/fnhum.2011.00110


# **ACKNOWLEDGMENTS**

I thank the Wissenschaftskolleg zu Berlin for providing the opportunity to create the digital collages in **Figures 2** and **3**, Eric Heller for providing some of the images in **Figure 2**, and Todd Siler for permission to reproduce the images in **Figure 1**.


branched flow in a two-dimensional electron gas. *Nature* 410, 183–186.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 April 2011; accepted: 20 January 2012; published online: 07 February 2012.*

*Citation: Fetz EE (2012) Artistic explorations of the brain. Front. Hum. Neurosci. 6:9. doi: 10.3389/fnhum. 2012.00009*

*Copyright © 2012 Fetz. This is an openaccess article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.*

# **HOW A CEREBRAL HEMORRHAGE ALTERED MY ART**

**Katherine Sherwood** *(artist perspective)*

# How a cerebral hemorrhage altered my art

# *Katherine Sherwood \**

*Art Practice and Disability Studies, University of California Berkeley, Berkeley, CA, USA*

#### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel Vered Aviv, The Jerusalem Academy of Music and Dance, Israel*

#### *\*Correspondence:*

*Katherine Sherwood, Art Practice and Disability Studies, University of California Berkeley, 345 Kroeber Hall, Berkeley, CA 94720, USA. e-mail: sherwood@berkeley.edu*

"How a Cerebral Hemorrhage Altered My Art" examines how a massive stroke affected my art practice. The paralysis that ensued forced me to switch hands and become a lefthanded painter. It was postulated by several neuroscientists that the "interpreter" in my brain was severely damaged during my CVA. This has had a profoundly liberating effect on my work. Whereas my pre-stroke period had the tendency to be over-intellectualized and forced, my post-stroke art is less self-conscious, more urgent and expressive. The primary subject matter of both periods is the brain. In my practice as an artist, my stroke is a challenge and an opportunity rather than a loss.

**Keywords: paintings, brain, CVA, neuro-anatomical illustrations, artist, cerebral angiogram**

Out of the blue, in 1997, I had a massive stroke at age 44. Brains had been on my mind for years before that. At the beginning of the 1990s I had started to incorporate brain imagery into my paintings. I was fascinated by the theories of the holographic paradigm promoted by Michael Talbot in *The Universe as a Hologram*. This new way of perceiving reality made perfect sense to me although it was labeled a "fringe science." I did several paintings using MRI's I found deep in the biosciences library at UC Berkeley. Small abstract holograms were embedded in the surface of the paintings, such as *Fun Puddle* (**Figure 1**). These paintings failed at expressing thoughts about the holographic paradigm but were visually successful and led me on my way.

I continued to use brain representations through the mid 1990s but my interest was shared with telescopic imagery. Similar to the microscopic views of the body, they were both realities that one could not perceive with the naked eye. I was obsessed with CIA satellite imagery especially photos taken in Russia of nuclear test sites as in the painting*Aldrich Ames* (**Figure 2**). My interest in these images coincided with my sense that I was a spy going through the tenure process in my work as a professor of art practice. In June of 1996, I achieved tenure and set out to have a relaxing post-tenure year.

The next May I experienced a cerebral hemorrhage affecting the parietal lobe of the dominant hemisphere. I lost my ability to walk, talk, read, and think as my right side became paralyzed within the course of 2 min. It happened during a graduate student's critique. Within those minutes my life was completely changed. I do not recall saying this but one of my colleagues reported that the last thing I said was "Oh no, not again." I was referring to the death of my father at age 33 from an aneurysm.

# This was when my life caught up to my art.

In rehab I was repeatedly asked if I wanted to paint as art therapy. I haughtily answered no, that I would just wait until I got the use of my right hand back again. Thankfully, I did not hold onto this attitude for long because my right hand never has regained function. I would not have become a more fluid and urgent lefthanded painter. It turns out that my left hand was, in my case, the better painting hand, and that painting in my studio was the most effective occupational therapy there could have been for me.

Six months later after my brain had absorbed my spilled blood I had a cerebral angiogram. Relieved that it was over and the possible second stroke had not occurred, I sat up on the gurney and looked at the computer screen in the corner of the room. The images of the arterial system of my brain both stunned and reminded me of the Southern Song Dynasty Chinese landscape paintings that I had deeply admired. I immediately said without thinking, "I need those images." The room broke out in laughter which I still do not understand. I repeated, "No, I am an artist and I really need those images." A few minutes later the radiologist entered the room and handed me the full set. He simultaneously told me I would not need brain surgery. I was immensely thankful that I did not have to wait 2 weeks to hear the desired results and grateful to my doctor for taking my request seriously. I knew in an instant that I would use those images in my paintings and clearly saw a blueprint of my artistic future within them. In them was an assurance that I would have a painter's life again.

I returned to my studio with a newfound vigor. Of course I had to make adjustments. Using my left hand was rather like an athletic challenge and it helped me to discover that I was slightly ambidextrous. I was aware of how lucky I was to be a painter and able to continue my vocation by merely changing hands. For example, it was fortuitous that I was not a sculptor, a musician, a dentist, or a brain surgeon.

I compensated for my lesser amount of fine motor skills in my left hand by working on much larger canvases. This allowed for larger brushes and an escape from meticulous detail. I tried to detoxify my process as much as possible as a preventive measure by abandoning using solely oil paints and their hazardous mediums. I switched to a mixed-media method where I combined acrylic and latex paint using oil at the very end of my process. I began to work on a horizontal surface instead of on the wall because of the weight of the canvases, my inability to move them, and to keep them from dripping. I converted my handmade platform bed that I could not get up on any longer into my painting table. I used a rolling chair to go around and around those large surfaces often working on one painting for a year unlike my pre-stroke pace. This was because of the greater size of the canvases as well as my newfound interest in building up the grounds of the painting before I began to apply paint. It takes longer to do anything now, which even in our speeded up world is not necessarily a bad thing.

I soon became proficient at visualizing what a painting would look like once it was hung vertically. Because I found mixing paint with one hand too difficult I soon began to rely on pre-mixed paint. Hence my palette changed and I ventured into using "ugly

colors" with sweet pastel ones. Most importantly there was a new ease in my process. I did not feel I had to intellectualize away my every move and an extreme amount of struggle faded away. A freer,

**FIGURE 2 | Aldrich Ames, 1995, 84-- × 72--, Mixed media on canvas.**

**FIGURE3|Facility of Speech, 1999, 108-- × 84--, Mixed media on canvas.** more enjoyable state of painting existed far different than in my previous work. I was also more detached from them often feeling that they went through me which at the end left me wondering did I make these paintings?

The paintings and prints done in the post-stroke period have combined brain images with the visual interpretations of the medieval manuscript, *The Lemegeton*. I had begun using the sigils in my art in 1992 but I did so because I admired their esthetic attributes, their unsymmetrical stance, and their immaculate calligraphic quality rather than a belief in their efficacy. This text contains seals attributed to King Solomon, the wise leader

**FIGURE 4 | Pump, Drug, Computer, 2006, 84-- × 108--, Mixed media on canvas.**

found in the Bible, Talmud, and Koran. His wealth and fame were said to be the result of magic. These seals allegedly represent the spirits he harnessed to achieve his wisdom, fame, wealth, and

**FIGURE 6 | Cajal's Revenge, 2007, 64-- × 50--, Mixed media on canvas.**

spiritual power. These spirits were used to answer questions and provide assistance. They were conduits for desire. The signets used in my post-stroke art purport to aid the supplicant in matters of healing. I began to reassess their effectiveness.

*Facility of Speech* (**Figure 3**) is a prime example of my use of the seals. It was made when I was anxious about returning to teaching. For almost 2 years, thanks to UC Berkeley, I had worked diligently on rehabilitation but usually in a one-to-one situation. The rest of my time was spent alone in my studio. The thought of leading an entire class was terrifying. I was still very labile. I was afraid that students would not understand it when I broke down crying and through my tears pleaded that I was in fact exceedingly happy. So I used the seal Bune which gives you facility of speech and a way with words. In the painting it is in the light green paint scrawled over my brains in loopy yet graceful lines. *Facility of Speech* was exhibited in the 2000 Whitney Museum Biennial.

One of the classes I taught at Berkeley was about the history of materials and techniques of painting. I took it very seriously

**FIGURE 9 | Mansur Healer of theYelling Clinic, 2010, 90-- × 36--, Mixed media on canvas.**

before my stroke believing I was there so that students would learn the chemistry of painting such as how to avoid cracks. I would think to myself when I detected cracks, "he obviously does not know how to paint." After the stroke, I slowly accrued paint on my canvas, going over the contours of the seal again and again. The light colors of my paintings, when built up, began to crack. This time I felt differently toward cracks and made them more pronounced by filling them with a darker oil paint. There was a bonus in being comfortable with their conceptual meaning. The cracks soon became pre-planned as a compositional device and as a metaphorical placeholder.

The final post-stroke element of my work is something one can not see or perceive. It is a spiritual device that I employ as my own form of ancestor worship and as the focus of intention for the painting. Before I assign each painting a seal, I choose a close person that has passed both to be in charge of the efficacy of the seal and to whom it is dedicated. This is my method for effective grieving, something I have had to do a lot of in my life.

Most of the press I have gotten since my CVA is based in the popular media versus art journals. I make an excellent"overcoming narrative" as it is known in the field of Disability Studies. It refers to the media's dealing with disability only when a disabled person beats all the obstacles so as to appear as "normal." Other articles, especially Peter Waldman's of the Wall Street Journal, proposed that my new success came from changes in my brain, particularly in the disruption of "the interpreter."My artist friends vehemently disagreed with this assessment, preferring to believe it had something to do with the 20-years of painting I had done before my cerebral hemorrhage and my ample time to paint while I was recovering. I leave it up to mystery, a category that drives my doctors crazy.

The paintings garnered a Guggenheim fellowship in 2005– 2006. I proposed incorporating brain imagery of western neuroanatomy from the sixteenth century to the twenty-first century. I became fascinated by the traditional role of the artist to pictorially represent what the anatomist discovers. In today's medical imaging technology, the role of the artist is eliminated. The digital process ostensibly avoids intervention, the human hand, and the craftsmanship of printmakers. What happens when the artist comes at the end of this process instead of in the middle, when the emphasis is on interpretation rather than observation or imitation?

In this body of work I shifted my attention to the nervous system. An example is the painting *Pump, Drug, Computer* (**Figure 4**). Enlarged digital copies on rice paper of Vesalius' nervous system are tiled onto the canvas looking at each other. The seal Foras is employed which supposedly makes one happy, wealthy, and wise. It also represents my two fathers. The title of the painting refers to the fact that I had just had a baclofen pump that has a computer within it implanted within my stomach. I laughed with my digital media colleagues that they may be the most adept at using

**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 13 April 2011; accepted: 29 February 2012; published online: 03 April 2012.*

*Citation: Sherwood K (2012) How a cerebral hemorrhage altered my* a computer but I am the only one among us that has a computer inside of me. In*Vesalius's Pump* (**Figure 5**) I combine images of his brains with my carotid artery while evoking the seal Sallos which brings love to all that ask.

The power and the helpfulness of limitations are often overlooked in life. Limitations spur you on as they challenge you and provide for creative inventions. As an artist they help you define yourself and focus on what does work. Many famous artists have had disabilities that have not restricted them artistically such as Michelangelo's Asperger's syndrome; Goya's deafness; Degas, Monet, and Matisse's visual impairments; Toulouse Lautrec's dwarfism; and Frida Kahlo's spina bifida, polio, and trolley accident. We also do not need anyone's pity for our conditions or to be solely defined by our medical condition. I firmly know my life has been considerably enhanced and made richer after acquiring my disability.

Santiago Ramon y Cajal's limitation was that he really wanted to become an artist but his anatomist father prevented that profession preferring his son to follow his medical footsteps. In the painting Cajal's Revenge (**Figure 6**), a blown-up version of Cajal's famous rendition of the Purkinje cell is placed above a painted seal. Being one of several nineteenth to twentieth century anatomists who drew his own illustrations, Cajal won the Nobel Prize in 1906 with Camillo Golgi. Cajal drew his own stained neurons using Golgi's techniques that were improved upon by Cajal. I have used his illustrations in many paintings of this series concluding with a 72-- × 72- work entitled *One in a Hundred Billion* (**Figure 7**) where a single neuron is painted in silver paint and fills the large, square canvas.

This body of work was shown at the National Academy of Science in Washington D.C. in 2007 and in Gallery Paule Anglim in San Francisco in late 2008. After my exhibit in 2008 I experienced an artist's block knowing internally it was time to change directions. This was painful because before I had always had my art to turn to in order to face life's difficulties. When this became absent, I felt bereft. After nine long months, I had a vision upon waking up one morning of a diptych with a long skirt attached. Who knows where that came from but I figured this could be a way out. I started to construct figures made up of the brains I had been using in my previous work and began making long skirts for each of them. I substituted the brains for eyeballs, faces, and breasts. For example, in *Neuron Nurse* (**Figure 8**) I painted Ramon Y Cajal's neurons to form the face, arms, and chest. In *Mansur's Healer* (**Figure 9**) I used four fMRI's from the Helen Wills Institute of Neuroscience at UC Berkeley where I am the artist-in-residence to create the head and an illustration from The Tashrih Mansuri (Anatomy of Mansur) to form the body. This series will include 10 paintings that will be shown in spring 2012 and is known as the *Healers of the Yelling Clinic*.

*art. Front. Hum. Neurosci. 6:55. doi: 10.3389/fnhum.2012.00055*

*Copyright © 2012 Sherwood. This is an open-access article distributed under the terms of the Creative Commons* *Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.*

# **HUMAN CORTICAL ACTIVITY EVOKED BY THE ASSIGNMENT OF AUTHEN-TICITY WHEN VIEWING WORKS OF ART**

**Mengfei Huang, Holly Bridge, Martin J. Kemp and Andrew J. Parker**

# Human cortical activity evoked by the assignment of authenticity when viewing works of art

# *Mengfei Huang1†, Holly Bridge2†, Martin J. Kemp3 and Andrew J. Parker 1\**

*<sup>1</sup> Department of Physiology, Anatomy, and Genetics, University of Oxford, Oxford, UK*

*<sup>2</sup> Department of Clinical Neurology, FMRIB Centre, John Radcliffe Hospital, University of Oxford, Oxford, UK*

*<sup>3</sup> Trinity College, University of Oxford, Oxford, UK*

#### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Nancy Zucker, Duke University Medical Center, USA Marian Berryhill, University of Nevada, USA*

#### *\*Correspondence:*

*Andrew J. Parker, Department of Physiology, Anatomy, and Genetics, University of Oxford, Sherrington Building, Parks Road, Oxford OX1 3PT, UK.*

*e-mail: andrew.parker@dpag.ox.ac.uk †Mengfei Huang and Holly Bridge*

The expertise of others is a major social influence on our everyday decisions and actions. Many viewers of art, whether expert or naïve, are convinced that the full esthetic appreciation of an artwork depends upon the assurance that the work is genuine rather than fake. Rembrandt portraits provide an interesting image set for testing this idea, as there is a large number of them and recent scholarship has determined that quite a few fakes and copies exist. Use of this image set allowed us to separate the brain's response to images of genuine and fake pictures from the brain's response to external advice about the authenticity of the paintings. Using functional magnetic resonance imaging, viewing of artworks assigned as "copy," rather than "authentic," evoked stronger responses in frontopolar cortex (FPC), and right precuneus, regardless of whether the portrait was actually genuine. Advice about authenticity had no direct effect on the cortical visual areas responsive to the paintings, but there was a significant psycho-physiological interaction between the FPC and the lateral occipital area, which suggests that these visual areas may be modulated by FPC. We propose that the activation of brain networks rather than a single cortical area in this paradigm supports the art scholars' view that esthetic judgments are multi-faceted and multi-dimensional in nature.

**Keywords: fMRI, visual perception, Rembrandt, social neuroscience, psychophysiological interaction**

# **INTRODUCTION**

*Joint first authors.*

Viewing art is popular, pleasurable, and for the most part an unsolved puzzle when it comes to the neurological mechanisms underpinning this experience. Much of the pioneering work on the neurology of art tended to focus on what has been called "neuro-esthetics" and has aspired to tell us why we find something beautiful. This broad-brush approach tells us little about the complex mechanisms that shape an individual viewer's response to a specific work of art within powerfully determined contexts. In particular, it is incapable of handling the intricate interactions of form and content within the framework of strong viewer expectations. If neuroscience is to speak effectively to art historians, it requires new questions and different methods. In this paper we are testing one such approach, as an early step in realigning the interaction of art history and neuroscience. The topic we have selected is that of how assertions of authenticity shape what we see.

Determination of the authenticity of artworks is of great importance in the art world: authenticity brings a scholarly value in shaping art historical understanding, has direct and potentially huge consequence for monetary value, and, most relevantly here, is presumed to have an impact on the individual viewer's experience. Declaring an artwork to be a forgery completely changes the reception of a picture by the viewer; suddenly,"we can see its every flaw" (Wynne, 2006).

Information about the authenticity of an artwork may therefore set a context for the perception of the art, with the capacity to modulate responses in the visual brain. The modulation of visual responses by contextual information is the focus of a great deal of current research. Selective attention (Kastner et al., 1999; Kastner and Ungerleider, 2000), reward (Shuler and Bear, 2006; Krawczyk et al., 2007), and working memory (Zaksas and Pasternak, 2006; Offen et al., 2009) are all examples of changes of context that set an appropriate bias on lower level responses within the brain. The viewing of artwork, particularly the pleasure or fascination it induces (Kawabata and Zeki, 2004, 2008), is also a multidimensional experience, extending beyond the purely visual. The way in which non-visual information about authenticity may alter the viewer's experience has been acknowledged in the context of art history and scholarship (Bossart, 1961; Goodman, 1976). Similar influences of non-sensory information occur in other sensory systems and in other culturally significant contexts (McClure et al., 2004).

When attempting to bridge the divide between neuroscience and studies of art, an initial step has been to summarize the multi-faceted experience of viewing art with a single rating of an artwork's esthetic impact (Kawabata and Zeki, 2004). Although such a generalization could form a useful starting point, scholars in the art world have found little of use in such an approach. By contrast, here we focus on a particular aspect of the viewer's experience, which art scholars generally agree is of significance, namely whether a work of art has been assigned as authentic or derivative. For less expert viewers, advice from experts is highly influential. Thus, there may be a strong effect of social influence on perception in this setting (Asch, 1956; Mojzisch and Krug, 2008).

For this reason, we designed our study to concentrate upon a highly regarded artist (Rembrandt) and on a particular band of expertise about that artist among the participants. We sought participants who were familiar with the name of Rembrandt but were not expert or trained viewers of art. Studies of expert or trained viewers are of course potentially of great interest, but we concluded that the nature of expertise may be highly particular. For example, there are experts who devote themselves to the study of fakes. We reasoned that such experts might find the viewing of fakes more rewarding than the viewing of authentic works. Another example is that experts in modern art might find Rembrandt relatively unexciting and therefore such persons might not care much whether they are looking at a fake or authentic picture. On the other hand, a true Rembrandt expert would be able effortlessly to spot the discrepancies between our assignments of authenticity and the underlying reality as to whether the images are genuine or fake. All these comments emphasize that expertise, especially in its elaborated forms, is individual and particular. For this reason, we concentrated upon the experience of the average viewer in an art gallery, who will look at the artworks in the context of advice from experts (displayed on the gallery wall or in the exhibition guide) about the authenticity of the works.

Using functional magnetic resonance imaging (fMRI), we examined the brain's response to *assignment* of authenticity during the viewing of "Rembrandt" portraits. The assignment of authenticity did not always reflect the true status of the work: in reality, sometimes the viewed artwork was truly authentic and sometimes it was a copy. Rembrandt portraits are particularly suitable for our study because there is a large number of examples, both real and false, whose authenticity has been systematically determined, although still leaving some room for dispute. Hence we were able to determine the brain's response to contextual information generated by the assignment of authenticity, separately from the visual impact that is generated by a genuine or fake image.

The pattern of brain responses we identify here indicates that the major effects of assignment of authenticity are not directly visual, nor even unitary; rather, changes are observed in the interaction between multiple brain regions that all make relevant contributions to the viewer's experience.

# **EXPERIMENTAL METHODS PARTICIPANTS**

Fourteen human participants (8M/6F, age 20–27, two left-handed) were recruited based on a screening questionnaire (see Appendix). The questionnaire ensured that all participants were amateur viewers (visit art museums <5 times/year; no extensive art training) and screened out individuals who dislike Rembrandt to avoid potentially confounding variables. Most importantly, it confirmed the participant's familiarity with Rembrandt, with all participants ranking him among the top 25 artists of all time, or higher. Establishing such esteem ensures that the difference between AUTHENTIC and COPY is salient.

All methods were conducted in accordance with NHS Ethics Reference 06/Q1604/86 and all participants provided written, informed consent. Each participant was provided with the following instructions before commencing the scanning session.

"In this experiment you will see a sequence of 50 Rembrandt paintings. Before each image appears, an audio prompt will announce whether the upcoming painting is 'authentic' or a 'copy' (Please see background for further information on copies). A blank screen will appear for a few seconds after each image to allow you to relax your gaze. A fixation cross will signal for you to focus again and an audio prompt will arrive shortly for the next image. Interspersed within the sequence will be three scrambled images. You will be told when to expect a scrambled image by the audio prompt 'neither'."

The following background on Rembrandt and the definition of copies of artwork was provided for all participants to read.

Rembrandt is recognized as one of the greatest of all painters and etchers. Working in Amsterdam, he experienced great success in the first part of his career, while his latter years were dogged by financial problems (largely self-inflicted) and social difficulties. He was particularly famed for his portraits, which seem to present his sitters' personalities in a profound manner. No artist ever created such a range of self-portraits, which seem to comprise a kind of painted autobiography. *"Rembrandt van Rijn, 1606–1669*

His early style was relatively detailed and naturalistic, but his way of handling paint became increasingly free. In his later works, in particular, thick, vigorous brush strokes, and large swathes of color evoke forms and textures rather than describing them in detail. The figures often emerge from dark or neutral backgrounds.

# *What is a copy?*

Rembrandt taught a number of pupils who were adept at producing work in his various styles. He was much copied and imitated by painters other than pupils, both in his own lifetime and later. As prices for his works escalated, he also became the subject of forgery. What was actually Rembrandt and what was not by him became very confused. The Rembrandt Research Project in Holland was set up to sort out the confusion. In this experiment, works of art labeled as"COPY" refer to pieces that were produced by pupils, followers, or forgers."

The main idea behind the provision of this information was to reduce the potential for participants to bring potentially incorrect and confounding notions to this study. For example, there often a good deal of confusion in lay-usage about the word "fake," which is often presumed to mean some active attempt at forgery. Clarification on points such as these is likely to provide the various participants and sub-groups of the study with a consistent contextual setting prior to the main experiment.

At the end of each person's brain scan, a second questionnaire (see Appendix) was conducted to learn about the person's response to the Rembrandt images. Some of the responses to these questions were used to assist the interpretation of the outcome of the scanning experiments. This questionnaire also asked about whether the participants had fallen asleep during all or part of the scanning: none reported that they had.

# **SELECTION AND PRESENTATION OF VISUAL IMAGES**

Rembrandt's portrait works were prolific and much copied, creating a large set of both genuine and derivative works (Van Sonnenburg, 1995). We obtained a set of 50 high-resolution, color, digital reproductions from the University of Amsterdam Rembrandt catalog (Seinstra, 2010), carefully selected for a balance of male and female portraits, in addition to similar numbers of each pose (frontal, three-quarter pose to the right, and three-quarter pose to the left). All images were re-sized to fit the 1024 × 768 pixel projector screen placed 1210 mm from participant, which resulted in images that were 600 pixels in height (∼20˚ of visual angle) and varied between 450 and 559 pixels in width. Within this set of 50 images, half of these (25) have been genuinely attributed to Rembrandt himself, while the other half are considered to be in the style of Rembrandt by someone else. Purely for convenience, we refer to the second of these groups as FAKE, even though not all of these works have been created as forgeries with intent to deceive.

Three portraits were scrambled, by converting each image into the Fourier domain, scrambling the phase of each frequency component, and then transforming back into image space (Prins, 2007). The scrambled portraits were interspersed at the beginning, middle, and end of the image sequence to provide a baseline for the cortical response to unstructured visual stimuli. Scrambled portraits were cued with the statement: "This is neither."

Participants were not given specific viewing instructions and were told to view each image as they pleased. Each participant viewed a randomly ordered sequence of 53 images, 25 of which were cued as AUTHENTIC, 25 cued as COPY, and three scrambled images cued as NEITHER. The participant heard a statement – "This is authentic"or"This is a copy"– immediately prior to seeing an image of a Rembrandt or Rembrandt-like portrait (**Figure 1**). Audio recordings were used in place of visual text, to maintain identical visual stimulation between the two test conditions.

**FIGURE 1 | (A)** Sample of genuine Rembrandt portrait (REAL); **(B)** derivative portrait in the style of Rembrandt (FAKE); **(C)** sequence of auditory cues and image presentation; **(D)** brain activations in occipital and temporal cortex generated by presentation of portraits after subtraction of activations generated by scrambled images of portraits; data averaged across 14 participants, red regions show significant BOLD activations during period of image presentation (*Z* > 2.3, *p* < 0.05, corrected for multiple comparisons).

A 15-s viewing time was chosen to balance the interest of maximizing the number of trials against the provision of adequate time for a participant to examine a painting as he/she would normally do, outside of an experimental setting. Average viewing time for a work of art is reported to be 27 s with a median of 17 s (Smith and Smith, 2001). With a 15-s viewing/15 s rest paradigm, participants confirmed that they had adequate time to view each painting and take a break between trials.

Participants were divided into two groups. Group 1 (4M, 3F) viewed each painting under the opposite expectation as Group 2 (4M, 3F). For example, the same painting would be cued as AUTHENTIC to Group 1, and as COPY to Group 2. The sequence of paintings was kept constant but presented in reverse order for half of the participants, to control for any effects of lapsing attention, as participants grew more tired toward the end of the experiment.

Out of the 25 paintings cued as AUTHENTIC, 13 were true, authenticated Rembrandts (REAL), and 12 were not (FAKE). Out of the 25 paintings cued as COPY, 13 were truly copies (FAKE), 12 were truly authentic (REAL). The relative numbers of REAL and FAKE works were reversed for the AUTHENTIC and COPY categories for Group 2.

From hereon, the first term in upper case refers to the actual identity of the painting, and second term refers the experimental expectation given each participant, such that a REAL-AUTHENTIC is both authentic in reality and cued as such; a FAKE-AUTHENTIC is a derivative work in reality, but is cued as AUTHENTIC. Omission of the first or second term signifies that the omitted variable is not specified, and both factors are included (e.g., AUTHENTIC includes both REAL-AUTHENTIC and FAKE-AUTHENTIC).

# **fMRI DATA COLLECTION AND ANALYSIS**

Scanning was performed on a Siemens Trio 3T scanner with a 12-channel head coil at the Oxford Centre for Clinical Magnetic Resonance Imaging (OCMR). Each functional scan consisted of 530 volumes collected in 1590s (TR = 3 s, TE = 30 ms, voxel resolution = 2 mm × 2 mm × 2 mm). Whole brain volumes were acquired to enable analysis of sensory and cognitive regions. We also acquired for each subject a high-resolution whole head T1 anatomy scan (MPRAGE 1 mm × 1 mm × 1 mm voxels, 192 slices, TR = 15 ms, TE = 6.0 ms), optimized for gray- and white-matter separation. All analysis was carried out with the FMRIB Software Library (FSL; Smith et al., 2004; Woolrich et al., 2009).

Pre-statistical processing was applied as follows: skull and other non-brain voxels were removed using the brain extraction tool (BET; Jenkinson et al., 2002; Smith, 2002). Motion correction was applied with FMRIB's linear registration tool (MCFLIRT; Jenkinson et al., 2002). Data were spatially smoothed using a Gaussian kernel (full width at half maximum FWHM = 5 mm). Highpass temporal filtering removed low frequency noise and slow drift. Each voxel's time series was divided by its mean image intensity and converted to a percent signal modulation and the time series of voxels within each restricted visual area mask was averaged. Statistical analysis on voxel time series was carried out using FILM (FMRIB's improved linear model) with local autocorrelation correction (Woolrich et al., 2001). Low-resolution functional data, high-resolution T1-anatomy and standard space templates were co-registered using FLIRT. *Z*-statistics were thresholded for individual voxels at *Z* = 2.3, *p* = 0.01, with the cluster significance for multiple comparisons correction set to *p* = 0.05. Each participant's functional activity was registered to Montreal Neurological Institute (MNI) 152 standard space using FLIRT (Jenkinson and Smith, 2001). A general linear model (GLM) was applied to each participant's data using FMRIB's fMRI expert analysis tool (FEAT).

A single group average analysis was carried out with FEAT, which conducts a *t*-test independently for each voxel, and converts the resulting *t*-statistic to a *Z*-score thresholded at 2.3 (*p* < 0.01). Gaussian Random Field Theory was used to correct for multiple comparisons, the *Z*-statistic threshold was set to *Z* = 2.3 with a cluster significance threshold of *p* = 0.05.

Regions showing significant activation in these contrasts were further characterized by calculating the mean %signal change over a given region of interest (ROI) for each participant using a FEATquery. Masks were created to define each ROI as a sphere centered upon the peak voxel in the group analysis. A 5-mm radius was chosen by convention for 2 mm voxel-size data. A *t*-test was applied to the group mean %signal change and group SD. Note that the GLM analysis and the %signal change both take into account the between-participant variance, but only GLM analysis also includes within-participant variance.

Resulting areas of activation were identified anatomically with the aid of the Harvard-Oxford Cortical Atlas and by comparing MNI coordinates to those found in previous literature. For visualization, brain activations were superimposed on the pial surface of the average brain created in Freesurfer.

A psycho-physiological interaction (PPI) analysis (Friston et al., 1997) investigates functional connectivity by looking for a difference in the correlation of activity between two areas during one psychological condition compared with another condition. Due to the interest in top-down modulation, the FPC, which differentially responded to AUTHENTIC vs. COPY (**Figure 3**), was chosen as the seed ROI. A right FPC mask thresholded at *Z* = 2 was made from the group level analysis of COPY–AUTHENTIC. This standard space mask was then transformed back into the functional space of each participant using FLIRT (Jenkinson and Smith, 2001). Each participant's resulting mask was then used to identify the peak voxel that will serve as the functional ROI seed for that participant's PPI. Selecting each participant's seed individually accounts for anatomical heterogeneity across participants, and allows for reduced seed ROI size, thereby improving the signal.

Three explanatory variables were employed for the PPI; (1) the psychological regressor, corresponding to the COPY– AUTHENTIC condition, (2) the physiological regressor, the timecourse of the seed ROI, and (3) the PPI regressor, the interaction between the psychological and physiological regressors. The PPI for each subject was then consolidated into a single average group analysis, following the same specifications as previously described.

Laterality indices were calculated using the LI-toolbox (Wilke and Lidzba, 2007) run in SPM8. A weighted-bootstrapping method of LI calculation was employed using the frontal lobe standard LI-toolbox template (Wilke and Schmithorst, 2006) during COPY trials relative to AUTHENTIC. The LI formula used was LI = (L − R)/(L + R), therefore resulting in positive values for left and negative for right lateralization. Apart from the calculation of laterality index, all the main analyses reported in this paper were conducted with and without the inclusion of the left-handed subjects in the data analysis. No substantially different conclusions were reached.

# **RESULTS**

#### **RESPONSE TO THE VISUAL CONTENT OF THE ARTWORK**

**Figure 1** presents examples of a genuine Rembrandt portrait (**Figure 1A**, REAL) and a derivative or fake (**Figure 1B**, FAKE), as well as the presentation sequence for the auditory cue about authenticity (either AUTHENTIC or COPY) and the visual image (**Figure 1C**). To check for an appropriate response to the artworks, the activation to either AUTHENTIC or COPY relative to scrambled portraits was measured. The activity, pooled across the 14 participants, yielded a characteristic, three-blob chain, reflecting activity in lateral occipital complex (LOC), occipital fusiform gyrus, and temporal occipital fusiform cortex, based on Harvard-Oxford Cortical Atlas (**Figure 1D**, RED regions, *Z* > 2.3, corrected for multiple comparisons). Both stimulus types elicited similar patterns of activity. The area on the fusiform gyrus selectively responds to images of faces (Kanwisher et al., 1997), and the LOC and temporal fusiform cortex have been found to be involved in object recognition (Grill-Spector et al., 1999; Ishai et al., 2000).

# **ASSIGNMENT AS COPY OR AUTHENTIC GIVES SPECIFIC ACTIVATIONS IN NON-VISUAL AREAS**

The occipito-temporal areas of the cerebral cortex are known to be visual and these areas provided responses specific to the structure of the portrait images. Therefore, we also examined these areas to test whether they were differentially activated in response to assignment of authenticity, but nothing significant emerged. However, other brain regions, outside these areas, did have differences in response to assignment of authenticity.

Interestingly, the more distinct differences were in favor of greater activations during the COPY condition (**Figure 2A**, RED regions, *Z* > 2.3, cluster-corrected for multiple comparisons). Across the entire frontal lobe, there was a bias toward greater activation to COPY assignment in the right hemisphere in righthanded subjects. This was evident in the frontopolar cortex [FPC; Right hemisphere: Montreal Neuroscience Institute (MNI) coordinates (32, 58, 0) mm; % signal change 0.14 *p* < 0.001 Bonferroni corrected], the middle frontal gyrus [MNI coordinates (44, 18, 38) mm; signal change 0.11; *p* < 0.01 Bonferroni corrected] and the posterior precuneus [MNI coordinates (4, −66, 36) mm; % signal change 0.15; *p* < 0.01 Bonferroni corrected].

The degree of bias toward right hemisphere activation in righthanded subjects varied between individuals. Furthermore, examination of the responses in FPC for the two left-handed individuals in the study showed the largest bias in favor of the left frontal lobe. A formal test using the Laterality Index (Wilke and Lidzba, 2007) confirmed that there was a significant laterality shift toward right hemisphere activation of the frontal lobe in right-handed participants (Bootstrap method; Wilke and Schmithorst, 2006, *p* < 0.05; two-tailed; **Figure 2B**). More data using this paradigm with more left-handed subjects would be needed to determine whether their frontopolar activations are lateralized in the left hemisphere.

There was also a modest increase in the activation of medial orbitofrontal cortex during the AUTHENTIC condition

**FIGURE 2 | (A)** Activation (*Z* > 2.3, corrected) to the assignment of authenticity (AUTHENTIC vs. COPY). Upper: frontal, lateral, and medial views of right cortical hemispheres of right-handed participants, greater activation to COPY (red) in right frontopolar cortex (FPC; signal change 0.14%, *p* < 0.001, Bonferroni corrected); greater activation to COPY (red) in middle frontal gyrus (signal change 0.11%, *p* < 0.01, Bonferroni corrected); and greater activation to COPY (red) in right posterior precuneus (signal change 0.15%, *p* < 0.01, Bonferroni corrected). Lower: medial views of left and right cortex; greater activation to AUTHENTIC (blue) in medial orbitofrontal cortex: left, signal change 0.39%, *p* > 0.05; right, signal change 0.09%, *p* < 0.01, Bonferroni corrected; and greater activation to COPY (red) in right posterior precuneus. **(B)** Distribution of lateralization index (+1 right-sided, 0 balanced, −1 left-sided) for FPC activation for right-handed (BLUE) and left-handed (RED) participants.

(**Figure 2A**; BLUE regions, *Z* > 2.3, uncorrected for multiple comparisons [Right hemisphere: MNI coordinates (4, 36, −22) mm; % signal change 0.39, *p* > 0.05 NS. Left hemisphere: MNI coordinates (−12, 42, −16) mm; % signal change 0.09, *p* < 0.01, Bonferroni corrected]. Inclusion of activations from all participants (right and left-handed) slightly improves the statistical power and confirms that the orbitofrontal activations appear to be bilateral.

#### **PSYCHO-PHYSIOLOGICAL INTERACTION WITH THE OCCIPITAL CORTEX**

Although there was no difference in the response of visual cortex to COPY and AUTHENTIC assignments, further analysis indicated a highly specific link between the signals in FPC and those in the visual cortex. PPI analysis aims to investigate changes in the interaction between two brain areas under different psychological conditions (Friston et al., 1997). Since FPC showed activation to COPY >AUTHENTIC, this region is a potential source of top-down modulation of visual responses.

**Figure 3A** plots the responses of a voxel in the occipital cortex as a function of the responses of the peak-responding voxel in right FPC, separately for the COPY and AUTHENTIC assignments, for a single participant. The COPY condition induces a stronger correlation between the signals in the two brain regions. These correlated signals are evident in the group PPI for right-handed participants (**Figure 3B**), which showed significantly higher correlation of activity between the FPC ROI seed and several visual areas, including the LOC bilaterally (*Z* = 2.3, *p* < 0.01; corrected for multiple comparisons). A PPI is also found in visual cortex in association with activation of right precuneus although the extent of activation is greater in the left hemisphere (**Figure 3B**).

The PPI is identified within the context of a regression model with main effects (activation of each cortical area independently) and an interaction term (co-activation of two cortical areas, assessed by the correlation between activated voxels in the two areas), in which a statistical test is applied to test whether the co-activation is changed by the psychological condition (in our case, the assignment of authenticity). As such, the analysis cannot identify the direction of causality between the two cortical areas. We examined the time-course of activations in FPC and occipital cortex for evidence that the activations in one region preceded another in time. There was no clear evidence for such an effect, although the unambiguous identification of such differences in timing is often difficult (Smith et al., 2011).

Overall then, for our results, a significant statistical outcome is consistent with a top-down signal from FPC to occipital regions or a redirection of outputs from occipital regions away from other cortical regions toward FPC, with no net change in the activation of the occipital regions. A third, and in some ways most likely, possibility is that there is an increase in the cortical signaling that is passing in both directions between the two cortical regions (FPC and occipital).

#### **DISTINCTION OF REAL AND FAKE ARTWORKS**

Finally, we examined whether there were any differences in the cortical responses to the REAL and FAKE Rembrandts, regardless of the assignment of authenticity. The only small difference we found was confined to the neighborhood of the calcarine sulcus, which we attribute to small mismatches of contrast or visual symmetry between the chosen image sets (see Appendix, **Figure A1**).

#### **DISCUSSION**

Viewing of portrait art elicited the predicted activation in lateral visual cortical areas, corresponding to regions sensitive to faces and object recognition. However, these areas were not differentially activated by the cue of authenticity. Other areas were significantly activated by the assignment of authenticity, including the right FPC, right middle temporal gyrus, right precuneus, and orbitofrontal cortex.

The COPY assignment resulted in the stronger activations, in FPC and right posterior precuneus. To understand these outcomes, it is important to recall that, in response to the cue that the artwork

was a COPY, many participants reported that they were actively engaged in trying to detect the flaws in the presented image to gain confirmatory evidence in favor of the assignment. Participants also reported that their working hypothesis about what distinguished genuine and derivative works shifted over time as they viewed more images.

Activation of FPC (Brodmann area 10) has been obtained previously in studies that require information to be held in working memory: what is similar and relevant to our current study is that these are tasks in which multiple goals and hypotheses are being evaluated at the same time (Koechlin and Hyafil, 2007). Given that art experts combine multiple sources of information to make judgments about authenticity, the activations of FPC observed in this study are consistent with the idea that our participants are actively building hypotheses about the visual content of the images to determine which are genuine and which are not. The right middle temporal gyrus is often activated in working memory tasks, particularly those in which participants are processing spatial information (Leung et al., 2002).

The precuneus has been associated with many higher cognitive functions, including consciousness, aspects of memory and the experience of agency (Cavanna and Trimble, 2006), but it has been argued recently that this diversity of results may partly reflect a failure to identify clearly the functional compartments of this cortical region (Margulies et al., 2009). The activation seen in this study is clearly lateralized to the right and within the posterior zone closely connected with visual cortical areas. Given that participants reported that they were actively engaged in hypothesis-seeking about visual images, our findings are consistent with the proposal that the posterior precuneus forms part of a network with other cortical areas that are more purely visual (Dejong et al., 1994) in their responsiveness.

Both the PFC and precuneus show a PPI with regions of the occipital lobe: the distribution of occipital, cortical areas identified in the PPI (**Figure 3B**) is similar to the cortical activations generated by the paintings themselves (**Figure 1**). This result is particularly interesting as it suggests a greater functional interaction between an executive function in FPC and sensory signals in LOC, when images are cued as COPY. The only region activated by assignment of AUTHENTIC was the orbitofrontal cortex, which has been associated with reward and monetary gain (Gottfried et al., 2003; Gold and Shadlen, 2007; Padoa-Schioppa and Assad, 2008), presumably reflecting the increase in the perceived value of the artwork. Since this result is strongly expected based on current hypotheses about the function of orbitofrontal cortex, the presence of this significant activation gives important support to our experimental paradigm. Since the assignment of a portrait as AUTHENTIC enhances the perceived value in the eye of the viewer, we can conclude that the method of delivering advice to the participants in this study was effective and relevant.

The design of our study needed to balance the experience of free viewing typical during art appreciation and the detailed level of experimental control achieved in many neuroscientific studies. Viewing computer presentations of artwork whilst lying supine in the brain scanner is vastly different from viewing artwork in a gallery space, or even looking at reproductions in books in normal contexts. Equally, there was no specific task in relation to each image presentation and eye movements could roam freely across the image. These features allow some license in the cognitive strategies that individual participants might employ during this experiment. Nonetheless, the focus on artwork of a particular style in combination with a clear methodological question has yielded some striking results.

The brain areas, which we find are activated by assignment of authenticity, emphasize the cognitive element of viewing artwork. Authenticity is just one component of the viewer's experience during the appreciation of a work of art. Even so, manipulation of this individual element subtly modulates the interaction between multiple brain regions of the participants. It may be said that one of the tasks of a writer on art is to achieve such a modulation in the viewers' responses. Additionally, the production of new artwork is generally held to initiate a transaction between the artist and the prospective viewers, within a social and cultural context that is often fuzzy and soft-edged – and therefore very susceptible to direction. The brain interactions that we have identified form part of the way that humans respond in this social and cognitive setting.

# **AUTHOR CONTRIBUTIONS**

The study was designed by Mengfei Huang, Martin J. Kemp, and Andrew J. Parker. Mengfei Huang implemented the design,

**REFERENCES**


collected initial results and conducted analysis under the supervision of Holly Bridge and Andrew J. Parker. Mengfei Huang wrote the initial working draft of the paper. Holly Bridge acquired data from more participants and undertook further analysis. Holly Bridge and Andrew J. Parker completed writing of the paper and all authors contributed to design and presentation of figures.

#### **ACKNOWLEDGMENTS**

This work was supported by grants from the Wellcome Trust and James S. McDonnell Foundation to Andrew J. Parker, by a Royal Society University Research Fellowship to Holly Bridge and a Fulbright award to Mengfei Huang.


njak, I., Flitney, D. E., Niazy, R. K., Saunders, J., Vickers, J., Zhang, Y. Y., De Stefano, N., Brady, J. M., and Matthews, P. M. (2004). Advances in functional and structural MR image analysis and implementation as FSL. *Neuroimage* 23, S208–S219.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 07 April 2011; accepted: 24 October 2011; published online: 28 November 2011.*

*Citation: Huang M, Bridge H, Kemp MJ and Parker AJ (2011) Human cortical activity evoked by the assignment* *of authenticity when viewing works of art. Front. Hum. Neurosci. 5:134. doi: 10.3389/fnhum.2011.00134*

*Copyright © 2011 Huang , Bridge, Kemp and Parker. This is an open-access article subject to a non-exclusive license* *between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.*

# **APPENDIX**

#### **BRAIN RESPONSES TO REAL AND FAKE REMBRANDT PORTRAITS**

The focus of the main study was the brain's response to assignment of authenticity rather than authenticity of the artworks depicted by the images. Nonetheless, the design of the experimental protocol makes it possible to use the same data set to determine whether the brain responds differently to images of REAL and FAKE Rembrandt portraits. The analysis methodology was identical to that in the main paper, except that in this case the contrast visualREAL > visualFAKE is of primary interest. The outcome is shown in **Figure A1**. This shows that all significant differences in REAL and FAKE portraits are confined to early visual areas in the close neighborhood of the calcarine sulcus (marked CS in Figure). The pattern of response is such that the REAL images generate stronger responses in left visual cortex and FAKE images generate stronger responses in right visual cortex. Almost certainly, this does not indicate a lateralized brain response to REAL and FAKE images. The most probable cause is a small mismatch in the distribution of small image features or contrast in the two sets of images, which our selection procedure had failed to eliminate. Although the images were carefully chosen to balance the numbers of left and right three-quarter views, it is nonetheless all too easy for some small bias of this kind to remain. This interpretation is consistent with the absence of activations anywhere else in the brain and most particularly, the absence of activations for visualREAL > visualFAKE that correspond in cortical location to the activations to visualAUTHENTIC > visualCOPY presented in the main paper. We also examined the data set for interactions between the AUTHENTIC > COPY and REAL > FAKE contrasts but nothing additional emerged.

#### **SCREENING QUESTIONNAIRE**

The following questionnaire was applied to all potential participants before recruiting them to ensure that all participants were amateur viewers (visit art museums 5 times/year; no extensive art training). The process also screened out individuals who dislike Rembrandt. To avoid potentially confounding variables. Most importantly, it confirmed their familiarity with Rembrandt's fame among artists, with all participants ranking him among the top 25 artists of all time, or higher

	- a. 0 times/year
	- b. 1–4 times/year
	- c. 5–10 times/year
	- d. 11–15 times/year
	- e. 16–20 times/year
	- 1. Never seen any of his work
	- 2. Somewhat familiar
	- 3. Familiar

	- 1 = strongly dislike
	- 2 = somewhat dislike
	- 3 = neutral
	- 4 = somewhat like
	- 5 = strongly like
	- a. the top 5 most famous artists of all times
	- b. the top 10
	- c. the top 25

# **POST-SCAN QUESTIONNAIRE**

The following questionnaire was applied to all participants immediately after their session in the MRI scanner. The verbal responses were used to check that participants had remained alert and active during the scan and to highlight subjective aspects of the participants' experiences.


# **THE RIDDLE OF STYLE CHANGES IN THE VISUAL ARTS AFTER INTERFERENCE WITH THE RIGHT BRAIN**

**Olaf Blanke and Isabella Pasqualini**

# The riddle of style changes in the visual arts after interference with the right brain

# *Olaf Blanke1,2\* and Isabella Pasqualini 1,2\**

*<sup>1</sup> Laboratory of Cognitive Neuroscience, Brain Mind Institute, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland*

*<sup>2</sup> Atelier de la conception de l'espace, Institute of Architecture and the City, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland*

#### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Stefan Pollmann, Otto von Guericke University, Germany Lutz Jäncke, University of Zurich, Switzerland*

#### *\*Correspondence:*

*Olaf Blanke, Laboratory of Cognitive Neuroscience, Brain-Mind Institute, Ecole Polytechnique Fédérale de Lausanne, Station 19, CH-1015 Lausanne, Switzerland. e-mail: olaf.blanke@epfl.ch; Isabella Pasqualini, Atelier de la conception de l'espace, Institute of Architecture and the City, Ecole Polytechnique Fédérale de Lausanne, Station 16, CH-1015 Lausanne, Switzerland. e-mail: isabella.pasqualini@epfl.ch*

We here analyze the paintings and films of several visual artists, who suffered from a welldefined neuropsychological deficit, visuo-spatial hemineglect, following vascular stroke to the right brain. In our analysis we focus in particular on the oeuvre of Lovis Corinth and Luchino Visconti as both major artists continued to be highly productive over many years after their right brain damage. We analyzed their post-stroke paintings and films, indicate several aspects that differ from their pre-stroke work (omissions, use of color, perseveration, deformation), and propose–although both artists come from different times, countries, genres, and styles–that their post-stroke oeuvre reveals important similarities in style. We argue that these changes may be associated with visuo-spatial hemineglect and the right brain. We discuss future avenues of how the neuropsychological investigation of visual artists with and without neglect may allow us to investigate the relationship between brain and art.

**Keywords: visual arts, painting, film, neuropsychology, neurology, Lovis Corinth, Luchino Visconti**

# **INTRODUCTION**

What is visual art? What are paintings? What are films? Innumerous answers have been proposed to these questions. During the last century despite a notable increase in such endeavors a general agreement on the adequacy of the questions posed or the answers provided has not been attained. Hence, in this text we suggest a different line of questions along with some preliminary answers that highlight the starting point for further investigations on the relationship between the visual arts and the brain.

In the present study we analyze the artworks of several visual artists, who suffered from a well-defined neuropsychological deficit –visuo-spatial hemineglect– following damage to the right brain. What can we learn about art using this approach? What is new in the present approach? And what could this analysis tell us about the relationship between visual arts and the brain? Several authors have applied principles from psychoanalysis (Kris, 1952), Gestalt psychology (Arnheim, 1954), as well as cognitive psychology and neuroscience (Rentschler, 1988; Zeki, 2000) to the visual arts. We here explore how the neuropsychological investigation of visual artists allows deeper understanding of the relationship between brain and art and argue that this approach has several advantages with respect to previous work on art and the brain (Blanke and Ortigue, 2011).

Over the last 150 years neuropsychological studies have led to the description of many important mechanisms of human brain functions such as language, visual and spatial perception, recognition, memory, and motor execution. Yet, these insights

have not been applied systematically to the understanding of art, although some investigations on art were carried out by neurologists and neuropsychologists with an amateur interest in the arts such as Bonvicini (1926), Alajounaine (1948), Jung (1974), Gardner (1975), and more recently Vigouroux (1997), Zaidel (2006), and (Blanke and Ortigue, 2011). We agree with Zaidel (2006) that the detailed study of painters with brain damage and its effect on the painters' art is probably the richest and most direct source for the elucidation of the relation between brain and art (as compared to insights based on psychological approaches or neuroscientific approaches, at least at the moment). We also note, that despite the accumulation of several neuropsychological observations in painters with different neuropsychological symptoms due to stroke over the last 100 years, this has not sparked much interest in art history or criticism (but see Cela-Conde et al., 2011; Nadal and Pearce, 2011). We predict that this will change in the future as it has already in the field of philosophy and the social sciences.

The cited neurologists and neuropsychologists have not only analyzed paintings but also have described the effects of brain damage on music and poetry. Whereas these analyses revealed important differences on art making following brain damage in different artistic genres (see the devastating effects of a left hemisphere stroke in Baudelaire and Debussy for example;Alajounaine, 1948), this has not allowed to describe in greater detail the effects of brain damage on the visual arts. Here we have followed and hope to have extended the approach initiated by art connoisseur, painter, and neurologist Richard Jung. He focused on one specific art genre

–painting–and studied paintings and drawings only in painters suffering from a clearly defined type of disease and brain damage –stroke to the right-hemisphere, associated with the neuropsychological symptom of visuo-spatial hemineglect (visuo-spatial HN; Jung, 1974, 1975; see also Gardner, 1975). Herein we aim to extend the Jungian approach by analyzing the neuropsychology and artwork of two major categories in the visual arts –painting (Lovis Corinth, Anton Räderscheidt, and Huguette Bouchardy-Rey) and film (Federico Fellini and Luchino Visconti). These artists suffered from stroke affecting the right brain associated with visuo-spatial HN. We are well aware that our approach is highly selective (only the visual arts, only some genresfrom the visual arts, only left-sided spatial HN, only right-hemisphere damage, only neuropsychological analysis), yet we believe that such a concentration is a necessary constraint –at least at the beginning, in order to develop approximate directions and empirical guidelines for this new discipline between neuropsychology and art theory.

# **VISUO-SPATIAL HEMINEGLECT**

Before detailing the effects of visuo-spatial HN in painting and film, we briefly introduce neurological, neuropsychological, and graphical signs of visuo-spatial HN. Hemineglect is a common neuropsychological condition following right posterior brain damage and is an attentional disorder characterized by disregard of sensory, imagined, or action information in the part of space to the left of the midline (Robertson and Halligan, 2001; Blanke and Lenggenhager, 2007). Thus, visuo-spatial HN is characterized by an attention deficit that mostly (but not only) concerns those parts of space that are contralateral with respect to brain damage. Typically HN is associated with right brain damage (leading to left-sided HN), particularly damage in right parietal and/or superior temporal cortex (Halligan and Marshall, 2001; **Figure 1A**), but also to premotor cortex (Verdon and Vuilleumier, 2010). HN is often associated with left-sided somatosensory and motor deficits that may affect arm and leg. In some cases of extensive and more posterior brain damage, HN may also be associated with left-sided hemianopia or left lower quadranopia. Language and memory deficits only rarely occur jointly with HN. Although HN is often associated with left-sided sensorimotor and visual deficits, the attentional deficit is classically independent of sensory or motor deficits. HN patients may thus behave as if the contralesional space did not exist, even if their brain mechanisms for left-sided perception and action are intact (for further discussion with respect to the visual arts see below). In addition, many HN patients are not aware of their left-sided attentional and sensorimotor deficits (i.e. anosognosia), although patients with visuo-spatial HN with and without anosognosia have an impaired ability to orient and react toward objects in contralateral space.

Hemineglect is a complex condition (Kerkhoff, 2001) that may affect different sensory modalities and cognitive functions to different degrees. In brief, neuropsychology distinguishes three major forms of HN: sensory, motor, and representational HN and these different forms have been associated with damage to different brain regions (Verdon and Vuilleumier, 2010). A deficit to react to stimuli such as visual, auditory, or tactile cues is called sensory HN (Kerkhoff, 2001). Motor HN is characterized by lessened

spontaneous movements (including eye movements) and exploration in the contralateral direction (Kerkhoff, 2001). HN may also be present when patients imagine spatial scenes, even in the apparent absence of sensory or motor HN. This form of HN is called representational neglect and was described initially when imagining public spaces such as the Piazza del Duomo in Mailand (Bisiach and Luzzatti, 1978; Ortigue et al., 2006). Of further relevance for the present study is the differentiation between objectand space-centered HN. The latter designates a left-sided HN with respect to the horizontal (mid-sagittal) body axis of the patient (omission of items on the left side of a drawing that is positioned in front of the patient). Object-centered HN is characterized by omissions on the left side of a drawn or perceived object that can be positioned on the right or the left side of the paper (**Figure 1B**).

In order to diagnose visuo-spatial HN, neuropsychologists use several standardized paper and pencil tests that allow the quick and precise detection and quantification of perceptual and graphical elements on HN. We here briefly describe several such tests of relevance for understanding visual art works by artists with visuo-spatial HN. During clinical routine testing several other standardized tests, also including computer-based tests are used to reveal the intensity and the different aspects of HN (see Robertson and Halligan, 2001). In the clock test the patient is asked to draw a clock from memory or to fill in the clock's numbers into a preexisting circle. HN patients often draw only the right side of the clock, omit the digits on the left side (7–12) or draw all digits from 1 to 12 into the right half of the circle (**Figure 1C**). Often a spatial deformation of the drawn clock or perseverations (of clock elements), as well as an altered orientation of the digits may be observed. Copying tests consist of the copying of a complex abstract geometric figure, a daisy, clock, or the drawing of a small landscape with a house and trees (**Figure 1D**). HN will be perceived in form of omissions on the left side of the sheet and object-centered neglect (by missing details or by deformations on the left of each individual object). Further graphical signs such as loss of perspective, general simplification, changed ductus, repetitions, or perseverations have also been observed (Blanke and Ortigue, 2011). In the line bisection task the patient is asked to indicate the middle of several horizontal lines shown on a piece of paper. Patients who neglect the left side of the line systematically indicate the middle of the line too far on the right side (**Figure 1E**). This behavior can be observed particularly with long lines or with lines, which are on the left, the contralesional side of the sheet. In cancellation tests, as the "Letter cancelation test" or the "Bell cancelation test" the patient is shown a sheet, on which several letters or bells are being drawn. The patient is asked to find all specific symbols (i.e., the bells) on the whole sheet and to mark them. Patients with left-sided visuo-spatial HN typically omit targets on the left side of the sheet. Below we have focused on visual signs that can be seen in the artworks of selected visual artists with left-sided HN. As previous authors, we have extended the use of clinical tools to the study of artworks and searched for HN signs in paintings, drawings, and films. How is HN characterized in these different art forms? Do painters and filmmakers continue to make art? Is their art changed? What do these changes look like? As we will see quite a few studies about paintings and painters have already been carried out, but almost no work exists about the effects of HN on film.

# **PAINTING**

Over the last century many painters have been described who suffered from visuo-spatial HN due to right hemispheric brain damage. Among them were Lovis Corinth and Otto Dix (Jung, 1974, 1975), Bruno Alder (Schnider et al., 1993), Huguette Bouchardy-Rey (Blanke et al., 2003), Pierre Ambrogiani (Vigouroux et al., 1990), Anton Räderscheidt (Jung, 1974), and many others (Bänzer and Hennerici, 2006; Blanke and Ortigue, 2011). We have here focused on the work of Corinth and further discuss a few works by

Räderscheidt and Bouchardy-Rey. As we will see below the analysis of the work by these three artists who worked during different periods of the nineteenth, twentieth and twenty-first century will allow us to highlight some converging post-stroke style changes, despite prevalent differences before the stroke.

### **LOVIS CORINTH**

By 1901 Lovis Corinth (1858–1925) was one of the most eminent German painters. Art historical labels, however, are not easily applied to Corinth's works (Kuhn, 1925; Schröder, 1992; Blanke and Ortigue, 2011). He painted naturalistic portraits, slaughterhouse scenes as well as interiors, still lives and landscapes that link him to Impressionism. At the same time he produced history paintings illustrating biblical and mythological scenes upon which his reputation in the last decade of the nineteenth century was based. Although he rejected Expressionism in principle, most of Corinth's later works place him among the Expressionists (Schröder, 1992). His work thus defies easy categorization. Two major periods have preoccupied art historians: Corinth's mature, "impressionistic" period or style (1900–1911) and his late "expressionistic" style (1912–1925) starting with his right hemispheric stroke in December 1911.

Art critic Alfred Kuhn defined Corinth's paintings carried out after 1911 as the "Altersstil [or late style] of the painter." He commented that "the preponderance of the plastic and corporeal starts to disappear progressively" and "an essentially plane-like painterly seeing was appearing" (**Figure 2A**). Schröder (1992) describes Corinth's work as paintings in which "the balance between horizontal and vertical seems disturbed," as "unstable" and "tilting" paintings (**Figure 2B**). This is opposed to Corinth's mature style that was characterized by high levels of corporeality, richly nuanced flesh of the human body in figure and history paintings (**Figures 2C,D**) that made Corinth famous. The importance of depth and spatial relations were seen as the "Bravourelements" of his art (Kuhn, 1925; Schröder, 1992). Kuhn (1925) explicitly mentions Corinth's disease as an important factor in this artistic change (as does Uhr, 1990) whereas others disapprove of disease related medical accounts (Osten, 1955), all agreeing that Corinth's late style begins in early 1912.

In December 1911, Corinth suffered a stroke and was immediately hospitalized although there seem to be no medical records. Corinth suffered from left-sided motor deficits of arm and leg and probably left-sided lower visual field deficits. After hospital discharge, he was able to walk a few steps, but only when supported by his wife and a cane. Although the right-handed Corinth already drew on his hospital bed, for several months she was not able to hold neither palette nor brushes in his left paretic hand. He recovered progressively and in 1914 his son noted that his father was again able to swim and walk without help. Based on these symptoms and several drawings that Corinth carried out in 1912, Jung (1974) proposed that he suffered from left-sided visuo-spatial HN. He also noted left-sided omissions in Corinth's post-stroke self-portraits. Thus, the central portrait shown in **Figure 2E** depicts Corinth's wife Charlotte to which Corinth added two self-portraits. Left-sided omissions can also be seen in Charlotte's face, her forehead and her hair. Her left shoulder was replaced by a small self-portrait. Although her left hand

and arm are drawn, both show signs of spatial deformities and are less precisely drawn than the corresponding right body parts. It has also been observed that her left hemiface is less wide and drawn with less spatial detail and nuances (Blanke, 2006). Corinth's selfportrait on the right side also shows left-sided neglect suggesting the presence of object-centered graphic neglect. Thus, despite placing this self-portrait in his preserved right spatial field, Corinth omitted left facial features (eye, hair, left facial contour), suggesting the presence of an attentional-spatial deficit as opposed to sole perceptual-visual deficits such as hemianopia or quadranopia. The left self-portrait also shows left-sided graphic neglect (left eye and other left facial features are missing). In a later self-portrait (from 1912, **Figure 2F**) further left-sided omissions can be seen (Blanke, 2006).

# **HUGUETTE BOUCHARDY, ANTON RÄDERSCHEIDT, AND LOVIS CORINTH**

The association of left-sided omissions and a deviation of the entire drawing and painting toward the right has been described in six painters suffering from left-sided visuo-spatial HN (Blanke and Ortigue, 2011). Left-sided deformations and perseverations have also been observed. Characteristic deformations are shown in **Figure 3A** in a drawing by Bouchardy-Rey where we see that

Bouchardy-Rey: Bouquet de Doris Mart (2001). © Huguette Bouchardy-Rey. **(B)** Huguette Bouchardy-Rey: Rose (2001). © Huguette Bouchardy-Rey. **(C)** Anton Räderscheidt. Mann mit gelben Handschuhen (1918); **(D)** Anton Räderscheidt. Self-portrait (1968). © Anton Räderscheidt VG Bild-Kunst, Bonn, Reproduktion aussschliesslich mit Autorisierung der Copyright-Inhaber, © 2011, ProLitteris, Zurich.

the roses on the left side of the bouquet are deformed or items drawn twice (perseverated) whereas this is absent for roses on the right side. Moreover, these classical elements of graphic HN have been postulated to give rise to a new style element in post-stroke paintings. Deformations, perseverations, and left-sided omissions and displacements lead to missing contours leading to an increase in flatness and loss of spatiality in post-stroke drawings and paintings by painters with HN (Blanke and Ortigue, 2011). In addition, objects and people are less clearly separated or distinguished from the environment and among themselves leading sometimes to a super-positioning of people and objects (sometimes due to perseverations; **Figure 3B**; see bottle on lower left). These elements may predominate on the left side of the painting, but can also be found over the entire canvas and have been described in works by Räderscheidt, Corinth,Alder, and Bouchardy (Blanke and Ortigue, 2011). As in the case of object-centered HN in Corinth, we stress the point that these latter findings underline the attentional (and

not only perceptual) changes in these visual artists. These changes are thus independent from the loss of low-level visual deficits that may or may not exist in these artists, but rather relate to higherlevel attentional changes related to perceptual, representational, and/or motor aspects of visuo-spatial HN. Future work in painters with neglect should carefully determine what type of neglect the painter suffers from and evaluate the respective effects on the art works. Comparative studies should also be carried out in nonartist patients with visuo-spatial HN, extending earlier work on drawing in patients (Piercy et al., 1960).

Art critic Kuhn has remarked in 1925 for Corinth's post-stroke works: "the contours disappear, the bodies are often as if pulled apart, deformed, their spatial relationships distorted, as if this would not be important anymore (as in Corinth's work before 1912)." We argue that these style changes are caused by the painter's altered mechanisms of attention, most likely related to the representation and the perception of "space" due to visuospatial HN. Whether these are the changes beholders of artworks find particularly remarkable may be another point of interest to pursue Berlyne (1954). Did these changes in the artist Corinth lead to the further development of his artistic "style" – from his Impressionism to his Expressionism– that is so characteristic of his later works (see Blanke, 2006; Blanke and Ortigue, 2011)? Surely this may not be regarded as the only reason for his late expressionistic phase, but it may well be an important reason, which brought him to paint in a way that in principle he himself did not value highly.

Although Corinth's paintings do not reveal clear signs of leftsided visuo-spatial HN,Blanke (2006) has argued that Corinth also changed his body position in front of the mirror when painting self-portraits, thus deviating from his customary stance (and that of his much admired Dutch masters Frans Hals and Rembrandt). In his self-portraits before 1912 Corinth depicts his body as turned rightward, whereas after 1912 he depicts himself mostly as turned leftward. Blanke (2006) has argued that this change in stance was necessary in order to look at his mirror reflection within his preserved right visual and spatial field avoiding his left neglected visual field (contrary to his pre-stroke habits).

Painters with visuo-spatial HN have also changed the palette of colors using colors differently, more intensely, and also more frequently as in their pre-stroke works. This is apparent in Corinth's "Walchensee" series where his color palette evokes those of his contemporary expressionists (such as Emil Nolde, Ludwig Kirchner, Franz Marc). Important changes in the use and perception of color have also been noted in Bouchardy-Rey (Blanke et al., 2003). Räderscheidt even describes in his diary that he experienced a "color explosion" ("Farbeinbruch") after his stroke, wondering, how he could have drawn for most of his life without much color. In fact, Räderscheidt is well-known for paintings from his magical realism period (1920s) that are characterized by cold, hard, and metallic colors such as gray, black, green and blue (**Figure 3C**). This is different in post-stroke paintings where he also uses intense reds and orange (**Figure 3D**).What is most relevant for the present considerations is that despite the original or mature style of the painter, country of origin, cultural background or epoque during which he or she was living there exists an ensemble of style elements that can be found in all of the affected painters. To us this suggests a common origin: interference with visuo-spatial mechanisms in

the right brain that are crucial for the perception, representation, and making of paintings and drawings.

# **FILM**

How does visuo-spatial HN affect filmmaking? Are there any filmmakers that have suffered visuo-spatial HN and continued to make films? The great film directors Federico Fellini and Luchino Visconti both suffered right hemispheric brain damage due to vascular stroke. Whereas Fellini was examined in detail in neurology, neuropsychology, and neuroradiology, equivalent data are not available for Visconti. Unfortunately, Fellini was not able to resume his cinematographic work, whereas Visconti made two major films after his right hemispheric stroke.

#### **FEDERICO FELLINI**

At the age of 73, in August 1993, the great cineast Federico Fellini (1920–1993) suffered right posterior brain damage associated with left visuo-spatial HN (Cantagallo and Della Sala, 1998). Fellini was also a great draftsman (De Santi, 1982) and showed several of the graphical HN signs that we described above for Corinth, Räderscheidt, and Bouchardy-Rey. Magnetic resonance imaging revealed a right temporo-parietal lesion compatible with vascular stroke. This was associated with a moderate sensorimotor left-sided deficit and left inferior quadranopia. Fellini's neuropsychological examination revealed normal language,face perception, long-, and short-term memory. The examination found left-sided visuo-spatial HN and severe visual extinction of which Fellini was only partially aware (Cantagallo and Della Sala, 1998). HNrelated omissions were found in cancelation and line bisection tasks (**Figure 4A**), complex figures and in writing and reading. These signs were only found in the early phase of hospitalization with normalization of performance in all tasks within 2 months. In reference to the graphical changes related to HN symptoms as described above, here, we depict two small drawings by Fellini revealing the presence of left-sided visuo-spatial HN (**Figure 4B**). Whereas these sketches are unmistaken recognizable as Fellini's,

Sala (1998) © 2011, ProLitteris, Zurich.

there are many left-sided omissions and spatial deformations that were not present in his pre-stroke drawings (Cantagallo and Della Sala, 1998). Fellini was not able to resume his film work because he suffered a second vascular stroke, from which he died in October 1993.

# **LUCHINO VISCONTI**

Luchino Visconti (1906–1976), the famous director of film, theater, and opera, suffered right brain damage at the age of 66 in August 1972. Just the same as Fellini he was a major figure of neorealism (Ferrara, 1963; Guillaume, 1966; Sterling, 1979). Visconti's work has been described as the translation of human subjectivity into a visual style (Nowell-Smith, 2003). Visconti was particularly famous for the decor of his film sets, which were painstakingly researched and reconstructed by a team of expert artists and craftsmen, the so-called"bottega viscontiana," under closest supervision of Visconti for reasons of authenticity (Schifano, 2009). A student of Visconti, Michelangelo Antonioni, remarks that spatial organization was a key aspect of Visconti's cinematographic work. He filmed with an immense accuracy in embedding the actor and making the actor adhere to the space, turning the immediately surrounding, peripersonal space into the actor's place of work (Lagny, 2002). The importance of spatiality in Visconti's cinematography is also commented in Schifano's (2009) biography. One important technique for achieving this spatial essence was, next to the use of a theatrical scenography, his use of locating objects and actors with respect to three cameras within space (**Figure 5A**). Another exceptional feature that contributes to the sense of spatiality and temporality in his films is the use of the cross-fade technique. In the cross-fade the filmed background is dissolved from one frame to the next by introducing a new background motive. This technique allowed Visconti to evoke a more realistic impression of elapsing time and a more realistic narrative flow, as well as for Visconti's style a "characteristic lack of internal curtains," which would have been induced through the classical montage of scenes (Schifano, 2009). Although montage has been defined as the foundation of film art by Arnheim (1957) and not only in reference to silent film

classics but also to the Italian Neorealist school Visconti preferred to avoid the use of montage as a tool. Cross-fade can be thus seen as a particular element engendering his realism, in that he avoided narrative "ruptures" or complex temporal constructions. In the same way Visconti introduced large panoramic shots, uncut and of considerable length, by implementing dramatic spatial sequences used to describe the film location in relationship to a thematic spatial and temporal presence of the actors (Lagny, 2002). Comparable to a theater scene the complete film scenes were filmed on site and in continuous shots within the limits of the original location (i.e. location filming) by the parallel use of one fixed and two mobile cameras following the actors step-by-step in the duration of the whole scene. These impressive panoramic shots were precisely oriented in space (Schifano, 2009; **Figure 5A**, **Figure 6A** left). In his early and pre-stroke oeuvre Visconti thus mastered the spatial and temporal adhesion of the filming angles, close-ups and panoramic views to the visual content of what may be called the total sequence of a movie. In filming, this enhanced visual comprehension can be seen as the main endeavor to convey a symbolic spatial depth to the beholder (Arnheim, 1957).

In midst of an intense filming period for "Ludwig," the righthanded LuchinoVisconti suffered a right-hemisphere stroke. Schifano (2009) reports an initial sensation of weakness that was followed by involuntary and uncontrollable movements of Visconti's left leg that lead finally to left-sided motor weakness and hemiplegia that affected left arm and leg. He was immediately brought to a hospital in Rome where he remained for 2 weeks and was then transferred to Zürich University Hospital where he stayed for 2 months. We were able to find only a limited number of documents about Visconti's acute and chronic medical history and convalescence period. For this we have relied on reports and descriptions of his friends, family, and biographers and a few television appearances. Thus, Visconti suffered from severe left-sided arm, hand, and leg paralysis and likely also associated left-sided somatosensory deficits. Inspection of a filmed original interview (that was part of the BBC documentary "The Life and Times of Luchino Visconti," 2002) shows an interviewer sitting to the left

**FIGURE 5 | Luchino Visconti. (A)** Scene from "Rocco and his brothers" on top of the Dome in Milan. **(B)** Scene with sequence of close-ups in "Conversation Piece." © 2011, ProLitteris, Zurich.

of Visconti. Photographs taken from that period reveal the persistence of important motor impairments of the left upper and lower extremity. Our analysis revealed no major facial asymmetries and –concerning space exploration– no overt gaze or attention abnormalities toward the left side. These findings testify to Visconti's rapid recovery, despite the persistence of his left-sided motor deficits. It is reported that later on Visconti further recovered and was able to walk with the help of a cane and in 1975 able to walk and stand with the help of a cane (Dieguez et al., 2007; Schifano, 2009). The relative sparing of facial paralysis suggests that Visconti's brain damage may not have been (or not only) at the subcortical level (that most often affects the entire hemi-body), but more likely included cortical structures in right frontal and/or parietal cortex. Based on some lacking neurological, neuroradiological, and neuropsychological standard data and the normal responses and exploration of the BBC interviewer to Visconti's left side it is difficult to ascertain the diagnosis of left-sided visuospatial HN. Yet, as almost all patients with left-sided paralysis due to vascular right-hemisphere stroke (i.e., 82% in the study by Stone et al., 1993) suffer from left-sided HN, the presence of visuo-spatial HN in Visconti is more likely than its absence.

We compared Visconti's films with those of other filmmakers by employing terminology introduced by the late Arnheim (1957) and also by comparing his work with drawings and paintings by painters suffering from visuo-spatial HN. Although Visconti quickly resumed the abruptly stopped film montage of "Ludwig" we do not know how much work Visconti has carried out himself (the officially released version of the film is considered by many to have been under influence of others). We have therefore focused our preliminary analysis selected pre- and post-stroke films. Film critic Geoffrey Nowell-Smith comments that the "last two films ('Conversation Piece,''The innocent') were curious and puzzled his admirers as much as his detractors" (Nowell-Smith, 2003). Lagny (2002) writes that in "Conversation Piece" "everything resides in the intense reflection about space, which in the film is constructed rather through the exchange of glances than through shifts of place"and that"everything happens in closed rooms,which remain difficult to situate in relation to each others."Other film critics have mentioned Visconti's shift in filming technique, from Visconti's characteristic cross-fade of the pre-stroke period to the montage of close-ups on post-stroke films (**Figure 5B**). The montage of close-ups corresponds to a technical practice, which, in Visconti's pre-stroke filmography, was much less often adopted and, if so, mostly in combination with panoramic scenes. It seems that the late Visconti imposed these uncharacteristic close-ups against the will of his surprised crew with whom he had been working for decades [i.e., such as his long-time camera man Giuseppe Rotunno who proposed that Visconti's late and exacerbated use of the closeup was related to his stroke and paralysis (Schifano, 2009)]. We disagree with this medical proposal and conscious adaptation by Visconti and rather argue in the next paragraph for a stroke-related change in cognitive style in Visconti due to visuo-spatial HN.

To summarize, in the two post-stroke films, "Conversation Piece"of 1974 and"The Innocent"of 1976,Visconti employs closeups in a much more deliberate and independent way. Considering that in pre-stroke films these close-ups were rare and if present almost always embedded in longer and panoramic film sequences (with changing surroundings integrated through cross-fading), we suggest that this late – explicit, novel, and abundant use of the close-up (**Figure 6**) strongly impacted the spatial configuration, or cinematographic spatiality, in Visconti's last two films. Concerning close-ups Arnheim wrote that "a superabundance of close-ups very easily leads to (...) a tiresome sense of uncertainty and dislocation" and "a film artist will generally find himself obliged not to use close-ups alone but only in conjunction with long shots that will give the necessary information as to the situation in general" (Arnheim, 1957). This analysis suggests that the post-stroke affordance of the previously avoided close-up may be related to visuo-spatial HN that is classically associated with uncertainty and dislocation (Blanke and Ortigue, 2011). Again, as remarked for Corinth's change from his Impressionism to his late Expressionism, these unusual or untypical elements may be those that the beholders of artworks will find particularly remarkable (Berlyne, 1954).

Schifano (2009) detects a loss of spatiality in "Conversation Piece" and infers this as a style element of the late Visconti (**Figure 5**, **Figure 6A,B**). The artist himself describes the images of the interiors in his late film as "freely recomposed, in the total freedom of proportion and position" (Lagny, 2002). What we submit to our reader's opinion is that the visuo-spatial and attentional "deficits" associated with HN may have inspired and somewhat guided Visconti to adopt and develop a new art making as was also the case in several painters (Blanke and Ortigue, 2011). Many films by Visconti have been described as autobiographical, but "Conversation piece" in particular. Moreover, the scenes in both post-stroke films are often filmed with the characters facing the camera in a close-up, and not in a natural position in space, such as in "Senso," "The Leopard," "The Damned," or "Ludwig," where complex spatial positions of the cameras in the interior generate a spatial mosaic of first- and third-person perspectives within the rooms (**Figures 6B,C** right). There is also, like never before, a nested structure of flashbacks –the "inner curtains"–which the artist had so far avoided and disliked. Finally,Nowell-Smith (2003) noted that in the later films "an interest in the decorative in its own right" and the "potential of color film to render visual surfaces in different ways" had become a principal issue (Nowell-Smith, 2003; **Figure 6E**). This can particularly be seen in the film"The Innocent" of 1976, Visconti's last cinematographic work.

# **DISCUSSION**

Based on the preliminary analysis of Visconti, we propose that the following four elements are characteristic of the post-stroke versus the pre-stroke oeuvre of Visconti. First, he shifted from a spatio-temporally "realistic" perspective induced by filming the entire scene sequence in a realistic way within the real location's limits (location filming with few close-ups mostly embedded into large filmic panoramas and the use of cross-fades) to a constructed "space of glances" by frequently deployed sequences of close-ups and more static perspectives. This was, second, associated with topographical and architectonic spatial disruptions, as opposed to his famous clarity for a cinematographic space. Third, in the poststroke films the actors much more often face the camera frontally generating the impression of flatness of bodies, scenery, and room sequence, compared to the pre-stroke films where perspectival

changes as well as richness and clarity of space and figures are more prevalent and render a more complex spatial configuration. Finally, there is a heightened interest in color and surface rendering in the late Visconti.

These Viscontian post-stroke elements are also present in the post-stroke artworks of painters with visuo-spatial HN. Thus in Corinth– next to more directly related HN signs such as leftsided omissions and deformations –his post-stroke oeuvre is also characterized by a plane-like way of painting. These paintings have been described as less spatial and corporeal, as containing spatially deformed elements (Kuhn, 1925). We described above how this differs dramatically from Corinth's pre-stroke works for which he was famous. In fact, the beautiful depiction of depth and spatial relations between objects, people, and environment were seen as the key elements of Corinth's art; as for Visconti. Thus, both artists – despite their difference in genre, epoch, and individual styles – moved away from their strong reliance on spatiality to a visual art where "the contours disappear, (...) the bodies are often as if pulled apart, deformed, their spatial relationships distorted, as if this would not be important anymore". Much alike, in his work on cinematography and time, Deleuze (2007) comments on the articulation of Visconti's main characters in his last film: "Everything becomes confused, to the point of indiscernibility of the two women in 'The Innocent'." Closeup and frontality characterizes the portrayed protagonists in the post-stroke works of Visconti and Corinth enhancing the desired distortion and planelikeness of the person in the painted or filmed environment. The changed and enhanced rich use of color and textures and its importance in enchanting the spatial relationships between person, object, and environment is a further factor that merits attention. We believe it is present in the post-stroke works of Visconti, Bouchardy, Räderscheidt, and Corinth in similar ways. The use of color (and the decorative) and its artistic employment in film was a great challenge that had to be faced by film artists in this time (Arnheim, 1957). Indeed, Nowell-Smith (2003) noted the different use of color and texture in Visconti's post-stroke films. Whereas, such remarks are difficult to link directly to visuo-spatial HN, we note that Kuhn (1925) and Uhr (1990) had remarked comparable changes in the post-stroke paintings of Corinth. Also Räderscheidt mentioned in his autobiography his own heightened interest in color in his post-stroke works (Blanke and Ortigue, 2011). These post-stroke similarities suggest that the study of visual artists with visuo-spatial HN may allow us to formulate new questions about color and the visual arts.

How much were the discussed artists aware of these changes? We believe with Dieguez et al. (2007) that Visconti and the other artists were well aware of these changes (as well as of their sensorimotor impairments). Yet, awareness does not exclude that HN will influence drawings and artworks as shown for Fellini (Cantagallo and Della Sala, 1998) and Bouchardy-Rey (Blanke and Ortigue, 2011). Visconti and Corinth may have taken this to a new level in art by creating masterpieces of striking beauty for many years with their HN. More work is necessary concerning automaticity and awareness in art making, an issue that should fascinate scholars of consciousness alike. We think that comparative analysis about awareness in Fellini and in other visual artists will be one interesting avenue to pursue. For example, Fellini wrote in a drawing "Cos'è sinistra?" but omitted to draw left-sided picture elements in that same and many other drawings (Cantagallo and Della Sala, 1998). This illustrates how the role of consciousness and awareness in art making may be studied empirically at least in the early phase after the onset of visuo-spatial HN. This is an interesting topic that may allow to study some of the numerous ways of artistic freedom, creativity, and the role of consciousness and awareness in art making also in later phases and in art making in general.

The Brazilian filmmaker Glauber Rocha detectedVisconti's tendency to frequently zoom into the picture (as well as other "late style" changes) already in Visconti's films from the 1960s'. He thus argued for a more continuous evolution toward his late style (i.e., focus on the close-up), describing them as Visconti's "healthy (and desired) rupture with Italian pictorial tradition" (Schifano, 2009). It may of course be the case. Yet, the above described similarities with painters and filmmakers and much earlier film critical writing about Visonti's late style would argue against Rocha's claim. Also, we cannot fail to remark that similar disagreement also characterizes art critics' opinions about the pre- and poststroke style in Corinth as well as Räderscheidt (Blanke, 2006; Blanke and Ortigue, 2011). Both, Visconti and Corinth, "pose a challenge to (...) criticism" as Nowell-Smith (2003) writes about Visconti. For both artists seems to apply what Schröder (1992) writes about Corinth: "There were no problems to distinguish (...) the mature period (...). The helplessness begins when trying to describe Corinth's late works. Already where it belongs is debated. Is it expressionistic? Or did it remain to the end consequently impressionistic?" A way to avoid this difficulty, as our analysis suggests, is to link some of these changes to visuo-spatial

# HN. We want to be careful, to avoid being misunderstood: as we have argued before (Blanke, 2006) artists will transcend these neuropsychological difficulties and have gone on to create artworks, masterpieces of amazing complexity and beauty. It is also true that their post-stroke works remain unmistakingly a Visconti or a Corinth. Yet we insist: the presence of visuo-spatial HN following right brain damage may have played an important (and empirically testable) role in the predominance of the respective developments, leading to a common late style in these different artists. As stated by Ernst Gombrich concerning the categorization of style,"it is impossible to lay down [artistic] rules...because one can never know in advance what effect the artist may wish to achieve" (Gombrich, 2006). We would like to add that the presence of visuo-spatial HN in some unfortunate artists such as Corinth, Bouchardy, Räderscheidt, Fellini, and Visconti may allow to study at least some of these aspects with respect to what is known about the right-hemisphere's role in perception and cognition.

We hope that these thoughts will be taken further and will also be applied to other visual arts such as sculpture and architecture (see Halligan and Marshall, 2001 for sculpture) and of course to more recent forms of the visual arts that are currently unexplored. In addition this needs to be combined with laboratory studies with visual artists as subjects in standardized experiments (see also Blanke et al., 2009), as well as studies testing the role of the left versus right-hemisphere in perceiving, judging and making drawings and paintings. One promising line of research is work by Dahlia Zaidel and others (Levy, 1976;Regard and Landis, 1989; Zaidel and Kusher,1989;Zaidel,2006). These approaches may already be difficult to achieve for making of drawings and paintings, but will likely be even more demanding for filmmaking. In-depth analysis of the artworks of our selected painters and filmmakers, holds to our opinion also some answers to what visual arts are and what visual artworks are. Importantly, it may help building a bridge between art criticism and empirical science (such as neuropsychology and neuroscience) on our way toward more frequent trespassing.

# **REFERENCES**


in *Lovis Corinth*, ed. K. A. Schröder (Munich: Prestel), 8–35.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 18 April 2011; paper pending published: 14 August 2011; accepted: 13 November 2011; published online: 02 January 2012.*

*Citation: Blanke O and Pasqualini I (2012) The riddle of style changes in the visual arts after interference withthe right brain. Front. Hum. Neurosci. 5:154. doi: 10.3389/fnhum.2011.00154*

*Copyright © 2012 Blanke and Pasqualini. This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.*

# **ARTS AND SCIENCES**

**Ofer Lellouche** *(artist perspective)*

# Arts and sciences

# *Ofer Lellouche\**

*Painter and Sculptor, Tel Aviv, Israel \*Correspondence: oferlellouche@hotmail.com*

*Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

*Reviewed by: Robert Pepperell, UWIC, UK*

The idea of bringing together the arts and sciences is not new. In almost every age, these two disciplines have had a difficult and fragile relationship. Where does this mutual fascination come from? What are the limits of this dialog? How can we avoid misunderstanding, and manage this encounter in a productive way?

As an artist, I am often asked "Why do you need this?" Actually my answer is that I do not know. I do not need it to paint, nor to ameliorate my art, and even not to reach new directions of exploring. Five years ago, I participated in a seminar regrouping 10 leading brain researchers with 10 leading artists. For three days, artists and scientists spoke about their work. It seemed to me like two parallel monologs that can never meet; scientists spoke about science and artists spoke about their art.

As a painter, I was fascinated to discover during the seminar that what we artists always knew, in an intuitive way, had a theoretical basis. I could better explain, for instance, why there is a huge difference between painting from nature and painting from photography.

But, naturally, this symposium did not influence my way of painting; for me, as for many other artists, the moment we create must be completely intuitive. It slightly resembles the moment the goalkeeper of a football team reacts in front of a penalty shot: a mixture of extreme concentration and of emptiness of the mind. But in order to arrive to this second where everything is possible, he has to suffer many trainings. In a way, the seminar was for me part of this training.

The language of art is a highly sophisticated one, and it is obvious why it is a privileged subject for sciences of the brain. Nevertheless, sciences, or any other rigorous communicative language, cannot explain art. The first sentence of Zeki (1999) in his book *Inner Vision* is, "This is not so much a book about art, it is more a book about brain." This impossibility is not due to a provisory lack of information, but because of the very different structure of the language of art and science.

What do we expect in the contemplation of a piece of art? For us, moderns, it certainly does not have a magical function (like Egyptian Art, for instance, a function which has a real influence on the real world). When art lovers are asked to better define why they need art, they speak of pleasure. But what kind of pleasure? Is it the same kind of pleasure that one derives from a good meal? Should painting be studied by scientists as gastronomy is?

This comparison hurts the artists (Although, alas, art critics and restaurant critics share the same page of the newspapers!). When people are asked to be more precise about the word "pleasure," they speak of "emotion," they evoke the feeling of "diving" into the piece, to "forget themselves," to identify themselves with the subject, they speak of a "dream awakened,"—all are feelings that you could not have while sitting in front of a plate of seafood. A piece of art makes us dream.

Of course, a piece of art also delivers an esthetic pleasure in the "gastronomic" sense of the word, but this is secondary. In what way does a piece of art make us dream? Why can we not stop looking at the Madonna by Raphael, while the (almost) same painting by a different artist leaves us indifferent? Why does the Montagne Sainte-Victoire painted by Cézanne makes us dream, while a photograph of it just looks like a mountain? Art is not a representation of nature in a beautiful way; It is a different kind of signification. There is a deep difference between the language of art and the language of everyday.

*The language-of-everyday* is based upon the assumption that you can understand what is said. The more accurate it is, the more communicative it becomes (For instance, in writing these lines, I make great effort to be *as clear as possible*). Sometimes, perfectly understanding this language is a question of life and death; if you do not understand the signification of a red light, for example, you might have a car accident. The language of law and the language of the sciences should be as precise as possible. The most minimalist expression of this language is the pictogram. When you see a sign representing very schematically a man and a woman—often just a circle representing the head, a triangle for a skirt and two rectangles for the trousers—you understand that this signifies the restrooms.

But just suppose this sign is found by future archeologists who have no understanding of its *meaning*; they will understand it through associations, they might hang this piece in a museum of art next to Adam and Eve by Durer, a Fang wooden sculpture of a couple, an Egyptian painting of Nout (the deity of sky) and Geb (the deity of earth), Joseph and Miriam by Rembrandt, and many other pieces of "art."

There is, I think, a great difference between the language of art and the pictogram (or the-language-of-everyday). The language of the pictogram functions through the understanding of the negation, the language of art functions through associations and ignores negation.

One pictogram is enough to describe the universe. Take, for example, the pictogram of a tree. You can split the universe into "tree" and "not a tree." It would be a very poor language but it would be, thanks to the negation, enough to describe the universe. But the painting of a tree means something else.

Many years ago, I painted my 3 year-old son and my 5-year-old daughter beneath a huge palm tree. When my son explained the painting, he said, "This is my sister. This is me. And this tree is my

father." In a sense, he understood that the question is not "Is this a tree or not a tree?" but about the different associations the tree can evoke.

I would like to be more precise about this notion of association and give a few examples:


Art accompanies us for a long time after we have exited the museum. A few months ago, while immobilized for a treatment of acupuncture, I did the following experiment: i tried to listen to the noises surrounding me as if they were a piece of music: the cars, the steps, the voices, the birds, the wind in the trees . . . (The day before, I listened to a contemporary music concert that used the same noises). I was listening as if it "had a sense," as if it was a piece of music written by some composer. After only a few minutes, I was so exhausted that I could not go on. Being a painter, this is what I am doing almost all of the time with my sense of vision, and in a natural way, without difficulty. My many years of practice allow me to "see" the world as if it were a painting.

4. The list of the associations art can provide is as long as all fields of human activities: music, poetry, philosophy, politics ....

Language of art functions in a completely different way than *the-language-ofeveryday*. Moreover, the language of everyday is much too "poor" to describe the language of art. Kantor demonstrates why a group (a,b,c, ...) is always "*smaller*" than a group of the associations of those elements (a, ab, abc, ac, ...). It might be possible to use this demonstration to show

engraving, **(down left)** Self portrait, etching with dry point and engraving.

why our ordinary language is too poor to describe a piece of art (which functions through associations). To me, it seems intuitively true.

Poets and lawyers speak two completely different languages with different rules even though they use the same raw materials of vocabulary. This is why it is a non-sense to try to translate a piece of poetry into its own language, and why it is necessary to translate it infinitely into others, while a legal document needs only be translated once. Furthermore, attempts to explain art using the language of everyday freezes the streams of infinite associations.

The notion of "sense" is completely different when it refers to a piece of art. The question "what does it mean?"—which includes in ordinary language the possibility of saying "what it does not mean"—has no meaning when it deals with art. One of my paintings was described by two prominent and sensitive critics as "a hymn to life" and as "a march toward death." I did not feel any contradiction between those two statements. Hamlet could be a young blond virgin actor, or a fat, unshaven, bisexual 45-year-old-man. This text makes us dream. This is why there is a need to come back to it again and again. . . Art is a paradox, in the sense that you are not able to say if it is right or wrong.

When I am asked that question, "*What does it mean*?" I always answer "everything." A real, a deep, a great piece of art contains, through an infinite number of associations, the whole world.

One of the most common misunderstandings about art is to think that it is composed of "sense" (a pictogram) decorated with "beauty." In this way of seeing, the painting of Cézanne would be the painting of a mountain, but represented in a "beautiful" way, like a beautifully "*designed*" mountain. There are computers programs that allow us to transform a banal photograph into something that looks like a painting. Too often, the painter is considered almost as if he or she is a very sophisticated program of this kind, and this is wrong. To me, a piece of art is not "sense" + "beauty," or just "beauty." It is another kind of sense.

Art makes us dream, a pictogram does not. A painting of a tree is not a pictogram of a tree with the addition of "beauty." Art, as well as dreams, functions by associations and ignores negation. Like in a dream, the objects or the forms, because of the very special structure of the work of art, undergo transformations, metamorphoses that do not hurt our logic.

This is also the reason why there is progress in science and not in art. The physics of Newton becomes a particular detail in the theory of Einstein. On the contrary, Cezanne, by creating new associations with the work of Poussin, makes it richer (Moreover, the work of Poussin enriches the work of Cezanne. Time functions in both directions). A new piece of art does not render an earlier piece of art irrelevant or old-fashioned; rather, it enriches it as it does with all other existing pieces. While listening to a quartet of Bartok, you can hear how it dialogs with Beethoven, with Bach, or with Hungarian folk music. In a way, we could say that linear history has, in this context, no meaning. Therefore, whereas the ideal of science is to develop a language which should be more and more concise, the language of art is becoming wider and wider. The language of sciences is convergent; the one of art is divergent.

This makes any description of art very limited. If you were one day in the presence of two painters speaking of art, you would probably be surprised at how clumsy the dialog can be. Using pantomime, or very vague expressions like "it works" or "it does not work," using concepts from other disciplines like "it is too sweet!" or "too heavy," or taken from music like "rhythm," "tonality," "accord," "major or minor" (It works also the other way: i have assisted at a violin master-class with Shlomo Mintz where he used concepts taken from painting, like "line," "colorful," "black and white," "blurry". . . !).

It is understandable, why art, which requires a highly sophisticated activity of the brain, interests scientists. Not as an effort to understand art better, but for the needs of scientific research. It is less understandable why artists need sciences.

It could be that artists are just interested in all fields of human behavior. They "use" scientific theory as a source of inspiration and then throw it away. At some time they are interested by the theory of the decomposition of light, and they create Impressionism, at other times by the theory of relativity, and they create Futurist paintings, etc. This, without any linear coherence to the theories they refer to. It is just a toy they can play with, and then leave.

It could also be that science and art are both linked to some "deep philosophy" of the time. It is fascinating to see, for instance, that the publication of the poem "Un coup de dés jamais n'abolira le hasard" by Stéphane Mallarmé, was published at the same time as Poincarré studies on the hazard, while there were no known connections between the two men. For some obscure reason, some idea circulates at a certain time and challenges both scientists and artists.

As I wrote above, the seminar on Art and Brain was—and still is—a part of my "training" as an artist. It allowed me to understand better the way in which vision functions, but the main lesson of this encounter was the idea of mapping. The idea of dividing the brain activity into zones which have different functions and work as a net of inter-references, has some similarity with the activities of the young generation of artists, mixing in the same exhibition video art, heterogeneous drawings (figurative, abstracts, pictograms), installations, performances, etc. It seems to me that this way of activity, also parallels the way we use the internet, and in some strange and remote way, the new conception of the world as a net of multi-cultural, multi-national, multireligious identities. This idea of a dynamic "map" is, today, in the *spirit of the time*, and it seems that the Hegelian conception of history of art is transformed into a more dynamic conception of the "geography" of art. Instead of historical exhibitions, there are more and more museums showing side by side African sculptures, Egyptian art, oil paintings of the seventeenth century, with contemporary art. This might be the *spirit of the time* both for science and for art.

# **REFERENCES**

Zeki, S. (1999). *Inner Vision: An Exploration of Art and The Brain*. Oxford University Press.

*Received: 20 April 2011; accepted: 08 January 2013; published online: 31 January 2013.*

*Citation: Lellouche O (2013) Arts and sciences. Front. Hum. Neurosci. 7:8. doi: 10.3389/fnhum.2013.00008 Copyright © 2013 Lellouche. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

# **THE ORGANIZATION OF SHAPE AND COLOR IN VISION AND ART**

**Baingio Pinna**

**REVIEW ARTICLE** published: 04 October 2011 doi: 10.3389/fnhum.2011.00104

# The organization of shape and color in vision and art

# *Baingio Pinna\**

*Department of Architecture, Design and Planning, University of Sassari at Alghero, Alghero, Italy*

#### *Edited by:*

*Luis M. Martinez, Instituto de Neurociencias de Alicante, Spain*

#### *Reviewed by:*

*Jan Johan Koenderink, Katholieke Universiteit Leuven, Belgium Birgitta Dresp-Langley, Centre National de la Recherche Scientifique, France*

#### *\*Correspondence:*

*Baingio Pinna, Department of Architecture, Design and Planning, University of Sassari at Alghero, Palazzo del Pou Salit, Piazza Duomo 6, 07041 Alghero, Sassari, Italy. e-mail: baingio@uniss.it*

The aim of this work is to study the phenomenal organization of shape and color in vision and art in terms of microgenesis of the object perception and creation. The idea of "microgenesis" is that the object perception and creation takes time to develop. Our hypothesis is that the roles of shape and color are extracted in sequential order and in the same order these roles are also used by artists to paint objects. Boundary contours are coded before color contours.The microgenesis of the object formation was demonstrated (i) by introducing new conditions derived from the watercolor illusion, where the juxtaposed contours are displaced horizontally or vertically, and based on variations of Matisse's Woman, (ii) by studying descriptions and replications of visual objects in adults and children of different ages, and (iii) by analyzing the linguistic sequence and organization in a free naming task of the attributes related to shape and color. The results supported the idea of the microgenesis of the object perception, namely the temporal order in the formation of the roles of the object properties (shape before color). Some general principles were extracted from the experimental results. They can be a starting point to explore a new domain focused on the microgenesis of shape and color within the more general problem of object organization, where integrated and multidisciplinary studies based on art and vision science can be very useful.

**Keywords: shape perception, color perception, perceptual organization, watercolor illusion, vision and art**

# **INTRODUCTION**

### **THE PROBLEM OF FIGURE–GROUND SEGREGATION**

Rubin in 1921 started the studies of figure–ground segregation, which is one of the basic problems of vision science, by asking what appears as a figure and what as a background. Through phenomenological experiments, he discovered some general figure–ground principles: surroundedness, size, orientation, contrast, symmetry, convexity, and parallelism. More importantly for our purposes, he suggested the following main properties, which belong to the figure but not to the background. (i) The figure assumes the shape traced by the contour, implying that the contour belongs unilaterally to the figure (border ownership, see Nakayama and Shimojo, 1990; Spillmann and Ehrenstein, 2004; Pinna, 2010a), not to the background. (ii) Its color/brightness is perceived full like a surface and denser than the same physical color/brightness on the background that appears instead transparent and empty. (iii) The figure appears closer to the observer than the background. These properties can be related to the three main object attributes: shape, color, and depth (Rubin, 1915, 1921).

In **Figure 1A**, the two inner regions of the square, separated by a wavy contour, can be perceived alternately as a figure or as a background. If the left side region is seen as a figure, the right one is perceived as a background. Consequently, the wavy contour appears as the boundary of the figure, while the background is perceived without a boundary and,thus,it amodally completes behind the figural region: as an empty space it fills the square frame. The figure–ground segregation induces in the two regions a brightness differentiation. The figure shows a clear surface color/brightness property (*Erscheinungsweise*, Katz, 1911, 1930): the chromatic paste appears solid, impenetrable, and epiphanous as a surface. On the contrary, the background is not seen colored but empty, penetrable and diaphanous as a void (Katz, 1911, 1930). Finally, the left side region appears to protrude in front of the background and sometimes also seems to pop out from the square frame. The background spreads and creates a penetrable depth in distance within the frame. All these properties synergistically contribute to the perception of the square shape like a frame or a window.

In **Figure 1A**, the figure–ground differentiation can be attributed to the concave–convex alternation along the contour. Even if this alternation can play a role on the basis of the convexity principle studied by Rubin, the same complementary results, previously described, are obtained by replacing the wiggly contour with a straight line, as shown in **Figure 1B**. Furthermore, similar property differentiations emerge also when the surrounding frame is removed (**Figure 1C**).

These outcomes suggest that a single line/contour behaves like a watershed differentiating the visual field in complementary (opposite) attributes related to shape, color, and depth perception. A single contour can be considered as a source of phenomenal asymmetry on both its sides with respect to the visual distribution of figure–ground attributes. If this were not true, then it would be impossible to perceive a wiggly irregular circular *surface* in **Figure 2A** or two women in Matisse's drawings illustrated in **Figures 2B,C**, but the expected results would have been respectively a wiggly closed *contour* in **Figure 2A** and a complex set of contours running in different directions and intersecting each other, without showing any specific surface and object

organization in **Figures 2B,C**. The phenomenal organization of contours in figure–ground attributes can be considered as the visual evolution of a contour in a surface. The results of **Figures 1** and **2** show indeed shapes, surfaces, and objects. These outcomes cannot be explained by invoking the role of past experience. In fact, as shown in **Figures 1** and **2A**, the same properties emerge without involving any kind of past experience.

Among the figure–ground attributes, the least strong within the previous figures is the chromatic/brightness differentiation between the figure and the background. In fact, why should a contour create a chromatic/brightness differentiation? What is the relation between a contour and a chromatic/brightness difference?

In **Figure 3A**, the brightness effect is enhanced. Under these conditions, the figure–ground segregation created by the wiggly contour within the black square induces also a clear brightness variation in the two complementary regions. Being the black square *per se* a figure segregated from the white background surrounding it, the presence of the wavy contour and of the Tjunctions on its boundaries favors its appearance as a window. In spite of this new organization, the previous figure–ground differentiation of attributes persists. The two regions can be partially, alternately, and reversibly seen as a figure or a background, even if it is easier to perceive the smallest region on the left side of the wavy contour as a figure, according to the size principle suggested by Rubin.

The chromatic/brightness differentiation and the figure– ground segregation also persist but appear less and less strong when the T-junctions are ineffective, i.e., by placing the wavy contour more and more included within the black square (**Figures 3B,C**).

The figure–ground differentiation of attributes is increasingly perceived in the three conditions illustrated in **Figure 4**, where the

wiggly surface appears transparent, mostly in **Figures 4B,C**, and the overlapped regions of the black squares, seen behind the transparent layer, are perceived with chromatic/brightness properties very different from the non-overlapped ones.

The figure–ground segregation problem, posed by the previous figures, is related to the way the complementary properties and, more particularly, those related to shape and color are reciprocally linked and organized. The main questions we asked in this work are the following: Where is information about shape and color mostly located? Are shape and color independent? Does their creation take time to develop? Are they organized in sequential order or in parallel? Is there any phenomenal logic in their visual organization? How are they bound? How do shape and color contribute to determine and define a visual object and what is the difference

between them? How are shape and color used by visual artists to create objects and scenes? Is the way artists use shape and color related to the way we perceive them?

# **GENERAL METHODS**

#### **SUBJECTS**

Different groups of 10 undergraduate students of linguistics, literature, architecture, and design and of children from 6 to 16 years old (only for the results reported in When the Color Becomes Boundary), 10 for each year, participated in the experiments and, for each experiment, to the two methods adopted and described in the Section "Procedure." Subjects had some basic knowledge of Gestalt psychology and visual illusions, but they were naive both to the phenomena studied and to the purpose of the experiments. They were both male and female undergraduates and all had normal or corrected-to-normal vision.

# **STIMULI**

The stimuli were the figures shown in the Introduction and in the Section "Results." The overall sizes of the visual stimuli were <sup>∼</sup>4˚. The luminance of the white background was 122.3 cd/m2. Black shapes had a luminance value of 2.6 cd/m2. The figures were shown on a computer screen with ambient illumination from a Osram Daylight fluorescent light (250 lux, 5600˚ K). Stimuli were displayed on a 33-cm color CRT monitor (Sony GDM-F520 1600 × 1200 pixels, refresh rate 100 Hz), driven by a MacBook Pro computer with an NVIDIA GeForce 8600 M GT. Viewing was binocular in the frontoparallel plane at a distance of 50 cm from the monitor.

# **PROCEDURE**

In order to study the phenomena here presented two methods were used: one more qualitative (phenomenological task) similar to those used by Gestalt psychologists and another more quantitative (scaling task).

# *Phenomenological task*

The task of the subjects was to report spontaneously what they perceived for each stimulus by giving, as much as possible, an exhaustive description of the main visual property perceived. The descriptions were provided by no less than 8 out of 10 subjects and were reported in the next sections within the main text to aid the reader in the stream of argumentations. The descriptions were later judged by four graduate students of linguistics, naive as to the hypotheses, to provide a fair representation of the ones given by the observers.

During the experiment, subjects were allowed: to make free comparisons, confrontations, afterthoughts; to see in different ways; to make variations in the illumination, distance, etc.; to match the stimulus with every other one. All the variations and possible comparisons occurring during the free exploration were noted down by the experimenter. This was necessary to define the best conditions for the occurrence of the emerging phenomena. About these tasks and procedure see Pinna (2010b).

# *Scaling task*

The phenomenological free-report method is complemented by a more quantitative one, based on magnitude estimation. The subjects were instructed to rate (in percent) the descriptions of the specific attribute obtained in the phenomenological experiments. New groups of 10 subjects were instructed to scale the relative strength or salience (in percent) of the descriptions of the phenomenological task: "please rate whether this statement is an accurate reflection of your perception of the stimulus, on a scale from 100 (perfect agreement) to 0 (complete disagreement)." Throughout the text, we reported descriptions, whose results of the magnitude estimation (mean rating) were greater than 88.

The task of the children is reported in Section "When the Color Becomes Boundary." All subjects were tested individually. During the experiment, observation time was unlimited. Reports for visual stimuli occurred spontaneously and fast.

# **RESULTS: SHAPE AND COLOR ORGANIZATION**

# **BOUNDARY CONTOURS AND COLOR CONTOURS IN THE WATERCOLOR ILLUSION**

The previous figures show a possible answer to the first question: where is information about shape and color mostly located? They suggest in fact that the information about shape and color is placed along the contours. Given that a single contour is perceived like the boundary of an object and appears to impart a surface color/brightness attribute, it follows that the information about shape and color is placed on the contour. This is clearly demonstrated by the watercolor illusion (Pinna, 1987, 2005, 2008; Pinna et al., 2001, 2003; Wollschläger et al., 2002; Spillmann et al., 2004; Devinck et al., 2005; Pinna and Grossberg, 2005; von der Heydt and Pierson, 2006; Werner et al., 2007). The illusion (see **Figure 5**), induced by two juxtaposed contours of different colors

and luminance contrast, strongly increases the unilateral belongingness of the boundaries and the surface coloration through a long range color spreading (see Pinna, 2010a). These results are related to the fact that the watercolor illusion fulfills the phenomenal asymmetry on both sides of a contour, previously described. More particularly, by creating a physical asymmetry and opposite gradient attributes, the watercolor illusion increases the complementary figure–ground properties that are spontaneously induced on both sides of a contour, where the asymmetry is not physical but apparent. This asymmetry induces also a volumetric effect similar to the Chiaroscuro technique aimed to create a bold contrast between light and shadow. Leonardo da Vinci improved the chiaroscuro through the Sfumato (from Italian "toned down" or "evaporated like smoke") obtained with a fine shading that produces imperceptible transitions from light to dark areas, without lines or borders, between colors and tones and aimed at obtaining soft lighting effects (see Da Vinci, 1452–1519; for a deeper discussion on the watercolor illusion and the Chiaroscuro, see Pinna, 2010c).

In the watercolor illusion, high luminance contrast between adjacent contours shows the strongest figure–ground and coloration effect, however, the color spreading is visible at equiluminance (Pinna et al., 2001; Devinck et al., 2005; Pinna and Reeves, 2006). Under these conditions, the border ownership is weakened and the figure–ground segregation appears reversible. This result suggest that figure–ground and depth segregation are independent from the coloration effect (Pinna, 2005; Pinna and Reeves, 2006; von der Heydt and Pierson, 2006). Pinna (2005) and Pinna and Reeves (2006) introduced the "asymmetric luminance contrast principle" (Pinna, 2005) stating that, all else being equal, given an asymmetric luminance contrast on both sides of a contour, the region, whose luminance gradient is less abrupt, is perceived as a figure if compared to the complementary more abrupt region perceived as a background. These results are summarized by the following spontaneous description of **Figure 5A**: a wiggly orange object with a sinusoidal overall shape. An orange object is perceived also when the contours are not closed as shown in **Figure 5B**. Finally, the wiggly orange object appears transparent or like a hole (see Pinna and Tanca, 2008) in **Figure 5C**.

These results can shed light on how shape and color are related, i.e., in the phenomenal logic of their organization. To clarify this point, it is necessary to analyze more deeply the previous description. On one hand, by saying "a wiggly orange object" the color of the boundary contour of the object is not mentioned. Even if it is clearly perceived, the purple color does not appear as a color of the object but like the boundary belonging unilaterally to the wiggly orange object. In other terms, the purple contour does not define the color but the boundary and, therefore, the shape. On the other hand, the adjacent orange contour defines the color of the object. It does not appear like the boundary contour of the object but like its color. It follows that the two contours play different roles.

These phenomenal properties of the watercolor illusion suggested the following general rule: The juxtaposed contour with the highest luminance contrast in relation to the surrounding regions tends to appear as the outermost boundary of the figure (Pinna and Reeves, 2006). This is the "boundary contour."

#### **MODAL AND AMODAL COLORATION**

The previous observations are supported by Picasso's colored drawing illustrated in **Figure 6** and showing a yellow cock. Under these conditions, the two sets of curved contours, black and yellow, belonging to the body of the cock, clearly assume different roles: the black contours appear like boundaries, while the yellow ones define the color. To better perceive this result, it can be useful to judge the phenomenal plausibility of the description opposite to the previous one, i.e., a black cock. Straightaway, the black cock appears as an inconsistent result, quite impossible or totally absurd. This outcome suggests that the differentiation of roles between contours also occurs when they are not adjacent like in the watercolor illusion.

These results suggest clear phenomenal differences between **Figure 6** and the watercolor illusion of **Figure 5**. While the coloration of **Figure 5** is actually and modally perceived, the one inside the body of the cock is perceived amodally (Michotte, 1951; Michotte et al., 1964; Kanizsa, 1985, 1991). More particularly, by reporting a yellow cock, the yellow coloration is vividly perceived as completing within the cock body despite it is not actually (modally) seen filling the whole shape like in the watercolor illusion. In other words, in spite of the few contours that do not fill the entire area, their coloration is perceived with an amodal sense of unity and homogeneity within the area traced and surrounded by the black contours. On the contrary, the coloration of the watercolor illusion is modally perceived as filling the entire area traced by the boundary contour. Modal and amodal coloration were previously studied by Pinna (2008a,b). The difference in the phenomenal quality of the coloration suggests again that the necessary information needed to define shape and color is placed on the boundaries. From the boundaries, these properties fill-in the whole shape (Pinna and Reeves, 2006; Pinna, 2008a).

Other examples of amodal coloration can be observed in children paintings where sometimes the color does not fill completely and perfectly the whole shape, but again the amodal coloration emerges spontaneously. A similar amodal completion of color is also easily perceived in Monet's self-portrait illustrated in **Figure 7A**, where all the elements are incomplete and sketchy. Nevertheless, the lower part of Monet's body appears amodally colored of homogenous black. Another way to obtain the amodal coloration is shown in Picasso's painting in **Figure 7B**. Under these conditions the color overflows the boundaries of the figures, but again it appears not as something else but as the color belonging to the figure.

In the last two examples the contours and the color appear different in terms of the area occupied by one or the other component, i.e., the chromatic area is much larger than the one of the contours, instead in **Figures 5** and **6** both contours are of the same width. Furthermore, **Figures 7A,B** use the incompleteness or the exceeding of color in relation to the external boundaries to show the amodal coloration, while the watercolor illusion and Picasso's cock use only contours respectively juxtaposed or separated and tangled up. The separation of the two contours of **Figure 6** switches the coloration effect from modal to amodal.

#### **WHEN THE CONTOUR BECOMES COLOR**

In **Figure 6**, it is not clear how the differentiation of roles occurs: Is the overlapping of contours responsible for that or the fact that the yellow contours are mostly included in the black ones or something else? What defines the differentiation of roles? To better understand the distinction between boundary contour and color contour, we used the next conditions, where the connection between Picasso's Cock and the watercolor color illusion can be more clearly understood.

It has been shown that in the case of the watercolor illusion the juxtaposition of contours generates the figure–ground and the coloration properties perceived modally. The presence of a gap between the two contours weakens the two main properties of the illusion more and more (Pinna et al., 2001). The modal effects become amodal. This amodal regeneration of effects is likely related to the necessary and inevitable process of boundary and color/brightness induction perceived whenever a contour is given, as we suggested for **Figures 1**–**4**.

In **Figure 8A**, Matisse's Woman has been modified with two sets of black contours, slightly but clearly shifted in the horizontal direction. The two shifted contours elicit an effect of blur or duplicity of the boundaries of the woman. Both contours are perceived as boundaries. More particularly, no contour prevails on the other as a boundary contour and none prevails as a color contour. Therefore, the shape appears split into two or strongly blurred. Moreover, the color/brightness differentiation is very weak or totally absent.

These results change if an orange contour replaces one of the two blacks. In **Figure 8B**, an orange woman is perceived. Similarly to **Figure 6**, none of the subjects perceived the figure as a black woman. Under these conditions, the blurred effect is very weak or totally absent. It appears clear that the black contour is predominantly perceived as the boundary of the woman, while the orange contour appears as the color of the whole figure. This is the reason why we don't perceive a black woman. The spreading of color is under these conditions perceived filling the entire object amodally and independently from the fact that the orange contour is inside or outside the figure surrounded and shaped by the black contour. This is one of the many conditions (other conditions were described in **Figure 7**) of amodal coloration that we call "amodal wholeness of color." On the other hand, the "amodal wholeness of shape" is represented by Monet's self-portrait illustrated in **Figure 7A**, where the rough contours of the legs and hands fill the wholeness by completing amodally their sketchy shape.

By replacing the black contour with a purple one (**Figure 8C**), the phenomenal description is the same, i.e., an orange woman. The orange contour assumes again the role of color and the purple one the role of boundary. Once again, this differentiation of roles explains the fact that we perceive an orange woman and not a purple or a purple–orange woman. The vertical displacement of the two chromatic contours shows the same differentiation of roles (**Figure 8D**).

# **PHENOMENAL LOGIC OF ASSIGNMENT OF THE ROLES OF BOUNDARY AND COLOR CONTOURS**

# *Assignment of the roles under equiluminance*

When the two sets of contours are near equiluminance (**Figure 9A**), their roles as boundary or color contours appear instable and easily reversible. What appears difficult to assign, in the first place, is the role of the boundary contour. Phenomenally, the differentiation of roles is addressed first and foremost to the boundary and then to the color. The boundary contour fixes the shape that has to be filled modally (in the case of the watercolor illusion) or amodally by the color contour. Pinna and Reeves (2006) suggested a similar sequential microgenesis in the formation of the watercolor illusion. This instable alternation of role assignment gives the figure a global effect that differs from the one described for **Figure 8A**, where both set of contours are black. In this case, there is not any competition. Both contours are perceived as boundaries and none of them is perceived as a color contour. The effect of blur and duplicity of the boundaries is related to the primary assignment of the boundary contour. In **Figure 9A**, the blurred effect is absent, but the competition of roles is plainly perceived. Since it becomes difficult to assign the role of boundary, the color contour cannot be assigned. Following the instability and reversibility of the boundary contour, the color contour role is also instable and reversible.

It is sufficient to increase the luminance contrast of one of the two contours to favor the assignment of the boundary and color roles (**Figure 9B**). Phenomenally, the dark red contour pops up instantly as the boundary of the woman and immediately after the green contour becomes its color. This sequence of assignments was spontaneously reported by the observers.

If the luminance contrast is responsible for the boundary/color role assignment, then an instable and reversible equilibrium between the two contours can be obtained by reversing their reciprocal contrast and by keeping their luminance at the same difference in relation to the background, as illustrated in **Figure 9C**. The phenomenal results are similar to those of **Figure 9A**: the instable and reversible alternation of the boundary role and the assignment suspension of the role of the color contour. As a consequence, by reducing the luminance contrast of the background, the differentiation of roles emerges spontaneously (see **Figure 9D**): the boundary role is assigned to the contour with the highest contrast and the color role to the contour with the lowest contrast. Finally, by reversing the contrast of **Figure 8C**, as shown in **Figure 9E**, opposite roles are now perceived in the purple and orange contours: the orange contour appears as the boundary and the purple becomes the color contour.

# *Assignment of the roles with multiple contours*

Assignments of boundary and color roles occur also when, instead of two, there are three displaced sets of contours. In **Figure 10A**, the three black contours show even stronger results than those described for **Figure 8A**: besides the blurred and the triplication, a sense of repetition of the same shape is also perceived. Among the contours there is not competition to take on the role of boundary, but all the three appear as such at the same time. By changing the color and the luminance of the two external contours (**Figure 10B**), the boundary and color roles are assigned immediately: the contour in the center pops up as the boundary of the woman, while the other two appear as a two-tone coloration.

By adding a further displaced contour, the amodal coloration appears now three-tone (**Figure 10C**) or as having the same color graded in brightness (**Figure 10D**). If any of the two sets of contours are black, the blurred and duplication of the boundary contours of the woman emerge again (not illustrated).

#### *Assignment of the roles with polychromy*

From the instances illustrated in **Figure 10**, the polychromatic properties of an object emerge clearly: more than one contour can assume the role of color contour. While multiple colors can occur within the same object, multiple boundaries cannot (see **Figures 8A** and **10A**). This suggests the multiplicity of colors and the uniqueness of the boundary. What happens if the polychromy occurs along the same contour? In **Figure 11A**, a black contour and a displaced polychromatic one are perceived like a multicolored woman. The polychromatic contour with the lowest luminance contrast is perceived as the color contour. In **Figure 11B**, the chromatically homogeneous contour, with a luminance contrast lower than the one of **Figure 11A** is clearly perceived as the boundary of the woman, while the polychromatic contour assumes the color (multicolored) role.

In **Figure 11C**, a condition opposite to the one of **Figure 11B** is illustrated. Now, the contour with the lowest luminance contrast is the homogeneous one. The results show that the assignment of roles is less effective than in the previous condition. Both contours compete to appear as boundaries and the strength of the boundary assignment to the contour with the highest luminance contrast is very low or absent if compared with the one of **Figure 11B**. The global configuration appears in fact blurred and with a duplication of the boundary contours. A control is illustrated in **Figure 11D**, where both contours are polychromatic but with different luminance contrast. The effect of the luminance contrast in assigning the boundary role is now

more effective than the one of **Figure 11C**, where polychromy and luminance contrast are one against the other. However, the effect is less effective than the one illustrated in **Figure 11B**, where the chromatic homogeneity and luminance contrast are synergistic.

These results demonstrate that the homogeneity (uniqueness) is a factor that plays a clear role in defining the boundary contour. More generally, the polychromy can be part of the amodal coloration effect, rather than of the boundary formation and assignment that requires an overall chromatic homogeneity and therefore a oneness and uniqueness. In fact, given that the boundaries tend to surround and induce a figure–ground separation with their unilateral belongingness, to obtain the best effect they required being homogeneous and unique.

#### *Assignment of the roles with thickness variations*

The thickness of the contours also plays a role in assigning to it the boundary attribute. In **Figure 12A**, the thickness of the red contours of **Figure 9A** is increased. Under these conditions, the instability and reversibility of the boundary assignment, previously described for **Figure 9A**, is reduced or totally absent: the red contours assume quite easily the role of boundary. The opposite occurs when the thickness of the green contours is increased: the boundary/color role between the two contours is now exchanged (**Figure 12B**). Thickness and luminance contrast can compete, as shown in **Figure 12C**. Nevertheless, when the thickness exceeds a certain threshold (likely when it is not anymore a contour but

appears like a surface) it is perceived like a coloration attribute (**Figures 12D,E**).

In **Figure 13**, three drawings/paintings by Matisse (**Figures 13A,B**) and by Rouault (**Figure 13C**) are illustrated. They demonstrate that by increasing the thickness of the boundary contours, they appear less and less as such and assume other visual meanings, such as stripes (**Figure 13B**) or shades and shadings (**Figure 13C**).

# *Chromatic and achromatic attributes in boundary and color role assignments*

Several readers may have noticed that, all else being equal, the assignment of the role of boundary to achromatic and chromatic contours is not necessarily symmetrical, but it can be more easily

**FIGURE 13 | Drawings/paintings by Matisse (A,B) and by Rouault (C).**

attributed to the achromatic ones, while the role of color to the chromatic contour. To demonstrate this hypothesis, we can go back to **Figure 8B** and compare it with **Figure 14A**, where the relationship between chromatic/achromatic and high/low contrast is reversed on the same white background. Phenomenally, the luminance contrast appears as the most important factor in defining the boundary role. In spite of this plain result, a weakness of this kind of stimulus is that by increasing or decreasing the luminance contrast of the orange contour like in **Figure 8B**, the chromatic contour appears more and more achromatic (low saturation). Therefore, to demonstrate the asymmetrical role of chromatic and achromatic contours, we followed another way based on the spontaneous description of the simplest conditions.

In **Figure 14B**, the subjects reported simply "a woman," but in **Figures 14C,D**, they stated "a light red or a green woman." These naïve descriptions suggest that, in the limiting case of one contour only, the chromatic contour tends to be perceived both as a boundary and a color contour, while the achromatic one is perceived only as a boundary contour. This hypothesis is supported by **Figure 14E**, where the continuation of the same contour is alternated by black and light red components. Under these conditions, the subjects reported to see a light red woman. The role of the achromatic contour in becoming more effortlessly as boundary is supported even more by the results of **Figure 14F**, where being both contours chromatic "a blue and light red woman" is more easily perceived.

# **WHY DO WE SAY "A RED SQUARE" AND NOT "A SQUARE-SHAPED RED" OR "A SQUARE RED"?**

A further help in the understanding of the complex relationship between shape/boundary and color comes again from the spontaneous descriptions. Malevich's "Red Square" (**Figure 15A**) is a clear demonstration of the distinction between shape/boundary and color. This simple description corroborates previous results according to which shape and boundaries are extracted before color: the noun is the square and not the color. The color describes and qualifies the main and first component, the noun, i.e., the square shape. To show this more clearly, we can try again to judge the phenomenal plausibility of the inverse description: a square-shaped red or a square red. Even if both descriptions (red square and square-shaped red) are logically equivalent, phenomenally are totally different. The first is congruous and "real," the square-shaped red is incongruous and perceived odd or totally impossible.

We suggest that the order of the two object attributes when they are linguistically described is related to their perceptual organization. In other words, the formation of nouns and adjectives in the case of shape and color strongly depends on the microgenesis of their formation. The same argument can be used for Matisse's "blue woman" illustrated in **Figure 15B**. Under these conditions, it is much more difficult to formulate the inverse description (similarly to the square-shaped red), thus corroborating the previous argument about the organization of shape and color in sequential order.

# *When the color becomes boundary*

A further and stronger demonstration of the previous argument emerges by asking 6–7 years old children to paint a red square

and a square-shaped red. Children up to 9 years old consider the square-shaped red as normal and unexceptional. Only later does it appear odd. Nevertheless, even if the two descriptions are logically equivalent, they are depicted in different ways as shown in **Figures 16A–C**, where typical instances painted by the young subjects are illustrated. When asked to paint a red square, young children chose a black pastel, draw a square shape, then took a red pastel, and colored the square shape in red. In the case of the square-shaped red, they chose a red pastel, draw the square shape, and then with the same red (i) filled the inner area of the square or (ii) filled the square with a more delicate pressure of the pastel creating a brighter red. These results are predominant up to 9 years of age.

These results demonstrate that the color attribute needs a boundary. Briefly, the boundary precedes the color. Moreover, shape and color are used like juxtaposed attributes of an object. The shape is something else, independent from the color property. Only after 9 years old, children integrate the two attributes of the red shape in a single object, represented in **Figure 16D**, which is geometrically identical to the one of **Figure 16B** but phenomenally different as we described previously. Besides, they consider the square-shaped red as odd. This is a further demonstration of the previous statement: boundary precedes color. A final observation concerns the clear tendency of the achromatic contours to assume the role of boundary as demonstrated by the results of the boundaries outlined with the black pastel.

It is worthwhile noticing that, from a logical point of view, the task"draw a red square"can also be interpreted as a red outline of a square (see **Figure 16D**). The same possible result can be expected for the square-shaped red. Very few subjects draw only the red outline. This result is supported by the fact that, on the basis of all the previous results and on the visual organization of shape and color, it does not take into account the presence of two attributes within the linguistic description: shape and color. Therefore, if shape and color are independent and the shape precedes the color, then they cannot be joined and integrated in a red outline, at least not in the stage of the juxtaposition of properties. This result is very rare also during the stage of integration, i.e., after 9 years old. This can be interpreted in terms of phenomenal qualities of the

color that tends to fill the entire shape both modally, like in the watercolor illusion, and amodally, like in the variations of Matisse's Woman.

Although this kind of result is unusual, the main answer to the task "describe what you see in **Figure 16E**" is a red square and never a white square or a white square with a red outline. Similar results are obtained from the description of **Figure 16A**. None of the subjects reported to see a red square with black boundaries, but only a red square. These results validate once again the visual shape and color organization previously described.

Finally, the results of this section demonstrate that boundary contours are not invisible even when they are invisible, in the sense that they pop out from a homogeneous colored shape, like Malevich's Red Square, both through the naive description (but also through what it is not said) and the spontaneous drawings of young children. Therefore, they are not amodally invisible even when they are modally invisible. These implications can be extended to how artists paint their object. They used in fact to start outlining the boundary contours with a black pastel and then filling with chromatic paste the shape traced. This is normally the way children draw and paint. The fact that boundaries are visible also when they are invisible is what emerges in Altamira's and Lascaux's cave paintings (**Figures 17A,B**) or in Matisse's Pink Nude (**Figure 17C**). Examples of this kind are present in most history of art, and this is because of the visual organization of shape/boundary and color that we studied in this work.

# **DISCUSSION AND CONCLUSION**

In the previous sections we demonstrated new examples useful to understand the general problem of figure–ground segregation and, more particularly, the unilateral belongingness of the boundaries to the figure and the color/brightness induction within its surface. We showed that to split two regions into figure and background and, as a consequence, into complementary attributes (border ownership, depth segregation, and surface color), a single contour is sufficient, just as many artists do when they create their work starting from a contour or using only contours. In other terms, a contour contains information about shape, depth, and color.

In this work we focused our attention on shape and color by starting from the watercolor illusion with its juxtaposed contours that play the roles of boundary and color. Then, we introduced the notions of *modal* and *amodal* completion of color and suggested new conditions where the contours are not adjacent, like in the watercolor illusion, but displaced both horizontally and vertically and where they are perceived as boundary or color contours,not modally but amodally. By varying the main conditions of the displaced contours (luminance contrast, color, number, thickness, and chromatic homogeneity), we studied how these factors influenced the assignation of the roles of boundary and color to the displaced contours. Through these experiments our purpose was to answer the following basic question: Where is information about shape and color mostly located? Are shape and color independent? Are they organized in sequential order or in parallel? Is there any phenomenal logic in their visual organization? How are they bound? How are shape and color used by visual artists to create objects and scenes? Is the way artists use shape and color related to the way we perceive them?

On the basis of the results described in the previous sections, the following answers emerged. (i) The information about shape and color are placed along the contours. (ii) The contour with the highest luminance contrast is perceived as the boundary contour of the object. (iii) The contour with the lowest luminance contrast is perceived as the boundary contour of the object. (iv) If one contour takes on a role (e.g., boundary contour), then the other adjacent (watercolor illusion) or displaced (variations of Matisse's Woman) contour assumes a different role (e.g., color). (v) The boundary contour is confined or restricted to a tiny contour surrounding the perceived object to which it belongs, while the role of the color contour is not confined to its geometrical end but it spreads into and fills the entire object. (vi) The coloration can be either modal, like in the case of the watercolor illusion, or amodal (see also Pinna, 2008), like in the case of Picasso's Cock and in Matisse's variations. (vii) The two roles of boundary and color contours are extracted one after the other, in sequential order, i.e., one role can be defined only if and only after the other is defined. (viii) The color contour spreads and fills the entire object within its boundaries, therefore again the role of the boundary contour, representing the end of the spreading, is logically extracted before the color of the object. (ix) This suggests the idea of "microgenesis" according to which the object perception and creation takes time to develop; according to this hypothesis the roles of shape and color are extracted in sequential order and in the same order they are also used by artists to paint objects. (x) Each object tends

to show and to be perceived as having a single boundary contour with the highest luminance contrast, even when it is not really present. (xi) While the boundary contour tends to be one, the color contour can be multiple: i.e., one boundary, many colors. (xii) Achromatic contours tend to assume the role of boundary. (xiii) Linguistic descriptions and object representations are related (by some kind of morphism that requires to be studied more deeply) to the visual organization of shape and color, i.e., semantic and syntactic organization like the order of adjectives and the tendency of some attributes to become adjectives and not nouns or *vice versa* (see the linguistic and phenomenal plausibility of "a red square" against the phenomenal oddness of "a square-shaped red"). Finally (xiv), because shape and color take time to be integrated, during the development of how the visual system leads the reproduction of perceptual and linguistic objects, the integration of visual attributes (stage of integration) can manifest an intermediate stage, before the full integration, where both attributes are simply juxtaposed (stage of juxtaposition).

Studies on visual processing of contour and shape, based on illusory contours (Shapley and Gordon, 1985; Dresp et al., 1990; Dresp, 1992) and on contrast/assimilation phenomena (Shapley and Reid, 1985), suggested that independent brain mechanisms are involved. More recent studies demonstrated that neurons in V2 respond differently to the same contrast border, on the basis of the side of the figure to which the border belongs (Zhou et al., 2000; Friedman et al., 2003; von der Heydt et al., 2003). This can be considered as a neural correlate of the unilateral belongingness of the boundaries. More particularly, figure–ground segregation is likely processed in areas V1 and V2 (Zhou et al., 2000; Friedman et al., 2003; von der Heydt et al., 2003), in inferotemporal cortex (Baylis and Driver, 2001) and the human lateral occipital complex (Kourtzi and Kanwisher, 2001). Furthermore, Zhou et al. (2000) reported that approximately half of the neurons in the early cortical areas are selective in coding the polarity of color contrast. The same correlate can be assumed to explain the figure–ground effect of the watercolor illusion (von der Heydt and Pierson, 2006). The specialized phenomenal roles of the juxtaposed and displaced contours in boundary and color and the phenomenal logic of their organization can shed light to understand how neurons become more and more specialized by firing to only one attribute and how they are then integrated, after a level of juxtaposition of attributes.

The main general principles, here suggested through novel conditions taken from vision and art, can be starting points to explore a new domain focused on the microgenesis of shape and color within the more general problem of object organization. In conclusion, integrated and multidisciplinary studies based on art and vision science are desirable because they can strongly contribute to the fully understanding (even in terms of neural circuitry) of the basic and common problem of perceptual organization of shape and color.

### **ACKNOWLEDGMENTS**

Supported by Finanziamento della Regione Autonoma della Sardegna, ai sensi della L.R. 7 agosto 2007, n. 7, Fondo d'Ateneo (ex 60%) and Alexander von Humboldt Foundation.

# **REFERENCES**


and figure-ground effects in the watercolor illusion. *Spat. Vis.* 19, 323–340.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 03 May 2011; accepted: 07 September 2011; published online: 04 October 2011.*

*Citation: Pinna B (2011) The organization of shape and color in vision and art. Front. Hum. Neurosci. 5:104. doi: 10.3389/fnhum.2011.00104*

*Copyright © 2011 Pinna. This is an openaccess article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.*

# **SCULPTING THE BRAIN**

**Pablo Garcia-Lopez** *(artist perspective)*

# Sculpting the brain

# *Pablo Garcia-Lopez\**

*Rinehart School of Sculpture (MFA), Maryland Institute College of Art (MICA), Baltimore, MD, USA*

### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Todd L. Siler, Psi-Phi Communications, LLC (dba Think Like a Genius, LLC), USA*

#### *\*Correspondence:*

*Pablo Garcia-Lopez, Rinehart School of Sculpture (MFA), Maryland Institute College of Art (MICA), 1300 W Mount Royal Avenue Baltimore, MD 21217, USA. e-mail: caravaca1@gmail.com* *Neuroculture,* conceived as the reciprocal interaction between neuroscience and different areas of human knowledge is influencing our lives under the prism of the latest neuroscientific discoveries. Simultaneously, neuroculture can create new models of thinking that can significantly impact neuroscientists' daily practice. Especially interesting is the interaction that takes place between neuroscience and the arts. This interaction takes place at different, infinite levels and contexts. I contextualize my work inside this neurocultural framework. Through my artwork, I try to give a more natural vision of the human brain, which could help to develop a more humanistic culture.

**Keywords: art, neuroscience, neuroculture, sculpture, metaphors, Cajal, mechanism, butterflies**

# **INTRODUCTION**

The development of neuroscience in the last century and in recent years has been very influential in many fields of knowledge such as economics, politics, law, philosophy, public relations, art, etc. Reciprocally, different disciplines of knowledge have also influenced the development of neuroscience and science as a whole, pointing out the important keystones of the neuroscientific agendas and also influencing scientific research from different sociocultural perspectives. This interwoven mix of different areas of knowledge has been essential to the rise of a *neuroculture* (Frazzetto and Anker, 2009) that is influencing our life under the prism of the latest neuroscientific discoveries. Furthermore, this *neuroculture* can have a significant impact on neuroscientists in their daily practice creating new ways of thinking that will influence their research. Especially interesting is the interaction that takes place between neuroscience and arts. This relationship takes place at diverse, infinite levels and contexts. Many artists have used discoveries, data, illustrations, paradigms or scientific methodologies guided by different goals, motifs, global and personal narratives, and mediums. Through their holistic artworks, they not only echo the latest scientific discoveries, but many times their works go beyond the nature and meaning of these discoveries, enriching them with their personal narratives, ambiguity or critical opinions of some aspects of neuroscientific research and opening a neurocultural dialog to a wider audience. Besides adding complexity, these artworks also add plasticity, subjectivity and intra-individual differences to the neuroscientific models. The personal experiences are many times hidden by the normative scientific method, and other times highlighted by the subjectivity of artists, are essential to add reality, complexity, and plasticity to the neuroscientific models. That is why I would like to explain my position as an artist at the junction between neuroscience and art.

# **BRIDGING THE GAP BETWEEN NEUROSCIENCE AND ART THROUGH METAPHORS CAJAL NATURALIST METAPHORS**

My work as an artist is directly inspired by my experience as a neuroscientist. I completed my PhD in conjunction with the Museum Cajal, working with the original slides and scientific drawings of Santiago Ramon y Cajal (1852–1934). Besides being completely astonished by the historical and current neuroscientific concepts, and esthetics of his histological slides, drawings, (Garcia-Lopez et al., 2010), articles, and books, I was impressed by the great abundance of metaphors that he employed in his scientific writings. Possibly, even more impressive concerning Cajal's metaphors are their naturalistic and organic essence. Many of these metaphors could be considered rhetorical ornaments, although they also function as explanatory and even as heuristic tools for proposing his models and theories about brain functioning. As Lakoff and Johnson pointed out in their seminal book *Metaphors We Live By* (Lakoff and Johnson, 1980), metaphors are not just rhetorical figures of speech, but ways of thinking. Describing Cajal's approach to the brain, Sherrington wrote:

"A trait very noticeable in him was that in describing what the microscope showed he spoke habitually as though it were a living scene . . . The intense anthropomorphism of his descriptions of what the preparations showed was at first startling to accept. He treated the microscopic scene as though it were alive and were inhabited by beings which felt and did and hoped and tried even as we do. It was personification of natural forces as unlimited as that of Goethe's Faust, Part 2. A nerve-cell by its emergent fiber "groped to find another"! We must, if we would enter adequately into Cajal's thought in this field, suppose his entrance, through his microscope, into a world populated by tiny beings actuated by motives and strivings and satisfactions not very remotely different from our own. He would envisage the sperm-cells as activated by a sort of passionate urge in their rivalry for penetration into the ovum-cell. Listening to him I asked myself how far this capacity

for anthropomorphizing might not contribute to his success as an investigator."

(Canon, 1949; Freire and García-López, 2008)

The use of metaphors by scientists has been studied by many scholars to see the cultural and personal influences that model the scientific practice (Hyman, 1962; Young, 1985; Sontag, 1990; Todes, 1997, 2009; Otis, 2002). They are useful as heuristic tools, but they can also become dangerous traps and obstacles, which can lead to initial progress but later stagnation in science (Kuhn, 1962).

For Cajal, the neurons were the *"butterflies of the soul"* (Cajal, 1901), in his personal interpretation of the psyche's myth. He often named different morphological structures using naturalistic terms: star-cells of the cerebellum, claw endings of the granule cells, etc., and named different cells and cellular endings with plant names such as mossy fibers, climbing fibers, rosacea endings, and nest endings (Cajal, 1899–1904). He also related the development of neurons to plants when he successfully applied the ontogenic method to study the nervous system:

"Since the full grown forest turns out to be impenetrable and indefinable why not revert to the study of the young wood, in the nursery stage."

(Cajal, 1901)

Cajal related plants to neurons not only by their morphology and development, but also because of their physiology, advancing his theories about the plasticity of the nervous system:

"As opposed to the reticular theory, the theory of the free arborization of the cellular processes that are capable of developing seems not only the most likely, but also the most encouraging. A continuous pre-established net—like a lattice of telegraphic wires in which no new stations or new lines can be created—somehow rigid, immutable, incapable of being modified, goes against the concept that all we hold of the organ of thought that within certain limits, is malleable and capable of being perfected by means of well-directed mental gymnastics, above all during its period of development. If we did not fear making excessive comparisons, we would defend our idea by saying that the cerebral cortex is similar to a garden filled with innumerable trees, the pyramidal cells, which can multiply their branches thanks to intelligent cultivation, sending their roots deeper and producing more exquisite flowers and fruits every day."

(Cajal, 1894; see also DeFelipe, 2006)

Cajal's organic metaphors may reflect many of his personal life experiences. Being born in a village (Petilla de Aragon, Navarra in Spain), being a naturalist and being an artist were part of adolescent experiences that would latter emerge on his science life inside the lab. Cajal's metaphors were also a product of his time. They reflect cultural, social, and personal narratives of the age they were created. Nowadays, these organic metaphors could be oldfashioned<sup>1</sup> or even dead neuroscientific metaphors, if we compare them to many of the current mechanistic terms employed by

**FIGURE 1 | Der Mensch als Industriepalast (Man as Industrial Palace) (1926).** From Fritz Kahn (1888–1968). Chromolithograph. National Library of Medicine, Stuttgart.

neuroscientists to refer to the brain (computational analogies, circuits, wires, cables, switching, firing, etc.) (**Figure 1**).

# **ORGANIC METAPHORS VS. MECHANISTIC METAPHORS**

I do not want to transmit the perception that organic metaphors are more truthful, useful, or beautiful than the mechanistic ones. Mechanistic metaphors seem more objective than organic ones, but I believe comparing the brain to a computer has the same heuristic value as comparing the brain to a cauliflower. Depending on where you put the focus of your analysis, you will highlight or hide some important characteristics about

<sup>1</sup>Cajal terminology is indeed still in use as well as many other organic terms such as "dendritic tree" or "synaptic pruning", but the current trend is to employ more mechanistic terminology.

the brain. Both systems of metaphors give us opposite, but complementary intellectual models, and both have their own esthetic beauty. For instance, it is interesting to note that the telegraph-nervous system model rejected by Cajal to explain the plasticity of the cerebral cortex was useful for Hodgkin and Huxley (1952) in their Nobel Prize-winning studies of nerve action potential generation and propagation. They used the differential equation that describes coaxial cable transmission (the spatiotemporal "Telegrapher's equation," which had been developed to model signal propagation for the design of the transatlantic undersea cable) (Daugman, 2001). Reciprocally, the use of brain's computer analogies has been very useful for the development of new technologies and important scientific fields like cybernetics and artificial intelligence (AI) research. These new technologies have also enhanced the development of neuroscience.

Although from a pragmatic point of view, the mechanistic metaphors can be more useful for scientists to continue their research about the brain, I find them negative as neurocultural products because they help to create a mechanical, deterministic, and reductionist vision of the human being. They hide some essential characteristics about the brain (natural origin, plasticity, self-organization, self-consciousness, emotional behavior, etc.). The vision of the nervous system that neuroculture creates is essential to envisioning ourselves and developing our life projects. From an educational perspective, I found more value to turn to another famous art-related metaphor of Cajal (1901) that envisions us as self-builders of our projects:

"Every man if he so desires becomes sculptor of his own brain."<sup>2</sup>

Interestingly, in a sort of unconscious echo of this metaphor, the conceptual artist Jonathan Keats put his brain, as well as it's original thoughts up for sale. He registered a copyright of his brain as a sculpture created by him through the act of thinking. According to an interview with the BBC, he wanted to attain temporary immortality, on the grounds that the copyright act would give him intellectual rights on his mind for a period of 70 years after his death. He reasoned that, if he licensed out those rights, he would fulfill the "Cogito ergo sum" *("I think, therefore I am"*), paradoxically surviving himself by seven decades. He then facilitated the sale by producing an exhibition and catalog at the San Francisco Modernism Gallery. The artwork consists of MRI images of his brain activity as he thought about art, beauty, love, and death (see also Frazzetto and Anker, 2009).

# **THE SUCCESS OF MECHANICAL METAPHORS**

Using mechanistic models is not a new procedure, and it is inscribed in a long philosophical and scientific tradition (Descartes, 1664; La Mettrie, 1748) that has usually equated the brain-mind-nervous system to the latest technological innovation in every generation; the catapult by the Greeks (Searle, 1984), the telegraph (Du bois-Reymond's idea released in a public lecture held in 1851, review in Otis, 2001), the jacquard loom (Sherrington, 1942), the telephone switchboard, the computer (Von Neumann, 1958).

Philosophical mechanism has been essential to reject the "élan vital" of vitalist philosophy. Once eliminated the vital sparks, energies, and spirits, mechanistic science became the new religion with their "objective" metaphors. Some disciplines such as cybernetics, AI research, and radical behaviorism have especially enhanced the mechanistic terms during the last century. It is still impressive how Skinner (1971) on the first chapter, A Technology of Behavior, of his book entitled *Beyond Freedom and Dignity*, tries to escape from the anthropomorphic metaphors of Psychoanalysis to start using his battery of mechanistic metaphors. In an era of mechanical objectivity, radical behaviorists found the best place to eliminate any kind of subjectivity of the human mind. Simply put, free will was considered an illusion. It was not until the visualization of the brain in action with new imaging techniques and the parallel development of cognitive neuroscience that the inside cognitive process of the mind/brain became again objective.

The behaviorist approaches were easily accepted and permeated many levels of society and educational systems. They were so resonant with human culture because we had already been transformed into machines before. The technology of behavior has been already in use in every society since ancient times: from the classical system of punishment and reward of education, religion, etc., to the more subtle strategies used today in social engineering. It facilitated the phenomenon of socialization and education despite being also at the ground of many antihumanistic positions that enhanced the use of man as a medium or machine. During the process of socialization, we were programmed to become cultural machines. The great achievement of radical behaviorists, mechanistic biologists, and some cybernetic approaches were to make us believe that even our nature was only mechanical. Through the abuse of mechanistic terms and analogies to refer to our body, brain, and physiological processes, we were transformed into cultural cyborgs.

Besides this conceptual and partial transformation of man to machine, we also assisted to deeper changes in the scientific practice, from Cajal's laboratory where he worked usually alone, to laboratories that are were envisioned as authentic factories<sup>3</sup> . Nowadays, science is one of the main important economic activities. Because of it's economic importance, the great competition, the race for arriving first to the new discovery, and many other reasons, many labs have become fabrics of science production. Depending on many aspects such as: the educational system, the country, the team principal investigator, among other factors, these factory lab models reject more or less the development of science creativity and originality to form robotic scientists with a high degree of specialization to produce very ambitious science

<sup>2</sup>Metaphors provide ambigous models of thinking. I interpret this sentence in the following way: it does not mean that you can make whatever you want with your brain (there are physical and material limitations to build a sculpture). It also does not reject the notion that education is essential to modulate your brain. But once you have that material that you have not conciously chosen, it is your own responsibility as a human to become a self-creator, through creativity and originality. This practice will allow us to become more human and not programmed machines.

<sup>3</sup>In Cajal's time, there were also other labs managed as factories like Pavlov's lab (Todes, 1997).

projects that require a lot of mechanical daily hard work, but with very few creative reward, especially for young scientists.

But of course, the transformations of society and culture of the last centuries did not only take place in the science education systems and scientific labs. The art studio was also transformed. Many machines and technologies became more used by artists, though art has always been linked to technology. Some art studios were transformed into art factories as soon as this notion of art became such an important socioeconomic industry. During last century, the number of assistants in art studios has increased, whereby transforming many studios into companies. The mechanical objectivity terminology also affected the language of artists. I was surprised of how some artists refer to themselves as *object makers* in order to highlight their craftsmanship activity. Of course, a painting or a sculpture is an object, but is it only that? Naming them only as objects removes any kind of *spiritual* value of the work; it only focused on the objective properties of the object. But what about the other characteristics of the artwork, such as the effort of the artist, the intention, the narrative, it's symbolic meaning, etc.?

# **THE METAMORPHOSIS OF THE MACHINE INTO A BUTTERFLY**

I am also a mechanical product of this mechanistic culture and society. A mechanical product enhanced by science. I had this intuition that I had transformed myself into a machine while I was completing my bachelor in molecular biology. Before finishing, I realized I did not want to become a scientist. Through the excessive theoretical approach, dogmatism, memorization of data, and lack of experimentation inside the lab, I did not develop my scientific creativity and originality4 . I have always considered myself a very creative person and furthermore, a person that needs to be creative to be happy. Although I did not develop my creativity as a scientist, as compensation to this excessive mechanization, my artistic creativity was enhanced. I had always made art at home, but it was not until this progressive mechanization that I started to feel the imperious necessity of creating art. This creativity and altered sensibility was also pointing to the necessity to express myself. Only very creative scientists can express themselves through their science as we have seen in the case of Cajal.

It was during a visit to the Venice Biennale (2003), during the last year of my bachelor, when I realized I wanted to mix science and art. There was an installation by the Israeli artist, Michal Rovner entitled, 'Against Order? Against Disorder?' at the Israel

<sup>4</sup>There are educational systems that do not promote the scientific creativity and originality. Creativity cannot be taught, but can be guided and stimulated. Some of these systems, with the excuse of being objective, reject creativity, imagination, and originality; and consider them as attributes of humanism. This is the first step to become a mechanical scientist.

Pavilion that was synthesizing the main ideas of molecular and evolutionary genetics, and social engineering in a very pleasant and instantaneous way. All the chromosomes, molecular cascades, cellular cultures, population genetics, eugenics, etc., were resonating in a single image. The main ideas I arrived through memorizing a great quantity of biochemical cascades, signaling pathways, etc., were already synthesized in a single image through a superposition of different visual metaphors. Furthermore, these ideas were amplified in very different and ambiguous ways that multiply the number of meanings and interpretations. For working in the interface between science and art, I decided to complete my PhD in Neuroscience. I wanted to learn more about science to have a better approach to the science and art interaction. I was lucky I could find the Museum Cajal, and besides obtaining my PhD, I obtained the perfect link between my personal narrative and my global one.

Because of my Cajalian influence, I have been working with organic or naturalistic metaphors with a special goal in mind: I would like to enhance the public vision of the brain as a natural organ rather than as a mechanical and cybernetic one. It is a romantic yet lost battle to renaturalize the public perception of the brain through my artwork, but it is still worthy. As an artist, I started to work with some of the Cajalian metaphors such as "the neurons as butterflies of the soul" (**Figure 2**) 5 , or the "cortical garden" (see section 2.1) (**Figure 3**). That is also one of the reasons I usually work with silk (the product of the cocoons *neuronsbutterflies*) (**Figure 4**), a very fragile/resistant and plastic material related to the butterfly's metamorphosis or neuronal plasticity. Interestingly, silk has been recently used as a scaffold for neuronal grafts, regeneration, and remielinzation in the peripherical nervous system (Allmeling et al., 2008; Radtke et al., 2011).

These naturalistic metaphors are the starting point of my artwork. It is through the use of these concepts, intuitions, personal experiences, materials, mediums, and different methods (very much influenced by scientific experimentation) that I try to make my artwork. In the case of the sculpture "Silk explosion or how to destroy 10<sup>6</sup> cocoons that will never become butterflies," (**Figure 4**), besides using silk, I also used a technical approach reminiscent of my microscopic observations. In this sculpture, the light plays an essential role catching the attention of the audience. The light is filtered through the shrinkfast transparent plastic

<sup>5&</sup>quot;Like the entomologist in search of colorful butterflies, my attention has chased in the gardens of the grey matter cells with delicate and elegant shapes, the mysterious butterflies of the soul, whose beating of wings may one day reveal to us the secrets of the mind."

**FIGURE 3 | The cortical garden (2009).** Photo video installation: digital print on velvet paper and video (Dimension variable).

**FIGURE 4 | Silk explosion or how to destroy 106 cocoons that will never become butterflies (2011).** Silk, MDF, electrical conduit, CFL ligthbulbs, plexiglass, shrink fast plastic. 6 × 6 × 3 m. Middendorf Gallery (Baltimore).

scaffold and silk obtaining textures similar to the histological stainings like the Golgi method (**Figure 5**).

The global naturalistic narrative I explained overlaps with my personal narrative, which is to become the *oil or petrol* of my *artistic machinery*. Of course, art is a process of research and experimentation when many times I become detached from my initial narrative only to arrive to a new place that becomes the real artwork. It is a sort of paradigm shift, in scientific terms. This is not only important for my art practice but also for my life philosophy.

"When the perception you have from yourself does not fit with who you really are, and you develop a mechanical behavior, is the moment to look for a new experience to alter your state."<sup>6</sup>

Step by step, I have abandoned this first-theoretical and metaphorical approach for a more experimental one. At the present time, I do not think too much in logical linear terms

<sup>6</sup>Free translation of the song Personalita Empirica by Franco Battiato, with lyrics by Manlio Sgalambro.

**FIGURE 5 | Details of the "Silk explosion or how to destroy 106 cocoons that will never become butterflies" (2011).** Silk, MDF, electrical conduit, CFL ligthbulbs, plexiglass, shrink fast plastic. 6 × 6 × 3 m. Middendorf Gallery (Baltimore).

while I am working. I try to be free of prejudices and theoretical premises, and follow my intuition guided by the experimentation in the studio. At the end of the day, I have an empiric personality, as beautifully sung by Battiato, and life is an experiment. My practice of sculpture is not only an exterior project, but an inherent work in progress. As Cajal once said and Jonathan Keats updated, I am shaping and reshaping another important sculpture that I hope will never cease.

The brain is wider than the sky, For, put them side by side, The one the other will contain With ease, and you beside. The brain is deeper than the sea

# **REFERENCES**


Lakoff, G., and Johnson, M. (1980). *Metaphors We Live By*. Chicago, IL: University of Chicago Press.

**ACKNOWLEDGMENTS**

For hold them, blue to blue The one the other will absorb, As sponges, buckets do. The brain is just the weight of God, For, lift them, pound for pound And they will differ, if they do As syllable from sound.

I want to deeply thank John Peacock, Daniel Todes, Virginia Garcia-Marin, Robert Merrill, Jennifer Coster and Yasmeen Afzal for their critical and helpful comments on the manuscript.


Todes, D. (1997). Pavlov's physiology factory. *Isis* 88, 205–246.

Emily Dickinson (ca. 1860, published in 1921)


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 14 May 2011; accepted: 18 January 2012; published online: 09 February 2012.*

*Citation: Garcia-Lopez P (2012) Sculpting the brain. Front. Hum. Neurosci. 6:5. doi: 10.3389/fnhum. 2012.00005*

*Copyright © 2012 Garcia-Lopez. This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.*

# **ARTISTS AND THE MIND IN THE 21ST CENTURY**

# **Geoffrey Koetsch** *(artist perspective)*

**REVIEW ARTICLE** published: 02 November 2011 doi: 10.3389/fnhum.2011.00110

# Artists and the mind in the 21st century

# *Geoffrey Koetsch\**

*Fine Arts Department, The Art Institute of Boston at Lesley University, Boston, MA, USA*

#### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Alain Dagher, Montreal Neurological Institute and Hospital, Canada Pablo Garcia-Lopez, Maryland Institute College of Art, USA*

#### *\*Correspondence:*

*Geoffrey Koetsch, Fine Arts Department, The Art Institute of Boston at Lesley University, 700 Beacon Street, Boston, MA 02115, USA.*

*e-mail: gkoetsch@gmail.com*

In 2008, Lesley University Professors Geoffrey Koetsch and Ellen Schön conducted an informal survey of New England artists to ascertain the degree to which recent work in neuroscience had impacted the visual arts. The two curators mounted an exhibition (MIND*matters* May-June, 2008) at the Laconia Gallery in Boston in which they showcased the work of artists who had chosen mental processes as their primary subject. These artists were reacting to the new vision of the mind revealed by science; their inquiry was subjective, sensory, and existential, not empirical. They approached consciousness from several vantage points. Some of the artists had had personal experience with pathologies of the brain such as dementia or cancer and were puzzling out the phenomenon consuming the mind of a loved one. They looked to neuroscience for clarity and understanding. Some artists were personally involved with new techniques of cognitive psychotherapy. Others were inspired by the sheer physical beauty of the brain as revealed by new imaging technologies. Two of the artists explored the links between meditation, mindfulness practice and neuroscience. Issues such as the "boundary" and "binding" problems were approached, as well as the challenge of creating visual metaphors for neural processes. One artist visualized the increasing transparency of the body as researchers introduce more and more invasive technologies.

#### **Keywords: art, brain, mind, neuroscience, interdisciplinary**

This study draws on the work of eight New England artists to show significant change in the way contemporary artists visualize the mind and to demonstrate that the change is due to the intellectual atmosphere created by the cognitive science revolution of recent decades. The artists are not working from a scientific agenda. Artists work intuitively with metaphor and react on an intuitive level to internal and external existential phenomena. But the influence of science is pervasive in contemporary life. What the study shows is that a new vision of the mind is replacing the ones that dominated the art of the 20th century.

In the first half of the century, classical psychoanalysis dominated the artist's view of the mind. The links between the founder of Surrealism, Andre Breton, and the theories of Freud are well documented (Nadeau, 1967). In the latter half of the 20th century artists focused attention to such things as transpersonal experience, chemically induced visions, and the mind/body/spirit connection (Grey, 1998). These tendencies persist today, but now they are blended with imagery inspired by recent neuroscience.

In this article I use the term "cognitive science" to signify the interdisciplinary field that attempts to integrate a broad range of mind-centered research that includes neuroscience, evolutionary biology, cognitive psychology, computation theory, and medical technology. The artists in our study approached the mind from a variety of perspectives. Constance Jacobson1 finds interest in brain structure; Audrey Goldstein's<sup>2</sup> focus is on patterns of activity. Heidi Whitman<sup>3</sup> creates "alternative" brain maps that visualize

the *contents* of mental activity, simultaneously with its *location* in space*.* Both Ellen Schön<sup>4</sup> and Constance Jacobson address issues of brain pathologies. Geoffrey Koetsch<sup>5</sup> illustrates the new "porosity" of the body as the brain's protective shell is penetrated by increasingly sophisticated research instruments. Denise Dumas<sup>6</sup> confronts the mystery of how the brain maintains the concept of a self isolated in space, the so-called "binding" and "boundary" problems.

# **STRUCTURE**

Several of the artists in our survey focus on neural structure, the brain as an organic electrochemical system. In their writing and public statements they frequently use terms such as "neural networks," "nodes," and "brain slices."

Constance Jacobson has a long-term interest in microbiology as a source of imagery. She translates structures revealed by micro photography into poetic, personal statements. In her *Tome Series* (**Figure 1**) axial slices of the brain inspire watercolors that evoke the brain's network of neurons that connect at trillions of points to form what scientists refer to as the "neuron forest." According to Jacobson, the drawings are not concerned with strict biological verisimilitude, but do reference cellular morphologies and communities.

The branching structure is one of the fundamental structural forms in nature, common to trees, neurons, and lightning. Sculptor Geoffrey Koetsch studied the branching structure of mangrove

<sup>1</sup>www.constancejacobson.com

<sup>2</sup>www.audreygoldstein.com

<sup>3</sup>www.heidiwhitman.com

<sup>4</sup>www.ellenschon.com

<sup>5</sup>www.koetsch.com

<sup>6</sup>www.dumasstudio.com

**FIGURE 2 | Node.**

trees during canoe trips through the swamps of Sanibel Island. In a series of analytical drawings, he systematized the mangrove root system, then built an aluminum model and integrated it with a human figure to create *Node* (**Figures 2** and **3**), a work that symbolically links the macro- and micro-biological worlds, root systems, lightning, and brain cells.

Interdisciplinary artist Audrey Goldstein directs attention to *patterns of brain activity* rather than to the physical structure of the brain. To use the familiar computer analogy, the physical structure of the brain is comparable to hardware and mental patterns of activity (such as language) to software. The physical brain is *architectur*e but computation, the manipulation of data, is brain *work* as specialized networks of neurons are dedicated to specific tasks.

Goldstein's piece entitled *Point to Point* (**Figure 4**) uses sculpture and video to visualize mental patterns derived from social activity. As people move through the world meeting friends and engaging in events, they create unique patterns of activity. These patterns are written in the neural network. Starting with a series of drawings Goldstein charted her social network: people were points and the relationships between them were connecting lines. Goldstein then derived a three-dimensional model from these drawings that served as a spatial metaphor for the encoded neuronal activity patterns etched in the brain. Next she attached this "brain activity model" to a backpack and carried it through her daily rounds, the portable brain guiding her movements. If the metaphor were to be extended, any new activities she undertook would have to be added to the model. Her "walks" were videotaped to add the time element to the project and provide the link between thought and action.

Mary Kaye's sculpture *The Spirit Builds the Body for Itself (to Goethe)* (**Figure 5**) is an abstract model of another kind of mental pattern: the process of logical thinking. It is a visual "thought" about the creation of the universe, a dualistic vision in which a transparent cone penetrates a wire web. Kaye borrows from the language of Constructivism to make a three-dimensional linear diagram of the thinking process, holding in tension both circular and dualistic thought patterns. The sphere of wires could be taken as a symbolic either of the birth of the physical universe or of circular mental conundrums such as the philosophical debate over the precedence of essence or existence (as indicated in the title of the piece, Kaye sides with Goethe and comes down on the side of spirit).

Kaye was a student of philosophy at Harvard, consequently her work reflects a foundation of deep skepticism. She doubts

**FIGURE4|Point to Point.**

the mind's capacity to know anything of reality, especially when sensory input has been encoded in language. Kaye refers often to the division of functions between the left brain (language, logic) and the right brain (emotion, space perception) and emphatically shows right brain bias. She believes that any verbal assertions we make about reality are suspect in that they tell us only about our own habits of mind and nothing about the nature of reality itself, which remains a mystery. "This is why I'm a visual artist," says Kaye. "It seems a much more efficient way of thinking than all these words."

Buckminster Fuller once said, "Everything you've learned as 'obvious' becomes less and less obvious as you begin to study the universe. For example, there are no solids in the universe. There's not even the suggestion of a solid. There are no absolute continuums. There are no surfaces. There are no straight lines." (Quoted in Pinker, 1997, p. 332). Mary Kaye's skepticism extends even to he reality of what she has created. According to Kaye, "Pinning down the essential physical piece is impossible-changing light changes it dramatically and essentially, unless you believe that its foundation level reality is the metal. But if you do think that, WHY do you think it? Your answer will reveal your basic assumptions about what is and what isn't real...is the essential sculptural material emptiness, which the metal allows us to see, or light, which is

**FIGURE 5 | The Spirit Builds the Body for Itself.**

neither physical nor not physical according to particle physicists, or is it the metal? The god Hephaestus would vote for the metal but remember he was a clunky god..."

To put Kaye's remarks into a broader context, I offer two views from the sciences. Cognitive psychologist George Miller wrote, "The crowning intellectual achievement of the brain is the real world...All fundamental aspects of the real world of our experience are adaptive interpretations of the really real world of physics." (Quoted in Pinker, 1997, p. 332). And what of the spirit? For many scientists, spirit (or soul) is just a particular sort of brain activity. Steven Pinker summed it up: "The supposedly immaterial soul, we now know, can be bisected with a knife, altered by chemicals, started or stopped by electricity, and extinguished by a sharp blow or by insufficient oxygen." (Pinker, 1997).

# **BRAIN MAPPING**

A number of contemporary artists have taken an interest in brain mapping, the neuro-imaging technology that enables scientists to pinpoint areas of the brain that process specific functions. In Ellen Schön's *Helmet Series* (**Figure 6**), for example, two of the works (*Crater, Porosity*) pinpoint with color the exact location of her husband's brain tumor. For Schön, this mapping is a source of anxiety.

**FIGURE 6 | Helmet Series.**

She writes: "Brain surgeons can now pinpoint their surgery – what to cut, what to leave intact. In the physical tangle of neurons and synapses, where does the soul, the essence of individual personality, reside?" Constance Jacobson, in the *Tome Series,* puts dark patches on her brain slices to locate the protein plaques and tangles of dead cells that starve and kill the neurons in the brain of an Alzheimer's patient.

The painter Heidi Whitman has made a signature style of brain mapping. The drawings and paintings in her *Brain Terrain* series (**Figure 7**) comprise, in the artist's words, "alternative brain maps that chart mental activity in metaphoric, specific, and sometimes narrative ways." She refers to her paintings as "wrong maps" since they are not derived from neuro-imaging technology but are an entirely invented terrain. Whitman's work give us a more comprehensive view of the mind than is possible with brain scans because the artworks visualize the *contents* of the mind as well as their location in the brain.

In Whitman's paintings we see the simultaneity of the mind's work. One day when she was at work in the studio she was bombarded by radio news about the Iraq war, military tanks appeared in her painting and nestled in beside raindrops, trees, dollar bills, continents, and galaxies. She shows us how experience is translated into thought and how memories are layered. "Cartography and abstraction are two languages used in my work. World events, anatomy, architecture, and nature play parts in these metaphors." The brain is passive in these works: there is no hierarchy, no spotlight of attention, no red flags. Mental elements float through, hovering over an abyss of mental space that is alive with arching waves of energy.

### **NEURAL BUDDHISTS**

In a New York Times op-ed piece titled "Neural Buddhists" David Brooks commented on how a scientific revolution can change public culture (Brooks, 2008). He said that just as the work of Darwin and Einstein transformed culture, "so the revolution in neuroscience is having an effect on how people see the world." He noted a change in science away from hard-core materialism and

**FIGURE 7 | Brain Terrain.**

the view of the brain as a cold machine. "Instead, meaning, belief, and consciousness seem to emerge mysteriouslyfrom idiosyncratic networks of neural firings."

Empirical science seems to be strengthening arguments for the existence of human universals. People all over the world have similar deep instincts for fairness, empathy, and attachment. Some evolutionary biologists see emotions as genetic imperatives and claim that emotions and beliefs are indispensable to functional utility. Brooks sees new respectfrom scientistsfor elevated spiritual states and says, "This new wave of research will...lead into what you might call neural Buddhism." He foresees a new challenge to many organized religions and concludes, "the real challenge is going to come from people who feel the existence of the sacred, but who think that particular religions are just cultural artifacts built on top of universal traits. It's going to come from scientists whose beliefs overlap a bit with Buddhism. In unexpected ways science and mysticism are joining hands and reinforcing each other." Brooks cites studies by Andrew Newberg of the University of Pennsylvania that show that transcendent experience can be identified and measured in the brain as a decrease in activity in the parietal lobe, which orients us in space. (Brooks, 2008).

Theravada Buddhism mixes well with modern neuroscience. This branch of Buddhism is empirical and does not engage in metaphysical speculation; there is no external power, no God in judgment. The historical Buddha was anti-authoritarian in matters of belief and taught spiritual self-reliance; he said we are not sinful by nature but ignorant; we should seek knowledge, not faith; there are universal moral laws but we must see them for ourselves; being is a aggregate of sensations and perceptions rising from matter to produce "mental formations" such as the self and the ego. For Buddhists everything is energy in motion and change is the only constant; nothing remains the same for two consecutive moments. "Every moment you are born, decay, and die," the Buddha said. Mind is not opposed to matter, it is an organ like the eye or the ear that can be controlled and developed; there is no soul, no ghost in the machine. (Rahula, 1974).

For some artists interest in the mind begins with Buddhism. Audrey Goldstein practices Tibetan Buddhism. Sculptor Geoffrey Koetsch practices Yoga and Zen. In the late '1980s Koetsch introduced the postures of Yoga into his work. They became a leitmotif, an archetype of mental and physical discipline, the union of body and mind. Meditation is a technique for the empirical observation of the mind.

In his reading of the literature of neuroscience Koetsch found concordance with Buddhist literature and his own direct experience: the mind is the activity of the brain, not a fixed entity but a dynamic process of relationships. The mind constructs and adaptive pattern called "self" or "ego" that is oriented in physical space that must be put aside or suspended in order to reach deeper insights.

Koetsch's work entitled *Node* explores consciousness without eliminating the gross physical body. He takes the holistic position that the brain must be approached in connection with the body, the body is the brain's interface with the environment that supplies its contents. Cognitive scientist Steven Pinker writes: "...of course the world does have surfaces and chairs and bodies, knots and patterns and vortices of matter and energy that obey their own laws and ripple through the sector of space-time in which we spend our days." (Pinker, 1997). Koetsch represents the body as the node at the intersection of the macro- and micro-biological worlds, space and time. The figure in *Node,* a transparent hollow shell, is a twofold symbol pointing both to the Zen void, "emptiness" and to the increasing porosity of the body whose boundaries have been exposed by science as illusory and invaded by medical technology with prosthetic devices, scanners, fiber-optic cameras, and EEGs.

A "mental environment" envelops the static sculptural components of *Node.* Digital displays on the floor and projections on the walls provide an element of time, suggest electrical energy, and simulate a barrage of neural impulses coming from multiple sources. Continuously changing images on the monitors show various categories of neural input: sensory stimuli, unconscious impulses, and memories. For example erotic desire is represented by clips from the film *Un Chien Andalou* by the Surrealists Luis Bunuel and Salvador Dali, and sensory stimuli by images of 17th century allegorical paintings of the five senses. Projected on the wall are films of lightning and a stroboscopically illuminated abstract model created by collaborator Rob Saulnier which suggests neural pathways, junctions, and foci of attention.

# **IDENTITY**

The central mystery of the mind is how consciousness arises from matter. The philosopher Colin McGinn, in his book *The Mysterious Flame* imagines a conversation between two extra-terrestials, one of whom has just returned from a mission to earth and is trying to explain humans to a colleague:

"They're meat all the way through.""No brain?""Oh, there is a brain all right. It's just that the brain is made out of meat." "So...what does the thinking?" "You're not understanding, are you. The brain does the thinking.""Thinking meat? You're asking me to believe in thinking meat?""Yes. Thinking meat! Conscious meat! Loving meat, dreaming meat. The meat is the whole deal! Are you getting the picture? (McGinn, 1999).

In her sculpture *Three is Company* (**Figures 8** and **9**), Denise Dumas gives us both meat and thought. In a profound exploration of consciousness, Dumas puts three bio-morphic sculptural elements (labeled "Me" "You" and "It") in a "house" resembling a laboratory apparatus. Two-way mirrors superimpose and double expose the elements to suggest a mental activity: the meat constructs an ego (me) and an "other" (you, it), which, by means of the two-way mirrors simultaneously see each other and see themselves reflected in the other. *Three is Company* shows how mind, arising from matter, constructs self, a fragile product that, in Dumas' words, "changes depending on where or with who we are."

According to Dumas, the work shows how identity is not fixed but supple, capable of redefinition and reinvention, especially in the face of radical displacement of environment, language, and culture. In neuroscience this is called the "boundary problem," the daily challenge we face to maintain a stable and coherent sense of identity. Referring to the malleability of self, Stephen Pinker states "Minds are probably easier to revamp than bodies because software is easier to modify than hardware." (Pinker, 1997).

Dumas' work is not a *celebration* of identity, neither cultural, national, nor gender. The three pieces of meat labeled "me,""you,"

**FIGURE 8 | Three is Company.**

and "it" are nearly identical specimens of biological standard equipment. The identity that arises from this equipment is comprised of a unique collection of memories and desires conditioned by embryological and biographical history. The self is not a fixed entity but a dynamic process of relationships.

### **PATHOLOGIES**

For two of the artists in this survey interest in the brain was stimulated by contact with disease. In 2006, Ellen Schön's husband was diagnosed as having brain cancer, a mixed glioma, in his frontal lobe. Her contact with neuroscience was through the various diagnostic techniques and surgical procedures employed by her husband's team of neurosurgeon, neurologist, and oncologist. Schön's frank, complex, and deeply personal ceramic series titled *Skullcap/Helmet* spans autobiography, brain science, spirituality, and esthetics. As autobiography it is the personal history of her experience with her husband's cancer: the painful symptoms, the anxiety of diagnosis, surgery, and post-operative stress, all accompanied by a deep spiritual search and emotional upheaval. The work "helmet" in the title refers to her husband's business a bicycle manufacturer's representative. Schön's experience with brain imaging technology is reflected in her knowledge of brain anatomy and her concern with the precise location of the tumor.

The skull is the brain's helmet, protecting it from outside impact. But it is useless against inside attack and becomes a barrier

**FIGURE 10 | The StormWithin Us.**

to healers. The two helmets subtitled *Crossing the Corpus Callosum* and *Zipped Up* represent the violence of opening the skull (the vessel of the soul) and its restoration to wholeness. The spiritual dimension of the piece is underscored by the helmet called *Labyrinth,* and by numerous references made by the artist to the use of the skullcap as a liturgical object in Vajrayana Buddhism. The most benign ritual use of the skullcap is as a begging bowl or food bowl, a monk's constant reminder of death and impermanence.

As was mentioned in the beginning of this essay, Constance Jacobson began work on her *Tome Series* in response to a family history of Alzheimer's disease. In Jacobson's drawings the scattered patches of dark matter in her brain slices represent the abnormal clusters of sticky protein that build up between the nerve cells in the brain of an Alzheimer's patient, blocking the supply of nutrients to the brain cells, which eventually die. The cortex shrivels up, resulting in progressive loss of memory and sense of identity (Alzheimer's Association, 2008). Since there is a genetic basis for Alzheimer's this is a cause of anxiety for Jacobson: the fear of loss of self as memories fade. In her *Grey matter Series* she explores "fading memories, thoughts reappearing and trying to connect to others." She overlays lotus leaves on the brain imagery, imposing visual simplicity, metaphorically healing the unruly, and complicated mind.

# **KALEIDOSCOPE**

Filmaker Karl Nussbaum7 views the mind from an omniscient point of view. In his film titled "The Storm Within Us" a third eye hovers restlessly over the mindscape, simultaneously looking at it from within and without (**Figure 10**). We are shown a mind practicing non-attachment, the technique in meditation of stilling the mind, assuming a third-person perspective, and watching thoughts and perceptions pass through uncensored, observing all of them but not attaching to any. This practice is the opposite of what the mind usually does, which is to zero in on whichever off the thousands of memories, thoughts, and perceptions clamors loudest and then select an appropriate response.

<sup>7</sup>www.karlnussbaum.com

Nussbaum works out of neuro-linguistic programming (NLP), an offshoot of cognitive psychology. It is a therapy not for curing mental disorders, but for enhancing human potential. An alternative to Freudian analysis, its focus is on recognizing patterns of communication, behavior, and relationships, analyzing the pattern of one's own behavior, and then learning to remodel it in order to better attain desired outcomes (Grinder and Bandler, 1983). *Reframing,* the title of one of Nussbaum's films, is an allusion to a specific therapeutic technique used in NLP in which an element of communication (video in this case) is presented so as to transform an individual's framing (perception of the meaning) of events, statements, or images. NLP counselor Joseph O'Connor explains that by changing the way an event is perceived, responses and behaviors will also change, "reframing with language allows you to see the world in a different way and this changes the meaning." (O'Connor, 2001).

In an artist's statement, Nussbaum talks about reframing in respect to his work. Reframing, he says, "means seeing actions and people from different viewpoints, in one continuous movement without editing." This video cycle explores ideas about transformation, hypnosis, water (as related to both religion and art forms), the brain, the unconscious, the rational vs. the emotional mind, neural pathways, evolutionary biology, and lunar and water cycles (as metaphors for reincarnation and emotional evolution).

Nussbaum imbues his work with a therapeutic effect. He writes of his desire to cast a spell on an audience, "to mesmerize them, speaking directly to their unconscious...In the end, [my goal is] to make connections between images, feelings, and ideas and ultimately between people."

# **HISTORICAL NOTES**

Looking back on the 20th century, one could argue that the Expressionists were concerned with the behavioral manifestations of consciousness, the surrealists with making visible its contents, and the artists of the 1970s and 1980s with enhancing the power of mind (via psychotropic visions, paranormal experience, and spiritual disciplines). In this study I found lingering traces of this latter category layered in with the symbols of the new mind science.

# **THE SURREALISTS**

In the Surrealist Manifesto, Andre Breton recounts that he was deeply absorbed in the work of Freud and that he had practiced Freudian psychotherapy on soldiers during World War I. (Breton, 1972). The goal of Surrealism, he said, was to create a revolution in the minds of men in which dreams and reality would fuse in a kind of absolute reality, *surreality.* Although he was often ambivalent about the world view of science and critical of the psychiatric profession as a whole, he was a quasi-empiricist. He collected accounts of dreams and conducted experiments in automatism at his *Bureau of Surrealist Research,* which was, Maurice Nadeau tells us, open to all who had something to say, to confess, or to create. They collected the material and dutifully published it in *La Revolution Surrealiste,* the movement's organ. (Nadeau, 1967). Breton studied the behavior of a psychotic woman and published his observations in the book *Nadja.* (Breton, 1960).

The Surrealists subscribed to Freud's "pneumatic"model of the mind: psychic pressure builds up in the unconscious and bursts forth onto the surface or is diverted to other channels such as the dream where it emerges in symbolic disguise to be unmasked by the therapist or disgorged by the artist. Salvador Dali proclaimed his paranoiac-critical method, the ability to be simultaneously (or alternatively) in the dream state, psychotic state, or state of normal consciousness. (Dali, 1942, 2004). Throughout the 20th century it was accepted wisdom that the artist has a special gift for moving between the conscious and unconscious mind.

# **VISIONARIES**

In the '1960s and '1970s interest in the mind focused on visions and mystical experiences often generated by drugs. Advocates such as Aldous Huxley extolled the virtues of mescaline and other mind altering substances, claiming that they would open the minds of ordinary people to transcendent powers that previously were the exclusive domain of the arhat, the mystic, and...the artist! "What the rest of us see only under the influence of mescaline," wrote Huxley, "the artist is congenitally equipped to see all the time. His perception is not limited to what is biologically or socially useful." Huxley prophesied in 1954 that mescaline would "transform most visualizers into visionaries." (Huxley, 1954).

The painters Alex and Allyson Grey exemplify the visionary tendency of the late 20th century, as the following passage from *The Mission of Art* bears out:

"In 1976 my wife, Allyson and I had an experience that changed our lives and out art. We sacramentally ingested a large dose of LSD and lay down. Eventually a heightened state of consciousness emerged in which I was no longer aware of the physical reality of my body in any conventional sense. I felt and saw my interconnectedness with all beings and things in a vast and brilliant Universal Mind Lattice. Every being and thing in the universe was a toroidal fountain and drain of self-illuminating love energy, a cellular node or jewel in a network that linked omni-directionally without end. All duality of self and other was overcome in this infinite dimension of spiritual light...this was the state beyond birth and death, beyond time, our true nature, which seemed more real than any physical body...This experience of the infinite net of spirit transformed our lives and gave us a subject that became the focus of our art and our mission." (Grey, 1998).

Although the work of the artists in this study shows a marked departure from the historical precedents just cited, some links to the past should be noted.

Gestural methods and chance play a role in the work of the painters Whitman and Jacobson, albeit under firm control. In her *Tome Series*, Jacobson works watercolor wet in wet and explains that the flow of the paint is analogous to the way ideas are in a fluid state before they settle into patterns and networks. In Whitman's mind maps, the highly controlled drips and dabs scattered here and there are conscious referents to the gestural drips of the action painters of the 1950s that are lodged in the artist's memory.

Dream content, the bedrock of Surrealist art, appears in the work of Whitman and Nussbaum, where it shares space with other mind stuff but is not given center stage. The main difference here is that the dreams are neither interpreted nor presented to advance the Freudian agenda of repressed desire. Whitman states laconically,"Dreams jumble reality." Nussbaum's approach is similar to Whitman's: the omniscient "I" notes the existence of the dream, the dream passes through and dissolves, it's just one more layer of activity in the seemingly infinite mind.

In an odd reversal of historical precedent, Mary Kaye has taken the formal vocabulary of Constructivism and given it a twist. Her work with industrial materials is neither a metaphor for the building of a socialist worker's state nor a symbol of solidarity with the proletariat, but instead stands for individualist intellectual work, the modeling of mind deliberating on existential questions.

Geoffrey Koetsch's work is the most invested in the past. He has re-introduced the Western figurative tradition into his art. His studies of mangrove root systems go back to the mid-century art and science movement that looked for structural "type forms" underlying nature, art, engineering, and science. His psychology has ties to Jung: universal archetypes and archaic remnants. In Koetsch's mind, Jungian psychology still has legs in the 21st century as science uncovers more evidence of innate human universals.

# **REFERENCES**


Rizzoli/PhiladelphiaMuseum of Art. Grey, A. (1998). *The Mission of Art*.


**CONCLUSION**

The conclusion to this essay was suggested by one of the reviewers of the manuscript (unidentified as of this writing). He wrote: "...the actual influence of current neuroscience (in the work of the artists covered in this article) is fairly limited. It appears to be more of a spark or trigger to art rather than a guiding principle as was the case for psychoanalysis and Andre Breton." Breton was a student of psychoanalysis and actually used the investigative tools of the discipline to produce what he considered "data" rather than art. The other Surrealists followed suit, employing such tools as free association and dream analysis. These tools were simple and accessible: talking, writing, drawing. This suggests that in order for neuroscience to serve as a guiding principle for an artist today, he or she would have to incorporate its research methods and technology into the creative process in order to *reveal* new dimensions of the mind as opposed to *reacting* to the revelations of scientists.

Howard). New York: Colliers Books.


Princenthal, N. (2008a). *Eyes Wide Shut*. New York, NY: Art in America.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 20 May 2011; paper pending published: 18 August 2011; accepted: 14 September 2011; published online: 02 November 2011.*

*Citation: Koetsch G (2011) Artists and the mind in the 21st century. Front. Hum. Neurosci. 5:110. doi: 10.3389/fnhum.2011.00110*

*Copyright © 2011 Koetsch. This is an open-access article subject to a nonexclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.*

# **WHAT LINE DRAWINGS REVEAL ABOUT THE VISUAL BRAIN**

**Bilge Sayim and Patrick Cavanagh**

# What line drawings reveal about the visual brain

# *Bilge Sayim\* and Patrick Cavanagh*

*Laboratoire Psychologie de la Perception, Centre Attention Vision, Université Paris Descartes, Paris, France*

#### *Edited by:*

*Luis M. Martinez, Universidade de A Coruña, Spain*

#### *Reviewed by:*

*Micah M. Murray, Université de Lausanne, Switzerland Luis M. Martinez, Universidade de A Coruña, Spain*

#### *\*Correspondence:*

*Bilge Sayim, Laboratoire Psychologie de la Perception, Université Paris Descartes, 45 rue des Saints-Pères, 75006 Paris, France. e-mail: bilge.sayim@parisdescartes.fr* Scenes in the real world carry large amounts of information about color, texture, shading, illumination, and occlusion giving rise to our perception of a rich and detailed environment. In contrast, line drawings have only a sparse subset of scene contours. Nevertheless, they also trigger vivid three-dimensional impressions despite having no equivalent in the natural world. Here, we ask why line drawings work.We see that they exploit the underlying neural codes of vision and they also show that artists' intuitions go well beyond the understanding of vision found in current neurosciences and computer vision.

**Keywords: visual perception, art, picture perception, painting, computer vision**

Line drawings have fascinated artists and scientists in various fields for many centuries with the first line drawings dating back more than 30,000 years (**Figure 1A**). The ease and immediacy of recognizing scenes and objects in simple line drawings suggests that, for the visual system, line drawings have deep similarities to other more detailed visual representations as well as to the real scenes they depict. For example, line drawings of visual scenes are recognized as fast and accurately as photographs (e.g., Biederman and Ju, 1988). Sometimes, line drawings convey a stunningly vivid impression of depth and three-dimensional shape, even when not much more than the outlines of an object are drawn.

Line drawings are so common in our everyday life that we seldom ask why they work. Once we ask that question, however, we realize that line drawings really are exceptional. In particular,in the real world, there are no lines around objects (with rare exceptions; **Figure 1B**). During the eons over which biological vision systems evolved, there has been no experience that could have adapted our visual systems to understand line drawings. Instead, objects are usually segmented from the background by lightness, texture, or color differences. So why does the visual system understand line drawings?

We could imagine that line drawings are a convention of modern art that we have come to recognize through learning as we have the alphabet in which this paper is written (e.g., Gombrich, 1969; Goodman, 1976). This account has been controversial (Kennedy, 1974, 1975; Deregowski, 1989; see also Gibson, 1971, 1979) and there is strong evidence against it. For example, it has been shown that infants (Yonas and Arterberry, 1994; see also Hochberg and Brooks, 1962), stone-age tribe members (Kennedy and Ross, 1975), and even chimpanzees (Itakura, 1994; Tanaka, 2007) are able to recognize line drawings. We even see line representation used by insects in bio-mimicry (**Figure 1C**). These findings rule out any strong version of culture-based acquisition for understanding line drawings, although clearly there are many culturally based conventions used in line drawings (**Figure 1D**).

If cultural knowledge is not the key for understanding line drawings and line drawings are too recent an arrival to have triggered any special adaptation, what then is the mechanism that allows us to make sense of these drawings? The likely explanation is that lines trigger a neural response that has evolved to deal with natural scenes. This fortuitous co-activation lets lines stand in for solid edges. Once artists discovered this (Kennedy, 1975), they quickly adopted this format as an economical and powerful method for representing scenes and objects. How does this "coactivation" work? The physiological investigation of the neural response to contours began with by Hubel's and Wiesel's (1962, 1968) transformational discovery that neurons in the primary visual cortex are tuned to the orientation of contours, responding to edges, and not to uniform areas. The part of cortex that analyzes visual information accounts for 30% or more of the cortex in primates and is located at the posterior pole of the brain. The visual cortex is further subdivided into several subregions that process the incoming images along parallel and serial streams. The first divisions of the visual cortex are labeledV1 throughV4 and, in all of these, we see the orientation-tuned neurons that can respond to edges. In areas V1 and V2, the orientation-tuned detectors can be specific to the attribute defining the contour (color, contrast polarity, or texture, etc.) but, starting in area V2 (Gegenfurtner et al., 1996), through V5 (Albright, 1992), and on to object recognition areas like IT (Sáry et al., 1995), many become indifferent to the attribute that defines the contour. These orientation-tuned units evolved to efficiently detect the contours in the natural world (Olshausen and Field, 1996) but even though the edges in the world are typically marked by a discrete change in surface attributes – lighter on one side than the other, for example – these units respond as well to lines – lighter in the middle and dark on both sides, for example, or even illusory contours that are suggested by context but not physically present (von der Heydt et al., 1984; Lee and Nguyen, 2001; see also Seghier and Vuilleumier, 2006). In other words, the receptive field structure that efficiently recovers edges, also works well for lines even though it was not designed to do so.

Consider then the cortical pattern of response to an object like a cube (**Figure 2**). The contour-selective neurons with oriented receptive fields fire only along the contour and not within the uniform areas of the object. If we were to look at the visual cortex with voltage sensitive dyes (Cohen et al., 1968; Tasaki et al., 1968; Blasdel and Salama, 1986), the pattern of activity for the oriented units would resemble a sketch of the object (Marr, 1982), distorted by the cortical anatomy (Tootell et al., 1982). A set of lines that match the cube's edges would trigger responses in the same pattern, indicating that, on a neural level, line representations are equivalent to the originals they depict. This notion is supported by a number of

**FIGURE 1 | (A)** Early line drawing representation of a rhinoceros at Chauvet, France, ca 30,000 BCE. **(B)** Outlines are infrequent effects of backlighting in natural scenes but even in this case, internal contours are never visible as lines. **(C)** Bio-mimicry used by the Fulgorida of South and Central America where lines trace the contour between the simulated lips and mouth. **(D)** Lines typically stand for discontinuities in surface depth or orientation but some artists rely on cultural conventions to make them stand for motion or energy (Keith Haring).

**FIGURE 2 | Contour-selective neurons with oriented receptive fields fire along the contours of the cubes.** The analysis of both types of cubes yields the internal sketch-like representation of the cube, which would, in actuality, be distorted by the cortical magnification.

recent imaging studies that showed that the activation in response to line drawings was similar to that for other representations (e.g., Ishai et al., 2000; Walther et al., 2011).

Is that all there is to it? No, in fact, there is quite a lot remaining to explain as can be seen in any line version of a natural scene. These are typically uninterpretable and the simple image in **Figure 3** shows why. Many of the contours in a scene arise from accidental illumination edges at the borders of shadows and shading. These contours, when represented as lines take on a reality that they should not have. Each line in a standard line drawing stands for depth or slant discontinuities between surfaces: these are "object contours." When the borders of shadows are included in a line drawing, these contours also get promoted to the status of object contour – but for locations where there were none. As a result, the whole image is corrupted, deviating from the structure of the original objects. **Figure 4** shows this even more dramatically because its original is a representation of a face that can only be recognized if the shadows are correctly processed. Rendered only as contours, the light and dark polarity required to interpret shadows is no longer available and the pattern becomes a meaningless set of lines. So, while contours are, of course, of prominent importance for visual perception (e.g., Koenderink, 1984), displaying *all* the

contours from the original do not capture the underlying scene very well. **(C)** A depth sketch revealing the two actual objects in the scene.

**FIGURE 4 | (A)** Two-tone image of Kennedy (1997) where the dark areas could be dark pigment or dark shadow. The face can be recovered only if the shadows are correctly identified. **(B)** A line version of the same contours is unrecognizable. The polarity information required for finding shadows is lost. The shadow boundaries can no longer be discounted and are taken as object contours that form meaningless shapes unrelated to the original.

contours of a scene in a line drawing will regularly fail to convey the essential parts of an image.

There is therefore a critical step between the extraction of edges and their assignment as "object contours." The visual system understands how to determine which contours are the critical ones and beyond a certain level in the hierarchy of visual cortex, only those contours should remain. Shadow borders and other accidental contours must be removed in order to keep only the object contours. We do not know where or how this happens in the visual system. No imaging or physiological study has yet shown the absence of response to a shadow border at some level of cortex, and yet this must happen. Like the visual system, artists also understand which contours are the important ones. We see only the characteristic object contours in their line drawings – never shadow borders, no matter how prominent they are in the scene they are drawing. Artists appear to have access to a body

larger distances to be perceived (from Kennedy, 1974).

**FIGURE 6 | Two very different line drawings triggering a familiar prototype.** Both sketches are perceived as faces.

#### **REFERENCES**


of knowledge – what makes a characteristic contour – that scientists only dimly understand at present. Future studies of artists' intuitive understanding of critical contours may lead to important insights for image understanding.

Following the initial critical choice of lines to include, what are the elements that are central to the information conveyed by line drawings? Particularly informative are those parts of an image where contours touch or intersect (Clowes, 1971; Huffman, 1971; Albert and Hoffman, 1995) and many authors have shown how these various junctions form a set of constraints that are often sufficient to specify the original object (c.f., Barrow and Tenenbaum, 1981; Malik, 1987). For example, a T-junction is formed when one object interrupts the contours of another object behind it (the contours meet in a junction as in the letter T); a Y-junction is seen at the front corner of a cube; an X-junction is formed when the contours of a transparent material cross those of a background surface. Contours also may end on their own when smooth surfaces self-occlude, such as the top of a torus or donut (Koenderink, 1984). These local junction cues are clearly used by the visual system to make sense of line drawings and we can see their power when they are in conflict (**Figure 5**) as the impossibility of the global shape does not suppress the local interpretations they trigger – a loophole exploited to great effect by artists like Escher and Reutersvard and scientists like Penrose. Interestingly, the way junctions are used in drawings has not changed very much over the recorded history of art (Biederman and Kim, 2008, see T-junctions where the rhinoceros's legs meet its body in **Figure 1A**), suggesting again that they are informative aspects of the world and not creations of our culture.

However, while junctions are certainly informative, they are not necessary for recognizing line drawings. The recognition of many sketches reveals an important contribution of memory. When a set of contours matches a familiar prototype, memory serves to fill in the missing details (**Figure 6**). These and many other line drawings show how artists are able to depict such various features as depth, folds, occlusion, texture, brightness and even odor, mental energy, or motion by choosing the right lines, revealing that artists (implicitly or explicitly) understand the code of the visual system. Scientists have yet to fully understand what artists have successfully been practicing for thousands of years and for some questions of the neural codes of vision, we may find that artists are a more immediate and better source of information than our most advanced scientific studies, whether of behavior, single cell recordings, or brain imaging.

# **ACKNOWLEDGMENTS**

This work was supported by a Chaire d'Excellence Grant from the ANR.

of visual recognition. *Cogn. Psychol.* 20, 38–64.


(Edinburgh: Edinburgh University Press), 295–323.


*Visual Information*. San Francisco: W. H. Freeman and Company.


neuron responses. *Science* 224, 1260–1262.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 16 June 2011; accepted: 30 September 2011; published online: 28 October 2011.*

*Citation: Sayim B and Cavanagh P (2011) What line drawings reveal about the visual brain. Front. Hum. Neurosci. 5:118. doi: 10.3389/fnhum.2011.00118 Copyright © 2011 Sayim and Cavanagh. This is an open-access article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, providedthe original authors and source are credited and other Frontiers conditions are complied with.*

# **DO ARTISTS SEE THEIR RETINAS?**

**Florian Perdreau and Patrick Cavanagh**

# Do artists see their retinas?

# *Florian Perdreau\* and Patrick Cavanagh*

*Laboratoire Psychologie de la Perception, Centre Attention Vision, CNRS UMR 8158, Université Paris Descartes, Paris, France*

#### *Edited by:*

*Luis M. Martinez, Universidad Miguel Hernández*

#### *Reviewed by:*

*Luis M. Martinez, Universidad Miguel Hernández Martin Banks, University of California Berkeley, USA Peter Thompson, University of York, UK*

#### *\*Correspondence:*

*Florian Perdreau, Laboratoire Psychologie de la Perception, Centre Attention Vision, CNRS UMR 8158, Université Paris Descartes, Paris, France. e-mail: florian.perdreau@ parisdescartes.fr*

Our perception starts with the image that falls on our retina and on this retinal image, distant objects are small and shadowed surfaces are dark. But this is not what we see. Visual constancies correct for distance so that, for example, a person approaching us does not appear to become a larger person. Interestingly, an artist, when rendering a scene realistically, must undo all these corrections, making distant objects again small. To determine whether years of art training and practice have conferred any specialized visual expertise, we compared the perceptual abilities of artists to those of non-artists in three tasks. We first asked them to adjust either the size or the brightness of a target to match it to a standard that was presented on a perspective grid or within a cast shadow.We instructed them to ignore the context, judging size, for example, by imagining the separation between their fingers if they were to pick up the test object from the display screen. In the third task, we tested the speed with which artists access visual representations. Subjects searched for an L-shape in contact with a circle; the target was an L-shape, but because of visual completion, it appeared to be a square occluded behind a circle, camouflaging the L-shape that is explicit on the retinal image. Surprisingly, artists were as affected by context as non-artists in all three tests. Moreover, artists took, on average, significantly more time to make their judgments, implying that they were doing their best to demonstrate the special skills that we, and they, believed they had acquired. Our data therefore support the proposal from Gombrich that artists do not have special perceptual expertise to undo the effects of constancies. Instead, once the context is present in their drawing, they need only compare the drawing to the scene to match the effect of constancies in both.

**Keywords: art, vision, visual constancy, visual search, scene perception**

# **INTRODUCTION**

Visual perception is our main access to the outside, "*distal*", world which we experience consciously at the end of a long chain of processes. The image projected on our retina is the *proximal* stimulus, the original data on which these processes operate. If we should see the world as it is represented on the retina,objects would change size as they moved toward or away from us, change color as they moved into different lights, be cut into pieces as they moved behind other objects, and jump to and fro every time we moved our eyes. But instead of perceiving this ever-changing world, we have a coherent, invariant visual representation of objects: we experience visual constancy, that is, our conscious percept is to a large extent in accordance with the *distal* object's properties whatever the *proximal* stimulus projected on our retina.

However, visual artists when rendering an object or a scene on a canvas return to a representation that is closer to the proximal image, depicting distant objects as smaller and nearby objects as larger. Clearly, compared to non-artists, artists are able to depict scenes and objects much more accurately. What is the basis of their expertise? One aspect is of course motor skill but the other of interest to us is the ability to see the proximal pattern of light and dark – to ignore the corrections, the visual constancies, underlying our everyday perception. The artist can pick the right dark pigment for depicting an object in a shadow, a pigment much darker than our subjective impression of the object; can make the distant object the correct size even though it is experienced as not very small. A number of studies have addressed these issues (Cohen and Bennett, 1997; Kozbelt, 2001; Cohen, 2005; Mitchell et al., 2005; Kozbelt and Seeley, 2007; Cohen and Jones, 2008; Matthews and Adams, 2008) showing indeed that drawing accuracy is correlated to perceptual performances: subjects who made more accurate drawings also showed less effect of context and visual constancies. According to Kozbelt (2001), artists are "experts in visual cognition." The present study addresses whether the expertise of visual artists lies in their ability to access their proximal representation better than non-artists. Have years of experience changed their visual processing and their ability to access early levels of representation? Such plasticity in visual processing as a result of visual experience is seen in many contexts (Hubel and Wiesel, 1970; Goldstone, 1998; Ostrovsky et al., 2006; Green and Bavelier, 2008).

The idea that artists have direct access to early representations has been strongly criticized by the art historian Gombrich (1987). Gombrich agreed with Ruskin (1912) that artists do use special techniques to depict the proximal stimulus but he felt that their training could not lead them to get an "innocent eye": the "innocent eye is a myth" (Gombrich, 1987, p. 251). Instead, "making comes before matching" (Gombrich, 1987, p. 99), and artists have to deal with their biased perception by drawing sketches according to it, and then make corrections in order to match it with the objective model they wish to represent. In this view, imagemaking is a hypothesis-testing process, a continuous back, and forth between production and correction. This "copyist" approach is an alternative explanation for the representational skills of artists. That is, artists may experience the same visual constancies as non-specialists but learn to make corrections in the context of the drawing itself as it progresses. Specifically, once sufficient context is present in the drawing, they only have to match the sizes and colors they see in their artwork to the perceived sizes and colors they see in the scene being depicted; the similarity of context in both will impose the same constancies.

To examine whether artists have developed visual expertise or copyist expertise, we tested three different constancies: size, lightness, and shape, all of which must be undone or bypassed for figurative artists to create an accurate copy of a scene. Two of the experiments use matching-to-standard tasks while the third is a visual search task. In all of these tasks, we will use context, perspective grids, shadows, and occlusion to trigger the application of visual constancies (Day, 1972; Todorovic, 2002, 2010), and see whether the artists are less influenced by the context than nonartists. If artists are indeed able to access, or recover their initial retinal image (closer to the proximal stimulus), they would be less affected by context than non-artists. However, this finding would not tell us whether the greater accuracy was due to perception that was uncorrected by visual constancies (Ruskin, 1912) or to skill in undoing the corrections (Gombrich, 1987). The critical factor to distinguish these two possibilities is speed: access to the uncorrected proximal image ought to allow for rapid response whereas the reversal of the corrections should require extra time. To test the speed of access, we use a visual search task for partial shapes in occluded or unoccluded presentation (He and Nakayama, 1992; Rensink and Enns, 1998). If artists are able to access the initial uncorrected image then their processing rates for the occluded versions will be more rapid than those of non-artists.

In these experiments, context is introduced in order to trigger the corrections of visual constancies and we assume that, without any instruction, both artists and non-artists would probably experience these context effects to the same degree. However, the subjects were not asked to judge the perceived size, or lightness, or shape, they were asked to ignore the context, to bypass constancy, and report the "real" size or luminance, or shape of the test. This is a critical point in the procedure: subjects are asked explicitly to report what corresponds to their retinal image. Can artists do this better than non-artists?

#### **EXPERIMENT: SIZE CONSTANCY**

Size constancy refers to the accurate perception of an object's size despite the fact that a distant object will have a smaller size on our retina than a near object. In order to provide such a "veridical perception" (Todorovic, 2002), the visual system needs to infer the object's size by correcting its size on the retina (in visual angle) for the perceived distance (**Figure 1**). Because size constancy is related to distance perception, it must be directly dependent on the various cues to depth (Leibowitz and Harvey, 1967; Day, 1972). For example, the influence of monocular cues (perspective grids) on size constancy has been shown in several experiments (Stuart et al., 1993; Aks and Enns, 1996; Bennett and Warren, 2002).

**FIGURE 1 | In the left panel, the man in the background appears to be about the same height as the woman in the foreground.** This perception corresponds to visual constancy. However, in the right panel, the man's image is moved so that he appears to be adjacent to the woman, and the now appears much smaller than he does on the left. This is the correction for distance that underlies size constancy and we, non-artists, are unable to ignore it even though we know that the two images of the man have identical size on the picture plane (measure them to check). Can artists register that the two images of the man have identical size in the picture plane?

Nevertheless, our perception is not limited strictly to corrected distal image; for example, Rock (1983) suggested that we are aware of both retinal size and actual size of the object, even if we generally do not pay attention to retinal size. However, even when asked to judge an object's retinal size (say, compare a distant building to our thumb held out beside it), there are residual effects of the actual size in the world (Carlson, 1960, 1962). This suggests that artists may be able to access the uncorrected retinal size of objects, ignoring to some extent the real world sizes of the objects; perhaps, they may do this more effectively than non-artists.

In this first experiment, perceived depth was induced by linear perspective cues of a receding hallway in the context condition. Here, size constancy should make the test stimulus look larger in the hallway than when it is seen against the flat grid (**Figure 2**), and we assume that, without any instruction, both artists and nonartists would probably experience this effect to the same degree. However, the subjects were asked to adjust the size to match the physical size of a standard (presented on a blank field below) as if they were using their fingers to measure the size directly on the screen. In other words, subjects were encouraged to ignore the context and report the "real" size of the test.

#### **MATERIALS AND METHODS** *Subjects*

For the three experiments, the subjects were subdivided in three groups: art students, professional artists, and non-artists. The first

were recruited from high-ranked Major Art School [*n* = 9, six females and three males, age = 22 ± 1.7]. Professional artists were recruited from galleries, workshops, and international artists associations [*n* = 14, nine females and female males, age = 39 ± 12.9]. Non-artists subjects were recruited from the internal network of Cognitive Science (RISC), a database of voluntary subjects, except for two subjects from our laboratory [*n* = 14, nine females and five males, age = 23 ± 2.8]. The non-artists reported having no particular drawing skills or specific training in visual arts. All subjects had normal or corrected-to-normal vision and those from outside our laboratory were paid 10C for their participation. They were informed about the purpose of the experiment and were naïve about our hypotheses. They all gave their informed consent before passing the experiment.

# *Materials*

All the experiments took place in a dark room and used the same materials. Also, the subject's head was always held by a chinrest so that his or her eyes were approximately 52 cm from the center of the screen. The stimuli were projected on a 22-- CRT screen (LaCie, Electron 22 blue IV), with a resolution of 1024 × 768 pixels and with a refresh frequency of 100 Hz. The monitor's luminance was linearized with a gamma correction. The experiments were programmed with MATLAB Psychtoolbox (version 3.0.8), and were run on an Apple computer.

The screen was divided in two equal vertical halves (21˚ × 16˚). In the top half ("*standard*"), two possible texture gradients could be displayed: a simple 16 × 16 black line-drawn grid simulating a vertical wall, or a black line-drawn perspective grid representing a hallway with a central perspective (with a unique vanishing point in the center). The targets were two green cylinders, one in each half (see **Figure 2**). The cylinders were drawn with Adobe Photoshop CS4, and their color saturation was set at 10% in order to avoid any distracting salience. All the visual elements (texture gradients and cylinders) were presented against a white background.

# *Procedure*

Participants were told, "Adjust the size of the cylinder, at the bottom of the screen, so that it matches the size of the standard cylinder at the top. Make your adjustment as if you were using your fingers to measure the size directly on the screen." They pressed the right arrow on the keyboard to increase the lower cylinder's size, or the left arrow to decrease it, and then pressed the space button to register the setting. There was no time pressure but the time they took to make their setting was recorded.

The standard cylinder displayed in the top half of the screen could be presented either on a simple grid or on a texture gradient representing a hallway. The former corresponded to the *normal* condition, while the latter corresponded to the *context* condition. The two conditions were presented equally often with the order randomized across trials. The standard cylinder could have six possible heights (1.5˚, 1.6˚, 1.7˚, 1.8˚, 1.9˚, and 2˚ of visual angle), which were randomized across trials, and the test cylinder could begin randomly either 50% smaller or bigger than the standard.

Each participant started the experiment with a block of 10 practice trials. The conditions in the test block were the texture gradient (normal/context) and the possible heights of the referential cylinder. There were 5 trials per condition for a total amount of 60 trials for the test bloc (5 × 6 × 2).

# **RESULTS**

Subjects settings increased proportionally with the standard size and we summarized each subject's settings by their means across the six standard sizes. We then computed a ratio between the context mean response and the normal mean response for each subject (group mean ratios are plotted in **Figure 3**). These ratios are a measure of the context effect on the subject's judgment. Ratios close to 1 mean that there was no effect of the context, while ratios significantly greater than 1 would suggest such an effect, that is, that subjects have overestimated the standard size when presented in the hallway context.

We ran a one-way ANOVA on those ratios with Groups (nonartist, art students, professional artists) as factor. This test showed no significant difference in the effect of context vs. normal conditions across groups [*F*(2,34) = 0.37, *p* = 0.69]. Nevertheless, all ratios were significantly greater than 1 [*t*(36) = 6.36, *p* < 0.000]. The average ratio was 1.08, where a ratio of 1 would indicate no effect of context. There was therefore no evidence in our results suggesting that artists are better than non-artists at ignoring context in accessing stimulus size. One of our other questions was whether artists' performances would vary with experience. To address that point, we analyzed the correlation between the context effect expressed as the ratio described above and subjects' years of art experience. We fixed non-artists' experience to 0, since they were not supposed to have followed an art training, and used the self-reported years of art training as the other variable in the correlation. The correlation was not significant (Pearson's *r* = 0.08, ns).

in all subjects.

Finally, we analyzed the response time for each subject to evaluate the effort the subjects put into making their settings in each settings in each condition. A longer time would suggest more effort. We found a significant main effect of Groups [*F*(2,219) <sup>=</sup> 22.59, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.17], as well as a main effect of the condition [*F*(1,219) <sup>=</sup> 5.89, *<sup>p</sup>* <sup>&</sup>lt; 0.016, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.03], but no interaction between Condition and Groups [*F*(2,219) = 0.357]. A *Post hoc* analysis showed that surprisingly, art students, like professional artists, spent *more* time on each trial, 15.37 and 15.95 s, respectively, almost twice as much as non-artists 8.60 s (both *p* < 0.001). There was no difference between art students and professional artists. This result is the opposite of our expectation that artists would find this task easier.

In summary, size perception was influenced by visual context for all subjects, showing an increase in the estimated size of the standard by in an average of 8% in the context condition compared to the normal condition. We also found no correlation between the degree of context's influence and the subject's experience, suggesting that experience and training do not play a crucial role in artists' performance. In sum, we find no evidence of an advantage for artists in ignoring context when judging object size.

The instructions were of critical importance in this task: if we had asked subjects to match the apparent size, we would expect that size constancy would apply equally to all, independently of their art training. But instead, we were encouraging subjects to ignore the context and evaluate the size of the standard and comparison as if they were measuring them on the screen with their fingers. Our adjustment procedure also allowed subjects time to engage various strategies; this is of particular interest to us as it should bring into play explicit strategies that artists have learned in drawing class as well as the implicit ones acquired through long practice.

Despite these aspects of the experiment that should have favored the artists if they did have special perceptual expertise, we found that the artists were as bound to the context effects as non-artists. Moreover, response time analysis showed that both art students and professional artists spent much more time on each trial than non-artists. We had expected artists to take less time, given their expertise. This opposite result suggests that the artists felt some pressure, as experts in visual perception, to perform well on these tasks, to engage the strategies that they had been taught to correct size perception and to overcome context effect. But despite the instructions to ignore context and despite the longer duration the artists spent on the task, they showed the same extent of constancy as non-artists.

# **EXPERIMENT: LIGHTNESS CONSTANCY**

We perceive objects via the light they reflect back to our retina. The received light is determined by two components: the object's surface reflectance and the illumination falling on it. The reflectance corresponds to the proportion of the incident light that is reflected at different wavelengths of the spectrum and fully depends on the surface material. It is a property of the object and remains constant whatever the intensity or wavelength distribution of the illumination falling on the object. The amount of light arriving at the retina (the proximal property) is the product of the object's reflectance (its "color," the distal property) and the illumination. Here we will focus on achromatic property of the object's surface – whether it is light or dark, and in the case of the achromatic test patches we use, white, gray, or black. We will use "lightness" as the perceived reflectance (white vs. black surface) and "brightness" or luminance as the perceived luminance (the product of illumination and reflectance). According to those definitions, lightness constancy designates the invariance of the surface's perceived reflectance despites changes in illumination (Gilchrist, 1988; Moore and Brown, 2001).

To recover the surface reflectance of an object, most authors assume a process that can discount the illumination falling on it. To do so, the visual system must estimate the illumination. A number of proposals have been made for this process (Gilchrist, 1988, 2006; Adelson, 1993, 2000; Arend and Spehar, 1993a,b; Agostini and Galmonte, 2002). Although lightness constancy has often been explained in terms of low-level mechanisms (simultaneous contrast effect caused by lateral inhibition in retina's ganglion cells), it now appears that in some cases, a high-level computation of spatial relationships of surfaces and light is required. For example, a cast shadow on a surface can be recognized by the visual system because it is darker, its borders are unrelated to object borders, the surrounding texture continues into the shadow area with a reduction of luminance but not contrast, and it appears to have no volume of its own (Cavanagh and Leclerc, 1989). Thus the visual system would attribute change of luminance within the shadow limits to a change in illumination, not reflectance (Gilchrist, 1988).

However, a painter can only vary the reflectance of the paint used to depict the object and so this one pigment must correspond to the luminance coming from the real object where the luminance is the product of the object's reflectance and the illumination

falling on it. Can normal observers make these luminance judgments with any accuracy (brightness) – how well could they pick a paint to match it? For instance, when a cast shadow falls on a test surface it leads the observer to perceive the object's surface as lighter (**Figure 4**). Can artists ignore the perceived reflectance and "see" the actual luminance any better than normal observers?

To examine this we introduce a cast shadow into a simple scene (**Figure 5**) where lightness constancy should make the test stimulus look lighter, more white, when the shadow falls on it even though its luminance remains the same. We assume that, without any instruction, both artists and non-artists would probably experience this effect to the same degree. However, the subjects were not asked to judge the perceived surface lightness (light or dark) but to judge the amount of light as if the shadow were not present or they could look at the gray patch through a tube. In other words, subjects were encouraged to ignore the context, to bypass lightness constancy and report the "real" luminance of the test.

# **MATERIALS AND METHODS** *Stimuli*

For this experiment, the screen was divided in two vertical halves having the same height and width (21˚ × 16˚). In those two halfscreens were displayed two identical boards textured with a wood surface and on which a piece of wood shaped as a cylinder was lying, each of them was made with Adobe Photoshop CS4. The wood surface's average luminance was 9.60 <sup>±</sup> 0.12 cd/m2 (mean and SD), while the white background's luminance was 68.4 cd/m2.

On the top board that served as standard, a cast shadow was rendered to correspond to the effect of a light source on the right. Within the shadow, the wood surface's luminance was 3.04 <sup>±</sup> 0.10 cd/m2 and then rose gradually to the adjacent value to simulate a shadow penumbra. Also, the shadow could have two possible locations covering or not the ellipse position. The target stimuli were two ellipses (2˚ × 1.5˚) colored with middle gray and were presented with the same luminance whether or they fell in the shadow region.

#### *Procedure*

The subjects were asked to adjust the luminance of the test ellipse, so that it corresponded to the actual luminance of the

standard ellipse. More particularly, subjects were told "adjust the luminance of the test ellipse, at the bottom of the screen, so that it matches the luminance of the standard ellipse, that at the top. Focus on the standard ellipse's inside, as if there was no cast shadow, and ignore the context of the scene." They pressed the right arrow for increasing the luminance and the left arrow for decreasing it. Once the subject was satisfied with the adjustment he or she pressed the space key to register the choice. The subject had all the time he or she wanted to give a response.

The standard ellipse could have six possible luminance levels randomized across the trials (14, 16.5, 19, 21.8, 24.6, 27.6; values given in cd/m2), while the test ellipse's luminance could be initially and randomly (before the subject's adjustment) either 25% smaller or bigger than the standard luminance. On the half of the trials, the standard ellipse was outside the shadow, and in the other half, it was inside. The ellipse's position was randomized across the trials.

Each subject started the task with a block of 10 practice trials to ensure that he or she had understood instructions. Conditions that composed the test block were the six possible luminance levels of the standard ellipse and the two positions of the shadow. There were 5 trials per condition, and so 60 trials in the test block (5 × 6 × 2).

#### **RESULTS**

As in the first experiment's analyses, we averaged the mean response for each subject over the stimulus conditions and computed a ratio between the mean in the context and normal conditions (**Figure 6**). A one-way ANOVA was run on the individual ratios with Groups (non-artists, art students, professional artists) as factor. There was no significant main effect of group [*F*(2,34) = 1.65, *p* = 0.21]. Nevertheless, all ratios were

significantly greater than 1 [*t*(36) = 14.48, *p* < 0.000]. The average ratio was 1.35, where a ratio of 1 would indicate no effect of context and a ratio of 3.16 would indicate complete lightness constancy. As in the first experiment, we asked whether context's effect on perceptual performance, quantified by ratios, varies with experience. The correlation between ratios, expressing the context effect, and individual self-reported years of art training was not significant (*r* = 0.02, ns).

Finally we analyzed the subjects' response time to evaluate the effort the subject made to perform the task. We found a significant main effect of Groups [*F*(2,219) <sup>=</sup> 18.91, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.15], and a significant main effect of condition [*F*(1,219) = 25.53, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.10]. There was no interaction between both factors. *Post hoc* comparisons showed that non-artists spent less time (9.67 s) than art students (11.44 s, *p* < 0.003), and the professional artists were slower still (14.31 s, *p* < 0.03).

Our present findings revealed that non-artists, art students, and professional artists were all strongly affected in their brightness (luminance) judgment when a cast shadow was overlapping the position of the standard ellipse. Subjects perceived the standard about 30% brighter than it was, and thus showed a strong effect of lightness constancy (all ratios were significantly greater than 1) despite being asked to ignore the shadow context.

As was the case for the size task, art students and professional artists again took significantly longer than non-artists to make their setting on each trial, suggesting that they put more efforts into doing the task. Nevertheless, this extra effort, and their substantial expertise did not allow them to overcome lightness constancy. Finally and consistently with our first experiment's results, we found no correlation between the effect of context and the subject's art experience.

# **EXPERIMENT: AMODAL COMPLETION**

Amodal completion, another instance of perceptual constancy (e.g., Rock, 1983), is a phenomenal completion of an object's shape even though some of its parts are occluded by another, intermediate object (e.g., Kanizsa, 1979). Despite the lack of information concerning the occluded parts of the far object, our perception of this object seems to remain complete so that, even if the object is separated into two visible parts by the occluder, we know that the different parts belong to the same object (Kanizsa, 1985). These completion phenomena have been explained in terms of Gestalt configuration laws, such as collinearity (good continuation, e.g., Kellman and Shipley, 1991), similarity, and so forth. Such laws are largely implemented by low-level mechanisms (e.g., edge detection, line orientation, and size discrimination in V1; problem-solving of "border-ownership" in V2 complex cells, e.g., Bruno et al., 1997; Rensink and Enns, 1998; Tse, 1999; Wolfe and Horowitz, 2004).

The processing of visual shape proceeds principally from an analysis of the parts (*mosaic* stage) to that of the whole (*completion* stage) where independence from vantage point and completion of missing details emerge. Surprisingly, our conscious access to the object does not seem to follow the same sequence, but rather the reverse (Hochstein and Ahissar, 2002). Several visual search studies have demonstrated that the individual parts of an object are accessed after the percept of the whole object, even when the whole object is not presented (it is partially hidden, He and Nakayama, 1992; Rensink and Enns, 1998; Wolfe and Horowitz, 2004). For example, He and Nakayama (1992) reported that searching for an L-shape is more difficult when it appeared touching an adjacent square. In this case, subjects seem to see not an L-shape but a square completed behind the occluder, thus camouflaging the L (**Figure 7**). Similar results have been found by Rensink and Enns

(1998), where searching for a notched square touching a circle led to greater reaction times and to search slopes that were steeper than when it was isolated.

If object-level descriptions are the first representations available to conscious perception (Tse, 1999; Lee and Vecera, 2005), any task that requires access to an object's parts requires that the object be "unbundled," a step that requires extra time (Hochstein and Ahissar, 2002). Can visual artists better ignore the completed form of the object's representation and then access the "mosaic" image that would be present on our retina? In our task, subjects were instructed to locate the notched square (**Figure 8**) so, if the notch contacted the adjacent circle, it would normally be completed and appear as a partially hidden square, camouflaging the notched square shape. If artists have any special expertise in accessing early representations, prior to the completion step, they should find these targets faster than non-artists.

# **MATERIALS AND METHODS** *Stimuli*

We designed a visual search task based on Rensink and Enns' (1998) experiment using amodal completion. The target was a notched square generated by subtracting a circle shape overlapping a square (see **Figure 8**) and that could possibly be

**FIGURE 8 | Shape task, stimuli, and conditions.** The target was a notched square that could be either green or red. In "normal" condition, the target was free, while distractors were Pacman-like circles with a square as companion. In "context" condition, the target was bounded with an "occluding" circle, whereas distractors were squares overlapping a circle. In both conditions, target, and distractors had the same overall size, and there were six isolated circles as supplementary distractors to prevent subjects from searching for a circle overlapping a square.

either red or green. For both colors we decreased the saturation by 90% so that neither seemed more salient than the other while remaining discriminable. The distractors were circles with a missing quarter sector (which was generated by subtracting a square shape overlapping a circle), which could also be green or red of the same saturation as the target. An item could accompany the target as the distractors. This added item was a green/red circle for the target, whereas it was a green/red square for the distractors. Depending on the condition, those paired items had a specific spatial relationship, either adjacent (mosaic condition) or touching (occlusion condition). The overall size spanned by the pair was 1.5˚ in the *mosaic* condition (notched square separated from the accompanying circle), and of 1.13˚ in the *occlusion* condition (notched square touching with the circle).

All the items were projected in a 12˚ × 8˚ visual array centered on the screen. Their position was randomly distributed within a 6 × 4 invisible grid. The set number was randomly chosen between 2, 8, or 12 items, and all the displayed elements were jittered by ±0.5˚ to avoid the item collinearity that could help the subject to find the target. To avoid alternative cues to the target pair, we added six isolated circles that were either green of red. The circles' size was approximately 0.77˚.

# *Procedure*

The subject had to find a specific target presented among a set of distractors. A target was present on all trials, but could have one of two colors: red or green. The subject had to report the color of the target by pressing the "Z" key on the keyboard if the target was red, or the "N" key if it was green. Subjects were asked to use their two hands, one per key.

The target could have two different orientations: either upright or upside-down. In the *mosaic* condition, the target was isolated, that is not bounded to another item, whereas in the *occlusion* condition, the target was attached to a circle so that it appeared as a square occluded by a circle. In the former condition, the distractors were a Pacman-like shape accompanied by a square, while in the latter condition they were a circle occluded by a square. Distractors were designed so that the shapes of those of the *mosaic* condition corresponded to those of the *occlusion* condition.

The task was divided in a practice block and a test block. The practice block consisted of 30 trials to ensure that the subjects had well understood the instructions and that they were able to discriminate the colors (green/red). The test block was designed as follows: at the beginning of each trial a black fixation cross was displayed at the center of the stimulus array for 1000 ms and the subject had to look at it. After its disappearance, the items were displayed for a maximum of 12 s, the time interval within which the subject had to respond. If the subject took too much time to respond, the message "too long" appeared and the experiment moved to the next trial.

The subjects had to respond as quickly as possible but keep the error rate below 10%. Each time they made an error, feedback including the current error rate was shown (computed on the basis of the total number of the errors they made over the total number of trials). Their reaction times were the dependent variable we measured.

The conditions were the spatial relationship between the target and its companion-item (mosaic/occlusion), the number of items (2, 8, or 12), the target's color (green/red), and the target's orientation (upright, up-down). There were 15 trials per condition, and hence 360 trials per subject (2 × 3 × 2 × 2 × 15). Those 360 trials were divided into two equal parts of 180 trials, and a short break between them was proposed to the subject.

#### **RESULTS**

We first analyzed the reaction times of the subject as a function of the number of displayed items (target and distractors) and we then computed linear regression slopes (**Figure 9**). The linear regression slopes show that subjects' reaction times linearly increased with the number of items (*R*<sup>2</sup> <sup>=</sup> 0.67 <sup>±</sup> 0.01) and that the slopes in both context (occlusion) and no-context (mosaic) conditions were steep, with an average of 178 ms/item for the context case, and 89 ms/item for the no-context case. This difference between conditions was significant [*F*(1,34) = 173.47, *p* < 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.84], showing a strong effect of context; however, there was no effect of Groups [*F*(2,34) = 1.74, *p* = 0.19] or interaction between Groups and Conditions [*F*(2,34) = 1.31]. A similar pattern of results held for the intercepts of these linear regressions. Because of the absence of the group effects and of interactions in the regression analysis, we could proceed to an analysis of the mean response times, as we had in the two previous experiments, calculating a ratio between mean in the context conditions and in the normal condition for each group (**Figure 10**). We ran a one-way ANOVA on the individual ratios with Groups (nonartists, art students, professional artists). This analysis revealed no main effect of Groups [*F*(2,34) = 0.30, *p* = 0.74]. This result is consistent with the absence of interaction between Groups and Conditions in the slope and intercept analyses. It suggests once again that artists (students and professionals) were not better than non-artists at accessing the raw image data of the target's L-shape. Nevertheless, again all ratios were significantly greater than 1 [*t*(36) = 17.88, *p* < 0.000] indicating a strong effect of context. The average ratio was 1.50, where a ratio of 1 would indicate no effect of context.

In order to address the question whether artists' ability to overcome the effect of context can be explained by their years of art training, we analyzed the correlation between the individual ratios and the individual self-reported experience. As in the two first experiments, we found no correlation between ratios and subjects' experience (*r* = −0.13, ns).

Visual search tasks allow us to quantify approximately the time that attention spends on every visual object (e.g., Treisman and Gormican, 1988; Wolfe, 1998; Wolfe and Horowitz, 2004; Nakayama and Martini, 2011). Previous articles have shown that accessing the visible part of an occluded object takes more time than when the partial shape is isolated (He and Nakayama, 1992; Rensink and Enns, 1998). Consistent with these earlier results, we also find that visual search for a notched square was slower when the notch was contacting a circle than when it was isolated indicating a strong effect of context even though subjects were instructed to ignore it and look for L-shapes. Finally, as in the two first experiments, ratios between the mean response times in

**FIGURE 9 | Reaction times as a function of number of items in display with group regression slopes and intercepts for normal and context conditions.** While non-artists and art students did not show differences between their slopes, professional artists were numerically slower in both conditions. But this was not significant. No main effect of groups was found for either the slopes or the intercepts.

the context and normal conditions did not correlate with subject's experience.

#### **GENERAL DISCUSSION**

Visual constancies, such as those of size, lightness, or shape, are known to depend on both low-level, automatic mechanisms and high-level, attentive processing. Our conscious perception

emerges with appropriate corrections for the context in the scene (Hochstein and Ahissar, 2002; Ahissar and Hochstein, 2004). This makes sense since we need to recognize objects for what they are, bypassing the particular details of how they arrived on our retina. Although this top-first strategy may be useful for our action in everyday life, visual artists have different goals. They must capture exactly those low-level details that broadly match what lands on our retina. Our present study asked whether visual artists like painters and draftsmen can really access this proximal representation or if they are as much affected by visual context and visual constancies as non-artists,*even when asked explicitly to ignore context*. One could expect that the intensive training of artists might modify the functional organization of the visual brain to allow artists faster access to the early visual information that they need to reproduce in their artwork.

Indeed, several previous studies have reported that visual artists outperformed non-artists in many visual tasks: mental imagery (e.g., Calabrese and Marucci, 2006), object recognition, visual search for embedded shape and Gestalt completion (Kozbelt, 2001). Other studies have shown that artists were also less influenced by shape constancy in a drawing task, as well as in a perceptual task (Mitchell et al., 2005; Cohen and Jones, 2008). Both Mitchell et al. (2005) and Cohen and Jones (2008) have related reduced effects of shape constancy to drawing accuracy. All those findings would suggest that, because artists are more accurate in depicting objects, they should be less influenced by their conceptual knowledge, and perhaps they would rely more on their present raw, early level representation than on their past knowledge.

However, the results of our three experiments, two matchingto-standard tasks and one visual search task, showed that art students and professionals do not differ from non-artists in their ability to ignore perceptual context. Indeed, in all of the three tasks, all the groups' ratios were significantly greater than 1, showing a significant effect of visual context on their settings. In the first two cases, we found that judgments for size were shifted an average 8 and 35% from veridical by the context (perspective and cast shadow). In the third, the amodal completion context slowed visual search by 50%.

These results argue against theories that suggested that artists' drawing accuracy is solely due to perceptual expertise. Moreover, all three experiments showed similar, significant effects of context for all groups even though the subjects were instructed to ignore context. There is no evidence here for plasticity in the visual systems of artists. It is possible that we might find some significant differences between artists and non-artists if we had more than the 23 artist and 14 non-artist subjects we tested here; or if we changed our tasks and insisted even more strongly that the subjects ignore

### **REFERENCES**


the context and report what was on the screen. However, even so, there would not be much joy for those who would want to see artists with an access to early representations. Our data did show significant large effects of context and the best the artists did at reducing this was a non-significant decrease in context effect of about 10% compared to the non-artists' ratio in the second experiment (lightness). This is far from the 100% reduction that would be required to be able to paint based on"seeing the proximal image."

Although there is little evidence of any visual system plasticity from all those years of training, there is evidence that their training did affect their performance in the matching tasks but in a different way: they took a very long time to make their settings compared to non-artists. The tasks were not easier for them as we would expect if they had special perceptual expertise. It suggests instead that artists may have found the tasks a personal challenge to their self-image as artists and so they spent more time, perhaps trying to apply specific strategies that they had learned to deal with depicting size and lightness. But to no avail. The visual search task also showed no advantage for artists, again giving no support to the possibility of a direct, more rapid access to a low-level visual representation.

According to the Gombrich (1987) model of schemata, artists act as copyists, starting with a rough approximation to the scene they are painting. They then compare the depiction with the original and make corrections so that they look the same. This interpretation of the skills of artists does not require them to "see" their retina, the proximal stimulus. Yes, they may make initial errors in selecting a paint, having chosen a value that is more in line with what they "see," affected as it is by visual constancies. They can quickly correct it once it is in play on the canvas and subject to the same constancies from the context surround it on the canvas, just as the original object is surrounded by its context in the world.

Nevertheless, our results have only examined perceptual factors. In contrast, the visual arts are not only visual but also motor as they involve the drawing task itself. Isolating the perceptual factor allowed us to argue against perceptual expertise as a contributing factor to the difference in drawing skills between artists and non-artists. However, the expertise of visual artists may only emerge in tasks that call on artists' to actually produce works of art. Further research should assess the role of visual factors in tasks where artists produce artworks.

# **ACKNOWLEDGMENTS**

This research was supported by a Chaire d'Excellence grant to Patrick Cavanagh.


role of visual short-term memory in amodal completion. *Psychol. Sci.* 16, 763–768.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 16 June 2011; accepted: 12 December 2011; published online: 30 December 2011.*

*Citation: Perdreau F and Cavanagh P (2011) Do artists see their retinas? Front. Hum. Neurosci. 5:171. doi: 10.3389/fnhum.2011.00171*

*Copyright © 2011 Perdreau and Cavanagh. This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.*

# **THE BRAIN ON ART: INTENSE AESTHETIC EXPERIENCE ACTIVATES THE DEFAULT MODE NETWORK**

**Edward A. Vessel, G. Gabrielle Starr and Nava Rubin**

# The brain on art: intense aesthetic experience activates the default mode network

#### *Edward A. Vessel <sup>1</sup> \*, G. Gabrielle Starr <sup>2</sup> and Nava Rubin3*

*<sup>1</sup> Center for Brain Imaging, New York University, New York, NY, USA*

*<sup>2</sup> Department of English, New York University, New York, NY, USA*

*<sup>3</sup> Center for Neural Science, New York University, New York, NY, USA*

#### *Edited by:*

*Alvaro Pascual-Leone, Beth Israel Deaconess Medical Center/Harvard Medical School, USA*

#### *Reviewed by:*

*Alvaro Pascual-Leone, Beth Israel Deaconess Medical Center/Harvard Medical School, USA Luis M. Martinez, Universidade de A Coruña, Spain Mark A. Halko, Beth Israel Deaconess Medical Center, USA*

#### *\*Correspondence:*

*Edward A. Vessel, Center for Brain Imaging, New York University, 4 Washington Pl., Rm. 156, New York, NY 10003, USA. e-mail: ed.vessel@nyu.edu*

Aesthetic responses to visual art comprise multiple types of experiences, from sensation and perception to emotion and self-reflection. Moreover, aesthetic experience is highly individual, with observers varying significantly in their responses to the same artwork. Combining fMRI and behavioral analysis of individual differences in aesthetic response, we identify two distinct patterns of neural activity exhibited by different sub-networks. Activity increased linearly with observers' ratings (4-level scale) in sensory (occipito-temporal) regions. Activity in the striatum (STR) also varied linearly with ratings, with below-baseline activations for low-rated artworks. In contrast, a network of frontal regions showed a step-like increase only for the most moving artworks ("4" ratings) and non-differential activity for all others. This included several regions belonging to the "default mode network" (DMN) previously associated with self-referential mentation. Our results suggest that aesthetic experience involves the integration of sensory and emotional reactions in a manner linked with their personal relevance.

**Keywords: aesthetics, preference, fmri, visual art, default mode network**

# **INTRODUCTION**

Human beings in every culture seek out a variety of experiences which are classified as "aesthetic"—activities linked to the perception of external objects, but not to any apparent functional use these objects might have. Looking at paintings, listening to music, or reading poems—these are hedonic experiences in which humans consistently choose to engage. And although the relevant objects in and of themselves have no immediate or direct value for survival or for the satisfaction of basic needs (food, shelter, reproduction), they nevertheless accrue great value within human culture. What are the neural underpinnings of aesthetically moving experience?

Although the foundation of aesthetic inquiry as a formal scholarly discipline is relatively recent—the philosopher Alexander Baumgarten introduced the modern use of the term in 1739 musings about the nature of "beauty" date back at least as early as Plato (Plato, 1989) and Confucius, and evidence exists of well-developed artistic traditions in most of the world's ancient cultures (e.g., China, India, Egypt, Mesopotamia, Persia). But it is only recently that it has become possible to investigate the physiological bases of aesthetic experience. Recent neuroimaging studies have identified several brain regions whose activation correlates with a variety of aesthetic experiences—namely locations in the anterior medial prefrontal cortex (aMPFC) and the caudate/striatum, with several additional regions detected in some studies but not others (Blood and Zatorre, 2001; Cela-Conde et al., 2004; Kawabata and Zeki, 2004; Vartanian and Goel, 2004; Jacobsen et al., 2006; Di Dio and Gallese, 2009; Kirk et al., 2009; Ishizu and Zeki, 2011; Lacey et al., 2011; Salimpoor et al., 2011). These findings form the initial basis for the field of neuroaesthetics, but key questions remain. In this study we examined more closely issues surrounding the intensity and diversity of aesthetic responses.

A major theme in philosophical inquiry into aesthetic experience is a tension between universality and subjectivity. On one hand, many authors have argued that aesthetic evaluations rely on universal principles. On the other, philosophical inquiry also emphasized the importance of understanding aesthetic responses as strongly subjective. These two views are not, in principle, mutually exclusive: subjective judgment may lead to aesthetic evaluations that are so consistent across individuals as to be termed universal. Indeed, the notion of universal aesthetics relies on the observation of wide agreement among people about the aesthetic value of certain objects or classes of objects (e.g., flowers; Scarry, 1999). Yet aesthetic judgments are not only subjective but also highly susceptible to cultural norms, education, and exposure. Thus, while there may be certain items that command consensus in their evaluations, for the majority of artifacts judgments can vary widely.

This variation in aesthetic judgments can be used to isolate the neural dimensions of aesthetic *responses* as opposed to reactions to particular features of a given work of art (e.g., Kawabata and Zeki, 2004; Salimpoor et al., 2011). To date, most studies have used stimuli that generated wide agreement. Putative subjective aspects of an experience were potentially confounded with differences in the stimuli themselves. Another fundamental problem is that using stimuli on whose aesthetic value people tend to agree necessarily gives more weight to common internal factors—be they driven by culture or by evolution—and leaves little room for truly individual aspects of subjective aesthetic experience to emerge. We solved this by using stimuli for which people expressed strongly individual preferences. These large individual differences enable us to use the diversity of visual artwork to parse out the different components of aesthetic experience.

To allow for these individual preferences to emerge, an important guiding principle in the choice of our stimulus set was that it should span a variety of styles and periods (see **Figure 1**). One way in which diverse stimuli may lead to individual differences is that they invoke a variety of emotions—an aesthetic response includes evaluations that can vary in valence and degree of arousal, from "preference" and "pleasure" to "beauty," "sadness," "awe," or "sublimity" (Frijda and Sundararajan, 2007; Zentner et al., 2008). Therefore, our instructions to observers explicitly acknowledged that strongly moving aesthetic experiences may come in a variety of forms, not merely beauty and preference. With this paradigm, we find large individual differences in which of the artworks observers find aesthetically moving: on average, each image that was highly recommended by one observer was given a low recommendation by another. Therefore, any BOLD effects found in a contrast of high vs. low recommendation reflect differences in aesthetic reaction, not stimulus features.

Differences in subjective experience may arise not only from differences in the emotions that a given artwork evokes but also from how different individuals weigh these emotions. To examine this, observers also responded to a nine-item questionnaire addressing evaluative and emotional components of their aesthetic experience for each artwork.

We find that brain regions differentially activated by artworks given high and low aesthetic recommendations can be classified into two distinct sets by virtue of the pattern of their response. BOLD activation varied linearly with observers' ratings in several sensory (occipito-temporal) regions. Activity in the striatum (STR) and pontine reticular formation (PRF) also varied linearly with ratings but straddled their resting baseline, exhibiting below-baseline activations for low-rated artworks. In contrast, a separate network of frontal and subcortical regions showed a step-like increase only for the most moving artworks ("4" ratings) and non-differential activity for all others. This included several regions belonging to the "default mode network" (DMN) previously associated with self-referential mentation, such as the anterior aMPFC. Within these networks, we observed sensitivity to positive and negative emotional aspects of aesthetic experience, and evidence for individual differences correlated with personal differences in aesthetic evaluation.

# **MATERIALS AND METHODS OBSERVERS**

Sixteen observers were recruited at New York University (11 male; 13 right-handed; 27.6 ± 7.7 years) and paid for their participation. All had normal or corrected to normal vision. Informed consent was obtained from all participants, in accordance with the New York University Committee on Activities Involving Human Subjects.

database (http://www.oclc.org/camio). See List of artworks for image credits and the full list of artworks used in the experiment.

# **STIMULI**

One hundred and nine images were selected from the Catalog of Art Museum Images Online database (CAMIO: http://www. oclc.org/camio; **Figure 1** and List of Artworks). CAMIO contains more than 90,000 images of textiles, paintings, architecture, and sculpture from museum collections around the world. The works of art came from a variety of cultural traditions (American, European, Indian, and Japanese) and from a variety of historical periods (from the 15th century to the recent past). Images were representational and abstract, and could be roughly classified as either female figure(s) (33), male figure(s) (23), a mixed group (20), still life (11), landscape (14), or abstract painting (8). These classifications did not show significant effects on responses.

Commonly reproduced images were not used, in order to minimize recognition. Most observers recognized no images, and no observer recognized more than a very few (3–5) stimulus images as reported by survey responses.

Images were scaled such that the largest dimension did not exceed 20◦ of visual angle, and the area did not exceed 75% of a 20◦ box. Stimulus presentation and response collection were controlled using a Macintosh G4 running Matlab 6.5 and the Psychophysics Toolbox (Brainard, 1997).

# **PROCEDURE**

Observers were told they would be viewing a set of artworks while lying in the scanner. They were to use a scale of 1–4 by pressing a button on a hand-held response box to answer the question "how strongly does this painting move you?" according to the following instructions:

Imagine that the images you see are of paintings that may be acquired by a museum of fine art. The curator needs to know which paintings are the most aesthetically pleasing based on how strongly you as an individual respond to them. Your job is to give your gut-level response, based on how much you find the painting beautiful, compelling, or powerful. Note: The paintings may cover the entire range from "beautiful" to "strange" or even "ugly." Respond on the basis of how much this image "moves" you. What is most important is for you to indicate what works you find powerful, pleasing, or profound.

Each observer viewed all 109 artworks; the order was counterbalanced across observers to control for possible serial order effects. Observers were instructed prior to entering the magnet and given practice trials using artworks not in the stimulus set.

# *Nine-item evaluative questionnaire*

After the fMRI session, observers were given a short break, and were then taken to a behavioral lab where they sat in front of a computer screen to complete a nine-item questionnaire. They were shown the same set of paintings in the same order as in the scanner. Each painting was shown for 6 s. Observers were asked to rate the intensity with which each artwork evoked the following evaluative/emotional responses: joy, pleasure, sadness, confusion, awe, fear, disgust, beauty, and the sublime. Responses to this nineitem questionnaire were given using mouse clicks on a visual seven-point scale for each item. These items were presented in random order on each trial. Observers could respond to the nine items in any order, but could not change ratings.

Observers ranged from those with novice-level experience of art and art history to several having completed some undergraduate study in the history of art (evaluated using a survey at the time of the experiment). Before entering the scanner, observers were also administered the Positive and Negative Affect Schedule (PANAS; Watson et al., 1988). PANAS is a highly stable and internally consistent metric for dispositional affect (mood), used to determine how frequently an observer experiences positive and negative affect in a defined time period. Observers in this study were asked to answer questions with regard to the immediately preceding few days.

#### **fMRI SCANNING PROCEDURES**

fMRI scans were carried out at New York University's Center for Brain Imaging, using a 3-T Siemens Allegra scanner and a Nova Medical Head coil (NM011 head transmit coil). Artworks were projected onto a screen in the bore of the magnet and viewed through a mirror mounted on the head coil.

The 109 artworks were divided into four sets (different subsets per observer, depending on order) and shown over the course of four functional scans using a slow event-related design. During these functional scans, the blood oxygen level dependent (BOLD) signal was measured from the entire brain using thirty-six 3 mm slices aligned approximately parallel to the AC-PC plane (in plane resolution 3 × 3 mm, TR = 2 s, TE = 30 ms, FA = 80◦ ). Each trial began with a 1 s blank period then a blinking fixation point for 1 s, followed by the artwork for 6 s, and a blank screen for 4 s, during which the observer pressed a key corresponding to recommendation. An additional 0, 2, or 4 s blank interval was inserted pseudorandomly between trials to jitter trial timing, with an average trial length of 13.14 s.

Observers were also run in a localizer scan containing blocks of objects, scrambled objects, faces, and places. This 320 s scan consisted of four 18 s blocks of each stimulus type, during which the observer performed a "1-back" task (where observers monitor for exact repeats of an image). Each block contained 16 stimulus images plus two repeats, each presented for 800 ms with a 200 ms inter-stimulus-interval. The full-color images were placed on top of phase-scrambled versions of the same stimuli filling a 500 × 500 pixel square to control for differences in size across stimulus categories.

A high resolution (1 mm3) anatomical volume (MPRage sequence) was obtained after the functional scans for registration and spatial normalization.

# **BEHAVIORAL DATA ANALYSIS**

For the observers' recommendations collected during the scanning session, a measure of agreement across individuals was computed by taking the set of 109 recommendations for every pair of observers and computing the Pearson correlation coefficient. Images with any missing recommendation values were excluded from the correlations in a pairwise manner. One observer gave no "4" recommendations, and was, therefore, excluded from subsequent analyses relying on the contrast of "4" vs. "1" responses. Similarly, a measure of across-observer agreement was computed for each item of the nine-item questionnaire collected after the scanning session. For each item, the Pearson correlation coefficient was computed for each pair of observers.

# *Factor analysis of evaluative questionnaire*

The responses on the nine-item questionnaire produced by each observer to each artwork (16 × 109 = 1744 trials total) were then converted to *z*-scores within observers and concatenated into a single large matrix of scores. Principal components extraction was used to identify factors with eigenvalues greater than one. Two emotional/evaluative factors survived and were rotated using the "direct oblmin" method, which does not require that the factors be orthogonal. Scores on these two factors were computed for each of the 1744 trials using regression (see **Figure 7**).

### **fMRI DATA ANALYSIS**

The scans were pre-processed using the FMRIB Software Library (FSL; Oxford, UK) to correct for slice-timing and motion, and were high-pass filtered at 0.0125 Hz. Subsequent analyses were performed using BrainVoyager QX (Brain Innovation, Maastricht, Netherlands). After alignment to observer-specific high-resolution anatomical images, the scans were normalized to Talairach space (Talairach and Tournoux, 1988), blurred with an 8 mm Gaussian kernel, and converted to *z*-scores.

# *4-vs.-1 whole brain analysis*

To identify regions sensitive to observer recommendation, a whole-brain random effects group-level general linear model (GLM) analysis was computed with the responses of each observer on each of the four possible recommendation levels coded as separate regressors (as a 6 s "on" period for each image convolved with a standard two-gamma hemodynamic response function, HRF). A contrast of the "4" regressors vs. the "1" regressors was computed and the resulting statistical map was corrected for multiple comparisons at a false discovery rate (FDR) of *q* < 0.05 (Benjamini and Hochberg, 1995; Genovese et al., 2002) and a cluster threshold of 5 3 mm3 voxels. This contrast will be referred to as the 4-vs.-1 whole-brain analysis (see Appendix **Table A1** and **Figures 3**–**5**).

# *ROI analysis*

In order to compare BOLD activation for all four recommendation levels across these regions, the group-level clusters from the 4-vs.-1 analysis were used to draw regions-of-interest (ROIs) from which we extracted timeseries for each observer. Using the average (over voxels in the ROI) of non-blurred, *z*-scored timeseries for each scan, individual observer parameter estimates for each of the four recommendation levels were obtained using a GLM with a standard two-gamma HRF convolved with a 6 s "on" period for each image (see **Figures 3**–**5**). Standard errors were computed across observers.

# *4-vs.-321 whole brain analysis*

To further isolate processes particular to aesthetic response, we computed a second whole-brain contrast relying on the same whole-brain GLM as above, but with a new contrast of only the "4" recommendations vs. the average of all the other recommendation levels, balanced to add to zero [e.g., a linear contrast of (−1 −1 −1 3) for the 1, 2, 3, and 4 regressors]. The same statistical threshold was used to correct for multiple comparisons – FDR of *q* < 0.05 and a 5 3 mm3 cluster threshold. This contrast will be referred to as the 4-vs.-321 whole-brain analysis (see **Figure 6**). Note that this contrast may lead to the discovery of new activations not found in the original 4-vs.-1 analysis. Given the widely extended and interconnected nature of the resulting whole-brain map, we do not report the full set of activation coordinates—most of the peak activations were coincident with regions reported for the 4-vs.-1 contrast. Group-level ROIs were isolated for four prominent activations not found in the 4-vs.-1 contrast: the anterior medial pre-frontal cortex (aMPFC), the left hippocampus (HC), left substantia nigra (SN), and the left posterior cingulate cortex (PCC). It was not possible to draw an isolated ROI for the aMPFC from this contrast given the large swath of activation we, therefore, drew a more restricted ROI for the aMPFC based on the 4-vs.-1 whole-brain contrast, but with a statistical threshold of *p* < 0.001.

# *ROI analysis of evaluative factors*

The trial-by-trial scores for the two factors extracted from the principal components factor analysis of the nine-item evaluative questionnaire were used to create BOLD predictors by convolving with a standard 2 gamma HRF with a length of 1 TR (2 s) and a delay of 1 TR relative to image onset. This middle TR was chosen as a compromise given our uncertainty about when, during a 6 s viewing, an observer was able to integrate enough information across successive fixations of an artwork to generate an affective response. The resulting timecourses were combined with an "Image On" predictor and orthonormalized using the Gram-Schmidt process before being entered into a GLM predicting BOLD activation in each of the ROI's identified in the whole brain analysis (see **Figure 7**).

### *Individual differences analysis of evaluative questionnaire*

We performed an analysis of individual differences in responses to the nine-item evaluative questionnaire and their relationship to BOLD activation. Each observer's recommendations and their subsequent responses on the nine items were converted to *z*scores, and then concatenated into a single large matrix (16 observers × 109 images = 1744 rows). We performed a stepwise regression analysis in SPSS (IBM, Somers, NY) of observers' recommendations against their responses to the nine items to eliminate redundant terms or terms which had no significant predictive power for recommendations. Individual standardized beta weights were then computed for how well each of the items surviving this procedure predicted recommendations, entered in order from most-to-least predictive at the group level (see Appendix **Table A2**). The resulting beta weights, which can be conceptualized as reflecting the weight an observer places on a particular emotion/evaluation when making recommendations, were used to predict the size (across observers) of the 4-vs.-1 BOLD effect in the set of ROIs identified in the whole-brain recommendation-based analysis. This yielded an overall *R*<sup>2</sup> for each ROI and beta weights for each of the items with associated confidence intervals. A significant effect in this analysis would indicate that variability *across* observers in the size of the BOLD effect in an ROI is related to variability in how much individual observers weigh a particular emotion/evaluation when making recommendations (see **Figure 8**).

# **RESULTS**

There was very low agreement in recommendations across observers, as assessed by computing the correlations between observers' recommendations taken in pairs (**Figure 2**). The average agreement (0.13 ± 0.17) indicates quite low agreement for visual art compared to other kinds of stimuli (e.g., Vessel and Rubin, 2010). (The mean of this distribution is significantly different from zero by a *<sup>t</sup>*-test, *<sup>t</sup>*[119] <sup>=</sup> 8.72, *<sup>p</sup>* <sup>&</sup>lt; <sup>10</sup>−13, but Cronbach's alpha, a measure of inter-rater reliability, confirms the very low agreement, α = 0.709; Cronbach, 1951). This finding has an important methodological consequence: on average, each image highly recommended by one observer was given a low recommendation by another. Therefore, any BOLD effects found in a contrast of high vs. low recommendation reflect differences in aesthetic reaction, not features of the images.

A whole-brain group contrast of trials in which an observer gave an image the highest recommendation ("4") vs. trials in which the image was given the lowest recommendation ("1") revealed a set of posterior, anterior, and subcortical brain regions that were correlated with observers' aesthetic recommendations (Appendix **Table A1**; see "Materials and Methods, 4-vs.-1 Whole brain analysis"). Below, we describe further the responses of these regions, grouped by the nature of the response. The groupings were based on an analysis beyond that which produced **Table A1** (4-vs.-1)—specifically, the pattern of responses across all four recommendation levels (see below). To examine those patterns, individual regions of interest (ROIs) were created based on the 4-vs.-1 whole-brain contrast, and the average timecourses were analyzed to estimate the response to each of the four response levels (see "Materials and Methods, ROI analysis").

In posterior (occipito-temporal) ROIs, there was a linear relationship between recommendation level and BOLD response

(**Figure 3**; left inferior temporal sulcus, ITS: −49, −61, −2; left parahippocampal cortex, PHC: −31, −32, −15; right superior temporal gyrus, STG: 52, −10, 7). In left ITS and left PHC BOLD response increased in an approximately linear fashion above resting baseline for increasing recommendations. Similarly, BOLD signal in right STG decreased in an approximately linear fashion below resting baseline for decreasing aesthetic reactions.

In two subcortical regions, the left striatum (STR) and the pontine reticular formation (PRF), there was also a linear relationship between recommendation and BOLD activation. But in contrast to occipito-temporal ROIs, BOLD response levels straddled the resting baseline (**Figure 4**; STR: −12, 10, 6; PRF: 0, −28, −17). Thus, highly-rated images led to activation greater than baseline and low-rated images led to decreases from the resting baseline.

In contrast with the linear relation between recommendation and BOLD response observed in the occipito-temporal and subcortical regions above, frontal ROIs identified in the 4-vs.-1 contrast (Appendix **Table A1**) revealed a markedly different pattern of responses. In the left inferior frontal gyrus, pars triangularis (IFGt), left lateral orbitofrontal cortex (LOFC), and left superior frontal gyrus (SFG) there was a non-linear, "step-like" pattern relating aesthetic recommendation and BOLD response (**Figure 5**). Activation in left IFGt (−50, 32, 12) and left LOFC (−35, 24, −4) was near baseline for artworks given a 1, 2, or 3 recommendation, but was strikingly higher for artworks given a

**FIGURE 3 | Posterior occipito-temporal regions of cortex show linear deflections from baseline with increasing recommendation.** The whole-brain images illustrate the t-statistic for the 4-vs.-1 contrast. Panels on the right illustrate the average beta weight (as a *z*-score) for each recommendation level, averaged across 15 observers (*l*ITS = left inferotemporal sulcus; *l*PHC = left parahippocampal cortex; *r*STG = right superior temporal gyrus). Error bars are standard errors of the mean across observers.

4, the highest recommendation (**Figure 5**; right-middle panels). The left SFG (−5, 19, 62) also showed this non-linear, step-like pattern, though shifted downward such that artworks rated 1,2, or 3 were significantly below baseline and only artworks rated 4 were at baseline (**Figure 5**, top panel). Similarly, activation in the left mediodorsal thalamus (mdThal: −6, −18, 12), which is heavily bidirectionally connected to the prefrontal cortex (Tobias, 1975; Tanaka, 1976; Behrens et al., 2003) showed a non-linear pattern of BOLD response with little differentiation for artworks given recommendations of 1, 2, or 3, but a much higher response for artworks given a 4 (**Figure 5**, bottom right).

# **HIGHLY MOVING IMAGES ENGAGE THE DEFAULT-MODE NETWORK AND RECRUIT ADDITIONAL NEURAL SYSTEMS**

The strikingly higher response of frontal regions for artworks rated as the most aesthetically pleasing over all other artworks lends initial support to the hypothesis that a "4" recommendation was fundamentally different from a 1,2, or 3, and that these trials were not just revealing "more" activation in a general network sub-serving preferences, but that they reflected the engagement of an additional process. To test this hypothesis further, we calculated a second whole-brain contrast between just the trials resulting in a rating of 4 and the average of *all* other trials (ratings of 1, 2, or 3; see "Materials and Methods, 4-vs.-321 Whole brain analysis"). This new analysis gave us more power to detect regions showing a difference for trials rated as 4 but that may not have been detected in the 4-vs.-1 contrast.

This 4-vs.-321 contrast revealed a large swath of activation on the medial surface of the left hemisphere, extending from the anterior medial prefrontal cortex (aMPFC: −6 38 4) to the SFG activation seen in the 4-vs.-1 contrast (**Figure 6** top left). The aMPFC is known to be a core region of the DMN; (Shulman et al., 1997; Mazoyer et al., 2001; Raichle et al., 2001), and, as

expected, inspection of the response to all four recommendation levels in this region shows a *decrease* in activation below baseline for presentation of most images (those rated a 1, 2, or 3). In contrast, those artworks rated as the most aesthetically moving (recommendation of 4) lead to BOLD activation at aMPFC's resting baseline (**Figure 6** top right). In other words, activation in the aMPFC for highly moving artworks is not suppressed, as it is for most artworks and most other types of external stimuli. The left posterior cingulate cortex (PCC: −9 −49 18) another core region of the DMN, showed a similar, though less striking, pattern of activation (**Figure 6**, middle right).

In addition to the aMPFC and PCC, the 4-vs.-321 contrast also revealed several subcortical regions showing significantly higher activation for only the highest rated artworks. The left substantia nigra (SN: −8, −12, −6) and the left hippocampus (HC: −30 −21 −10; **Figure 6** bottom panel) were not differentially activated by trials rated as 1, 2, or 3, but did show significantly greater activation for trials that resulted in recommendations of 4.

It is important to note that the differential response across the 4 recommendation levels cannot simply reflect response selection, as observers are selecting a response on every trial. It is also unlikely that the BOLD effects reflect an implicit mapping of a four response to a "yes" response, and not to aesthetic experience *per se*. If this were the case, one might expect to see faster response times on those trials. However, when we analyzed observer's mean response times for trials of each recommendation level separately, we saw no such effect [one-way ANOVA with subjects as a random effect; *F*(3, 56) = 0.44, *p* = 0.73].

# **SEPARABLE BOLD RESPONSES TO POSITIVE AND NEGATIVE ASPECTS OF AESTHETIC EVALUATION**

Aesthetic experiences can invoke a wide variety of evaluative and emotional responses. Following the fMRI session, observers saw each artwork a second time and rated the degree to which it brought about a specific response on a nine-item questionnaire of evaluative terms (see "Materials and Methods, Nine-item evaluative questionnaire"): pleasure, fear, disgust, sadness, confusion, awe, joy, sublime, and beauty.

Evaluative reactions to individual paintings were not consistent across individual observers (average across observer correlations of 0.13, 0.49, 0.29, 0.38, 0.32, 0.30, 0.16, 0.17, and 0.17 for each term respectively; standard deviations ranging from 0.10 to 0.20). The range of agreement on these items illustrates that some of the variability in recommendations across observers was at least partly driven by different feelings being evoked by each painting (e.g., low agreement for ratings of pleasure), but was also because different people place different weights on those feelings (such as fear).

This variability at two stages—both in the mapping between artworks and feelings they evoke, and in mapping between evoked feelings and aesthetic recommendation—precludes any meaningful direct relationship (at the *group* level) between ratings of these nine items and activation in the set of brain regions revealed by the 4-vs.-1 and 4-vs.-321 whole-brain group analyses. One approach to understanding these subjective evaluative responses is to test whether there exists a reduced set of latent factors that are common across observers and can explain a significant proportion of the variance in responses.

A principal components factor analysis identified two grouplevel factors that together accounted for 59% of the variance in observers' ratings on the evaluative questionnaire (**Figure 7A**) Factor 1: eigenvalue of 3.045, accounting for 33.8% of variance; Factor 2: eigenvalue of 2.269, accounting for 25.2% of variance (see "Materials and Methods, Factor analysis of evaluative questionnaire"). Factor 1 loaded very highly on pleasure, beauty, and other positive questionnaire items, while Factor 2 loaded very highly on fear, disgust, and sadness (**Figure 7B**). Scores on these factors were computed for each observer looking at each image and used to re-analyze the BOLD timeseries from the previously identified set of ROIs (see "Materials and Methods, ROI analysis of evaluative factors").

BOLD signal in the SN was sensitive to the "positive" evaluative factor (**Figure 7C**). Positive scores on Factor 1 were associated with higher BOLD signal in left SN [1-tailed *t*(15) = 2.15, *p* = 0.024]. Left STR and left SFG also showed a trend toward sensitivity to Factor 1 [1-tailed *t*(15) = 1.13, *p* = 0.14 and 1-tailed *t*(15) = 1.16, *p* = 0.13, respectively].

The STR was also sensitive to the "negative" factor, as was the left IFGt (**Figure 7C**). Positive scores on Factor 2 were associated with higher BOLD signal in left STR [1-tailed *t*(15) = 2.68, *p* = 0.0086] and left IFGt [1-tailed *t*(15) = 2.16, *p* = 0.024]. Additionally, the left aMPFC was weakly sensitive to Factor 2, approaching significance [1-tailed *t*(15) = 1.66, *p* = 0.059]. None of the posterior occipito-temporal regions were sensitive to either factor.

#### **BOLD EFFECTS IN THE PRF AND LEFT ITS REFLECT INDIVIDUAL WEIGHTS ON EVALUATIVE RESPONSES**

Evaluative responses across observers were highly individual (see above). Individuals may rely on different evaluative and emotional responses when making their aesthetic recommendations. A regression analysis on each individual's set of responses was used to determine what weights would need to be assigned to each of these items in order to predict each observer's recommendation for each artwork (see "Materials and Methods, Individual

differences analysis of evaluative questionnaire"). Three of the items could be removed without significantly affecting the predictability of the set: joy, confusion, and the sublime.

groups high on Factor 2, with Awe and Sublime being partway between the

Across observers, different subsets of the remaining evaluative terms were effective in predicting individual recommendations (Appendix **Table A2**). For example, some observers tended to recommend images that they reported as awe inspiring, while other observers did not show a significant relationship between awe and recommendation, but did show a relationship between images that evoked fear and their recommendations of those images.

These individual profiles of evaluative weightings were correlated with the magnitude of observed 4-vs.-1 BOLD effects in two ROIs, the PRF and left ITS (**Figure 8**). Individualized weights on the remaining six evaluative terms were able to account for a large proportion of across observer variability in the PRF and left ITS 4-vs.-1 BOLD effect sizes (*R*<sup>2</sup> <sup>=</sup> <sup>0</sup>.70 and 0.62, respectively).

Observers who tended to recommend images they found to be awe-inspiring showed a larger effect of recommendation in the PRF, a part of the reticular activating system [**Figure 8A**; beta = 1.22 ± 0.98, *t*(8) = 2.88, *p* = 0.021]. No other evaluative term reached significance in the PRF.

In the left ITS, observers' weights for pleasure were significantly related to the BOLD effect [**Figure 8B**; beta = 1.72 ± 1.34, *t*(8) = 2.96, *p* = 0.018]. This relationship suggests that left ITS may at least partially mediate the relationship between rated pleasure for an artwork and aesthetic recommendation. No other evaluative term reached significance in the left ITS.

# **CONTROL ANALYSIS**

across observers.

Regions that respond to specific stimulus types (faces or places) showed no effect of recommendation [One-Way ANOVA, left FFA *F*(3, 44) = 0.21, *p* = 0.89; right FFA *F*(3, 44) = 0.08, *p* = 0.97; left CoS *F*(3, 52) = 0.20, *p* = 0.89; right CoS *F*(3, 52) = 0.20, *p* = 0.90]. These regions were identified using an independent localizer scan. We were able to identify a face-responsive region in the posterior fusiform gyrus (FFA) in 12 of the observers (Puce et al., 1995; Kanwisher et al., 1997; McCarthy et al., 1997) and a place-responsive region in the collateral sulcus (CoS) in 14 of the observers (Epstein and Kanwisher, 1998; Epstein et al., 1999). This finding rules out the possibility that the linear effects of recommendation observed in PHC, STG, or ITS depend on stimulus differences.

# **DISCUSSION**

Aesthetic judgments for paintings are highly individual, in that the paintings experienced as moving differ widely across people. The neural systems supporting aesthetic reactions, however, are largely conserved from person to person, with the most moving artworks leading to a selective activation of central nodes of the DMN (namely, the aMPFC, but also the PCC and HC) thought to support personally relevant mentation (see below). The most moving artworks also activate a number of other frontal and subcortical regions, including several which reflect the evaluative and emotional dimensions of aesthetic experiences. A separate network of posterior and subcortical regions show a graded (linear) response signature to all artworks in proportion to an observer's aesthetic judgment. Finally, two regions (PRF and left ITS), show differences in activation level *across* individuals that are correlated with whether the individual finds certain aspects of a painting (e.g., awe) appealing.

### **ENGAGEMENT OF THE DEFAULT MODE NETWORK DURING THE MOST AESTHETICALLY MOVING EXPERIENCES**

The aMPFC shows decreases in activation from its resting baseline for all images*except* those rated as most aesthetically moving. Previous studies have reported that activation in this region is positively correlated with aesthetic evaluation (Kawabata and Zeki, 2004; Vartanian and Goel, 2004; Jacobsen et al., 2006; Di Dio and Gallese, 2009; Ishizu and Zeki, 2011). However, none of these studies have clearly shown the relationship of aesthetically driven activations to this region's resting baseline.

The DMN is a network of brain areas associated with inward contemplation and self-assessment (Gusnard and Raichle, 2001; Raichle et al., 2001; Kelley et al., 2002; Wicker et al., 2003; D'Argembeau et al., 2005, 2009; Andrews-Hanna et al., 2010). As with other areas in the DMN (such as the PCC, where we also see differential activity for only the most aesthetically pleasing images), aMPFC typically shows below-baseline activity in response to external stimulation, and this was indeed what we found in observers' responses to many of the art stimuli to which they were exposed. However, for those few stimuli that each observer judged as creating a strong aesthetic experience, the suppression of aMPFC were alleviated, which is typically seen when observers perform tasks related to self-reflection or during periods of self-monitoring. Such activation in the aMPFC at or above its resting baseline in response to an *external* stimulus is rare.

Importantly, our results show that only the *most* aesthetically moving artworks lead to differential, and widespread, activation in the aMPFC, contrary to the claim (Kawabata and Zeki, 2004; Ishizu and Zeki, 2011) that activation in this region is related to beauty in a linear fashion. This difference may be a consequence of the lower number of response levels used in their studies (three vs. four), the inclusion of paintings deemed "ugly" by their observers, the fact that the paintings were not being seen for the first time, or by differences in instructions.

Several studies of self-reflective processes have shown that aMPFC does not deactivate during tasks in which observers assign to themselves personally relevant traits of varying valence (e.g., happiness, honesty, cruelty, etc., Kelley et al., 2002; D'Argembeau et al., 2005; Amodio and Frith, 2006; Moran et al., 2006). Trait studies may reflect a set of processes whereby observers don't simply think about themselves, but, more specifically, *match* traits with self-inspection, as a part of broader social cognition. In a similar manner, release from deactivation during aesthetic experience may reflect observers' matching self-inspection with their perception of an object.

Strong emotions that are salient to observers also attenuate the depression of aMPFC activation associated with task performance (Simpson et al., 2001a,b), while emotion processing that is *not* personally relevant (e.g., viewing pictures of unknown persons in empathy-producing situations) has no effect on decreased activation of aMPFC during task performance (Geday and Gjedde, 2009). Highly moving aesthetic experiences appear to represent an analogous situation in which an external stimulus brings about a strong emotional response.

During such intense aesthetic experiences, the aMPFC may function as a gateway into the DMN, signaling personal relevance and allowing for a heightened integration of external (sensory/semantic) sensations related to an art object and internal (evaluative/emotional) states. How such integration is neurally instantiated and how it is related to reward circuits (e.g., whether it is caused by or creates activity in reward-related brain areas) are important questions for further research.

# **UNIQUE RESPONSE SIGNATURES FOR SENSORY AND EVALUATIVE NETWORKS**

This is the first report of unique response signatures separating cortical activations to artwork into a posterior occipito-temporal network and an anterior frontal network. In addition to the frontal activation in aMPFC, the SFG, IFGt, and LOFC also show a step-like response, the latter two regions increasing *above* baseline for only the most moving images. Within this set of frontal regions, the factor analysis of evaluative responses further distinguishes the ROIs from one another—the LOFC shows no sensitivity to either Factor 1 or Factor 2, while lFGt is sensitive to Factor 2, and both SFG and aMPFC show weak sensitivity to Factors 1 and 2, respectively. Subcortically, activations in the SN, mediodorsal thalamus, and hippocampus also show a step-like pattern of response, suggesting that these regions interact with the frontal network.

This network of frontal regions, which we refer to as an "evaluative" network, likely supports an analysis of emotional response and personal relevance. We suggest that the step-like pattern is a signature of an aesthetic response, where the most moving images produce a clearly differentiable pattern of signal, going beyond mere liking, to something more intense and personally profound. Additional support for this interpretation comes from a recent study in which observers were instructed to view artworks in terms of semantic or visual detail ("pragmatically"), as opposed to in terms of color, composition, shapes, mood, and evoked emotion ("aesthetically"). They found an activation in left lateral prefrontal cortex (−44, 37, 7; BA 10) corresponding to what we term left IFGt, which was selectively engaged in the "aesthetic" condition (Cupchik et al., 2009).

The second signature we observe, a linear response to observer recommendation, is found in more posterior cortical regions (PHC, ITS, and STS). In all of these areas, BOLD signal responds to the onset of any image and linearly tracks observers' aesthetic reactions. Several previous reports have also found activations in occipito-temporal areas for preference judgments of a variety of stimuli, including artwork, abstract geometric shapes, scenes, and faces (e.g., Vartanian and Goel, 2004; Jacobsen et al., 2006; Kim et al., 2007; Yue et al., 2007).

These activations likely reflect a stimulus-bound sensory and semantic analysis of preference that is relatively automatic. Supporting this interpretation is the finding that observers whose recommendations were well predicted by ratings of imageinduced "pleasure" tended to show a larger BOLD effect in the ITS (suggesting that observers differ in the degree to which they value a sensory/semantic analysis performed by posterior areas versus emotional evocativeness when reacting to aesthetic experiences). It is important to note that the linear effect of aesthetic recommendation that we observed in these areas is not due to systematic differences in the type of stimuli preferred by the observers, as neither the CoS nor FFA, defined using an independent localizer task (for places and faces, respectively) showed any effect of recommendation.

Subcortical regions STR and PRF, which also show a linear relationship to observer's recommendations, increased *above* baseline for recommended images and decreased *below* baseline for non-recommended images. Given the involvement of a column of areas in the midbrain with arousal functions (Kinomura et al., 1996; Steriade, 1996), these activations may reflect "reward" valence in STR and arousal level in PRF, two often theorized axes of emotional responsivity (Lang et al., 1990; Low et al., 2008). Although we did not explicitly measure physiological arousal, the fact that the BOLD effect size in PRF was larger for observers who tended to recommend images they found awe-inspiring suggests a potential association between aesthetic awe and arousal.

### **INTEGRATION IN THE STRIATUM**

Not only is STR activity linearly related to aesthetic recommendation, it is also sensitive to both emotional/evaluative factors. This suggests that STR may integrate perceptual, evaluative, and reward components of aesthetic response for the purpose of outcome selection (the choice of recommendation level). This pattern, along with the detection of a related response pattern in the mdThal, is in accord with the established existence of corticostriato-pallado-thalamic loops (Alexander et al., 1986; Steriade and Llinás, 1988; Alexander and Crutcher, 1990; Middleton and Strick, 2002; Kelly and Strick, 2004). Further research will be needed to elucidate the temporal dynamics of the flow of information between these regions in aesthetic responses.

The location of the observed striatal activation straddles the anatomical division between dorsal and ventral STR, and is similar to that reported by Vartanian and Goel (2004), though other studies of preference have reported more ventral effects (Kim et al., 2007; Lacey et al., 2011). Intriguingly, we did find significantly greater activation in the *right* ventral STR for the most highly recommended images (4-vs.-321 contrast, results not shown). The literature on reward posits that the dorsal STR represents the "actor" function of learning and implements habits or decisions (Maia, 2009), as well as the expectation of reward and punishment (Delgado et al., 2000, 2003), whereas the ventral STR (along with the amygdala, VTA, and OFC), carries out "critic" functions of representing actual reward and reward-prediction error (Schultz et al., 1992; Schoenbaum et al., 1998; Hikosaka and Watanabe, 2000; Schultz, 2000; Tremblay and Schultz, 2000; Setlow et al., 2003; Paton et al., 2006; Wan and Peoples, 2006; Simmons et al., 2007). While the locus of our activation in STR does not clearly fall in either the ventral or dorsal STR, the fact that STR responds regardless of emotional valence is in agreement with findings in monetary reward (Delgado et al., 2000, 2003).

Findings in regard to aesthetic reward have suggested a schism between desired and achieved reward that maps onto dorsal and ventral STR, respectively. Based on a PET study of pleasurable resolution of musical expectation, Salimpoor et al. (2011) have suggested that the caudate ("dorsal" STR) responds primarily to expecting a desired reward ("wanting"), while the nucleus accumbens (ventral STR) is active while experiencing the peak emotional response ("liking") associated with the resolution of a musical theme, line, or phrase. Unlike the novel, static images used in our study, their musical stimuli are temporally extended experiences, enabling listeners to predict the resolution of a musical phrase (and subsequent pleasure) based on familiarity with musical structure or particular songs. This may partially explain the difference in the locus of striatal effects following the hypothesized moment of aesthetic reward, given the known involvement of basal ganglia structures in a variety of temporally sequenced behaviors (Harrington et al., 1998). However, our task and results argue against a strict interpretation of striatal activation as reflecting anticipatory "wanting" a predicted reward, as there was no possibility of differential anticipatory responses for any of our images.

#### **AREAS FOR FURTHER RESEARCH**

Our experiment is the first to find activation in the SN in visual aesthetic response, though it has been reported for music (Suzuki et al., 2008). Activation in the left SN for the most highly rated images raises the possibility that the efferent dopaminergic connections from the SN to the STR offer a mechanism by which hedonic responses to the most highly moving images might be modulated. This might be tested in further research.

In this set of observers, recommendation-related BOLD response appears primarily as increases in activation in the left hemisphere. However, it is unclear at this time whether this represents a real difference in the lateralization of aesthetic processes or merely reflects variation in the sensitivity of observing these effects at the whole-brain level.

Finally, it remains to be seen to what degree these systems are perturbed by depression or other mood disorders. Intriguingly, we found that the size of the BOLD effect in PHC, reflecting semantic/sensory processing, was larger for observers reporting positive mood (*r* = 0.68, *p* < 0.004 using *r* to *z* transform), suggesting that mood may act as a gateway to getting pleasure from sensory/aesthetic experiences.

### **CONCLUSIONS**

The nature of aesthetic experience presents an apparent paradox. Observers have strong aesthetic reactions to very different sets of images, and are moved by particular images for very different reasons. Yet the ability to be aesthetically moved appears to be universal. The emerging picture of brain networks underlying aesthetic experience presents a potential solution to this paradox. Aesthetic experience involves the integration of neurally separable sensory and emotional reactions in a manner linked with their personal relevance. Such experiences are universal in that the brain areas activated by aesthetically moving experiences are largely conserved across individuals. However, this network

# includes central nodes of the DMN that mediate the intensely subjective and personal nature of aesthetic experiences, along with regions reflecting the wide variety of emotional states (both positive and negative) that can be experienced as aesthetically moving.

The linking of intense aesthetic experience and personal relevance may have implications for artists and educators alike further research could explore whether increasing the personal relevance of aesthetic experiences increases their intensity and the resulting associations.

# **ACKNOWLEDGMENTS**

Justin Little, Lizzie Oldfather, Alexander Denker, Brynn Herrschaft, Steven R. Quartz, Damian Stanley, Souheil Inati, and Pablo Velasco. This project was supported by an ADVANCE Research Challenge Grant funded by the NSF ADVANCE-PAID award # HRD-0820202 and by the Andrew W. Mellon Foundation (as a New Directions Fellowship).

#### **REFERENCES**


and lateral prefrontal neurons of the monkey varying with different rewards. *Cereb. Cortex* 10, 263–271.


intralaminar nuclei. *Science* 271, 512–515.


rewards," in *Linking Affect to Action: Critical Contributions of The Orbitofrontal Cortex*, ed G. Schoenbaum, J. A., Gottfried, E. A., Murray and S. J., Ramus (New York, NY: New York Academy of Sciences), 674–694.


aesthetic preference for paintings. *Neuroreport* 15, 893–897.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 12 October 2011; accepted: 12 March 2012; published online: 20 April 2012.*

*Citation: Vessel EA, Starr GG and Rubin N (2012) The brain on art: intense aesthetic experience activates the default mode network. Front. Hum. Neurosci. 6:66. doi: 10.3389/fnhum.2012.00066*

*Copyright © 2012 Vessel, Starr and Rubin. This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.*

# **LIST OF ARTWORKS**

Grateful acknowledgement is given for permission to reproduce the following artworks depicted in **Figure 1**:


The full list of artworks used in this study is as follows:


*Louise de Keroualle*, c. 1671. Peter Lely (Peter van der Faes Lilley, 1618–1680, Eng.). Oil on canvas.


*The Wreck*, c. 1854. Eugene-Louis-Gabriel Isabey (1804–1886, Fr.). Oil on canvas.

*Thirty-Six Immortal Poets*, c. 1740–1750. Kagei Tatebayashi (1504–1589, Jap.). Two-fold screen, ink and color on paper.

*Three Dancers*, c. 1940. William H. Johnson (1901–1970, USA). Oil on burlap.

*Tomatoes, Fruit, and Flowers*, c. 1860. Unknown (American). Oil on canvas.

*Triumphant Child*, c. 1946. Walter Quirt (1902–1968, USA). Oil on canvas.

*Trompe-l'Oeil Still Life with a Flower Garland and a Curtain*, c. 1658. Adriaen van der Spelt (1630–1673, Dutch). Oil on panel.

*Turning Point of Thirst*, c. 1934. Victor Brauner (1903–1966, Romanian). Oil on canvas.

*Unidentified Raga*, c. 1775. unknown (Indian). Watercolor and gold on paper.

*Venus*, c. 1518. Lucas (the Elder) Cranach (1472–1553, Ger.). Oil on linden.

*Vision of the Sage Markandeya*, c. 1775–1800. Unknown (Himachal Pradesh, India). Watercolor and gold on paper.

*Watson and the Shark*, c. 1782. John Singleton Copley (1738–1815, USA). Oil on canvas.

*Woman and Flowers (Opus LIX)*, c. 1868. Sir Lawrence Alma-Tadema (1836–1912, Dutch). Oil on panel.

*Yama, King of Hell*, c. 1800. Unknown (Tibet). Watercolor on cotton.

*Young Woman with a Fan*, c. 1754–1756. Pietro Rotari (1707–1762, It.). Oil on canvas.

*Young Woman with a Turban*, c. 1780. Jacques Louis David (1748–1825, Fr.). Oil on canvas.

*Young Women Jumping Rope*, c. 1942–1944. Rufino Tamayo (1899–1991, Mex.). Oil on canvas.

# **APPENDIX**

**Table A1 | Mean Talairach coordinates for all activations found in the 4-vs-1 whole brain contrast.**


*BA, Brodmann's area; SD, spatial standard deviation; Vol, volume; Avg t, average t statistic; Ctx, cortex; Sulc, Sulcus; Gyr, Gyrus; Inf, Inferior; p, pars.*


**Table A2 | Results of the individual differences regression in which each observer's recommendations were predicted from the reduced set of six emotional terms.**

*The reported F statistic has (6, 102) degrees of freedom. resVar* = *residual variance in recommendation not accounted for by the regression.*

# **NEURO-IMPRESSIONS: INTERPRETING THE NATURE OF HUMAN CREATIVITY**

**Todd Lael Siler**

*(artist perspective)*

# Neuro-impressions: interpreting the nature of human creativity

# *Todd Lael Siler\**

*ArtScience® Publications, Denver, CO, USA*

#### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel Luis M. Martinez, Universidade da Coruña, Spain*

#### *\*Correspondence:*

*Todd Lael Siler, ArtScience® Publications, PO Box 372117, Denver, CO 80237, USA. e-mail: toddsiler@alum.mit.edu; www.toddsilerart.com*

Understanding the creative process is essential for realizing human potential. Over the past four decades, the author has explored this subject through his brain-inspired drawings, paintings, symbolic sculptures, and experimental art installations that present myriad impressions of human creativity. These impressionistic artworks interpret rather than illustrate the complexities of the creative process. They draw insights from empirical studies that correlate how human beings create, learn, remember, innovate, and communicate. In addition to offering fresh aesthetic experiences, this metaphorical art raises fundamental questions concerning the deep connections between the brain and its creations. The author describes his artworks as embodiments of everyday observations about the neuropsychology of creativity, and its all-purpose applications for stimulating and accelerating innovation.

**Keywords: ArtScience, creativity, discovery, invention, innovation**

# **INTRODUCTION**

I make art about the brain, and learn about the brain through art. This remains my lifelong passion and challenge: discovering how the human brain constantly learns about itself by studying its countless creations. That's the central theme of my artwork, which considers how the brain is connected to all of its creations in every way imaginable, and how brain mechanisms form and shape our lives and future.

The eclectic aesthetics of my artworks reflect the broadest definition of Art, which encompasses **A**ll **r**epresentations of **t**hought. From my perspective, Art embraces all expressions and manifestations of creativity, embodying the collective work of human nervous systems and everything our minds make (McCulloch, 1988; Siler, 1993).

We tend to experience things by how we define them. When we encounter a work of art whose subject matter is neuroscience, we expect to see copious images of recognizable brain matter. It rarely occurs to us that all the creations of the mind we encounter daily (from houses to cities) bear little resemblance to the brain. And yet, these things reflect the handiwork of the human brain. How exactly, no one knows for certain. But it's one of the most exciting promises and prospects of neuroscience: to know how (Pinker, 2009). And, learn how to boldly think beyond the categories of our compartmentalized knowledge, sparking important innovations.

# **IMPRESSIONISM MEETS NEURAL ART**

As the title of this article implies my art is mostly impressionistic. Meaning, it shares certain qualities of ambiguity and abstraction visible in Modern and Post-modern Art. It also shares a common wellspring of inspirations that connect my artistic interpretations of nature with Claude Monet's "Les Nympheas" (Water-Lilies); these visceral murals fill two oval-shaped, womblike rooms at the Musée de l'Orangerieis in Paris (Tucker, 1998). The paintings grew in Monet's imagination for thirty years, well after this practitioner of *en plein air* ("in the open air") painting had planted thousands of water lily bulbs, which evolved into the elegant pond he painted, studied and maintained like an outdoor lab.

Monet's impressionistic art appeals to my Neural Art, as it connects us to the world within and around us. Moreover, it inspired me to plant in my paintings all sorts of brain-related questions about how we perceive and understand the world; literally, I collaged my perceptions and concepts on my canvases, and cultivated these conceptual plantings over many years. Some seeds grew into these colorful depictions of human neurons shown here. They are meant to evoke images of a different, yet related, "garden of the mind" that Monet painted at Giverny.

The fundamental questions I have picked to explore are not the garden variety type, even though they are quite universal. For example, how is the human brain connected to nature, and how is nature connected to everything the brain creates? Specifically, how do the details of nature *detail* the nature of the brain?

Interpreting these open-ended questions has yielded a cornucopia of art forms that document my impressions of the creative process (Siler, 1995). These artworks share an aesthetic kinship with other contemporary visual artists who make tangible *the intangible* aspects of creativity, as well (Bailly, 1982; Kriesche, 1985; Arakawa and Gins, 1991). In effect, they are manifestations of "metacognition," a term used in the field of education and cognitive neuroscience to describe the process of "knowing about knowing" and the practice of questioning our "cognitions about cognition" (Metcalfe and Shimamura, 1994).

# **THE FINE ART OF THOUGHT**

The brain-inspired artworks of mine highlighted here (**Figures 1**–**3**) evolved from my graduate studies at MIT's Center for Advanced Visual Studies in 1979. At CAVS, I had

**FIGURE 1 | "Thought-Assemblies," 1979–1982.** Installation view at Musee D'Art Moderne de la Ville de Paris, France, 1982. Mixed media on synthetic paper and canvas, 9 × 127 ft. "Thought-Assemblies" details a process of incubating ideas and percolating on the possibilities of their actualization. This symbolic artwork draws on my formal studies of neuropsychology, which helped inform my visualizations of the creative process.

the opportunity to freely explore a wide range of interrelated fundamental questions that focused on some deep connections between nuclear physics and neurophysiology. I expressed these connections metaphors, physical analogies and visual suppositions (Siler, 1981), following a path of creative inquiry cut by artists and scientists of the Italian Renaissance—most notably the quintessential ArtScientist, Leonardo da Vinci. As the thousands of pages of his Codices show, Leonardo spent his lifetime searching nature's connections, many of which focused on understanding the functional architecture of the brain (MacCurdy, 1938). DaVinci's search helped spur the collaborative efforts today in human neuroscience, in which teams of researchers systematically correlate the cause-and-effects of neural events aided by non-invasive medical imaging tools.

The neurophysiologist Eberhard Fetz eloquently summed up these ongoing efforts to grasp the the great unknowns of the brain that challenge our collective ingenuity, writing: "there is a largely unexplored area of brain function as itself a subject for artistic representation. The neural networks in our brains effortlessly perform common miracles of perceiving the world, controlling volitional movements and performing higher functions like speech and thought. These cognitive functions are all produced by complex patterns of neural activity, but how mental events emerge from material mechanisms remains an enduring mystery" (Fetz, 2012).

Ultimately, human development hinges on "understanding neurons ". . . " these aesthetic elementary microchips of the brain" (Segev, 2011), and understanding that our collective future rests on how wisely and ingeniously we apply our neural knowledge. To this way, a world of inquisitive minds seek insights into the symbolic languages of neurons, in the same visionary way that Pythagoras understood this reality: "mathematics is the nature of language," like symbolisms is the language of nature. Naturally, we are all symbol-making creatures.

# **VISUALIZING THE NATURE OF HUMAN CREATIVITY**

As a generalist studying the human brain everyday, I use the fine arts (**Figures 1**–**3**) as instruments for hypothesizing and investigating the actions of neural systems that form, shape and influence every facet of our lives (Siler, 1988). The paintings interpret how our thoughts, feelings, actions and behaviors maybe traced to various neural mechanisms with the understanding that "correlation is not causation."

In responding to my open-ended questions, I created one sprawling visual knowledge map that looks as long and complicated as a linear high-energy accelerator! This impressionistic artwork, titled "Thought-Assemblies" (**Figures 1** and **2**), interprets the interconnected process of creative and critical thinking. It poses these interrelated basic questions for everyone to ponder: How do the mechanisms of thought (nerve cell-assemblies and interactions) influence the contents of thought? Are action potentials, which relay information over long distances and synaptic potentials, which integrate information over short distances the signaling devices that change the meanings of our thoughts, feelings, and actions? (Siler, 1987). Are "thought-assemblies" related patterns of mental activity or an association of ideas—the creations of "cell-assemblies?" What is a thought? A thing, or a product of some thing? A process, or something intangible? (James, 1890; Eccles, 1970).

"Thought-Assemblies" served as the visual component of my MIT dissertation, *Architectonics of Thought: A Symbolic Model of Neuropsychological Processes* in Interdisciplinary Studies in Psychology and Art. The artwork presents an alternative perspective on the neuropsychology of creativity—one that connects all acts of creating, discovering, inventing, innovating, collaborative learning, and problem solving (Siler, 1986). Moreover, it intimates how nature may be one interconnected creative process with countless manifestations.

The conceptual framework for this artwork builds on the work of the 20th century Canadian behavioral psychologist Donald O. Hebb's theory of cell-assemblies, which describes how neurons connect with one another to form groups of neuronal connections that fire together in various acts of learning (Doidge, 2007). He also noted that "thought must be known as theoretically as a chemist knows the atom" (Hebb, 1949).

The overall pattern of "Thought-Assemblies" resembles a giant EEG recording, suggesting that the mental states (e.g., varying degrees of alertness and levels of consciousness) are closely correlated with the shape of the EEG (its frequency and amplitude). These states of mind are represented in the virtual mental representations that I have collaged and mounted on a sensual synthetic paper. The mosaic of mental imagery documents ephemeral flashes of creative thinking as I envision them occurring within the real and virtual worlds of the mind. The apparent linearity of this artwork belies the non-linear, stochastic process of creativity (Siler, 1985).

Physically, this *thoughtform* is the size of a 12-story building turned on its side (see **Figure 1**). The "windows" of this virtual building are comprised of 515 pictures of mental representations organized in a seemingly orderly way. The artwork was meant to envelop its viewers, making them part of the art. In this way, I aimed to could show what's on my mind and they

**FIGURE 2 | "The Organizing Principle for** *Thought-Assemblies***" (1979–1982).** Ink on paper, 10.5 × 8.5 inches.

could read my thoughts, absorbing the concepts and contemplating the hypotheses. Some of my installation drawings envision this artwork stretching for miles. Other drawings show it shrunk to the tiny scale of a Very Large Storage Integration (VLSI) computer chip that could fit on your pinky's fingertip. Even that tiny scale may be too large, especially when viewed on the nanoscale (10 <sup>−</sup>9m), where "size does matter"; in particular, it matters to our understanding of the hierarchy of influences at work in everything that is composed from the bottom up: "from the atom to clusters of atoms to nanomaterials to materials (Mendeleev, 1901); all exhibit different behaviors that are not just relevant to their different physical dimensions" (Ozin et al., 2009). That includes the human nervous system and all other forms of organic material.

"Thought-Assemblies" can be configured on curved or wavy walls, as shown in **Figure 2**. That particular wall is derived from the arc of the cingulate gyrus, which is part of the Limbic system. This region marks the "heart" of the brain (thalamas), where non-specific thalamic projections (Nauta and Whitlock, 1954) link higher and lower brain functions that directly influence our thoughts-feelings-and-actions (Chorover and Chorover, 1982). Within this region, I hypothesize, intuitions, insights, eurekas, and other emotionally-charged feelings occur, signaling the state of flow (Csikszentmihalyi, 1996) and inducing the

**FIGURE 3 | "The Brain Theater of Mental Imagery" (1983).** Mixed mediums on spunbonded synthetic canvas, 12 × 100 ft., with mounted paintings and white light hologram, Installation view: Boston Center for the Arts, 1990.

simple pleasures of memorable aesthetic experiences that are processed by higher order cerebral systems (Siler, 1986; Cowley and Underwood, 1998, June 15; Damasio, 2000; Hesselink, 2011). Overall, it searches the neuropsychology of the brain that inspired its design and composed its contents, which encompass everything from poems on nature to studies of neuronal architecture.

# **GLIMPSING A FUTURE SHAPED BY UNDERSTANDING CREATIVITY**

Creativity remains an inexhaustible subject that is relevant to all aspects of human development, interactivity, and culture (Koestler, 1964; Root-Bernstein, 1985; Sternberg, 1988; Siler, 1997; Epstein, 1999). This subject is linked to and riddled by many of nature's deepest mysteries, among them: complexity, connectivity, and chaos. Understanding these phenomena and their relationship is the wonderful challenge of transdisciplinary thinkers, or ArtScientists, who sense that piecing together the great puzzle of creativity entails integrating all human knowledge (Root-Bernstein et al., 2011).

### **REFERENCES**


"The Brain Theater of Mental Imagery" (**Figure 3**) offers one unique environment for seeking and seeing some of the most puzzling connections that link neural mechanisms. In approaching this work with an open mind and liberated imagination, you are likely to glean how it unites all the elements of its creation, just as the human brain does (Siler, 1990).

Standing a few feet from this painting, you notice these neurallike networks or ganglia in the gray matter. These vigorously textured reliefs, created by layers of paint, reveal the unique printing and painting process that generated this giant, continuous monotype. It was created by an imaging invention of mine, which MIT patented with me in the early 1980s. One intriguing detail about this artwork and invention is the fact that it was inspired by some Golgi-stained neurons that Dr. Walle Nauta showed me along with his exquisite neuroanatomical drawings that are every bit as elegant as Santiago Ramon y Cajal's wondrous renderings (Cajal, 1899). These works show a similar "creative aesthetic" that unites the complementary sensibilities of the arts and sciences (Bronowski, 1956; Curtin, 1982; Root-Bernstein, 1996).

The artworks I have touched on here convey one overarching impression of our artistic-scientific-mathematical portraits of the human brain: all fall short of fully describing our collaborative minds' potentially limitless capabilities (Siler, 2011). And that's a good thing, as it suggests there are *surmountable* opportunities for developing useful scientific generalizations of brain dynamics applied to the advancement of humankind—rather than "insurmountable opportunities," to echo the cautionary words of venture capitalists who must invest in these developments that invariably shape our future.

My art aims to challenge our concepts of limits (Medawar, 1984) by engaging and expanding our sense of wonderment (Weisskopf, 1979). "Wisdom begins with wonder," Socrates said. And wonder propels and critiques our scientific pursuits of the truth (Morrison and Morrison, 1984), while heightening our awareness of our creative potential.

# **ACKNOWLEDGMENTS**

I thank Idan Segev, Eberhard Fetz, and Stephan Chorover for their guidance on this paper. Also, countless thanks to Ron Feldman and Ronald Feldman Fine Arts for championing my artwork over the past 30 years.


Pritzker (New York, NY: Academic Press), 759–766.


England: Cambridge university Press.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 03 May 2012; accepted: 26 September 2012; published online: 12 October 2012.*

*Citation: Siler TL (2012) Neuroimpressions: interpreting the nature of human creativity. Front. Hum. Neurosci. 6:282. doi: 10.3389/fnhum.2012.00282*

*Copyright © 2012 Siler. This is an openaccess article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

# **WHAT DOES THE BRAIN TELL US ABOUT ABSTRACT ART?**

**Vered Aviv** *(artist perspective)*

# What does the brain tell us about abstract art?

# **Vered Aviv\***

Faculty of Dance, The Jerusalem Academy of Music and Dance, Jerusalem, Israel

#### **Edited by:**

Javier DeFelipe, Cajal Institute, Spain

#### **Reviewed by:**

Bryan A. Strange, Technical University Madrid, Spain Camilo J. Cela-Conde, Universidad de las Islas Baleares, Spain

### **\*Correspondence:**

Vered Aviv, Faculty of Dance, The Jerusalem Academy of Music and Dance, Jerusalem 91904, Israel e-mail: veredaviv@gmail.com

In this essay I focus on the question of why we are attracted to abstract art (perhaps more accurately, non-representational or object-free art). After elaborating on the processing of visual art in general and abstract art in particular, I discuss recent data from neuroscience and behavioral studies related to abstract art. I conclude with several speculations concerning our apparent appeal to this particular type of art. In particular, I claim that abstract art frees our brain from the dominance of reality, enabling it to flow within its inner states, create new emotional and cognitive associations, and activate brain-states that are otherwise harder to access. This process is apparently rewarding as it enables the exploration of yet undiscovered inner territories of the viewer's brain.

**Keywords: abstract art, neuroesthetics, neural correlates of art, artistic preference, art and associations**

# **ART AND REALITY**

Over the course of human evolution, the phenomenon of art appeared some 30,000 years ago and humans became increasingly occupied with creating and appreciating works of art (Humphery, 1999; Solso, 1999). Art works are sensed and perceived via the same neuronal machinery and anatomical routes that were primarily developed for interacting with, and comprehending, "reality". These mechanisms evolved in order for us to acquire and analyze sensory information from the world around us and, consequently, to successfully and adaptively behave in an everchanging environment (see the "Perception Action loop" theory in Tishby and Polani, 2011).

The visual system, which is the vehicle that processes visual art, is aimed at filtering, organizing and putting (functional) order to the enormous amount of data streaming into our visual system. Interestingly, at early stages of visual processing, the visual scene is deconstructed into its elementary components such as spots of light, lines, edges, simple forms, colors, movement, etc. At later (higher) stages, the system reconstructs these components into complicated forms and objects: a moving car, a face with blinking eyes, a pirouette of a dancer (Zeki, 1992; Hubel, 1998). Being an efficient learning machine, our brain uses bidirectional ("top down" and "bottom up") processing schemes and algorithms for visual scene analysis. Namely, we first build (predict) a tentative model, an optional representation, of the visual world and this model is then verified and updated with increased accuracy against the "evidence" presented by the sensory stimulus (Hochstein and Ahissar, 2002; Bar, 2007; Tishby and Polani, 2011). These ongoing bidirectional processes enable us to make quick and effective generalizations and decisions about the world.

In contrast to the processing of daily objects, art is free from the functional restrictions imposed on the visual system during our daily life. Art is very often engaged in finding new ways to organize and represent objects and scenery. Artists are liberated to represent and to decompose depicted objects in various non-functional (non –"realistic") ways. Examples are works by artists of the Cubist (e.g., George Braque and Pablo Picasso) or Surrealist (e.g., Salvador Dali and Juan Miro) movements. Artworks could also be only partially faithful representations of our daily visual experience, such as the monochromatic blue figures of Pablo Picasso or the blue horses of Franz Marc, and it can be "free" from obeying the laws of physics (e.g., the flying figures of Marc Chagall or the impossible objects of E.C. Escher). Apparently we categorize some inputs as artworks while others as non-art. We make this distinction based on contextual, cultural and perceptual parameters. Interestingly, a major distinction between perceiving an object as piece of art or as part of the daily visual (non-art) experience, relies on the presence of artistic style (such as the brush work of the painter) and not only on the content of the scene (Augustin et al., 2008; Cupchik et al., 2009; and see also Cavanagh and Perdreau, 2011; Di Dio et al., 2011).

The above notion brings to mind the unique character of abstract art, which, unlike representational art and other forms of art mentioned above, does not exemplify objects or entities familiar to our visual system during daily life experience. Still, as all visual information, abstract art is perceived via the same system that was developed primarily in order to functionally represent real-world objects. This places abstract art in a unique position within visual processing—far from the natural ("survival") role of that system. It is therefore intriguing to try and understand why we are attracted to abstract art (as demonstrated by the huge success of museum exhibitions of the abstract artwork, such as those of Jackson Pollock). This must mean that abstract art, which is a rather new human invention, offers something attractive to the viewer's brain. So I would like to ask: what does abstract art offer to the viewer's mind?

It should be noted that this article focuses on the two ends of a continuum between representational art and abstract art, and therefore not relating to the in-between category of paintings, i.e., semi-representational or semi-abstract works.

# **NEURAL AND BEHAVIORAL CORRELATES OF ART/ABSTRACT ART**

A fundamental assumption of modern brain research is that each action in mental/cognitive/emotional realms is correlated with a corresponding specific brain activity pattern. Each activity represents and generates the resultant experience. It is therefore worth seeking for the neural correlates of the abstract art experience and attempting to extract the principles underlying the neural processing of this form of art.

In an fMRI imaging study, Kawabata and Zeki (2004) demonstrated that different categories of painting—landscape, portrait and still life—evoked activity at localized and category-specific brain regions. In contrast, abstract art did not activate a unique localized brain region. Rather, brain activity related to abstract art appeared in brain regions activated by all other categories as well. Thus, when subtracting the fMRI signal generated by abstract art from signals generated by representative art of the various types (landscape, portraits, still life) then zero activity was observed.

This is surprising as one might assume that there would be neural correlates (i.e., specific brain activity) for the specific cognitive category recognition of abstract art. On the other hand, because abstract art does not consist of clear well-characterized objects, but rather is composed of basic visual elements such as lines, spots, color patches and simple forms such as triangles, one might expect the activity corresponding to these basic elements to also appear in other categories of brain activity. In this case, we should not expect a unique brain activity related to abstract art as indeed was found by Kawabata and Zeki (2004) as well as by Vartanian and Goel (2004). To put it differently, it seems that we know that we view abstract art by realizing that what we view does not belong to any other specific category of art. Namely, we recognize abstract art by exclusion.

In addition to fMRI studies, abstract art was also studied by behavioral and by direct voltage electroencephalogaphy (DC-EEG) methods. Combining behavioral and low-resolution electromagnetic tomography analysis, Lengger et al. (2007) demonstrated that observers preferred abstract and representational paintings in an equal manner. Yet the abstract stimuli evoked more positive emotions. Representational artworks were classified as more interesting, were understood better and induced more associations (as reported subjectively by the observers). Information about the painting (such as the title of the paining, the artist's name, the technique used) increased understanding of each style (representational as well as abstract art), but it did not change other parameters of evaluations (i.e., preference, associations, emotions). Comparing brain activity in response to representational and abstract paintings revealed significantly higher activation for representational art works in several brain regions, predominantly in the left frontal lobe and bilaterally in the temporal, frontal and parietal lobes, limbic system, insula and other areas as well. Increased brain activity in response to representational art was mostly attributed to the process of object recognition, and the activation of memory and associations systems. Introducing stylistic information seemed to reduce cortical activation, for both representational and abstract art. The authors concluded that information on artworks seems to facilitate the neural processing of the stimuli.

The idea that knowledge and experience facilitate the processing of the visual stimuli was also evident in the work of Solso (2000). Solso monitored brain activity of a portrait-artist (via fMRI) while he drew faces, and compared the artist's brain activity with that of a non-artist who was drawing the same faces. Brain activity of the artist revealed less activity in face processing areas (posterior parietal) than that of the non-artists. This lower level of activation of the artist's face recognition area indicates that he may be more efficient in the processing of facial features than the novice.

From the above experiments one may conclude that abstract art, stylistic knowledge and experience all seems to reduce cortical brain activity as compared to the relevant controls (representational art, stylistic knowledge and novice, correspondingly). These results indicate that the analysis of abstract art evokes less focal brain activation.

The study by Vartanian and Goel (2004), presents some evidence that a reduction in subjective aesthetic preference is correlated with decreased activity in certain brain areas involved with reward systems, whereas greater aesthetic preference evoked larger activity in other brain areas, involved with emotional valence and attention. They found that, in general, representational paintings were preferred over abstract paintings. Correlating brain activity (via fMRI) with aesthetic preference, the researches demonstrated that activation in the right caudate nucleus decreased with decreasing preference, while the activation of fMRI signals in bilateral occipital gyri, left cingulate sulcus and bilateral fusiform gyri, all increased in response to increasing preference. These results imply that, because abstract art is less preferred by the observer, there is less reward, less emotional valence and reduced attention, all of which results in reduced brain activity.

It has been claimed that during the processing of art works, two different aspects take place—the processing of pictorial content and the processing of the artistic style (Cupchik et al., 1992; Augustin et al., 2008). In an event related potential (ERP) study, Augustin et al. (2011) found that processing of style starts later and develops more slowly than the processing of content (50 ms vs. 10 ms, respectively). They attribute this time difference in processing of the artwork to the fact that classification of content is extremely over-learned by humans as part of daily object classification and recognition whereas style analysis is a visual task that many have hardly ever experienced. They suggest (after Leder et al., 2004), that stylistic information might be processed as an abstract entity, which requires some high level processing, rather than a combination of low level embedding of features. This work also supports the notion that style specific information and art experience would facilitate and influence the perception of abstract art (more than of representational art). If this is indeed the case then abstract art, which exposes us mostly to the style of work and hardly to a significant content of it (as no particular objects are depicted), is being processed mostly via brain's routes of style analysis; routes that are less familiar to, and less used by, most people. In other words, abstract art introduces us to unfamiliar (or less familiar) situation.

It should be noted that many of the brain imaging studies on art rely on "reverse inference", that is to say that an activation of a particular brain area is used as an indication for the engagement of that brain's area in a particular cognitive process. Whereas activity of a particular brain area during a specific cognitive process imply the involvement of that area in that cognitive function, the reverse proposition needs a wider support, via high selectivity of the response of that particular brain area, or increase in prior probability of the particular cognitive process (Poldrack, 2006).

Another feature that might be enhanced while looking at abstract art is how global is the pattern of observation when concrete recognizable objects are missing in the pictorial scene. Such lack of objects enables a more uniform global gaze. For example, Taylor et al. (2011) investigated eye tracking of viewers appreciating Jackson Pollock's paintings, showing that the viewers' eyes tend to scan rather uniformly the surface of the whole canvas. This finding is in clear contrast to, by now classical, eye tracking studies of representative art, whereby the eye teds to gaze mostly on salient features in the painting (e.g., eyes, nose, trees, signature, etc.) and to almost completely neglect the rest (majority) of the painting's surface (see for example Locher et al., 2007; Hari and Kujala, 2009). The work of Taylor et al.(2011) supports the notion that, while analyzing abstract art, the visual/perception system is less engaged with focal and converging gaze but rather to a more homogeneous gaze. Again, a less familiar situation in our daily experience (see related work by Zangemeister et al., 1995). Another research found that in representational art, the eyes fixate longer on the figurative details than in abstract paintings, probably due to the lack figurative elements in the pictorial scene. This holds for both experts and laypersons (Pihko et al., 2011).

# **SPECULATIONS REGARDING OUR ATTRACTION TO ABSTRACT ART**

Pictorial art analysis can be regarded as composed of three main processes; (i) the brains' effort to analyze the pictorial content and style; (ii) the flood of associations evoked by it; and (iii) the emotional response it generates (Bhattacharya and Petsche, 2002; also see Freedberg and Gallese, 2007). Of course, being man-made for no immediate practical use, art in general enables the viewer to exercise a certain detachment from "reality" which, so it seems, provides certain rewards to the art-lover.

But abstract art offers a particularly unique opportunity that is evoked by visual stimulus which is not object-related and, therefore, remote from our daily visual experience. This frees us, to a large extent, from (automatically) activating object-related systems in the brain whose task is to "seek" for familiar (memorybased) compositions. Such "survival" mechanisms (e.g., "binding" and "figure ground separation") are not activated via abstract art, thus enabling us to form new "objects-free" associations that may arise from more rudimental visual features such as lines, colors and simple shapes. This conclusion is supported by both the lack of specific brain region(s) for the processing of abstract art exclusively (Kawabata and Zeki, 2004) as well as by the eye tracking experiments (Taylor et al., 2011), demonstrating that in abstract art, the eye (brain) is "free" to scan the whole surface of the painting rather than "fall" mostly into well recognized salient features, as is the case when processing representational art. Abstract art may therefore encourage our brain to respond in a less restrictive and stereotypical manner, exploring new associations, activating alternative paths for emotions, and forming new possibly creative links in our brain. It also enables us to access early visual processes (dealing with simple features like dots, lines and simple objects) that are otherwise harder to access when a whole "gestalt" image is analyzed, as is the case with representational art.

If indeed the above hypothesis were correct, then one would expect a larger variability of individual response between people, and at different times for same viewers, in brain response to abstract art as compared to representational art. Indeed, such variability was found by behavioral studies. Reflecting inner state rather than obeying to the dominance of visual objects, the response to abstract art is expected to be more dependent on one's particular inner state at a very specific moment, more so than while observing representational art (which more automatically activates the "survival"-related brain system). At some instances, a particular abstract artwork might evoke strong association and emotional response than in other times, when the inner state of the viewer is less approachable, less amenable to processing abstract art. A related prediction is that abstract art would activate more of the default system in the brain, associated with inner-oriented processing. This prediction goes along with the findings of Cela-Conde et al. (2013), which demonstrate the involvement of the default mode network during the later phase of aesthetic appreciation. Relevant to the current paper is the claim expressed in the mentioned article, indicating the complex relations between the inner thoughts and the processing of external events (for more on the role and involvement of the default system in art appreciation see also Vessel et al., 2012; Mantini and Vanduffel, 2013).

In contrast, representative art would activate the extrinsic system more powerfully, as this system is associated with processing information arriving from the external environment (Golland et al., 2008).

To conclude—abstract art is a very recent (100 years old or so) invention of the human brain. Its success in attracting the brains of so many of us suggests that it has an important cognitive/emotional role. Supported by recent experimental studies, I claim that abstract art frees our brain from the dominance of reality, enabling the brain to flow within its inner states, create new emotional and cognitive associations and activate brain-states that are otherwise harder to access. This process is apparently rewarding as it enables the exploration of yet undiscovered inner territories of the viewer's brain.

# **REFERENCES**


Cavanagh, P., and Perdreau, F. (2011). Do artists see their retinas? *Front. Hum. Neurosci.* 5:171. doi: 10.3389/fnhum.2011.00171


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 27 November 2013; accepted: 03 February 2014; published online: 28 February 2014.*

*Citation: Aviv V (2014) What does the brain tell us about abstract art? Front. Hum. Neurosci. 8:85. doi: 10.3389/fnhum.2014.00085*

*This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2014 Aviv. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# *The Musical Brain*


Jessica Phillips-Silver and Peter E. Keller

# **DISTINCT INTER-JOINT COORDINATION DURING FAST ALTERNATE KEYSTROKES IN PIANISTS WITH SUPERIOR SKILL**

**Shinichi Furuya, Tatsushi Goda, Haruhiro Katayose, Hiroyoshi Miwa and Noriko Nagata**

# Distinct inter-joint coordination during fast alternate keystrokes in pianists with superior skill

#### *Shinichi Furuya1,2\*, Tatsushi Goda1 , Haruhiro Katayose1 , Hiroyoshi Miwa1 and Noriko Nagata1*

*<sup>1</sup> School of Science and Technology, Kwansei Gakuin University, Hyogo, Japan*

*<sup>2</sup> Research Fellow, Japan Society for the Promotion of Science, Tokyo, Japan*

#### *Edited by:*

*Robert J. Zatorre, McGill University, Canada*

#### *Reviewed by:*

*Peter Keller, Max Planck Institute for Human Cognitive and Brain Sciences, Germany Virginia Penhune, Concordia University,* 

*Canada Caroline Palmer, McGill University,* 

*Canada*

#### **\****Correspondence:*

*Shinichi Furuya, School of Science and Technology, Kwansei Gakuin University, 2-1, Gakuen, Sanda, Hyogo 669-1337, Japan.*

*e-mail: auditory.motor@gmail.com*

Musical performance requires motor skills to coordinate the movements of multiple joints in the hand and arm over a wide range of tempi. However, it is unclear whether the coordination of movement across joints would differ for musicians with different skill levels and how interjoint coordination would vary in relation to music tempo. The present study addresses these issues by examining the kinematics and muscular activity of the hand and arm movements of professional and amateur pianists who strike two keys alternately with the thumb and little finger at various tempi. The professionals produced a smaller flexion velocity at the thumb and little finger and greater elbow pronation and supination velocity than did the amateurs. The experts also showed smaller extension angles at the metacarpo-phalangeal joint of the index and middle fingers, which were not being used to strike the keys. Furthermore, muscular activity in the extrinsic finger muscles was smaller for the experts than for the amateurs. These findings indicate that pianists with superior skill reduce the finger muscle load during keystrokes by taking advantage of differences in proximal joint motion and hand postural configuration. With an increase in tempo, the experts showed larger and smaller increases in elbow velocity and finger muscle co-activation, respectively, compared to the amateurs, highlighting skill leveldependent differences in movement strategies for tempo adjustment. Finally, when striking as fast as possible, individual differences in the striking tempo among players were explained by their elbow velocities but not by their digit velocities. These findings suggest that pianists who are capable of faster keystrokes benefit more from proximal joint motion than do pianists who are not capable of faster keystrokes. The distinct movement strategy for tempo adjustment in pianists with superior skill would therefore ensure a wider range of musical expression.

**Keywords: multi-joint movements, postural control, stiffness control, long-term training, musicians, music, hand-arm coordination, musical expression**

# **Introduction**

Artistic musical performance involves variation of multifaceted aspects of music. Precise manipulation of musical components, a skill necessary for evoking intended emotional responses to audience, requires dexterous control of hand movements and postures that is coordinated with arm movements. Studying musical performance therefore provides a good opportunity to probe into how the nervous system skillfully orchestrates a redundant number of degrees of freedom (DOFs) of the motor system to achieve artistic musical expression. Previous studies have extensively investigated repetitive hand movements during musical performance or more simplified tasks (Parlitz et al., 1998; Aoki et al., 2005; Goebl and Palmer, 2008; Fujii et al., 2009a,b; Loehr and Palmer, 2009; Palmer et al., 2009; Furuya and Soechting, 2010). Some of these studies have delineated differences in the characteristics of force exerted by digits (Parlitz et al., 1998; Aoki et al., 2005) and in the activities of extrinsic finger muscles (Fujii et al., 2009a,b) between musicians and non-musicians. For example, Parlitz et al. (1998) measured the force applied to the keys while professional and amateur piano players alternately struck two keys. They found that the professionals showed shorter durations of force application to the keys after the keys reached their bottom position than did the amateurs, indicating a reduction in unnecessary exertion of force. Fujii et al. (2009a) found that highly skilled drummers exhibited less tonic and more reciprocal activity at the finger muscles during finger tapping compared with non-drummer controls. These findings commonly suggested lower energetic cost of movements by musicians with superior skill. However, neither movement organization reflecting superior skill nor its relation to movement rate (tempo) has been addressed. To identify them provides insight into neural control of artistic musical performance, because manipulation of music tempo, which elicits emotional and autonomic responses to individuals listening to music (Dalla Bella et al., 2001; Khalfa et al., 2008), is essential for a wide variety of musical expressions.

Bernstein (1967) postulated that improvements in motor skills would be associated with the use of a greater number of DOFs to economize movement production. For example, involvement of additional joint in movement production can generate the intersegmental dynamics that propel motions at the adjacent joints in place of muscular force (Dounskaia, 2010). In agreement with this postulation, recent studies have demonstrated that skilled pianists reduced muscular work while striking a key by utilizing a greater number of DOFs compared with novice piano players (Furuya and Kinoshita, 2007, 2008a,b; Furuya et al., 2009). These results showed that compared with novices, professional pianists decelerated their shoulder extension to a greater degree during arm downswing to generate larger inter-segmental dynamics that accelerate motion at the elbow and wrist joints, thereby reducing the distal muscular load. The pianists also rotated their shoulder for flexion to a greater extent than the novices while depressing keys, thereby reducing the finger muscular load to compensate for the key reaction force. These findings indicate that individuals with superior piano skills economize the work performed by distal muscles during a piano keystroke by taking advantage of proximal joint motion. One limitation of these studies was that they focused on a discrete keystroke motion. Due to distinct differences in the control mechanism underlying discrete and cyclic movements (Schaal et al., 2004; Hogan and Sternad, 2007; Huys et al., 2008), it is worthwhile to probe whether the skill level-dependent difference that has been observed for a discrete piano keystroke is also present for cyclic keystrokes.

Another key issue in musical performance is how to change the organization of movement to modulate tempo. A small number of studies have investigated the effect of tempo on the upper-limb movements while pianists were striking two keys simultaneously and repetitively (Kay et al., 2003; Furuya et al., 2011). The results of these studies have shown that as the tempo increased, professional pianists changed joint velocity in a non-uniform manner; for example, increased their shoulder and wrist velocity and decreased their elbow velocity (Furuya et al., 2011). This differential effect of tempo variation on movements across joints has also been observed in studies investigating cyclic arm movements during a handdrawing task (Meulenbroek et al., 1993; Pfann et al., 2002). For example, a study of a circle-drawing task found that as movement tempo increases, a majority of subjects use more elbow motion than shoulder motion (Pfann et al., 2002). However, it is unclear whether variation of inter-joint coordination in relation to tempo could differ for individuals with different skill levels.

The present study aimed to address the effect of tempo on the organization of hand and forearm movements and finger muscular activity while both professional and amateur piano players struck two piano keys alternately with the thumb and little finger. This motor task, referred to as "tremolo," was chosen because the movement is performed cyclically in three-dimensional space, requiring a larger number of DOFs to be controlled for both movement production and postural maintenance compared with the planar movements that have been primarily studied previously. We hypothesized that inter-joint coordination of movements during repetitive keystrokes would vary in relation to both tempo and skill level. Specifically, we postulated that pianists with superior skill would utilize proximal joint motion to a greater extent to strike keys, keep non-striking fingers less extended, and lessen finger muscular load during keystrokes and that these skill level-dependent differences would become more pronounced at faster tempi.

The present study also probed into the movement features associated with individual differences in the fastest striking tempo for piano players. Extremely fast hand movements represent the motor skills of *virtuoso* musicians, and their underlying neural mechanisms, including gray matter volume of motor cortex (Amunts et al., 1997; Münte et al., 2002) and cerebellum (Hutchinson et al., 2003) and white matter volume (Bengtsson et al., 2005), have been studied extensively. Behavioral studies using finger tapping have also illustrated superior hand motor functions of musicians (Jäncke et al., 1997; Aoki et al., 2005). Nevertheless, an understanding of the movement characteristics that enable extremely fast cyclic limb motion during musical performance remains elusive.

# **Materials and methods Participants**

Five active expert pianists ("professional;" two males and three females, mean age ± SD = 24.3 ± 3.2 years, all right-handed) with more than 20 years of formal classical piano training and five recreational pianists who have no history of professional music education but practiced for less than 3 h per week ("amateur;" three males and two females, mean age ± SD = 22.6 ± 1.1 years, all right-handed) participated in the present study. All of the professional pianists had won awards at domestic and/or international classical piano competitions. The group mean of the hand span was 203.6 ± 15.1 and 195.2 ± 15.1 mm for the professionals and amateurs, respectively (*t*-test: *p* = 0.55). The group mean of the body mass was 54.4 ± 13.6 and 55.8 ± 9.8 kg for the professionals and amateurs, respectively (*t*-test: *p* = 0.88). A lack of significant group difference in both of these variables confirmed that their anthropometry did not account for any observed differences in kinematics and muscular activities between the two groups. In accordance with the Declaration of Helsinki, the experimental procedure was clearly explained to the participants, who submitted written informed consent. The study was approved by the local ethics committee at Kwansei Gakuin University.

#### **Experimental apparatus and key-striking task**

The experimental apparatus used was a digital piano with a touch response action (P-250 YAMAHA Co.), a motion-capturing system consisting of 13 high-speed cameras (eight Eagle and five Hawk Eye, Mac3D system, Motion Analysis Co.), and a two-channel electromyography (EMG) system (Harada Electronics Industry Ltd.; **Figure 1A**). To collect positional data on anatomical landmarks, spherical reflective markers (5 mm in diameter for the hand and key and 9 mm in diameter for the wrist and elbow) were attached to two separate keys and on all joint centers of the right hand and arm. The experimental task was to perform repetitive tremolo keystrokes with the right hand, alternating keystrokes of the 52nd key (E) by the thumb and the 60th key (C) by the little finger (**Figure 1B**). The keys were 118 mm apart. This motor task (tremolo) is widely used in a variety of musical pieces written by composers such as Beethoven, Chopin, Schumann, Liszt, and so on. The findings derived from the present study should thus provide information that would be useful for pianists and piano teachers. Before starting the keystroke task, a striking tempo was provided to the participant by a metronome for 5 s. Then, the participant was cued to start striking the keys. Each trial consisted of keystrokes for 10 s. During each trial, the metronome continued to provide the tempo. The left arm and hand were kept relaxed on the side of the body while the trunk and right upper arm were placed in an upright position with minimal movement.

Five target striking tempi of 70, 90, 110, 130, and 260 bpm were chosen in this study, which correspond to 857, 667, 545, 462, and 231 ms, respectively, of inter-keystroke intervals (IKI) of one key (i.e., two successive strokes of the E key). Hence, the expected IKI

between successive strokes of two different keys (e.g., from E stroke to the following C stroke) was half of them, ranging from 428.5 to 115.5 ms for these five tempi. Each participant was also asked to perform the designated task as fast as possible. We did not provide participants with any instruction regarding a manner to depress the keys ("touch"). The target loudness for the tone was set to approximately 65 MIDI velocity during the task and was monitored by an experimenter during each trial. The loudness was set to a constant level because our previous study showed an significant interaction effect of loudness and tempo on the upper-limb movements during repetitive piano keystrokes (Furuya et al., 2011).

# **Data acquisition procedures**

Twenty-five spherical reflective markers were mounted on the keys and the body to identify anatomical landmarks for the purpose of digitalization. These markers were placed on the skin over the fingertips and three joint centers of all five digits, the distal end of the radius, the proximal and distal ends of the ulna, and the two piano keys. The motion of the reflective markers was recorded at 120 Hz using 13 high-speed cameras surrounding the piano (**Figure 1A**). The camera locations were carefully arranged so that one could continue to record position data of all of the markers while performing the target task. The spatial resolution in the camera setting was 1 mm. The 3D time-position data of each marker was obtained by the direct linear transformation method. All procedures were established in a previous study, which was conducted for the purpose of creating CG animations based on motion-capturing data derived while playing the piano (Kugimoto et al., 2009). The data were digitally smoothed at a low-pass cutoff frequency of 15 Hz using a second-order Butterworth digital filter. Subsequently, the following joint angles were computed: elbow pronation/supination rotation about an axis passing through the proximal and distal ends of the ulna, thumb internal/external rotation about an axis passing through the trapeziometacarpal joint of the thumb and the index metacarpo-phalangeal (MCP) joint, MCP, and proximalphalangeal (PIP) flexion/extension rotation at the index, middle, and ring fingers, and MCP flexion/extension rotation at the little finger (**Figures1C–E**). These computations were based on procedures proposed previously (Feltner and Taylor, 1997; Hirashima et al., 2007). We did not use data regarding angles at the IP and MCP joints at the thumb and little PIP joints for subsequent analysis because our pretest of two professionals and two amateurs found that the rotational motion at these joints was indiscernible in the present task.

The electrical activity of the right side of the extensor digitorum communis (EDC) and flexor digitorum superficialis (FDS) muscles was recorded with the surface EMG system. In our pretest, we also attempted to record the activity at the forearm muscles responsible for the pronation and supination rotations (supinator and pronator teres muscles), but our surface EMG system failed to do it reliably due to substantial cross-talk from adjacent muscles. Pairs of Ag/AgCl surface disposal electrodes were placed at the estimated center of each target muscle with a 20-mm center-to-center difference. The electrode position was carefully determined to minimize cross-talk from adjacent muscles. At each electrode position, the skin was shaved, abraded, and cleaned using isopropyl alcohol to reduce source impedance. The EMG signals were amplified 5000-fold and sampled at 960Hz using an A/D converter interfaced with a personal computer. The signals were then digitally high-pass filtered (with a cutoff frequency of 20 Hz) to remove movement artifacts and then root-mean squared. To normalize these EMG data for each muscle in each participant, EMG data during maximum voluntary contraction (MVC) were obtained for each muscle by asking the participant to perform maximum flexion or extension with an isometric force against a stationary object for a 5-s period. Each participant was verbally encouraged to achieve maximal force at a designated joint angle. During an MVC trial for the EDC and FDS muscles, the finger and wrist joints were kept at 180°. A percentage of the MVC value was then calculated using the mean value of the middle 3-s period of the MVC data. To confirm that each participant exerted maximum force, the MVC trial was repeated twice for each muscle, and the mean MVC value of these two trials was computed. Due to a lack of any apparent difference between the two MVC values (*t*-test: *p* = 0.69), we simply computed the mean rather than the highest value.

#### **Data analysis**

The onset of descending movement for both the E and the C keys ("finger-key contact moment") was determined when the computed vertical velocity of the key exceeded 5% of its peak velocity. During each time window between two successive strokes of the E key (i.e., one IKI), we computed a set of kinematic and EMG variables, which were then averaged across all IKI during each trial. The following kinematic variables were computed: (1) peak values of both the thumb internal rotation velocity and the MCP flexion velocity of the little finger, (2) peak values of the supination and pronation angular velocity at the elbow joint, and (3) mean angle of the MCP and PIP joints at the index, middle, and ring fingers (i.e., non-striking fingers) during an IKI. To eliminate the background noise in the EMG variables, we initially subtracted the mean value of the muscular activity recorded while the hand and forearm were kept relaxed on the table. We then computed the mean values of the activities of the EDC and FDS muscles. In addition, we used the following equation to compute the co-activation index (CI) based on previous studies (Kellis et al., 2003; Furuya and Kinoshita, 2008b; Furuya et al., 2011) as an estimate of joint stiffness:

$$\text{CI} = \frac{\left(\int\_{\iota1}^{\iota2} \text{EMG}\_{\text{ugan}}\left(t\right)dt + \int\_{\iota2}^{\iota3} \text{EMG}\_{\text{ant}}\left(t\right)dt\right)}{\Delta T}$$

where the period from *t*1 to *t*2 denotes the time when the agonist EMG activity is less than the antagonist EMG and vice versa for the period from *t*2 to *t*3. ∆*T* is the IKI.

# **Statistics**

Using group (professional/amateur) and tempo (five predetermined tempi) as independent variables, a two-way analysis of variance (ANOVA) with repeated measurements was performed for each of the dependent variables. Newman–Keuls *post hoc* tests were performed where appropriate to correct for multiple comparisons. Statistical significance was set at *p* < 0.05.

# **Results**

# **Key-striking velocity and inter-keystroke interval**

**Table 1** shows the mean MIDI velocity and IKI of the E and C keys across participants in the two groups at different tempi. Two-way repeated measures ANOVA demonstrated no difference in MIDI velocity for different tempi and between the two groups, confirming tone volume consistency. The IKI was systematically decreased in proportion to tempo for both groups, all of which followed the IKI designated by the metronome (857, 667, 545, 462, and 231 ms for 70, 90, 110, 130, and 260 bpm, respectively). These findings confirmed that both professional and amateur pianists successfully performed the designated task.

# **Profile of joint angular velocity**

**Figure 2** illustrates the profiles of the joint angular velocity of the elbow, little finger, and thumb across different tempi (70, 130, and 260 bpm) for one representative professional and one representative amateur pianist. The results show that the angular velocity for elbow supination and pronation, little MCP flexion, and thumb internal rotation reached their peaks prior to the keypress moment for both players. In addition, the timing of peak velocity occurred earlier for both players as tempo increased. The professional pianist showed greater peak angular velocity for the elbow and smaller peak velocity for the thumb and little finger for all three tempi compared to the amateur pianist.


*70, 90, 110, 130, and 260*

*\*\*p* < *0.01.*

*Each number in parentheses indicates the SD across participants.*

*The units of Vel and IKI are the MIDI velocity unit and milliseconds, respectively.*

*bpm corresponds to 857, 667, 545, 462, and 231*

*ms of IKI of each key, respectively.*

# **Peak angular velocity at the elbow, little finger, and thumb**

**Figure 3** shows the mean peak angular velocities for elbow supination and pronation (**Figures 3A,C**), little MCP flexion (**Figure 3B**), and thumb internal rotation (**Figure 3D**) across participants in the two groups at different tempi. A two-way ANOVA with repeated measures found a group effect on all of these variables (**Table 2**), confirming greater elbow velocity for the professionals compared with the amateurs and greater velocities of the thumb and little

finger for the amateurs compared to the professionals. For the peak elbow supination and pronation velocity, both the effect of group x tempo interaction and the main effect of tempo were confirmed. The interaction effect indicated a greater increase in elbow longitudinal rotational velocity with increasing tempo for the professionals compared with the amateurs.

#### **Time at peak velocity relative to finger-key contact moment**

**Table 3** shows the mean time at peak angular velocities for elbow supination and pronation (**Figures 3A,C**), little MCP flexion (**Figure 3B**), and thumb internal rotation (**Figure 3D**) relative to the moment of the corresponding finger-key contact across participants in the two groups at different tempi. The velocity reached its peak at approximately 50–70 ms following the moment of fingerkey contact at a slow tempo. However, as the tempo increased, the moment of peak velocity was earlier for both groups. This finding indicates that the duration of accelerating key descent with the fingertip became shorter in proportion to the tempo. Two-way repeated measures ANOVA demonstrated a main effect of tempo on all of these variables (**Table 2**). Neither a group x tempo interaction effect nor a main group effect was found.

To determine whether an earlier occurrence of peak joint angular velocity at a faster tempo resulted in an earlier occurrence of the peak linear descending velocity of the key, group means of the time of peak descending velocity of the key struck by the thumb and little finger relative to the moment of the corresponding finger-key contact across participants was further computed at different tempi. For the key struck by the thumb, the time at peak velocity at 70, 90, 110, 130, and 260 bpm was 79.3 ± 7.3, 83.9 ± 5.7, 75.0 ± 3.8, 72.0 ± 7.6, and 57.2 ± 5.9 ms, respectively, for the professionals and 83.7 ± 1.9, 78.6 ± 0.9, 75.2 ± 0.4, 70.9 ± 2.1, and 56.7 ± 5.6 ms, respectively, for the amateurs. Both of these players displayed a systematic decrease in the time of peak key velocity with increasing tempi. Similarly, for the key struck by the little finger, the time at peak velocity at 70, 90, 110, 130, and 260 bpm was 76.6 ± 8.5, 82.1 ± 8.2, 75.7 ± 2.0, 72.9 ± 6.3, and 54.9 ± 4.4 ms, respectively, for the professionals and 79.3 ± 4.2, 77.2 ± 2.2, 75.5 ± 1.9, 72.2 ± 2.1, and 51.5 ± 6.5 ms, respectively, for the amateurs. For both variables, the tempo effect was significant (*p* < 0.01), confirming the earlier occurrence of peak key-descending velocity at faster tempi. This indicated that the fingertip accelerated to depress the key for a shorter duration with an increase in tempo.

# **Joint angles of the non-striking fingers**

**Figure 4** shows the mean joint angle at the MCP and PIP joints of the non-striking fingers (index, middle, and ring fingers) during keystrokes across participants in the two groups at different tempi. ANOVA confirmed a group effect only on the MCP joint at the



*\*p* < *0.05, \*\*p* < *0.01.*

index and middle fingers (**Table 2**), which confirmed a smaller extension angle at the index and middle MCP joints for the professionals compared with the amateurs. There was no effect of tempo at any joint except for the middle MCP joint, where only amateurs had a slight increase in the angle with increasing tempo.

# **Muscular activity**

**Figure 5A** illustrates muscular activity profiles at the EDC (*positive*) and FDS (*negative*) muscles across different tempi (70, 130, and 260 bpm) for one representative professional and one representative amateur pianist. Overall, these players showed increases in the activities of both muscles prior to the moment of finger-key contact, exhibiting their co-activation. In addition, the activity increased as the tempo increased. The professional player showed a smaller peak of activity for both muscles compared with the amateur for all tempi. The professional also exhibited smaller tonic activity, particularly at the EDC muscles, than the amateur.

**Figures 5B–D** show the average values of the mean activity at the FDS and EDC muscles and the CI values for these muscles during keystrokes across participants in the two groups at different tempi. There was a significant interaction effect of group and tempo on both the mean muscular activity of these two muscles and the CI value (**Table 2**). The interaction effect indicates a greater increase with tempo for the amateurs compared to the professionals. The effect of both group and tempo was also confirmed for all of these variables. The group effect indicates a greater muscular activity for the amateurs than for the professionals, whereas the tempo effect indicates an increase in muscular activity with tempo.

# **Intra-trial correlation between relative timing error and movement characteristics**

To assess if the deceleration of joint rotation due to the finger-key contact was correlated with the timing accuracy of the subsequent keystroke, a correlation between the peak angular deceleration (how much joint rotations at the elbow and digits were decelerated at the moment of collision) and relative timing error of the subsequent IKI was computed within a trial. The relative timing error was computed as follows: (IKIexp − IKIobs)/IKIexp. IKIexp and IKIobs indicates the expected and observed IKI, respectively (Goebl and Palmer, 2008). The peak deceleration of joint rotation was determined within

**Table 3 | Group means of the time at peak joint angular velocities at five striking tempi.**


*Time zero indicates the moment of the corresponding finger-key contact. (i.e., Little finger keypress for the peak vel. of little finger flexion and elbow supination, and thumb keypress for the peak vel. of thumb internal rotation and elbow pronation). Each number in parentheses indicates 1 SD across participants.*

the range from −30 to 20 ms (Time zero indicates the moment of finger-key contact). Data from all five tempi was combined for the correlation analysis. Some keystrokes particularly at slower tempi showed no apparent peak of joint deceleration, which were not included for the analysis.

During the thumb keystroke, a significantly negative correlation was found for three of the professionals (*r* = −0.26 ± 0.37; mean ± SD across five players) and one of the amateurs (*r* = 0.12 ± 0.38) for the elbow pronation, and for one of the professionals (*r* = 0.24 ± 0.37) and two of the amateurs (*r* = −0.12 ± 0.37) for the thumb internal rotation. During the little finger keystroke, a negative correlation was evident for two of the professionals (*r* = −0.03 ± 0.36) and two of the amateurs (*r* = −0.17 ± 0.24) for the elbow supination, and for two of the professionals (*r* = −0.06 ± 0.37) and two of the amateurs (*r* = −0.11 ± 0.21) for the little MCP flexion. These findings indicate that some of the pianists performed more accurately timed keystrokes following the occurrence of stronger joint deceleration elicited by the finger-key contact, which suggests that proprioceptive feedback from muscle spindles ensures temporal accuracy of successive keystrokes. This view is in agreement with the report by Goebl and Palmer (2008).

# **Correlation between fastest tempo and movement characteristics**

To determine the movement features that account for individual differences in the fastest key-striking tempo across players, we initially computed the IKI during the fastest keystrokes.

The results for the professionals and amateurs were 156.6 ± 8.6 and 220.9 ± 13.5 ms, respectively, for the thumb keystrokes and 162.6 ± 8.6 and 220.8 ± 12.8 ms, respectively, for the little finger keystrokes (mean ± SD across players). A *t*-test confirmed a significant group difference in each of the two variables (*p* < 0.01), indicating a faster keystroke rate for pianists with superior skill.

Using the dataset of the fastest tempo for all 10 participants, we performed a correlation analysis between the IKI (i.e., the mean value between the keystrokes of the thumb and little finger) and several fundamental movement variables. These variables included (1) the peak velocity for elbow longitudinal rotation, which is the average of the peak velocities between elbow pronation and supination, (2) the peak velocity for digit flexion, which is the average of the peak velocities between the thumb internal rotation and little MCP flexion, (3) the time at the peak descending velocity of the key relative to the moment of the keystroke, which is the mean between the two keys struck by the thumb and the little finger, and (4) the CI value. To minimize multiple statistical tests, we did not use the mean EMG activity and time at peak angular velocity of the elbow, thumb, and little fingers for the analysis because these variables should be largely related to the evaluated variables (i.e., the CI value and time at peak key descending velocity).

**Table 4** summarizes the results of the correlation analysis. Based on a Bonferroni correction for multiple tests, we set a significant *p*-value of 0.0125. We found significant correlations between the IKI at the fastest tempo of the 10 players and both



*r: correlation coefficient, p: p-value \*p* < *0.0125 with Bonferonni correction.*

the peak elbow velocity and the time of peak key velocity. The negative and positive coefficients for the elbow velocity and the time at peak key velocity indicate a greater elbow velocity and earlier occurrence of peak key velocity for players who strike faster, respectively.

Having only used 10 datasets in the analysis, we acknowledge that outliers have potential to substantially influence the results. In order to further increase our confidence in the reliability of the results, we performed (1) a robust regression analysis (Hampel et al., 2005) and (2) a bootstrap procedure (Efron, 1979). The robust regression analysis evaluates if the gain significantly differs from zero. The result showed that the *p*-value for the peak elbow longitudinal rotational velocity, peak digit flexion velocity, time at the peak key velocity, and CI was 0.012, 0.053, 0.002, and 0.077, respectively. This verified the robustness of the results derived from the correlation analysis against outliers. The bootstrap procedure was also performed to assess if the correlation coefficient differed from zero. **Table 4** (right) shows the 95th percentile confidence limits of the correlation coefficient derived from 1000 bootstrap samples and *p*-value indicating if the coefficient is significantly larger or smaller than zero. Both the upper and lower confidence limits were negative for the peak elbow velocity, and positive for the peak elbow velocity. The correlation coefficient for the peak elbow velocity and the time at the peak key velocity was significantly smaller and larger than zero, respectively.

To assess if particular kinematic variables are associated with the temporal precision of keystrokes during playing at the fastest tempo, we also performed the correlation analysis between the movement variables and coefficient of variance (CV: SD/mean) of the IKI within a trial. None of the movement variables showed a significant correlation, which was also confirmed by the robust regression analysis (*p* > 0.05).

#### **Discussion**

# **Inter-joint coordination of hand and forearm movements**

We found that professional pianists produced smaller flexion velocity at the thumb and little finger and greater elbow pronation and supination velocity during alternating keystrokes compared with amateurs. This finding indicated that pianists with superior skill used proximal limb motion to a greater extent to strike keys. In agreement with this finding, we had previously found that professional pianists took greater advantage of shoulder joint rotation than did novice players during a keystroke (Furuya and Kinoshita, 2008a,b). This movement allowed them to utilize the inter-segmental dynamics to move the elbow and wrist joints. However, the inter-segmental dynamics that propelled the finger movements were negligible. Therefore, it seems unlikely that, in this study, the pronounced rotation at the proximal joint in the professionals played a role in generating the inter-segmental dynamics that effectively drive finger motion. This view is strengthened by the concurrence of peak velocities at the elbow and digits (**Figure 4**), which contrasts with previous findings of the occurrence of peak joint velocity in the order from proximal to distal when utilizing inter-segmental dynamics (Dounskaia et al., 1998; Buchanan, 2004; Furuya and Kinoshita, 2007). We speculate that elbow rotation directly contributes to the production of fingertip descending velocity to strike a key. A similar motor skill was also reported for a ball-throwing motion, where skilled throwers took greater advantage of the shoulder's internal rotation to accelerate the hand motion just prior to releasing the ball compared to unskilled individuals (Hore et al., 2005; Gray et al., 2006).

There are at least two benefits of taking advantage of the proximal joint motion. First, the distance between the joint center and the endpoint of the linked system is longer for more proximal joints. Therefore, joint rotation around more proximal joint results in a larger translational velocity at the limb endpoint, which provides proximal joint rotation with mechanical advantage. Second, proximal muscles have greater physiological cross-sectional areas than distal muscles. Because the tolerance to muscular fatigue increases in proportion to the cross-sectional area (Herzog, 2000), utilization of proximal joint motion could ensure fast and accurate movements for longer periods of time. Indeed, we found smaller loads on extrinsic finger muscles for the professionals compared to the amateurs. Presumably, extensive piano training involving extraordinary repetitive strokes could allow pianists to acquire inter-joint coordination that has mechanical and physiological advantages. This perspective is in agreement with Bernstein's hypothesis regarding the use of a greater number of DOFs to economize movement production with improvements in motor skills (Bernstein, 1967). Furthermore, observations in favor of Bernstein's hypothesis in both discrete (Furuya and Kinoshita, 2008a) and cyclic motor behaviors, in spite of their distinct differences in neural control mechanisms (Schaal et al., 2004), suggest that using a greater number of DOFs during skill improvement might be a common principle governing skill acquisition in multi-joint movements.

#### **Organization of hand posture**

Another key finding of the present study was smaller extension angles at the fingers that were not used for striking keys for the professionals compared with the amateurs. The muscular activities of finger extensors that are responsible for lifting the fingers were also smaller for the professionals than for the amateurs. These results indicate that the professionals reduced the muscular load for keeping the non-striking fingers lifted during the course of the keystrokes. Because a systematic increase in the extension angle of non-striking fingers with tempo was not necessarily observed, the role of this postural muscular contraction is unlikely to compensate for an unwanted spillover effect of the striking motion on the non-striking fingers due to anatomical and neural connections across digits (Hager-Ross and Schieber, 2000; Lang and Schieber, 2004; Aoki et al., 2005; van Duinen et al., 2009; Yu et al., 2010). Instead, this finger elevation plays a role in simply avoiding sounding unwanted tones.

The hand is a highly redundant motor system with a large number of DOFs. Previous studies have demonstrated that the organization of hand posture during a manual grasping task is subject to task constraints, such as position of finger location to be placed on the object (Lukos et al., 2007), the geometric shape of the object (Santello et al., 1998), and the movement speed during reaching toward the object (Rand et al., 2006). However, whether hand posture would differ depending on the skill of the individual has remained unclear. In the present motor task, although the explicit constraint imposed on non-striking fingers was only to lift the hand to avoid touching unwanted keys, there was a distinct difference in hand posture between the professional and amateur pianists. This finding implies that the nervous system reorganizes hand posture during keystrokes with an improvement in skill. There seem at least two feasible benefits of this reorganization of hand posture. First, hand posture with smaller load at finger muscles would facilitate endurance to peripheral muscular fatigue, which is a concern for piano playing (Penn et al., 1999). Second, smaller force exertion at finger muscles could reduce muscular stiffness, which decreases mechanical constraints limiting independent control of finger movements (Leijnse, 1997; Lang and Schieber, 2004). We therefore inferred that pianists with superior skill took account of maximizing endurance to muscular fatigue and/or independent control of digits when organizing the hand posture during repetitive keystrokes.

# **Movement strategy for striking keys over a wide range of tempi**

As the tempo increased, the rotational velocity of the elbow, thumb, and little finger reached its peak earlier. This finding indicated that the fingertip accelerated to depress the key for a shorter duration with an increase in tempo. Nevertheless, the key's descending velocity did not decrease at a faster tempo. This finding can be explained in terms of increases in both elbow velocity and muscular co-contraction with increasing tempo. These increases should allow for a greater transfer of momentum from the limb to the key while the fingertip is colliding with the key. It is therefore likely that both the professional and the amateur players took greater advantage of momentum transfer as the tempo increased to compensate for the failure to fully accelerate key depression. Indeed, pianists produced a fingertip velocity that was larger than the key-descending velocity only during a keystroke with a touch that was able to utilize momentum transfer (Furuya et al., 2010).

Intriguingly, we also found an interaction effect of group and tempo on the peak elbow velocity and on the finger muscular coactivation level. The professionals showed a greater tempo-related increase in elbow velocity than did the amateurs, an effect being reversed for co-activation. This finding suggests that the professionals increased the descending velocity at the fingertip to a greater extent and the stiffness at the fingertip to a lesser extent when striking faster compared to the amateurs, which highlights a skill level-dependent difference in the movement strategy to adjust tempo. Furthermore, when striking at the fastest tempo, a distinct inter-pianist difference in the IKI was clearly explained by both the elbow longitudinal rotational velocity and the time at peak fingertip descending velocity. This finding supported our view that a failure to fully accelerate the key depression with the fingertip when striking faster was compensated for by elbow motion, which reflects a player's expertise. Taken together, the distinct movement strategy for tempo adjustment in pianists with superior skill may play a role in the execution of extremely fast keystrokes. One implication of this finding for music pedagogy would be to produce motion at proximal joints but not distal joints to strike keys faster, which might be counterintuitive for less skilled players.

Previous neuroimaging studies have demonstrated that pianists with superior skill had a greater volume of gray matter in the motor cortex (Amunts et al., 1997) and cerebellum (Hutchinson et al., 2003). Enlargement of these motor-related brain regions, which can result from long-term training from childhood (Zatorre et al., 2007; Penhune, 2011), has been mostly explained in terms of superior hand motor function, such as faster speed of finger tapping (Amunts et al., 1997). However, both the primary motor cortex (Vargas-Irwin et al., 2010) and the cerebellum (Thach, 1998; Timmann et al., 2008) play roles in coordinating multiple DOFs. Our findings therefore raise the possibility that a greater volume of motor-related regions may enable pianists to utilize more DOFs, particularly at the proximal body portion, to perform virtuosic motor behaviors. This supposition was compatible with a theory proposing that as motor skill develops, DOFs that were initially redundant become abundant so as to enhance movement performance (Yang and Scholz, 2005; Latash, 2008). Such a change may be associated with an enlargement of motor-related brain regions with an improvement of motor skill for piano playing.

The present findings may provide insights into motor control of piano touch. Previous studies demonstrated that touch in piano keystroke was defined by distinct mechanical interaction between the fingertip and key (Goebl et al., 2005; Kinoshita et al., 2007; Goebl and Palmer, 2008). Our study also suggested different finger-key contact dynamics between the professionals and amateurs particularly at faster tempi, because the pianists with superior skill showed a smaller increase in the finger muscular co-activation and thus stiffness with tempo. However, the smaller increase in the co-activation was likely attributed to a greater increase in elbow velocity with tempo. The utilization of the elbow motion for the professionals thus allowed for the fingertip-key contact with low stiffness over a wide range of tempi. This implicates that the proximal joint motion is a key determinant of piano touch, which supports our recent finding (Furuya et al., 2010).

# **Specialized motor skill responsible for artistic musical performance**

To manipulate elements of music (e.g., rhythm, timbre, loudness, harmony, and tempo) elicits emotional and autonomic responses during listening to music (Dalla Bella et al., 2001; Gomez and Danuser, 2007; Bernardi et al., 2009; Hailstone et al., 2009). Artistic musical performance thus requires motor skill to manipulate elements of music. The present study focused on motor skill to change tempo, a key variable affecting listeners' emotion and autonomic activity (Dalla Bella et al., 2001; Khalfa et al., 2008). We identified movement characteristics responsible for tempo adjustment during alternate piano keystrokes, which however differed depending on players' expertise. Remarkably, the differences appeared to enable pianists with superior skill to perform faster keystrokes, which allowed for a wider range of musical expression. In addition, a less pronounced increase in finger muscular activity with tempo for the professional pianists than the amateurs could help in individuated finger movements and/or prevention of muscular fatigue, both of which are necessary for precise tone production. Accordingly, the specialized motor skill to manipulate tempo for the professional pianists would ensure artistic musical performance with superior expressiveness and precision.

# **Conclusion**

The present study determined certain motor skills responsible for artistic musical performance. To maintain fine motor performance for a prolonged duration of repetitive keystrokes, the professional pianists had faster elbow rotation, slower rotation at the digits used for keystrokes, smaller extension angle at the fingers unused for keystrokes, and smaller co-activation of finger muscles compared with

# **References**


the amateur pianists. The professionals also increased the elbow velocity to a greater extent and the co-activation to a smaller extent than the amateurs with an increase in tempo. Furthermore, during striking at the individual's fastest tempo, pianists capable of faster keystrokes showed greater elbow velocity and earlier occurrence of the peak key velocity, which highlights the importance of the specialized motor skill for a wider range of tempo manipulation. A future study is needed to assess if the findings can be generalized to melodic sequences involving motions at greater number of digits and less pronounced arm motion, and if the findings of individual differences can be generalized to a larger number of pianists.

# **Acknowledgments**

We thank Drs. Masaya Hirashima (Tokyo University) and Takashi Takuma (Osaka Institute of Technology) for their helpful and constructive suggestions on the calculation of joint angles based on the 3D position data of markers. We also thank Mr. Hrishikesh Rao (University of Minnesota) for proofreading the manuscript. We appreciate three anonymous reviewers who provided constructive suggestions to improve the manuscript.

segments in striking the keys by expert pianists. *Neurosci. Lett.* 421, 264–249.


comparisons of digits, hands, and movement frequencies. *J. Neurosci.*  20, 8542–8550.


discrete and continuous movements. *PLoS Comput. Biol.* 4, e1000061. doi: 10.1371/journal.pcbi.1000061


local primary motor cortex populations. *J. Neurosci.* 30, 9659–9669.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 27 March 2011; accepted: 12 May 2011; published online: 27 May 2011.*

*Citation: Furuya S, Goda T, Katayose H, Miwa H and Nagata N (2011) Distinct inter-joint coordination during fast alternate keystrokes in pianists with superior skill. Front. Hum. Neurosci. 5:50. doi: 10.3389/fnhum.2011.00050*

*Copyright © 2011 Furuya, Goda, Katayose, Miwa and Nagata. This is an open-access article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.*

# **MUSIC AND THE BRAIN, LITERALLY**

**Joseph LeDoux**

Opinion Article published: 01 June 2011 doi: 10.3389/fnhum.2011.00049

# Music and the brain, literally

# *Joseph LeDoux***1,2\***

*<sup>1</sup> Center for Neural Science and Department of Psychology, New York University, New York, NY, USA*

*<sup>2</sup> Emotional Brain Institute, Nathan Kline Institute for Psychiatric Research, Orangeburg, NY, USA*

*\*Correspondence: ledoux@cns.nyu.edu*

I spend my days working on the brain, specifically on how it makes mind and behavior possible. Most everyone believes that mind and behavior depend on the brain. But just how it all works is still mysterious. We have learned a lot, but we have got a long way to go.

My particular area of research is about the relation between emotion and memory. For the past 25 years or so I have been using a simple behavioral paradigm, Pavlovian fear conditioning, to try to understand how the rat brain (and by inference, the human brain) learns and stores memories about threatening situations. In Pavlovian fear conditioining a rat is exposed to a sound that is followed by a mild shock. The sound comes to elicit fear responses, such as freezing behavior, autonomic nervous system responses, and hormonal responses. By tracing the circuits forward from the auditory system to the response control systems, we implicated specific regions of the amygdala in the learning and storage of these memories. This kind of memory is different from what we usually mean by the term memory. It is not memory that you consciously recall and reflect on. It is implicit or unconscious memory. In the early 1990s, I introduced the concept of emotional memory (LeDoux, 1992a,b,c), which built upon the distinction between declarative/explicit memory and non-declarative/procedural/implicit memory (Squire, 1986). Thus, *emotional memory* (the implicit or unconscious kind of memory created by fear conditioning) was distinguished from *cognitive memory about emotion* (the kind of conscious memory one might have about a situation in which emotional memories are learned and stored). The former depends on the amygdala and the latter on the hippocampus and other components of the medial temporal lobe memory system (LeDoux, 1996, 2002, 2007a,b).

Work by my lab and others has made a lot of progress in pinpointing the neural system centered on the amygdala that is involved in the learning and storage of emotional fear memory, as well as the synaptic, cellular, and molecular mechanisms involved (for reviews see: LeDoux, 1996, 2000, 2002, 2007a,b; Walker et al., 2003; Maren and Quirk, 2004; Rodrigues et al., 2004; Fanselow and Poulos, 2005; Maren, 2005; Phelps and LeDoux, 2005; Sah et al., 2008; Pape and Pare, 2010; Tully and Bolshakov, 2010).

Research is difficult these days. Neuroscientists not only have to try and crack the hardest code unknown to man, we also have to convince our colleagues sitting on funding panels that our particular idea about how to make a little progress is what they should bless with an influx from the shrinking pot of government research money. So it is good to have a way to relax.

At some point in every day I have a guitar in my hands, unless I am traveling and without my sonic security blanket. Sometimes I just strum some chords. Other times I work on a rhythm pattern, or I attempt to fine-tune some vocals so that they are in harmonic alignment with the chords (or at least close). Other times I am rehearsing or playing with my friends and band mates in The Amygdaloids (www.amygdaloids.com).

In the early fall of 2006 I was invited to give a lecture on my work on emotions and the brain to the Secret Science Club. This was and still is a thriving science-for-thepublic program held in a bar in Brooklyn. It draws on the amazing reservoir of scientific researchers in the New York City area, having a speaker once a month. The organizers said they would find some entertainment to follow my talk. I said I would bring the entertainment.

For a couple of years, Tyler Volk and I had been playing guitar together, alternating between each other's apartment. Tyler is a Professor in Biology, with a specialty in environmental science, but with a serious fascination with mind and brain (I consider him a closet cognitive neuroscientist). At one point Daniela Schiller, a postdoc in cognitive neuroscience at NYU, joined us on drums at a party. When we got the Brooklyn gig, Daniela invited her research assistant, Nina Curley, to play bass. We marched off to the Secret Science Club as The Amygdaloids.

Since we were a neuroscience band, we decided to focus our set on songs about mind and brain and mental disorders. We played some covers (Manic Depression, 19th Nervous Breakdown) and a couple of original tunes I wrote (Mind Body Problem, All in a Nut- the latter being about the amygdala, which is named after the Greek work for almond). A local newspaper described our show with the label, "Heavy Mental."

By the summer of 2007 we had put together enough original tunes to put out a CD, which we of course named "Heavy Mental." Included were songs like A Trace (a love, song about the synaptic basis of memory), Inside of Me (a lyrical portrayal of Descartes' idea that one's mind can only be truly known internally), When the Night Is Dark (about fear), Mind Body Problem (a love song about the struggle between passion and reason), and An Emotional Brain (also about the difficulty of controlling our emotions). The songs were recorded at Axis Studios in Manhattan (Jeff Peretz and Steve Rossiter, producers).

We have played in various juke joints and clubs around NY, and also in some pretty fancy venues – Madison Square Garden (for NYU's graduation) and the Kennedy Center in Washington DC (a real gig). The big venues are fun, but there's something special about dark, dank clubs like Kenny's Castaways, where decades of beer fumes exude from walls that once contained the sounds of Dylan, Springsteen, and other giants in their early days. Some of the other clubs we have played around New York include Arlene's Grocery, Don Hill's, Lakeside Lounge, and Otto's Shrunken Head.

By 2008 we had a whole new set of songs, and had hooked up with an independent record label, Knock Out Noise, which agreed to record our second CD. It was originally titled, "Brainstorm," but eventually formally released as "Theory of My Mind (**Figure 1**)." This title obviously is a play on the notion in psychology and philosophy known at theory of mind, the idea that social interactions between people involve in part the ability to put oneself in the shoes (mind) of another. Theory of My Mind featured Grammy winning artist Rosanne Cash doing back up vocals on two songs, and Simon Baron-Cohen, leading proponent of idea that theory of mind is impaired in autism, on bass. "Theory of My Mind" had a lot of music on it, 13 songs in all. We explored questions about memory (Mist of a Memory; Glue), free will (How Free Is Your Will; Crime of Passion; the Automatic Mind), the relation of mind and matter (Mind Over Matter; Piece of My Mind), emotions (Fearing), mental time travel (Imaginate), neurophysiology (Refractory Time), mental illness (Brainstorm), and, of course, theory of mind (Theory of My Mind).

In addition to getting to record with Rosanne Cash, we have gotten to play with or be on the same bill with, some amazing musicians, including Lenny Kaye (Patti Smith Group), Steve Wynn (Dream Syndicate, The Baseball Project), Gary Lucas (Capitan Beefhart; Gods and Monsters), Dee Snider (Twisted Sister), The Kennedys, and Rufus Wainwright.

Tyler, Daniela, and I have been the core of the band from the start. Bass has been a moving target. When Nina left NY to pursue other interests, I brought in Gerald McCollam, an old friend who had run the teaching lab in the Center for Neural Science several years earlier. But then his freelance programming work took him back to Louisiana, where we both grew up. Recently, Amanda Thorpe took over bass. Amanda has an angelic voice, and an independent career in the New York music scene. She fits the band well having studied neuropsychology at University College London.

I get asked quite a lot about the relation of music and the brain. But that's not really my area of research. I try to connect music and the brain lyrically rather than through scientific activity. I do not have any formal musical training, and do not really know what the right questions to ask are. I suppose I could come up with some studies of emotion and music, but have not felt that urge. Some

**Figure 1 | The jacket cover of "Theory of My Mind" (released by Knock Out Noise, 2010).**

really talented scientists are involved in this field, such as Robert Zatorre, Dan Levitin, Mark Tramo. They are doing a fabulous job of uncovering all sorts of fascinating things about how the psychology of music relates to the biology of the brain. I am just happy to writes songs and play music.

Music is not just fun. It is also a great communication device. I have used books to communicate science to the-public in the past. My two main books, The Emotional Brain (1996) and Synatpic Self (2002) have reached a large audience around the world. Now I am taking a different tack. I write songs with little nuggets of information about mind and brain and mental disorders as a way to hopefully stimulate interest in brain and mind, hoping the listener might then hunger to know more. Maybe they will go to the web, or even pick up a book on the brain. It is an experiment, the outcome of which is still unknown. Perhaps it will work. In the meantime, even if it does not, The Amygdaloids are having a good time.

# **Supplementary Material**

The Audio files and Video files for this article can be found online at http://www.frontiersin. org/human\_neuroscience/10.3389/ fnhum.2011.00049/full and http://www. youtube.com/watch?v=AMI3hbgRj6o

# **References**


Squire, L. R. (1986). Mechanisms of memory. *Science*  232, 1612–1619.


*Received: 28 April 2011; accepted: 09 May 2011; published online: 01 June 2011.*

*Citation: LeDoux J (2011) Music and the brain, literally. Front. Hum. Neurosci. 5:49. doi: 10.3389/ fnhum.2011.00049*

*Copyright © 2011 LeDoux. This is an open-access article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.*

# **MUSIC AND THE AUDITORY BRAIN: WHERE IS THE CONNECTION?**

**Israel Nelken**

# Music and the auditory brain: where is the connection?

# *Israel Nelken\**

*Department of Neurobiology, The Interdisciplinary Center for Neural Computation and The Edmond and Lily Safra Center for Brain Research, Hebrew University, Jerusalem, Israel*

#### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Carles Escera, University of Barcelona, Spain Robert J. Zatorre, McGill University, Canada*

#### *\*Correspondence:*

*Israel Nelken, Department of Neurobiology, Hebrew University, Edmond J. Safra Campus, Givat Ram, Jerusalem 91904, Israel. e-mail: israel@cc.huji.ac.il*

Sound processing by the auditory system is understood in unprecedented details, even compared with sensory coding in the visual system. Nevertheless, we do not understand yet the way in which some of the simplest perceptual properties of sounds are coded in neuronal activity. This poses serious difficulties for linking neuronal responses in the auditory system and music processing, since music operates on abstract representations of sounds. Paradoxically, although perceptual representations of sounds most probably occur high in auditory system or even beyond it, neuronal responses are strongly affected by the temporal organization of sound streams even in subcortical stations. Thus, to the extent that music is organized sound, it is the organization, rather than the sound, which is represented first in the auditory brain.

**Keywords: music, auditory, brain, neurons**

# **MUSIC AND THE AUDITORY SYSTEM**

When I started studying the auditory system, I claimed that I wanted to understand why monkeys prefer listening to the Rolling Stones rather than toMozart. The monkeys I referred to were, obviously, not *Macaca mulatta*, but rather a subspecies of *Homo sapiens* that I disliked at the time. However, only today, many years later, I can perceive the real conceit of the younger me: the assumption that by studying the auditory system I will be able to understand the reactions of humans to the fine distinctions that separate Rock music from classical music. The "fine" should be taken seriously: a corollary of the argument I want to make in this perspective is that in terms of our current understanding of auditory processing, there is not much difference between the two.

Having already said"music,""auditory processing,"and"understanding," I have to define the scope of my argument. I will not try to define music beyond the trivial remark that while music has to do with sounds, not all sound is music. For example, I do not consider the"music of nature," sounds in the natural environment, to be music, in the same sense that a magnificent sunset above the hills west of Jerusalem is not art. So what I am interested in has to do with the fact that someone took many sounds and organized them in some way – music includes not only sound, but also organization, both in sound space and in time.

Contrary to this vague definition of music, when I say "auditory processing," I have something very definite in mind – I mean the biological processes (my bias being toward the electrical ones) that occur between the vibration of the tympanic membrane at one end and the spiking activity of neurons in the auditory brain at the other end. I will almost completely ignore here evidence from fMRI, which at best can give some hints as to the location of active neurons; and most evidence from EEG and MEG, which, while measuring electricity, reflect only distantly the actual spiking activity of neurons. Thus, my view of auditory processing in this perspective is unabashedly neuron-centric: by "understanding" I mean the reduction of (some of) the phenomenology of music

into neural mechanisms, spikes, synaptic currents, ion channels and all.

Finally, I have to define the parts of the brain I am considering as auditory. This is a surprisingly hard question. While the auditory nerve, the multiple brainstem auditory centers with their intricate analysis of auditory neural signals culminating in the hugely complex inferior colliculus, the medial geniculate body, and the primary auditory cortex are all clearly parts of the auditory brain, there are many other brain areas where there are auditory responses but which are not considered as auditory. These include, for example, the amygdala (Quirk et al.,1997), the superior colliculus (Middlebrooks and Knudsen,1984), the hippocampus (Edeline et al., 1988), and the cerebellum (Huang and Burkard, 1986), to name just a few subcortical centers; and many cortical areas that lie beyond the classical auditory cortex (e.g., Cohen et al., 2004). For the purpose of this perspective, I will concentrate on the "core" auditory regions, those parts of the brain that would be considered as "the" auditory system in a textbook of the nervous system – the subcortical ascending auditory pathways, primary auditory cortex, and the surrounding fields.

I should immediately admit the limitations of this strongly reductionist approach. First, I am limiting myself to (mostly) data from animal studies. At the early stages of processing that I am considering, mammalian brains are reasonably similar to each other so that this is probably not a serious constraint. Second, a phenomenon as complex as music cannot be reduced to the responses of single neurons, but would require studying simultaneously the responses of many neurons distributed throughout the brain. Even an account as reductionist as the one I am considering here would require taking into account such brain-wide networks; however, my argument will be based on evidence from single-neuron responses only. Third, the brain areas I am considering are rarely those specifically activated by music in human imaging studies (e.g., Janata, 2005). As I will argue below, processing of relevance for music is performed in these areas, in spite of the generally negative evidence from human imaging studies.

With these cautionary notes out of the way, here is the main argument of this perspective: with our current understanding of the auditory system, we stand in the paradoxical situation in which we do not understand "sound," while we have a strong handle on "organization." Thus, the low-level representations of sounds on which music is based are badly understood, and may in fact occur only "higher up" in the brain, outside the auditory brain as defined here. On the other hand, high-level aspects of music, such as sound organization in time, are strongly reflected within this same auditory brain.

# **SOUNDS AND THE AUDITORY BRAIN**

I will pursue the first part of my argument in two ways. I will first argue that we do not quite understand where and how the low-level properties of sounds, such as pitch and timbre, are represented in the auditory system. I will then argue that this is really a side effect of an even larger gap in our understanding – the fact that we do not understand the relationships between the pressure waves that cause the tympanic membrane to vibrate, and the introspective percept we call sound, which is very far removed from both the physical vibration that initiated it and from the representation of these vibrations in the auditory system (at least as defined here).

Let us consider sound processing in the auditory system and its relationships to a fundamental property of sound that is used in music – pitch. Pitch is without doubt one of the most important properties of sounds with which humans do music. The major physical correlate of pitch is periodicity (not frequency!), but this is not an absolute identification – there are periodic sounds that do not elicit pitch at their periodicity, and non-periodic sounds that do elicit pitch (Schnupp et al., 2011, Chapter 3). Most importantly, pitch represents an abstraction: many different sounds have the same pitch (e.g., a violin, cello, trumpet, flute, and a piano all playing the same note, see https://mustelid.physiol.ox.ac.uk/ drupal/?q = topics/same-melody-different-timbre).

This abstract quality of pitch has consequences to our understanding of the coding of pitch in the auditory brain. To start with, it is often argued that since auditory nerve fibers follow the periodicity of sounds evoking pitch, pitch is coded in the auditory nerve. I believe that this is seriously wrong.

The heart of the matter is the fact that periodicity may depend on spectral content in a wide frequency band, while auditory nerve fibers are narrowly tuned; in general a single auditory nerve fiber simply does not "hear" enough of the sound in order to respond to the right periodicity. Thus, a neuron whose best frequency is 200 Hz will respond roughly similarly to a sound with a periodicity of 100 Hz containing a prominent second harmonic and to a sound with a periodicity of 200 Hz with a prominent fundamental, and may not respond at all to a sound with a pitch of 200 Hz which misses its first few harmonics. In other words, the response of an auditory nerve fiber tuned to 200 Hz is neither sufficient nor necessary for a sound to be perceived as having a pitch of 200 Hz.

This fact is well known, but is usually handled by claiming that it is the activity in the whole array of auditory nerve fibers that represents the pitch of a sound. This claim is in a way true – by observing the array of auditory nerve fibers, it should certainly be possible to determine the pitch of a sound. After all, when we listen to sounds, we extract pitch from the auditory nerve activity pattern all the time. However, this claim also misses the point, in two ways.

First, such a claim does not solve the problem of the coding of pitch. Somewhere in the brain, some structure has to take the array of activity of the auditory nerve fibers, and use it to extract the invariant representation of pitch (or so we intuit), so claims about "population coding" just shove the problem of pitch coding away without solving it. There is no extra explanatory power in the claim that the auditory nerve fibers represent pitch relative to the claim that the pressure vibrations in the air represent pitch.

Second, and possibly even more importantly, the array of auditory nerve fibers represents not only pitch, but also all other perceptual properties of sounds. The same fibers whose responses contain information about the pitch also carry information about timbre and loudness. In fact, in as much as we can talk about representations in the auditory nerve, the array of auditory nerve fiber represents very clearly one thing – the physical vibrations of the tympanic membrane. It does not even represent the abstract quantity called periodicity, not to mention the perceptual quality called pitch.

What about stations higher up in the auditory pathway? There is a substantial and important work on the coding of pitch in the brainstem. As in the case of auditory nerve fibers, brainstem neurons follow the periodicity of the acoustic stimulus, but the dominant sound representations all the way up to the inferior colliculus share with the auditory nerve fibers the narrow width of tuning of each individual element and the high sensitivity to many (if not all) properties of sounds. Thus,while periodic sounds evoke strong periodic activity in the brainstem (e.g.,Winter et al., 2001), there is no convincing evidence that the brainstem (even the inferior colliculus) has an explicit representation of pitch (Reviewed in Schnupp et al., 2011, Chapter 3). In fact, the most convincing suggestions for the structure(s) that perform this abstraction,from sounds to their pitch, are far up in the auditory hierarchy, certainly above primary auditory cortex, both in humans (Patterson et al., 2002; Hall and Plack, 2009) and in non-human primates (Bendor and Wang, 2005). This is, in a way, a rather surprising finding. Sounds go all the way from the periphery to primary auditory cortex and above without an explicit assignment of pitch. And without a pitch representation, it is hard to see how music can be represented.

I believe, however, that the gap between music and the current understanding of the auditory system is much wider than this upside-down result. In my discussion of pitch coding, I ignored a crucial facet of real-world sounds: contrary to most auditory experiments (including many of mine), we usually hear more than one "sound" at each moment in time. For example, while typing this manuscript, I hear the low rumble of the power supplies of the many computers in my lab, a merle singing outside the window of my office, and the tick tack of the keys I hit while typing. My auditory nerve fibers carry information about the mixture, not about any particular component of it. There is an important corollary here – at the level of the auditory nerve, many pitches may be present concurrently. In as much as this is music, these different pitches have at least some individual existence. However, it is hard to think of ways of identifying the different pitches without at the same time also separating out the different bits of sounds that sum up to produce the mixture at the ear (Schnupp et al., 2011, Chapter 6).

This argument puts in the foreground the need to understand the transformation that occurs in the brain between the physical stimuli and the "objects of perception," those things that carry the perceptual properties we attribute to sounds such as pitch, timbre, spatial location, and so on. Music is done to a significant degree with these "objects of perception" – the individual tones composing a chord,melody as separate from its accompaniment, and so on and so forth. In the last 10 years or so, electrophysiologists studying the auditory brain came to appreciate the great importance of a loose collection of processing tasks called auditory scene analysis (ASA) whose goal is to form these objects of perception (Bregman, 1990). In fact, I consider ASA, in a wide enough sense, as the major processing task of the auditory system. Thus, understanding how neurons do ASA is a necessary step toward understanding music in the brain.

So, how much do we understand ASA in neuronal terms? Consider one important advance in understanding the implementation of ASA in the brain: the recent spate of work on streaming. In a typical streaming experiment, two sounds are presented alternately to the subject. If the difference between the two sounds (e.g.,frequency separation between two pure tones) is large enough, and/or if they are played fast enough, the sequence of sounds breaks down perceptually into two "things," each containing one of the two sounds (hear the illustration at https://mustelid. physiol.ox.ac.uk/drupal/?q = topics/streaming-galloping-rhythmparadigm). Bregman (1990) named the two "things" streams. The groundbreaking work of Fishman et al. (2001) led to a neural account for the breakdown process of the single sequence of pure tones into two streams: they showed that under conditions in which a single sequence is heard, neurons in auditory cortex of macaques tend to respond to both tones, while when a breakdown occurs they tend to respond to either one tone or the other. Using these ideas, Micheyl et al. (2005) showed that the dynamics of the breakdown process in human listeners can be accounted for by the dynamics of neural responses in auditory cortex of macaques. Recently, Elhilali et al. (2009) remarked that there should be also an important role for the temporal incoherence of the neuronal responses to the two tones in the two-stream condition, adding yet another component to the neural model of streaming.

While these are significant advances in the process of linking the perceptual phenomenon of streaming with neural responses, it is important to realize that these studies did not find a neural representation of streams. The neurons studied by Fishman, Micheyl, Elhilali and their colleagues just responded to the individual sounds in the sequence. Instead, these studies demonstrate properties of neural responses that may be used by a hypothetical (but at this point possibly mythical) next layer to create streams. Thus, important as they are, these studies do not solve the issue of the representation of streams in the auditory system.

To the best of my knowledge, there is only one non-trivial example of the end-product of ASA in neural hardware: the specific responses of neurons in cat auditory cortex to the background components of natural sounds (Bar-Yosef and Nelken, 2007; Nelken and Bar-Yosef, 2008). In these experiments, short segments of natural recordings of bird songs have been used. These segments were digitally processed to remove the bird songs, preserving only the background rustling. Many neurons responded to the natural sound with similar responses to those they emitted when presented with the background alone, but responded differently when presented with the clean bird song. These neurons respond to one bit of the sound independently of the presence of other bits of sounds,which may be substantially louder inside their frequency response area. Unfortunately, the neural mechanisms leading to such responses have not been worked out.

# **ORGANIZATION AND THE AUDITORY BRAIN**

Here comes what is, for me, the most surprising twist in the plot. There is in fact significant amount of processing in the auditory brain which I find highly relevant for music. However, it does not have much to do with the "sound" of music, but rather with the "organization" of music.

The phenomena I want to emphasize here occur at a time scales of seconds to minutes. Responses of neurons to sounds turn out to depend on the recent history of sound presentations. Early clues to these effects have been known for many years. For example, Condon and Weinberger (1991) showed a strong depression in the response to a frequency that pipped continuously for a few minutes, but this depression did not generalize to other, nearby tone frequencies, and therefore did not represent a "fatigue" of the neuronal responses.

It was however the introduction of the oddball paradigm into single-neuron studies by Ulanovsky et al. (2003) that really energized the study of context sensitivity in the auditory system. The oddball paradigm has been extensively used in human studies (Naatanen et al., 2007) to study the important component of the auditory event-related potentials called mismatch negativity (MMN). Ulanovsky et al. (2003) adapted this paradigm to singleneuron studies. In a typical oddball experiment, two tones are presented in a sequence, one common and one rare. In a different sequence, the two tones are again presented but with their roles reversed. The typical result of such experiments is that the response to the same tone may be substantially larger when rare than when common. This effect, named "stimulus-specific adaptation" (SSA, Ulanovsky et al., 2003) when considered in the context of singleneuron responses, has been now studied by a large number of groups and shown to be present in auditory cortex of anesthetized cats (Ulanovsky et al., 2003); awake rats (Von Der Behrens et al., 2009); the inferior colliculus of rats (Malmierca et al., 2009); and the medial geniculate body of rats and mice (Anderson et al., 2009; Antunes et al., 2010).

Stimulus-specific adaptation is relevant to music because it shows that the responses of neurons to the same sound depend on the organization of its recent past. In the case of oddball sequences, this is a rather simple dependency – the less common the sound, the larger the response it evokes. However, recent work in my laboratory (Taaseh et al., 2011) compared the responses evoked by the same tones in a number of different sequences, showing for example that the responses to a rare tone played with a common sound are shaped by different mechanisms than the responses to a tone that is rare, but played together with many different sounds, all of whom are rare. Similarly, the responses to the two tones in an oddball sequence depend on whether the sequence is regular, with fixed intervals between presentations of the rare tone, or whether the sequence is random, with a fixed probability of presentation of the rare tone. Thus, the neuronal responses in auditory cortex do not depend only on the probability of the tones, but also on fine details of their order (Nelken et al., 2010). Furthermore, SSA is not limited to pure tones – frozen tokens of white noise evoke SSA when played in an oddball configuration (Nelken et al., 2010). Thus, SSA in primary auditory cortex of anesthetized rats seems to engage mechanisms that are sensitive to the detailed history of the sound sequence, and not only to the rarity of the rare pure frequency tone.

# **SO WHAT?**

The auditory brain shows little evidence of sound representations in terms of their perceptual qualities, and, even worse, does not even seem to represent sounds at all, at least in the usual everyday sense of the use of the word "sound." Instead, the auditory brain seems to represent, quite well, the physical vibrations of the tympanic membrane. At the level of the auditory cortex there are some hints of representations that either emphasize features of sounds that can be used later to create "objects of perception" or "streams" (I am vague on purpose), or even already separates

# **REFERENCES**


sound mixtures into their components. It is in this sense that classical and Rock music are not that different from each other. Most low-level descriptors of the two would not be too far from each other – overall spectral range (with a possible advantage to Rock music at very low frequencies), typical rates of spectral and temporal modulations, and all of these other properties that modulate the responses of auditory neurons in the early parts of the auditory brain.

While we struggle with the nature of sound representations in the auditory brain, it is singularly easy to observe the signature of sound organization on the neural responses, starting as early as the inferior colliculus. Thus, organization is reflected in the neural responses of the auditory brain more strongly, and at earlier stages, than sounds (in the sense of "objects of perception"). This is nonintuitive (at least to me). Taken to the extreme, this state of affairs may mean that in the "organized sound" that music may be, we may have easier time accounting for the "organized" than for the "sound" within the confines of the auditory brain. Thus, it may well be that when brains process music, organization comes first, and sound only follows.

#### **ACKNOWLEDGMENTS**

This work was supported in part of grants from the Israeli Science Foundation (ISF) and the German-Israeli Foundation (GIF).

space in the cat's superior colliculus. *J. Neurosci.* 4, 2621–2634.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 19 May 2011; accepted: 10 September 2011; published online: 27 September 2011.*

*Citation: Nelken I (2011) Music and the auditory brain: where is the connection? Front. Hum. Neurosci. 5:106. doi: 10.3389/fnhum.2011.00106*

*Copyright © 2011 Nelken. This is an open-access article subject to a nonexclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.*

# **AUDITORY SCENE ANALYSIS: THE SWEET MUSIC OF AMBIGUITY**

# **Daniel Pressnitzer, Clara Suied and Shihab Shamma**

# Auditory scene analysis: the sweet music of ambiguity

# *Daniel Pressnitzer 1,2\*, Clara Suied1,2,3 and Shihab A. Shamma1,2,4*

*<sup>1</sup> Centre National de la Recherche Scientifique and Université Paris Descartes, UMR 8158, Paris, France*

*<sup>2</sup> Département D'études Cognitives, Ecole Normale Supérieure, Paris, France*

*<sup>3</sup> Fondation Pierre-Gilles de Gennes pour la Recherche, Paris, France*

*<sup>4</sup> Electrical and Computer Engineering, University of Maryland, College Park, MD, USA*

#### *Edited by:*

*Robert J. Zatorre, McGill University, Canada*

#### *Reviewed by:*

*Joel Snyder, University of Nevada Las Vegas, USA Pierre Divenyi, Veterans Affairs Northern California Health Care System, USA*

#### *\*Correspondence:*

*Daniel Pressnitzer, Département D'études Cognitives, Ecole Normale Supérieure, 29 rue d'Ulm, 75230 Paris Cedex 05, France. e-mail: daniel.pressnitzer@ens.fr*

In this review paper aimed at the non-specialist, we explore the use that neuroscientists and musicians have made of perceptual illusions based on ambiguity. The pivotal issue is auditory scene analysis (ASA), or what enables us to make sense of complex acoustic mixtures in order to follow, for instance, a single melody in the midst of an orchestra. In general, ASA uncovers the most likely physical causes that account for the waveform collected at the ears. However, the acoustical problem is ill-posed and it must be solved from noisy sensory input. Recently, the neural mechanisms implicated in the transformation of ambiguous sensory information into coherent auditory scenes have been investigated using so-called bistability illusions (where an unchanging ambiguous stimulus evokes a succession of distinct percepts in the mind of the listener). After reviewing some of those studies, we turn to music, which arguably provides some of the most complex acoustic scenes that a human listener will ever encounter. Interestingly, musicians will not always aim at making each physical source intelligible, but rather express one or more melodic lines with a small or large number of instruments. By means of a few musical illustrations and by using a computational model inspired by neuro-physiological principles, we suggest that this relies on a detailed (if perhaps implicit) knowledge of the rules of ASA and of its inherent ambiguity. We then put forward the opinion that some degree perceptual ambiguity may participate in our appreciation of music.

**Keywords: auditory perception, perceptual organization, bistability, auditory illusions, music**

# **INTRODUCTION**

This paper aims at highlighting some cross-connections that, we argue, may exist between auditory neuroscience, perceptual illusions, and music. More precisely, we address the issue of auditory scene analysis (ASA). ASA refers to the ability of human listeners to parse complex acoustic scenes into coherent objects, such as a single talker in the middle of a noisy babble, or, in music, a single melody in the midst of a large orchestra (Bregman, 1990). It has long been and still is one of the hot topics of auditory neuroscience, with its share of important advances and ongoing controversies (e.g., Shamma and Micheyl, 2010 for a review). ASA has also been studied in a musical context, with the hypothesis that many of the established rules of polyphonic writing in the Western tradition may be underpinned by perceptual principles (Huron, 2001). The aim here is not to repeat those arguments, but rather to provide a brief review, aimed at the non-specialist and biased toward perceptual illusions: we argue that illusions seem to be both a powerful investigation tool for neuroscientists and an important expressive device for musicians.

We will first briefly discuss the potential of illusions to reveal fundamental principles of perception in general. We will then describe the problem that ASA has to solve and what we know of the neural processes involved. For our purposes, we will emphasize recent studies, both behavioral and neuro-physiological, that have made use of so-called bistability illusions based on ambiguous stimuli. Finally, through a few musical illustrations and a computational model, we will suggest that perceptual ambiguity has been part of the composer's repertoire for quite some time. We will then conclude by speculating on the potential role of ambiguity in the appreciation of music.

# **ILLUSIONS AS A SIGNATURE OF PERCEPTUAL INFERENCE**

Illusions are a vivid way to remind us about some basic but essential facts about perception. Take for instance the change-blindness illusion from vision (e.g., O'Regan et al., 1999)1. In a changeblindness paradigm, major parts of an image or a film can be modified, in full "sight" of the observer, but these changes will go unnoticed if they are not attended to. This has been taken as strong confirmation that visual awareness is not the result of the passive and obligatory registration of sensory information impinging on the retina, but rather, it is by essence an active exploration of the visual scene (O'Regan and Noe, 2001).

Another useful example is the classic Ponzo illusion (e.g., Murray et al., 2006), illustrated in **Figure 1**. Here, all blue objects in the figure have the same physical length, but they are usually perceived as one being taller than the other – even when the observer is fully aware that she/he is being"tricked."But are we really being tricked? Arguably, quite the opposite: the illusion reveals that we are able to

<sup>1</sup>A particularly dramatic illustration of the illusion can be found here: http://youtu.be/voAntzB7EwE

make sense of complex sensory information on the basis of ecological plausibility. We are not really interested in the size of the image projected on the retina by the two objects, what is termed the proximal information. Rather, we would like to know what the size of each object is likely to be in the external world, what is termed the distal information. The perspective lines suggest that the second object is located further away than the first one. Somehow, the visual system is able to recognize that fact. As a result, the same proximal length on the retina is "seen" as two different distal sizes. The (useful) distal size is what enters awareness.

Note that, as introspection suggests, this does not seem to be a result from a laborious and voluntary reasoning about the laws of optics on the part of the observer: brain-imaging showed that the even the early stages of visual processing (primary visual cortex, V1) were modulated by the size illusion (Murray et al., 2006). Furthermore, the salience of visual illusions seems to increase with development: the mature visual system is more susceptible to it, perhaps because it "knows" more about the laws of optics thanks to experience (Kovacs, 2000; Doherty et al., 2010). More susceptibility to illusion means more accuracy in interpreting the visual scene.

How is this possible? Illusions have been studied since the beginning of experimental psychology, so any definitive answer would prove incomplete and controversial. The only point we wish to make here is that illusions seem to highlight the ongoing interaction between sensory input, which is noisy and inconclusive by nature, and some knowledge about the world that is embodied in perceptual systems (Barlow, 1997; Gregory, 1997, 2005). The precise way this is achieved is still a matter of debate. In vision, the Bayesian framework, which explicitly takes into account prior knowledge about the structure of the world, has been shown to account for several behavioral and physiological findings (e.g., Kersten and Yuille, 2003; Kersten et al., 2004 for reviews). Interestingly, in this framework, some perceptual illusions actually appear as optimal percepts given some simple prior rules governing physical objects (Weiss et al., 2002). Note that alternative schemes exist, where the observer does not try to make inferences about the state of the world (Purves et al., 2011 for a recent review). Here, perception keeps track of previous experiences in order to disambiguate

future experiences by learning for instance the statistics of natural images. In all frameworks, we would suggest that illusions are an adaptation of perceptual systems to the regularities of the external world.

Illusions thus serve to illustrate a very simple but important idea. Perception is an active construct, more akin to a moment-bymoment gambling process than to a rolling camera or open microphone. In more elegant terms, Helmholtz (1866/1925) famously suggested that perceptual awareness was built from a succession of "unconscious inferences," aiming at producing the hypotheses about the state of the world that are most beneficial for guiding behavior. In the following, we will entertain the view that, as ASA also has to deal with ambiguous sensory data, it can be approached as a problem of perceptual inference.

# **AUDITORY SCENE ANALYSIS: THE PROBLEM AND HOW THE AUDITORY SYSTEM MAY SOLVE IT THE ACOUSTIC PROBLEM**

# Among the various current opinions on ASA, there seems to be at least one that reaches a consensus among investigators: realistic auditory scenes can be rather complicated. At the core, sound is a one-dimensional physical phenomenon. An acoustic pressure wave impinges on one of our eardrums and it can only do one of two things: it can push the tympanic membrane a little bit, or it can pull it a little bit. Moreover, there is no occlusion between acoustic waves originating from different sources: as waves propagate through the air, they sum linearly at each point. As a consequence, at any moment in time, the little push or pull on the eardrum may be caused by one sound source out in the world, but it may also be caused by two sound sources, or by many sound sources, the number of which is unknown (**Figure 2**). This is what is known as an ill-posed problem in mathematics. There are too many unknowns (in fact, an unknown number of unknowns) for too few observations. The problem cannot be solved without further assumptions.

Each author has, at one point or another, tried to convey the intricacy of auditory scenes by a metaphor. Helmholtz (1877) evokes the interior of a nineteenth century ball-room, complete with "a number of musical instruments in action, speaking men and women, rustling garments, gliding feet, clinking glasses, and so on." He goes on to describe the resulting soundfield as a "tumbled entanglement of the most different kinds of motion, complicated beyond conception." His choice of metaphor may not have been totally neutral. The complexity of natural acoustic scenes is clearly large in general, but that of *musical* acoustic scenes can be positively daunting. Consider for instance the picture of **Figure 3A**, a photograph taken before the première of Gustav Mahler's eighth symphony. This work has also been dubbed the "Symphony of a Thousand," an obvious reference to the size of the orchestra and choir. An illustration of the resulting acoustic waveform (Sound Example S1 in Supplementary Material) is presented in **Figure 3B**. It seems impossible to infer, from there, how many sources where present and what they were doing.

But is the auditory system really expected to make sense of each and every one of the acoustic sources? This is not the case, fortunately. Mahler has a thousand potential acoustic sources at the tip of his finger when writing his score, but he will in fact use various

devices to create only a tractable number of concurrent auditory objects (this tractable number may be rather low for the listener, Brochard et al., 1999). He does that by means of what could rightly be termed auditory illusions (we know there are many sources, we hear only one melody). This is a first hint of the intricate connections between ASA, illusions, and music, to which we will come back later.

# **THE SENSORY PROBLEM**

Auditory processing begins by the transduction of the onedimensional physical motion caused by acoustic waves into patterns of nervous activity in the auditory system. Because of the biophysics of the cochlea, the sensory receptor for hearing, this acoustic information is first broken into frequency sub-bands. The detailed mechanics of this transformation are beyond the scope of this paper, but for a review, see Pickles (2008). This so-called tonotopic organization is broadly preserved along the auditory pathway up to at least the primary auditory cortex.

When applied to a piece of music, tonotopic organization produces a representation similar to the illustration of **Figure 3C**. Tonotopy seems to help revealing patterns that were not obvious from the sound-wave. However, it also produces challenges of its own: now, the energy produced by a single sound source will be spread over several frequency channels and, consequently, will recruit distinct sets of sensory neurons. The problem that ASA has to solve can thus be rephrased as follows: given the flow of sensory information from the cochlea, which resembles a time–frequency analysis, the listener must determine the likely combination of physical sources in the world. Unfortunately, this is still an illposed problem. An exact solution being impossible, perceptual gambling must begin.

#### **CUES TO ASA**

A vast amount of psychophysical data has been accumulated on the topic of ASA (sometimes also referred to as the cocktail party

**FIGURE 3 | The problem faced by auditory scene analysis. (A)** Music creates acoustic scenes with a large number of potential sound sources, as illustrated by this picture taken before the American premiere of Mahler's eighth symphony – dubbed the "Symphony of a Thousand." *Image source: Wikipedia*. **(B)** The acoustic waveform of the first few seconds of Mahler's eighth symphony (see also Sound Example S1 in Supplementary Material). At any moment in time, the information available to the auditory system is the pressure value at the ear. This value may be reflecting vibrations from an unknown number of physical objects. The challenge of auditory scene analysis is to infer the most likely distal causes that account for the proximal pressure values. **(C)** Cochleogram of the waveform in **(B)**. The picture was obtained by passing the acoustic waveform through a model simulating the early stages of auditory processing (Shamma, 1985). The acoustic information is now spread over a two-dimensional time–frequency plane, as would be the case in the tonotopic channels of, e.g., the auditory nerve. The challenge is now at least twofold: to group all the activity belonging to one source and only to that source, even though it may be spread over many tonotopic channels; to bind over time the activity produced by a given source over time. In spite of the problem of being ill-posed, human and other animals are remarkably able at solving it and we are able to follow, e.g., a single speaker in a noisy environment. However, in the case of music, scene analysis usually fails: we cannot hear out each and every singer of the choir, even though they are distinct sound sources. This is precisely one of the points of the paper: how composers steer the inherent ambiguity of auditory scene analysis to achieve "illusory" musical effects.

problem, Cherry, 1953). A classic book also exists, which gave its modern name to the field (Bregman, 1990). More recent reviews are available (Carlyon, 2004; Snyder and Alain, 2007; Shamma and Micheyl, 2010). Here we will not go into any details, but rather sketch two possible accounts of ASA while emphasizing the role of inference processes in both of them.

Bregman (1990) suggested that ASA may be broken into two sub-problem. The first one is local in time and is termed vertical organization. Vertical refers to the frequency axis of **Figure 3C**: at any given moment in time, ASA needs to parse the distribution of energy over frequency channels into its plausible distal cause(s). The issue is twofold: acoustic sources are in general complex, so they produce activation over several auditory channels. It is thus important to be able to recognize these channels as belonging to one source. Also, pressure waves originating from different acoustic sources add up with each other, so it may be useful to be able to parcel out the contribution of each source to any given patch of activity in the time–frequency plane.

The general principle of ASA for Bregman seems to be one of perceptual inference based a heuristic ensemble of cues, each expressing a little knowledge about the way the acoustic world usually behaves. For instance, a cue to vertical organization is harmonicity. Harmonicity refers to the fact that any periodic sound can be represented by a stack of pure tones, and that these tones will exhibit a harmonic relationship: their frequencies will all be integer multiples of a fundamental frequency, *f*0, corresponding to the inverse of the repetition period. Harmonicity is a strong cue: it would be particularly unfortunate that several independent distal sources satisfied the harmonic relation just by chance. On the contrary, a harmonic relation is the obligatory correlate of any periodic sound. Accordingly, when we hear harmonic series, perceptual awareness is overwhelmingly that of a single source and not of a disparate collection of pure tones. However, there are many instances of natural sources which do not produce fully periodic sounds and hence exact harmonic series (piano strings for instance). So, the harmonicity cue must include some tolerance (Moore et al., 1986). There are many other cues to vertical organization, each having a strong or weak effect on the perceptual outcome: location (Darwin, 2008), onset synchrony (Hukin and Darwin, 1995), spectral regularity (Roberts and Bailey, 1996), to cite a few. Importantly, just as is the case for harmonicity, none of the cue is infaillible and all are potentially corrupted by noise.

The other sub-problem of ASA is termed, predictably, horizontal organization. It refers to the horizontal time axis of **Figure 3C**. Acoustic sources tend to extend over time, and sound sources do not necessarily produce energy in a continuous fashion. It seems nevertheless advantageous to consider a series of footsteps as a single source, and not to reset the perceptual organization of the scene for each step. For horizontal organization, a putative distal source is also called a "stream." Musical melodies are a prime example of streams.

The cues to horizontal organization, again, seem to follow the plausibility principle. Because of the physics of sound production, an acoustic source will tend to be slowly varying over time. It is unlikely that two consecutive sounds produced by the same source, such as a single talker, will display in rapid succession wide differences in pitch, timbre, or location. Streams will thus favor the grouping together of sounds that are perceptually similar, and segregate sounds which are perceptually dissimilar. Any similarity cue seems to be able to subserve streaming (Moore and Gockel, 2002). Just like for vertical organization, the precise degree of dissimilarity that can be tolerated within a single stream seem to be highly variable, but more on that in Section "Visual Bistability."

Recently, what seems to be a radically different account of ASA has been proposed (Elhilali et al., 2009; Shamma et al., 2011). It suggests that there is one simple and general principle that could govern the formation of auditory streams. The general idea is that sound is analyzed through a multitude of parallel neural channels, each expressing various attributes of sound (periodicity, spatial location, temporal and spectral modulations, etc.). The problem of ASA is then to bind a sub-set of those channels together, with the aim that all channels dominated by a given acoustic source will be bound together and, if possible, not bound with channels dominated by other sources. The suggested principle is *temporal coherence* between channels (as measured by correlation over relatively long time windows). Coherent channels are grouped as a single stream, whereas low coherence indicates more than one stream.

In spite of many differences between these two frameworks, we would argue that there is a core connection between them, when one considers the need for perceptual inference for ASA. This is explicit in Bregman's case, as organization cues are based on the usual behavior of sound sources (even though the neural implementation of each heuristic is not always specified). In contrast, the coherence model does not seem to be easily construed as an inference model: it does not explicitly store knowledge about the external world, not does it include a "decision" stage. However, coherence is a direct and simple way to embody neurally a plausibility principle. Indeed, a single source will tend to induce coherent changes in all channels, irrespective of the nature of the channel. Moreover, these changes will be incoherent with those of other sources. Thus, more often than not, binding coherent channels will lead to isolating single acoustic sources. Note that coherence will never be a perfect trick, either: as soon as there is noise or more than one source in a given channel, choices need to be made among the likely candidates for binding.

This brief account of current issues in ASA is obviously oversimplified. In particular,we have not mentioned the crucial importance of learning and familiarity on the ability to extract information from a scene (e.g., Bregman, 1990; Bey and McAdams, 2002; Agus et al., 2010; McDermott et al., 2011), the role of attention (Thompson et al., 2011), or the strong multi-modal influences on the formation of perceptual objects (e.g., Suied et al., 2009). There are also many open issues that remain to be solved. However, we would argue that the general picture that emerges is that ASA truly behaves as if it were an inference process relying on a variety of sensory cues. These cues are evaluated from the proximal acoustic wave and concomitant neural activity, but they are also weighted with respect to their physical plausibility by means of a form of embodied knowledge (not necessarily explicit and not necessarily operating in a top-down manner) of some of the laws of the acoustics of sound sources.

# **BISTABLE ILLUSIONS AS TOOL TO PROBE THE PHENOMENOLOGY OF SCENE ANALYSIS AMBIGUITY AND ASA**

We have mentioned that a wide variety of cues can influence ASA. These cues must somehow be pooled to produce a single perceptual outcome that is able to guide behavior. Often, most cues point toward a highly plausible hypothesis in terms of the

number and nature of the distal sources. However, in many cases, the cues can also provide incomplete or even conflicting evidence. For instance, approaching footsteps can be obfuscated by background noise, but a single decision must be reached as to act or ignore. In fact, because of the very nature of ASA as an ill-posed problem, it can be argued that, fundamentally, the system cannot be fully certain of the distal information so there is *always* some degree of ambiguity to be resolved.

### **VISUAL BISTABILITY**

This is where perceptual illusions based on ambiguity enter the picture. The examples of **Figure 4** illustrate what is termed bistable perception in vision: an unchanging stimulus presented for a certain amount of time evokes spontaneous perceptual alternations in the mind of the observer. As the physical description of the stimulus does not match its subjective experience, bistable perception can rightly be termed an illusion. A variety of bistable illusions have been described by visual scientists. Reversible figures such as **Figure 4** are bistable (Long and Toppino, 2004). Binocular rivalry, where incompatible images are presented to the each of the two eyes, also produce alternations between images (Helmholtz, 1866/1925; Alais and Blake, 2005). Finally, there are bistable motion stimuli such as moving plaids (Hupé and Rubin, 2003)2.

These illusions are seemingly very diverse, but they all have two important things in common. First, they present the visual system with a profoundly ambiguous situation. The information that reaches the retina for **Figure 4** may well have been caused by two faces looking at each other, or, alternately by a vase. Second, it seems that confronted with such an insoluble dilemma, the perceptual system's response is to explore in turn the different possible interpretations (and not to consider an "average" interpretation, as two faces and a vase which contours match exactly is a highly unlikely situation). This is not an obvious outcome: it may well have been possible to imagine that the two alternative interpretations would have been simultaneously available to awareness, but apparently this is not the case. A moment-by-moment decision seems unavoidable.

Recent investigations of visual scene analysis have made extensive use of bistability illusions (for reviews, Leopold and Logothetis, 1999; Long and Toppino, 2004; Sterzer et al., 2009). This enduring interest is perhaps because bistability highlights fundamental processes involved in perceptual organization. As we argued for ASA, sensory scenes contain by necessity some degree of ambiguity. The problem of "inverse optics," just as "inverse acoustics," is ill-posed. Our perceptual systems constantly operate in this inference regime, but we are generally not aware of it because, fortunately, one highly plausible interpretation usually trumps all the others. That this interpretation mostly corresponds to reality is an impressive sign of the sophistication of perception, and not of the simplicity of the problem (as attempts at artificial vision and audition remind us). With this in mind, bistability can be seen as a clever trick by neuroscientists to study the general inference processes that are always at work in perceptual organization.

As an aside, it is interesting to consider the kind of neural models that have been proposed for visual bistability. Whereas some studies implicate higher brain regions such as pre-frontal cortex, which are most naturally associated with decision and inference (Sterzer and Kleinschmidt, 2007), there are also formalisms based on low-level competition between incompatible percepts (Lankheet, 2006) or non-linear neural dynamics (Kelso, in press) that achieve the same phenomenology. This highlights the fact that the "decision processes" we refer to here may take many different forms when implemented with neurons, some of them bottom-up and largely automatic.

#### **AUDITORY BISTABILITY**

The history of auditory bistability is much more modest than that of visual bistability, but recent studies suggest that it may also provide a useful experimental probe for ASA. A surprisingly simple paradigm already reveals the existence of auditory bistability: in its various forms, the "streaming paradigm" uses only two pure tones of different frequencies, arranged in repeating patterns (**Figure 5**). Depending on the frequency and time difference between the tones, listeners report either grouping all

<sup>2</sup>Demonstrations for the plaid stimulus are available at: http://audition.ens.fr/sup/.

tones together in a single melody (a single stream) or splitting the sequence in two concurrent melodies (two streams). Early on, it was noticed that perception of one or two streams could change across stimulus presentations for a same listener and even within a single presentation (van Noorden, 1975; Bregman, 1978). It had thus been remarked that streaming presents a "striking parallel" with apparent motion, a visual stimulus that is bistable (Bregman, 1990, p. 21). However, the changes in percept were usually assumed to be under the volitional control of the listener (van Noorden, 1975) and were not until recently subjected to the range of experimental and theoretical tools applied to visual bistability.

In fact, auditory streaming is a perfectly fine instance of bistability, as shown by a formal comparison between the perception of ambiguous stimuli in audition and vision (Pressnitzer and Hupé, 2006). In this study, the auditory stimulus was a streaming sequence (van Noorden, 1975; **Figure 5**), and the visual one consisted of bistable moving plaids (Hupé and Rubin, 2003; **Figure 4**). A common point between the two is that they can be perceptually grouped as a single object (a stream or a plaid), or split in two different objects (two streams or two sets of lines). Also, the "correct" organization is ambiguous. The dynamics of percept alternations in auditory streaming were found to display all of the characteristics that define visual bistability (e.g., Leopold and Logothetis, 1999). Percepts were mutually exclusive, that is, subjects reported successively one or two streams but rarely an intermediate percept between the two. The percept durations were random and followed a log-normal statistical distribution. Finally, the effect of volition was limited and highly similar between modalities: when instructed to try and maintain one perceptual interpretation in mind, observers were unable to lengthen the average duration of the target interpretation; rather, they were only able to shorten the duration of the unwanted interpretation. Other authors have independently strengthened the case for auditory streaming as a bistable phenomenon (Denham and Winkler, 2006; Kashino et al., 2007). Interestingly, in those studies, bistability for streaming seemed to be the rule rather than the exception as it could be observed over a surprisingly broad range of stimulus parameters.

An apparently unrelated example of auditory bistability can be found in a paradigm termed verbal transformations (Warren and Gregory, 1958). Listeners were presented with a rapid sequence of

**FIGURE 5 | The auditory streaming paradigm. (A)** Two different notes (two tones at different frequencies) are presented in a repeated high–low–high (HLH) pattern. Here, the frequency difference between the note is relatively small (two semitones), so when played at a reasonable tempo, the perception is usually of a single melody with a galloping rhythm (HLH–HLH–HLH...). **(B)** Same as A, but with a larger frequency difference (11 semitones). In this case, many listeners experience the perception of two concurrent melodies (H–H–H–... and –L–L–L...) that can be attended individually, but not simultaneously. The sound demonstrations are also available at http://audition.ens.fr/sup/

repeated verbal material, typically syllables or words (e.g., "life life life"). After prolonged exposure, most listeners reported subjective alternations between the original material and some transformed speech forms (e.g., switches between "life life life" and "fly fly fly"). Warren and Gregory (1958) argued that verbal transformations were the auditory analog of reversible figures in vision. Recently, Sato et al. (2007) and Kondo and Kashino (2007) confirmed that the dynamics of switches between speech forms were similar to other examples of bistable stimuli.

These examples suggest that, despite large acoustic differences, many of the stimuli used to study ASA may share a common point. When in the right regime, the decision processes involved in ASA are revealed by bistable perceptual switches, which display strikingly similar characteristics across all stimuli.

# **BISTABLE ILLUSIONS AS TOOL TO PROBE THE NEURAL BASES OF ASA**

A major interest of the bistability paradigm for neurophysiologists is that it dissociates the subjective report of the observer from the external stimulus. If a neural correlate of the changes in perceptual reports can be found, then it cannot be confounded by some passive propagation of the stimulus statistics (as those are unchanging). Rather, the correlate must be related to a brain network involved in creating the percept that reached awareness (Tong et al., 2006).

Auditory bistability is being used to investigate the neural correlates of ASA, and in particular the neural correlates of streaming. Overall, results indicate that neural correlates of bistable percepts during streaming may be found at many levels of the auditory pathways. Gutschalk et al. (2005) for instance played a long-duration streaming sequence and asked his listeners to report continuously on their perception of one or two streams. Magneto-encephalography (MEG) revealed that the event-related fields evoked by the tones in the sequence differed if the subjective perception was that of one or two streams, for the same physical stimulus. Localization of the source of the fields suggested that the effect originated from secondary auditory cortex. Cusack (2005) used the same auditory bistability paradigm with an fMRI technique. He observed differential activity for one or two streams in the intra-parietal sulcus, a locus beyond the main auditory pathways associated among other things with cross-modal processing. Using a similar paradigm but focusing on the moment of the perceptual switches, Kondo and Kashino (2009) found switch-related activations in primary auditory cortex, but also in an earlier processing stage, the auditory thalamus. Schadwinkel and Gutschalk (2011) investigated streaming based on subjective location differences and, together with cortical activation, found a correlate of bistable switches in the auditory midbrain (inferior colliculus). Single-unit recordings for streaming based on frequency, in the bistable regime, suggest correlates in primary auditory cortex (Micheyl et al., 2005) but also as early as the cochlear nucleus, the first synapse after the auditory nerve (Pressnitzer et al., 2008). Because of the nature of the technique, these last two studies fall short of co-registration of bistable percept with neural activity, but they do show that the average temporal dynamics of changes in percepts due to bistability is found at the earliest stages of the auditory hierarchy.

When auditory bistability is based on verbal transformation, yet other types of correlates have been found, this time involving fronto-parietal networks implicated in speech (Kondo and Kashino, 2007; Basirat et al., 2008). A recent study has directly compared the two types of auditory bistability in the same subjects (Kashino and Kondo,in press). It confirmed that, even though the auditory bistability networks overlap to some extent, notably for thalamic and primary cortical areas, they also differ to reflect the nature of the competition (speech forms versus tone frequency and rhythm).

This brief overview shows that a bewildering array of neural correlates has been claimed for auditory bistability, encompassing many levels of the auditory pathways. This absence of a single locus is reminiscent of the current view of visual bistability (Tong et al., 2006; Sterzer et al., 2009). It could be that technical details account for the differences between studies. However, it could also be that the bistability processes for ASA are indeed applicable to many levels of processing and hence well-suited to a distributed neural implementation: ASA is such an essential function for hearing that its basic mechanisms seem to be pervasive throughout the auditory system. In any case, it seems that bistable illusions are now firmly part of the experimental assay for the investigation of ASA.

# **SOME MUSICAL ILLUSTRATIONS**

After this brief review of ASA and the use of ambiguity illusions by neuroscientists, it is now time to turn back to music. Before delving into specific examples, a few general points should be made. We have already suggested that musical auditory scenes have the potential to be the most complicated acoustic mixtures encountered by human listeners, because of the sheer number of different acoustical sources involved. We have then mentioned a few of the cues that are considered reliable for ASA, based on physical plausibility. For vertical organization for instance, tones that are synchronous and that satisfy a harmonic relation are highly unlikely to come from different sources, as the likelihood of such a chance combination is really small. But not so in music: in fact, if the composer so decides, and provided the performers are skilled enough, it is well possible to have a collection of different sources playing in synchrony and following harmonic ratios (a choir, for instance). For horizontal organization, it is highly unlikely that a single source widely and rapidly changes its pitch and timbre3. But not so in music: musical instruments covering a broad pitch range (e.g., the piano) or even the voice (think human beat-box) can be used to such effect.

What happens then to the subjective experience of the listener? The reasonable assumption is that the general rules of ASA described above still apply, but that the outcome of perceptual inference may or may not reflect the physical description of the scene. The musician can attempt to facilitate the emergence of one or more distinct melodic lines from an otherwise complicated acoustic mixture, or on the contrary to promote the illusory fusion of many sources into one perceptual object. This could be described as steering the inherent ambiguity of ASA toward one of several possible perceptual interpretations. Interestingly, when enjoying music, the listener may be especially willing to entertain different solutions to the ASA problem as there is no obviously harmful potential consequence to making a mistake (as opposed to failing to detect footsteps in the savannah).

Let us now survey what may feel like a haphazard collection of musical examples, borrowed from different genres and historical periods. The eclecticism is intentional, and it is in fact only limited by space constraints and by the music collection of the different authors. It is our hypothesis that similar examples would be found in many musical traditions,including of course non-Western ones.

# **THE ART OF VOICE-LEADING**

A lot of music around the world, starting from what is arguably the most valuable kind of all, lullabies, involve a single acoustic source. However, perhaps because of the social value of music (McDermott and Hauser, 2005; Fitch, 2006), there are also many examples across cultures of musical ensembles involving more than one source. Musicians may then wish to avoid the acoustic muddle that would result from a random superposition of sound sources and try to simultaneously express several distinct melodic lines. This is what is termed polyphony. In Western music, it has been conceptualized through numerous treatises, providing various kinds of advice on how to best achieve "voice-leading."

A particularly fascinating example of voice-leading is to be found in the Musical Offering from J. S. Bach (**Figure 6**). The circumstances of the composing of this piece are worth repeating. The title refers to a single melodic line that the emperor Friedrick II of Prussia presented to Bach, perhaps as a challenge to his composing skills. The theme can be seen and heard at the beginning of the example of **Figure 6** and Sound Example S2 in Supplementary Material. From this royal "offering," Bach was reportedly able to improvise on the spot a polyphonic canon involving several voices. Later on, he returned a score containing several variations on the theme, including the *tour de force* that is the "Ricercar, a 6." In parts of this later canon, six different melodic lines are present.

To help the listener distinguish between the voices, the writing takes advantage of many of the rules of ASA (Huron, 2001). For instance, synchronous harmonic intervals are carefully avoided to avoid fusion between voices, while the pitch steps within a voice are relatively small to promote stream formation. These are two of the most potent cues to vertical and horizontal grouping, as we have seen above. Many, more subtle rules of ASA also seem to be enforced in the music of Bach, as discussed in Huron (2001).

In addition, and perhaps revealingly, some of the canons of the Musical Offering are known as "ambiguous canons," bearing the religious inscription "Quaerendo invenietis" (Seek and you shall find). We may speculate on another meaning of this inscription. Indeed, as we have seen, ASA is per nature ambiguous, and especially so for such complex architectures as those imagined by Bach. The listener is thus left to his own devices to solve the perceptual riddles contained in the music. In the twentieth century, Anton Webern paid tribute to this masterpiece of controlled ambiguity by orchestrating it (Sound Example S3 in Supplementary Material). By assigning different timbres to parts of the canon, he suggests to the listener his own reading of the intricate polyphony, as there

<sup>3</sup>Timbre is notoriously difficult to define. Here, it is meant as "the timbre of a given sound source," including the co-variations in spectral shape and temporal envelope that accompany changes in pitch for most musical instruments.

monodic melody, with a single voice. After a few bars, however, additional melodic lines can be seen to appear. Voice-leading becomes increasingly intricate as the music progresses (see also Sound Examples S3 and S4 in Supplementary Material).

were no indications of instrumentation on the original score. In his own words: "My orchestration aims at uncovering the relations between motifs. [...] Is it not about awakening what is still asleep, hidden, in this abstract presentation that Bach gave and which, because of that, did not exist yet for many people, or at least was completely unintelligible?" (Letter to Hermann Scherchen, own translation).

#### **THE ART OF FUSION**

With more than one instrument, it is also possible to aim at the opposite effect: blending all the different sources into a coherent ensemble where they eventually become indistinguishable. Early church music (plain-chant) for instance aimed at fusing all singers into a single melodic line. Later on, fusion between different instruments became the realm of orchestration. Any work written for a symphonic orchestra provides examples of complex sonorities achieved by the perceptual fusion of a mixture of acoustic instruments. String quartets are subtler examples: a talented quartet may seemingly oscillate between perfect osmosis between the parts and clearly distinct melodic lines. The illustration we chose in Sound Example S4 in Supplementary Material is taken from the work of Gil Evans,who took to a particularly elegant level the art of "arranging" the instruments of a jazz big-band into a rich orchestral palette. In this example, the compound timbre of the orchestra converses with the soloist in a natural fashion. However, it would be perfectly impossible for the listener to describe the exact combination of instruments that is making up the orchestral "chimera" (Bregman, 1990).

#### **ILLUSIONS AS MUSICAL DEVICES**

Using the rules of ASA to promote fusion across instruments or, on the contrary, to create distinct voices may be described as implicitly relying on auditory illusions (not all instruments may be heard, and, conversely, not all melodies are produced by a single physical instrument). There are also composers who have made explicit use of illusions as a structural principle for their writing (Risset, 1996; Féron, 2006). Composers known as proponents of "spectral music" built a whole method from the ASA paradox of breaking down the spectral content of natural sounds, which are usually perceived as single sources, to then write complex chords heard as orchestral timbres, thus fusing instruments that are usually heard individually (Grisey and Fineberg, 2000; Pressnitzer and McAdams, 2000).

But the work of Gyorgy Ligeti in particular bears the mark of perceptual illusions as musical devices in their own right. Take for instance the two pieces illustrated in Sound Example S5 and S6 in Supplementary Material. In the case of "Lontano," many instruments are fused into a slowly evolving texture and it is extremely difficult to isolate any one of them. In Ligeti's own word, "Polyphony is written but one hears harmony". The same orchestral configuration is used for the "San Francisco Polyphony," but here, the various instruments are individually heard, with indeed a feeling of a rich polyphony. Through these two pieces, most of the rules of ASA are used to create dramatically different perceptual outcomes with a same orchestra. This use of auditory illusions was a fully planned and deliberate musical esthetics, as stated by Ligeti himself (Sabbe, 1979): "Yes, it is true, I often work with acoustical illusions, very analogous to optical illusions, false perspectives, etc. We are not very familiar with acoustical illusions. But they are very analogous and one can make very interesting things in this domain."

#### **BISTABILITY IN MUSIC**

As it turns out, the bistability illusions that neuroscientists have only recently started to use appear almost *verbatim* in some musical pieces. The bistability of the streaming paradigm is the basis of what has been termed pseudo-polyphony or implied polyphony. Implied polyphony consists at leading several concurrent voices with a single instrument. Among the numerous possible examples, here again it is easiest to refer to the work of Bach (Davis, 2006). In his "Suites for solo Cello," the music played by the largely monodic instrument incorporates interleaved musical lines. The segregation of successive notes into two or more concurrent melodies is achieved by relying on the usual cues to streaming, and most notably as there is a single timbre, on the frequency and time intervals between notes (**Figure 6**, Sound Example S7 in Supplementary Material). Here it is interesting to note that the example of **Figure 7** incorporates a broad range of frequency intervals, starting from one that should promote grouping and ending on one that should promote streaming. In the middle is the ambiguous range where the listener may explore various organizations.

Finally, the verbal transformation paradigm bears strong resemblance with the work of minimalist composers such as Steve Reich or Philipp Glass,where afew musical elements are often recycled and re-used until the perception of the listener subtly changes. The link with verbal transformation is particularly obvious for the

piece "It's gonna rain" by Steve Reich, which consists largely of a tape loop of this single utterance. The perceptual effect is much richer than suggested by this factual description, because of the multistable alternations between speech forms that emerge during listening. Verbal transformations have also found their way into popular music, as illustrated by Sound Example S8 in Supplementary Material where Carl Craig uses the device to create a tense and shifting atmosphere preparing the appearance of an unambiguous beat.

# **A COHERENCE MODEL OF ASA APPLIED TO MUSICAL ANALYSIS**

To close the loop between the neuroscience of ASA and music, we now apply a recent computational model of ASA to the analysis of some of the sound examples discussed above. The model is that of coherence (see Subsection "Cues to ASA"). Its implementation details are available elsewhere (Elhilali et al., 2009). Briefly, the computational architecture is inspired by the known neurophysiology of the auditory system. It postulates that the auditory system first decomposes the sound into different frequency bands, as occurs in the cochlea. Subsequently, these bands are used to construct parallel channels estimating elementary spectro-temporal attributes, which are combinations of temporal and spectral modulations (Chi et al., 2005). Finally, a pair-wise correlation of all attributes is performed to obtain what is termed the coherence matrix.

The coherence matrices for Sound Examples S5 and S6 in Supplementary Material, Ligeti's pieces of Section"Illusions as Musical Devices," are illustrated in **Figure 8** and in the Movies S1 and S2 in Supplementary Material. Each cell in the matrix represents a pair-wise correlation, with warm colors indicating non-zero coefficients. The diagonal is the correlation of a channel with itself, thus it indicates the overall energy of the sound. The off-diagonal elements are what matters for ASA: as explained earlier, a single source tends to have all its attributes temporally modulated in unison. Consequently, when these attributes channels are pair-wise correlated, the resulting patterns in the coherence matrix appear highly regular and sparse. When many incoherent attributes are driving the channels, as would be the case for many independent sources, the coherence matrix has weak and diffuse off-diagonal activation.

The coherence matrices predicted by the model are strikingly different for Lontano (**Figure 8A**) and the San Francisco Polyphony (**Figure 8B**). For Lontano, an exceptional degree of

**FIGURE 8 | An illustration of the coherence model of ASA applied to music.** The coherence matrices (see text, A Coherence Model of ASA Applied to Musical Analysis) are displayed for excerpts of two pieces by György Ligeti, Lontano [**(A)**, Sound Example S5 in Supplementary Material] and the San Francisco Polyphony [**(B)**, Sound Example S6 in Supplementary Material]. Coherence has been averaged for the first 20s of each piece, and the displays are organized by frequency channels. As the qualitative difference between panels suggest, the perceptual organization experienced in the two cases is radically different (even though the composition of the orchestra stays basically the same).

coherence is observed and the matrix is sparse and ordered. This is because numerous instruments play the same temporal rhythm and slow melodic progression, although with different pitches and timbres. For the aptly named San Francisco Polyphony, it is the exact opposite, as many instruments play independent melodies with independent rhythms. This is to put in parallel with the different impressions conveyed by these two pieces, with the first sounding like a single rich harmony while the latter like many voices that never coalesce.

The potential usefulness of these displays stems from their ability to illustrate how complex sound scenes would tend to be perceived by listeners (Shamma et al., 2011). Specifically, when the coherence pattern are highly ordered, it implies that attending to any one attribute points to all others and binds them together (mathematically analogous to the feature reduction attained by principal component analysis of the matrix). By contrast, when many channels are uncorrelated, attending to one attribute does not link it to any other, and hence all voices remain independent. Thus, the range of possible perceptual organizations is somewhat constrained by the form of the coherence matrix.

It is probably too early to suggest a quantitative use of such models in musicological analysis. In particular, it is not easy in the general case to reduce the matrix to a single measure that estimates the number of perceived sound sources. This is perhaps related to the fact that, because of ASA ambiguity and bistability, there is usually no single perceptual answer. However, the qualitative analysis presented here seems sufficient to show that Ligeti used to great effect one very potent principle of ASA, coherence, in order to achieve the perceptual "illusions" he was so keenly interested in.

# **CONCLUDING REMARKS: THE ESTHETIC VALUE OF AMBIGUITY?**

From our brief tour of ASA, it seems clear why neuroscientists would be interested in perceptual illusions based on ambiguity: they seem to highlight the ongoing inference processes at work during perceptual organization and thus may serve as useful probes to investigate the architecture of the system. But why so many musicians, from different styles, have apparently chosen to play with ambiguity as an integral part of their composing devices?

The short answer is of course that we do not know for sure. It is our hypothesis here that a lot of the music available gravitates around a sweet spot including some degree perceptual ambiguity (with counter-examples of blindingly obvious organizations, of course, as is always the case with artistic endeavors, so the hypothesis is to be taken in the statistical sense). Future research, perhaps using computational models such as the one we have outlined, will have to substantiate the claim. In the meantime, a few anecdotal observations are consistent with a role of ambiguity in the appreciation of music. First, perhaps more than any other art forms, it seems that we are incredibly keen to listen over and over again to the exact same piece of music4. Why is this so? Repeated listening comes with an enhanced ability to uncover musical streams that may have been missed the first time around. Memories (or schemas, in Bregman's terms) are sure to form through repeated exposure (Agus et al.,2010). They can then help to pull-out streams from complex scenes, so repeating the same piece over and over allows one to explore its inherent ambiguity. Second, music is also one industry where customers are happy to spend good money for purchasing the same work several times, but from different performers. Concert-goers do the same, too. In French, performers are called "interprètes." It reminds us that, from the score, several readings are possible. It is likely that musical interpretation often plays with ASA: highlighting the clarity of the voices, or, on the contrary, seeking fusion between parts.

It could finally be further speculated that there is something deep in the fact that we seem to look for ambiguous auditory scenes when creating and listening to music. Zeki (2004), discussing ambiguity and visual art, pointed out that there is fundamentally no ambiguity without perception. Physical information is just there, ambiguity only arises when a perceiver is trying to "make sense" of this information. Music is an especially challenging stimulus to make sense, as most of it is abstract without any clear reference to an external object. By embedding several latent perceptual organizations into complex acoustical scenes, music may

# **REFERENCES**


of speech forms. *Neuroimage* 42, 404–413.


well be able to challenge the listener with a rich set of possibilities that can be freely entertained, with no other potential consequence than being surprised, rejoiced, or moved.

#### **ACKNOWLEDGMENTS**

We would like to thank Robert Zatorre for editing the manuscript, as well as the two reviewers for numerous insightful comments. This work was supported by the Agence Nationale de la Recherche (ANR-BLAN-08 Multistap, Daniel Pressnitzer); the Fondation Pierre Gilles de Gennes pour la Recherche (Clara Suied); the Chaire d'Excellence Blaise Pascal and the program Recherche à Paris (Shihab A. Shamma).

# **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at http:// www.frontiersin.org/human\_neuroscience/10.3389/fnhum. 2011.00158/abstract

All sound examples are at http://audition.ens.fr/dp/Frontiers\_ Sound\_Examples/

**Sound Example S1 |** Gustav Mahler, Symphony no 8 in E flat major, 1907. *Veni, creator spiritus.* Chicago Symphony Orchestra, direction: Sir Georg Solti, 1972. Decca Legends.

**Sound Example S2 |** Johann Sebastian Bach, The Musical Offering, 1747. *Ricercar, a6.* Musica Antiqua Köln, direction: Reihnard Goebel, 1979. Archiv Production.

**Sound Example S3 |** Johan Sebastian Bach/Anton Webern, The Musical Offering, 1935. Orchestration of the fugue no 2 from the Musical Offering. London Symphony Orchestra, direction: Pierre Boulez, 1991. Sony Classical.

**Sound Example S4 |** Miles Davis/Gil Evans, George Gershwin's Porgy and Bess. 1958. *Summertime.* Columbia

**Sound Example S5 |** György Ligeti, Lontano für großes Orchester, 1967 Sinfonie-Orchester des Südwestfunks, Baden-Baden, direction: Ernest Bour, 1969. Wergo.

**Sound Example S6 |** György Ligeti, San Francisco Polyphony für Orchester, 1973/74. Sinfonie-Orchester des Schwedischen Rundfunks, direction: Elgar Howarth, 1977. Wergo.

**Sound Example S7 |** Johan Sebastian Bach, Suites for solo Cello in C major, ca. 1720. *Suite no 3 in G major.* Janos Starker, 1965. Mercury Living Presence.

**Sound Example S8 |** Carl Craig, More songs about food and revolutionary art, 1997. *Dominas.* Planete SSR.


<sup>4</sup>We would like to thank I. Nelken for pointing out this fact quite vividly, by suggesting to ask ourselves how many times we have seen our favorite movie compared to how many times we have listened to our favorite piece of music.


(2007). "The dynamics of auditory streaming: Psychophysics, neuroimaging, and modeling," in *Hearing – From Sensory Processing to Perception*, eds B. Kollmeier, G. Klump, V. Hohmann, U. Langemann, M. Mauermann, S. Uppenkamp, and J. Verhey (Berlin: Springer), 275–283.

Kelso, J. A. S. (in press). "Multistability and metastability: understanding dynamic coordination in the brain,"in *Multistability in Perception: Binding Sensory Modalities*, eds J. L. Schwartz, N. Grimault, J. M. Hupé, B. C. J. Moore, and D. Pressnitzer.


stream segregation. *Acta Acustica United with Acustica* 88, 320–332.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 14 June 2011; accepted: 16 November 2011; published online: 14 December 2011.*

*Citation: Pressnitzer D, Suied C and Shamma SA (2011) Auditory scene analysis: the sweet music of ambiguity. Front. Hum. Neurosci. 5:158. doi: 10.3389/fnhum.2011.00158*

*Copyright © 2011 Pressnitzer, Suied and Shamma. This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits noncommercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.*

# **SEARCHING FOR ROOTS OF ENTRAINMENT AND JOINT ACTION IN EARLY MUSICAL INTERACTIONS**

**Jessica Phillips-Silver and Peter E. Keller**

# Searching for roots of entrainment and joint action in early musical interactions

# *Jessica Phillips-Silver 1\* and Peter E. Keller 2,3\**

*<sup>1</sup> International Laboratory for Brain, Music and Sound, Montreal, Quebec, Canada*

*<sup>2</sup> Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany*

*<sup>3</sup> MARCS Institute, University of Western Sydney, Australia*

#### *Edited by:*

*Robert J. Zatorre, McGill University, Canada*

#### *Reviewed by:*

*Robert J. Zatorre, McGill University, Canada Edward W. Large, Florida Atlantic University, USA Erin Hannon, University of Nevada Las Vegas, USA*

#### *\*Correspondence:*

*Jessica Phillips-Silver, International Laboratory for Brain, Music and Sound, Pavillon 1430 Boulevard Mont-Royal, H2V 4P3, QC, Montréal, Canada. e-mail: jessica.phillips-silver@ umontreal.ca; Peter E. Keller, Max Planck Research Group "Music Cognition & Action", Max Planck Institute for Human Cognitive and Brain Sciences, 04103 Leipzig, Germany. e-mail: keller@cbs.mpg.de*

**INTRODUCTION**

What joint human behavior reveals greater coordination in time and affect – or entrainment – than music? When individuals join together in almost any musical behavior, ranging from a simple tune shared between a young child and caregiver, to a rhythmically complex performance of a Cuban Jazz band (**Figure 1**), their joint action is characterized by entrainment. The term entrainment (from the Middle French *entrainer*, v., to draw, drag) is used across diverse domains to refer generally to spatiotemporal coordination between two or more individuals, often in response to a rhythmic signal (e.g., Phillips-Silver et al., 2010). Used in musical contexts (including dance), entrainment typically refers to the rhythmic synchronization as a product of individuals' interaction (Clayton et al., 2005) – that is, rhythmic coordination with simple (e.g., 1:1 in-phase or anti-phase) or more complex (e.g., 2:3 or 3:4 polyrhythmic) phase relations.

One component of entrainment is thus temporal. Temporal entrainment to music can be observed at hierarchical levels of metrical periodicity in the body (Leman and Naveda, 2010; Naveda and Leman, 2010; Toiviainen et al., 2010) and brain (Large and Jones, 1999; Large and Palmer, 2002; London, 2004; Jones, 2009; Nozaradan et al., 2011). People often experience temporal musical entrainment in the automatic, even uncontrollable head-bobbing or toe-tapping that occurs when we listen to musicians play (or the covert activation of motor areas of the brain in the absence of overt

When people play music and dance together, they engage in forms of musical joint action that are often characterized by a shared sense of rhythmic timing and affective state (i.e., temporal and affective entrainment). In order to understand the origins of musical joint action, we propose a model in which entrainment is linked to dual mechanisms (motor resonance and action simulation), which in turn support musical behavior (imitation and complementary joint action). To illustrate this model, we consider two generic forms of joint musical behavior: chorusing and turn-taking. We explore how these common behaviors can be founded on entrainment capacities established early in human development, specifically during musical interactions between infants and their caregivers. If the roots of entrainment are found in early musical interactions which are practiced from childhood into adulthood, then we propose that the rehearsal of advanced musical ensemble skills can be considered to be a refined, mimetic form of temporal and affective entrainment whose evolution begins in infancy.

**Keywords: music, joint action, entrainment, ensemble skills, development, dance**

movement), while the height of temporal entrainment might be seen in the complex rhythmic timing and exchange between partners or ensemble members in music or dance (Keller et al., 2007; Keller, 2008; Goebl and Palmer, 2009) as portrayed by the band in **Figure 1**.

A second component of entrainment derives from the mutual sharing of an affective state between individuals.Authors have even spoken of a kind of affective "synchrony," although we prefer the term entrainment in order to distinguish the temporal and affective components while encapsulating the essence of both. Therefore, for the purpose of this paper we will refer to both temporal and affective entrainment. Affective entrainment involves the formation of interpersonal bonds (see Feldman, 2007) and is related to the pleasure in moving the body to music and being in time with others (known in the vernacular as"groove"; Pressing, 2002; Madison, 2006; Janata et al., 2012). Janata et al. (2012) have put forth that temporal and affective components of entrainment in music might be inextricably linked, as for example, the quality of sensorimotor synchronization (temporal) can predict the degree of experienced groove (affective), and affiliation (Hove and Risen, 2009). In searching for the biological roots of temporal and affective entrainment, early musical interactions between infants and their caregivers might provide the first experiences in entrainment (e.g., Malloch and Trevarthen, 2009) and thus provide a foundation for the development of skills in musical joint action (**Figure 1**).

**FIGURE 1 | Illustrations of temporal and affective entrainment.** The skills of musical ensembles such as a Cuban Jazz band embody temporal entrainment and musical joint action. Affective entrainment

can be seen in early forms of joint action between an infant and caretaker, and might provide a foundation for the development of musical skills.

In joint action in general, multiple individuals coordinate their movements in space and time, with a goal to communicate, or to effect a change in the environment (see Sebanz et al., 2006). Here we take on the term "coordination" commonly used in this literature because movements between co-actors might be coupled in either a synchronized or complementary fashion, in time, or in space. Studies of the spatiotemporal aspects of interpersonal joint action reveal that coordination in joint action can be emergent or planned (Knoblich et al., 2011). Emergent coordination occurs spontaneously due to automatic processes that are grounded in perception and action couplings in the brain (Knoblich et al., 2011). In a non-musical example, when two conversational partners are engaged, they show a coordination of body sway that is unintentional and does not require planned action (Shockley et al., 2003). Unintentional coordination of body sway can also occur between duetting musicians (Keller and Appel, 2010), an example that will be discussed in more detail later. The emergent process can also be described in terms of concepts from dynamical systems theory (e.g., coupled oscillators; see Schmidt and Richardson, 2008; Oullier and Kelso, 2009; Riley et al., 2011). In contrast, planned coordination requires shared representations of the intended outcome of the joint action and a plan for each individual's own role in the joint action. This can be seen in a variety of activities in which multiple individuals engage, including spoken conversation, or carrying a heavy object together (e.g.,Clark, 1996, 2005; Knoblich and Sebanz, 2008; Bekkering et al., 2009; Sebanz and Knoblich, 2009). In music, both emergent (automatic headbobbing) and planned (playing and dancing along) coordination reflect entrainment and can exert simultaneous influences on the individuals' actions.

How do we experience emergent and planned coordination in musical interactions where entrainment involves demands on temporal synchronization and complex rhythmic timing? To understand how these forms of musical joint action arise requires an excursion into several different research fields. The same terminology is occasionally used to different ends in these fields, and part of our task consists of an attempt to reconcile this with the aim of merging these diverse fields on common ground (see **Box 1** for a glossary of our terminology and definitions). We begin in the next section by identifying two primary generic modes of musical joint action, chorusing and turn-taking, and their underlying entrainment mechanisms. Then in the section entitled "Ensemble Skills in Musical Joint Action" we consider how these mechanisms are crystallized in specific cognitive and motor skills that support musical ensemble performance. We then discuss in the section entitled "The Beginnings of Musical Joint Action" some of the potential ontogenetic roots for the cognitive and motor systems required in musical joint action. In the final section, we speculate on a possible mimetic relationship between developmental trajectories in social coordination skills and musical ensemble skills.

# **COORDINATION MODES IN MUSICAL JOINT ACTION: CHORUSING AND TURN-TAKING**

Chorusing and turn-taking are canonical modes of joint action that exemplify emergent and planned coordination. Chorusing occurs when communicative signals (sounds or movements) produced by separate individuals make simultaneous and roughly equal contributions to the joint action as a whole (as in synchronous chorusing observed in several species; Merker, 2000). For example, in multi-part music, monophonic (unison), and homophonic (chordal) textures can be considered to be instances of chorusing. In dance, the "chorus line" and the "corps de ballet" provide definitive examples of chorusing. Turn-taking involves the ordering of communicative signals produced by separate individuals in such a way that there is little temporal overlap (i.e., serial organization) or, when there is overlap, information in the signal produced by only one individual has priority at any given time (hierarchical organization). Musical turn-taking occurs in antiphonal "call and response" textures (as in Afro-American blues and gospel music; Williams-Jones, 1975; Waterman, 1999) and, more generally, when instruments or dancers exchange roles, taking turns at leading and accompanying, such as happens in a Baroque fugue.

The distinction between chorusing and turn-taking is based on the temporal relations between actions of interacting individuals. Musical joint actions, however, have behaviorally relevant spatial, as well as temporal, components. Spatiotemporal movements shape dance forms and define the embodiment of metrical hierarchies (Leman, 2007). In a musical melody, the pitch intervals and contours define trajectories through mental representations of pitch space that can correspond cross-modally to physical space (Rusconi et al., 2006; Lidji et al., 2007; Keller and Koch, 2008; Eitan **Box 1 A glossary of terms as they pertain to the concepts of musical joint action in this paper.** Terms are listed in the order they appear in text.

**Entrainment:**The spatiotemporal coordination between two or more individuals, often in response to a rhythmic signal (e.g., Phillips-Silver et al., 2010). Examples include playing music or dancing together, in which temporal entrainment is particularly important, and infant–caregiver interactions, in which affective entrainment is often the primary goal (see **Figure 1**). Temporal and affective components of entrainment often coincide in musical joint action, and might be inextricably linked (Janata et al., 2012).

**Joint action:** Social interaction wherein multiple individuals coordinate their behaviors in space and time to communicate or to effect a change in the environment (see Sebanz et al., 2006). Two types of joint action, emergent coordination and planned coordination, are described below.

**Emergent coordination:** A type of joint action that occurs by spontaneous, automatic processes that are grounded in links between perception and action.

**Planned coordination:**A type of joint action that, in addition to basic entrainment, requires shared representations of the intended outcome of the joint action, such as in musical ensemble playing.

**Chorusing:** The production by separate individuals of simultaneous sounds or movements that signal joint action, such as in a chorus of voices, the beat of a drum circle, or a dance chorus line.

**Turn-taking:** The production by separate individuals of alternating or dovetailing sounds or movements that signal joint action, such as in "call and response" or musical fugue.

**Imitation:** An intentional or unintentional spatiotemporal mirroring of the actions (or their effects) of one individual by another. Examples occur in a choral unison or a group's entrainment to a musical pulse.

**Complementary joint action:** An action achieved by partners with a common goal, and a systematic but non-identical spatiotemporal relation between the partners' actions. An example occurs in hierarchical turn-taking as heard in the trading of licks in a Jazz band.

**Motor resonance:** The automatic, bottom-up, exogenously driven activation of movement-related brain areas (e.g., premotor and sensorimotor cortex). Examples include an infant's non-conscious mimicry of the posture or affective expression of an adult or another infant.

**Action simulation:** The controlled, top-down activation of sensory and movement-related brain areas, that does not necessarily rely upon exogenous stimulation. Examples include the mental imagery of a sequence of musical sounds or dance movements.

**Mimesis:** An aesthetic representation or recreation of a physical reality. We propose an example of mimesis in the process of musical practice and expertise, which recalls and builds upon the stage of learning affective and temporal entrainment during infancy (see **Figure 1**).

and Timmers, 2010). To account for such spatiotemporal relations, chorusing and turn-taking can be classified with respect to two further general classes of action: imitation and complementary joint action (**Figure 2**).

Imitation entails one individual mirroring the spatiotemporal features of another's movements or their effects (see Brass and Heyes, 2005), often in a goal-directed manner (Sebanz et al., 2006). Imitation may occur intentionally, such as when a musician or dancer incorporates a pre-existing sequence in a variation on a theme, or unintentionally, as when two duetting players inadvertently copy each other's body sway or the effects of that sway on expressive parameters of the sound that they are producing (Keller and Appel, 2010). Imitation may occur at varying time lags, ranging from short intervals that give the impression of simultaneity (as in unison chorusing, or moving bodies or instruments to a common musical pulse) to longer intervals at which the model behavior and the copy do not overlap (serial turn-taking). Complementary joint action, which requires more sophisticated partnership skills, occurs when spatiotemporal features of one individual's behavior are different from, but systematically related to, those of another individual (see Bekkering et al., 2009). This is exemplified in non-unison chorusing and hierarchical turn-taking.

We assume that entrainment supports imitation and complementary joint action through two brain mechanisms that capitalize on links between perception and action. Specifically, imitation is mediated by motor resonance while complementary joint action requires internal simulation processes that may involve mental imagery. Motor resonance occurs when brain regions that control movement – including motor, premotor, and sensorimotor cortices – are activated automatically by the observation of another's movements (Rizzolatti and Craighero, 2004). This process may lead to non-conscious mimicry, where individuals unwittingly adopt the facial expressions, mannerisms, or postures of interaction partners (Chartrand and Bargh, 1999). Motor resonance may contribute to the perception of musical pulse (van Noorden and Moelants, 1999; Grahn and Brett, 2007; Chen et al., 2008; Large, 2008) and, in both music and dance, may promote readiness for temporally coordinated, planned action.

Motor resonance evoked during the perception of music and dance does not only function in the service of action, but may also modulate aesthetic responses (Calvo-Merino et al., 2008; Kornysheva et al., 2010; Cross and Ticini, 2011). Motor representations of practiced actions result in stronger motor resonance in adults as well as in infants (Calvo-Merino et al., 2006; van Elk et al., 2008), which suggests that motor resonance is tuned by experience and,

**FIGURE 2 | Diagram illustrating dual routes by which entrainment supports mechanisms and behaviors in musical joint action.** One route leads to emergent coordination (e.g., automatic imitation and chorusing) via motor and/or perceptual resonance. The other route leads to planned coordination (e.g., complementary joint actions involving turn-taking) through action simulation. The vertical arrows in the figure represent relations between mechanisms and behaviors within each route, while horizontal arrows represent potential areas of overlap between the routes (e.g., imitative behavior may arise through resonance alone or through a combination of resonance and simulation). The dotted arrows imply that the behaviors and mechanisms at each level are not mutually exclusive, and may exist in hybrid forms (e.g., non-unison chorusing and hierarchical turn-taking).

when combined with perceptual resonance (Schütz-Bosbach and Prinz, 2007a), can contribute to action simulation.

Action simulation, as conceived here, is richer than motor resonance to the extent that it involves the activation of sensory, in addition to movement-related, brain regions (cf. Schütz-Bosbach and Prinz, 2007a). Specifically, action simulation occurs when sensorimotor neural processes that resemble those associated with executing an action are engaged in the absence of overt movement (see Gallese et al., 2004; Wilson and Knoblich, 2005; Decety and Grèzes, 2006). Such covert activity may be triggered by observing – or by merely imagining – an action or its effects, for example, body movements in the case of dance (Cross et al., 2006), or tones in the case of music (Keller, 2008). This highlights what we believe to be a fundamental distinction between motor/perceptual resonance and action simulation: While resonance is driven exogenously and automatically (i.e., preattentively) by the perception of external events, simulation may be generated endogenously in the absence of external stimuli (cf. Grush, 2004) and may be modulated by attention.

Action simulation is a mark of expertise, as it is mediated by experience-based associations between sensory and motor processes (Haueisen and Knösche, 2001; Baumann et al., 2005; Bangert et al., 2006; Zatorre et al., 2007). Auditory–motor simulation is, therefore, especially strong in musicians (Lahav et al., 2007) and visuo-motor simulation is particularly potent in dancers (Calvo-Merino et al., 2005, 2006; Cross et al., 2006). It has been

proposed that action simulation facilitates the understanding of others' intentions and affective states, as well as playing a role in predicting an observed action's immediate outcome and future course (e.g.,Wilson and Knoblich, 2005; Decety and Grèzes, 2006; Schütz-Bosbach and Prinz, 2007b; Sebanz and Knoblich, 2009). In musical joint action, predictions based on simulation can facilitate interpersonal coordination in the context of musical textures characterized by chorusing and turn-taking (e.g.,Keller et al.,2007; Keller and Appel, 2010).

Instances of unison chorusing – for example, singing a tune such as Auld Lang Syne, or moving one's body in matched fashion with other individuals as in a group folk dance – can be considered to be forms of imitation to the extent that co-actors adopt leader or follower roles. Spatiotemporal interpersonal coordination at this level can especially reveal the contribution of motor resonance triggered exogenously by perceptual input, in addition to the simulation, and action-planning required to execute the musical unison. A complex example that illustrates the contribution of multiple cognitive–motor processes is the Afro-Brazilian Congado – a musical ritual of religious significance that entails the simultaneous performance of separate but proximal communal groups of singers, percussionists, and dancers. A rigorous analysis of Congado performances (Lucas et al., 2011) has shown that inter-group entrainment is influenced by proximity and visual contact, similarity of tempo, and intention (i.e., whether proximal groups resist entrainment), factors which reveal contributions of emergent entrainment and imitation, as well as simulation and action-planning in planned coordination. Finally, during turntaking, the serial or hierarchical ordering of signals produced by different individuals (e.g., when "trading licks" in jazz or exchanging melody and accompaniment roles in chamber music) constitutes complementary joint action. Effective coordination during turn-taking therefore requires the participating individuals to predict the spatiotemporal trajectories of each other's actions via simulation.

The cognitive and motor demands of chorusing and turntaking vary as a function of the degree to which these modes of musical joint action require imitation or complementary joint action. The scheme in **Figure 2** captures this through the use of arrows to delineate potential links between the two levels of behavior (chorusing/turn-taking and imitative/complementary) and entrainment-based mechanisms (motor/perceptual resonance and action simulation). Note that musical joint action may adopt hybrid forms that house elements of both imitative and complementary joint action (e.g., when one improvising musician imitates another's rhythm while altering the pitches). In the next section we consider in more detail the cognitive and motor processes that serve advancedforms of musical joint action, such as skilled ensemble performance, through links to basic mechanisms of entrainment, motor resonance, and action simulation.

# **ENSEMBLE SKILLS IN MUSICAL JOINT ACTION**

Musicians coordinate their actions with precision and flexibility. To engage in such planned coordination, whether it entails turn-taking or chorusing, ensemble musicians in many musical traditions invest considerable time into rehearsing together in order to form shared representations of the ideal sound (see

Davidson and King, 2004; Ginsborg et al., 2006; Davidson, 2009). According to Vesper et al. (2010), the "minimal architecture" supporting generic joint action includes shared task representations and processes related to action monitoring, prediction, and behavioral modifications that simplify coordination. The latter "coordination smoothers" involve modulations of an individual's own behavior that render the action more predictable and make it easier for others to coordinate with it (Vesper et al., 2010). In music performance, exaggerated movements associated with breathing, body sway, and ancillary performance gestures such as head nods (Goebl and Palmer, 2009;Keller and Appel, 2010; Keller,forthcoming) may serve as smoothers. Once shared goals are established, these musicians rely upon three core cognitive–motor ensemble skills that are based on entrainment, and allow each individual to coordinate with the actions of co-performers in real time (see Keller, 2008).

The most fundamental ensemble skill is adaptive timing, or adjusting the timing of one's movements in order to maintain synchrony in the face of unintended temporal perturbations (e.g., by other players) and intended tempo changes in the music. Adaptive timing is mediated by temporal error correction mechanisms that enable internal timekeepers (i.e., neural oscillations at timescales relevant to musical meter) to remain coupled, or entrained in stable phase relations with external signals (Repp, 2005; Large, 2008; Repp and Keller, 2008). Temporal error correction processes that operate automatically support emergent coordination, while others that require attention may be invoked in contexts requiring planned coordination, such as when an ensemble musician intentionally adapts to the expressively motivated tempo changes of a co-performer (Keller, 2008). Research on adaptive timing has yielded evidence for assimilation in interpersonal action timing during dyadic sensorimotor synchronization. This has been observed in joint finger-tapping tasks that require basic in-phase coordination akin to chorusing (Konvalinka et al., 2010), as well as in tasks that require individuals to produce movements in alternation,which involves turn-taking. In these studies, cross-correlation analyses revealed interpersonal dependencies in the timing of coactors' finger taps that were suggestive of mutual assimilation. Temporal assimilation may be a form of imitation that facilitates ensemble cohesion by making multiple individuals sound collectively as one.

A more cognitively advanced skill is prioritized integrative attending, that is, simultaneous attention to one's own actions (high priority) and those of others (lower priority) while monitoring the integrated ensemble sound (Keller, 1999, 2001)*.* Prioritized integrative attending thus relies both on an individual's ability to divide attention between different sound sources, and on the group's joint attention skills (to the extent that multiple individuals must attend to the aggregate structure that results from their coordinated actions). This variety of joint attention is particularly important in music characterized by complex turn-taking. For example, to perform the interlocking rhythms of Central African music, according to Nketia (1962), each player must have a general awareness of the resulting aggregate pattern, as well as the "knack" for coming in at the right moment. It has been proposed that the dynamic allocation of attentional resources that is required for prioritized integrative attending relies on entrainment (Keller, 2008).

Specifically, the coupling of internal timekeepers to periodicities associated with the music's metric structure may provide a temporal schema that allows attention to be modulated in a manner that is optimal for monitoring multiple levels of the musical texture concurrently (Keller, 1999; London, 2004; Jones, 2009). Evidence that such metric schemas facilitate prioritized integrative attending has been found in studies in which musicians are required to memorize one instrumental part, as well as the aggregate structure, of multi-part rhythm patterns (Keller and Burnham, 2005). For example,in one experiment listeners were required simultaneously to memorize a target (high priority) part and the overall aggregate structure (resulting from the combination of parts) of short percussion duets. The key finding was that recognition memory for both aspects of each duet was influenced by how well the target part and the aggregate structure could be accommodated within the same metric framework.

The third ensemble skill is anticipatory imagery, which involves the use of mental imagery in planning the production of one's own sounds and predicting upcoming sounds of co-performers. Auditory or motor imagery may thus assist in anticipating others' actions with a view to optimizing ensemble cohesion (Keller, 2008)*.* In support of this hypothesis, it has been shown that individual differences in synchrony between pianists in duos are positively correlated with performance on a task designed to measure the vividness of anticipatory auditory imagery (Keller and Appel, 2010). With regard to the mechanisms underling this relationship, it has been claimed that anticipatory imagery relies on entrainment-based sensorimotor coupling to drive internal models that trigger mental images during action simulation (Keller, 2008). These internal models, which allow sensorimotor transformations between bodily states and events in the immediate environment to be represented in the brain, are harnessed during action simulation to generate predictions about an action's future course (Wolpert et al., 1998, 2003). The accuracy and stability of sensorimotor synchronization in the context of real and virtual instances of interpersonal coordination have been shown to depend on temporal prediction abilities (Wöllner and Cañal-Bruland, 2010; Pecenka and Keller, 2011), which are, in turn, positively correlated with temporal imagery abilities (Pecenka and Keller, 2009a,b).

Thus a suite of cognitive and motor skills, operating in concert, supports successful joint action in musical contexts via entrainment-based mechanisms that enable diverse forms of chorusing and turn-taking behavior. This is particularly evident in the performances of expert ensembles, such as the Cuban Jazz band depicted in **Figure 1**. Some of these ensemble skills, in addition to coordination smoothers and shared task representations, can be founded on principles of joint action learned very early in life (**Figure 1**). We now turn to the early foundations of musical joint action.

# **THE BEGINNINGS OF MUSICAL JOINT ACTION**

Joint action as a general form of coordinated social action begins with the earliest relationship. Affective entrainment in social contexts seems to be a predisposition of the infant and mother (Trehub, 2003), as in the "intuitive parenting" that helps to regulate the infant's emotional state and guide learning in his pre-verbal environment (Papoušek, 1996b). One of the highlighted aspects of the mother–infant interaction is its musical nature. Musical qualities of typical infant-directed speech include slow and regular tempo, repetition, exaggerated prosody and the accompaniment of movement and gesture. These features may serve as coordination smoothers, as well as functioning to engage the infant's attention (Fernald, 1991), to enable emotional communication (Trainor et al., 2000; Trainor, 2008), and to promote language acquisition (Papoušek, 1996a; Kuhl, 2004).

The infant is responsive to such expressive musical characteristics in the mother's (or other caretaker's) displays, and also participates in "communicative musicality" (Malloch and Trevarthen, 2009; the infant's behavior has also been referred to as "protomusical," e.g., Cross, 2003), for example, in predicting the timing of the mother's expressions and producing responses that are to some degree temporally coordinated (e.g., Crown et al., 2002). Jaffe et al. (2001) define temporal coordination as a form of interpersonal contingency, and they describe interactional (nonperiodic) rhythms which allow an infant and parent to each predict the timing pattern of the other's behavior. This ability to predict timing is necessary in order to eliminate the time lag of a sensorimotor reaction (Merker et al., 2009), and is considered to be essential to bonding within the dyad (Jaffe et al., 2001). In a common game played by adults to engage infants, violation of expectation in timing is used to make the infants laugh, also supporting infants' ability for predictive timing (Stern, 2007). Infants' production of musical actions, as in sung tones or rhythmic movements, begins to emerge during the first year (perhaps especially with encouragement) and can be considered in the context of the communication of the pre-verbal infant (Trehub, in press). Yet before full musical actions emerge, the infants' responsiveness to musicality and the timing of their responses may build upon a repertoire of non-verbal vocal and gestural expressions (Eckerdal and Merker, 2009).

Infants thus engage in a kind of "sympathetic conversation" with their mothers, the timing of which enables them to anticipate and relish in the mothers' expressive displays, and causes them distress if timing is mismatched (see Trevarthen and Aitken, 2001). Temporally coordinated actions can be simultaneous, dovetailing, or alternating (Feldman, 2007), though these interactions are not typically strictly periodic, and are stochastic and bidirectional in organization (Cohn and Tronick, 1988). When mother–infant interactions take the form of chorusing, they have been referred to as "coaction" (Dissanayake, 2000). In this context, the behaviors that infants display with their caregivers include social gaze, facial expressions, and vocal behaviors (by 3 months of age), gesture, and shared attention to objects (after 6 months of age; Feldman, 2007). The roughly unison or matched forms of interaction between infants and their caregivers might be attributed to motor resonance (e.g., Meltzoff and Decety, 2003), and reflect emergent coordination behavior. The function of this emergent coordination is to establish affective entrainment, cooperation, and bonding between the infant and her caregiver (Feldman, 2007; Feldman et al., 2007).

Turn-taking in infancy has been studied primarily in the context of observing spoken conversation, and indicates the importance of the development of social cognition and attention. Infants shift gaze or attention as they observe videos of adults engaged in conversational turn-taking (von Hofsten et al., 2005), and gaze shifts become increasingly predictive with age (Bakker et al., 2011). By 4–6 months infants are sensitive to cues of social cognition (such as selective attention to face-to-face interactions) in turntaking (Augusti et al., 2010),which coincides with the development of their ability to use infant-directed speech cues to choose their preferred social partners (Schachner and Hannon, 2011). Before semantics and syntax are shared between conversational partners, cues to turn-taking are manifest in culturally dependent vocal prosody, eye gaze, and body movements (Wilson andWilson,2005) but with a universal target of timing in turn-taking (Stivers et al., 2009). A further cognitive skill in interpretation of conversational turn-taking is anticipating transitions in conversation (i.e., when the next speaker will begin to speak), which does not develop until around 3 years of age and might be influenced by language development (von Hofsten et al., 2009). This anticipatory timing skill might also correspond to improvements in planned coordination of actions (**Figure 2**).

Spatiotemporal imitation appears in the repertoire of young infants – for example in facial (see Meltzoff and Moore, 1997) and manual gestures (Bekkering et al., 2000), as well as affective mirroring, beginning with the first social smiles at just 6 weeks of age (Rochat, 2007) – which are all examples of emergent coordination (**Figure 2**). The timing of imitation is limited by cognitive–motor maturity but is taught in part by caregivers to young children, often via imitation games aimed at encouraging joint action (Gergely et al., 2002; Sebanz et al., 2006; Papoušek, 2007). In early stages imitation is automatic and relies primarily on motor resonance (Paulus et al., 2011), even more strongly in practiced behaviors such as crawling than in novel behaviors (i.e., walking; van Elk et al., 2008). While many actions of young infants may rely on resonance, according to von Hofsten (2004), even from birth such actions are not mere reflexes but can be motivated, informed, and goal-directed. Evidence for this claim includes infants' interest in tracking and imitating the purpose and the outcomes of observed actions (e.g., von Hofsten and Siddiqui, 1993; Gergely et al., 2002; Gergely and Csibra, 2003). The goal-directed nature of infants' actions, revealing planning, prediction, and motor representations, could enable the progression (with muscular and especially cognitive development) from resonant imitation to more complex forms of joint and complementary joint action.

In the earliest joint actions that are performed by infants with adults, the adult typically helps the infant to achieve her goal, in which case a precise motor representation is not required (Vesper et al., 2010). By around 1 year of age, several cognitive changes facilitate joint action. First, joint attention has emerged (see Rochat, 2007), in which individuals knowingly attend to the same object or event. This skill coincides with monitoring of relative shared attention between the infant and his social partner (Rochat, 2007). The 1-year-old understands intentions, which has been argued (Tomasello et al., 2005) to be the basis for understanding beliefs (i.e., theory of mind) – emerging around 15 months (Onishi and Baillargeon, 2005) and continuing to develop with experience with language and shifting of perspective (Tomasello et al., 2005). At 1 year action-planning is evident, as infants show goal-directed eye movements that reflect motor representations (Falck-Ytter et al., 2006). These motor representations are thought

to be supported by the mirror neuron system: a brain network that is recruited similarly during action perception and action execution (Falck-Ytter et al., 2006), and is thought to be involved in the interpretation of music and dance in adults (Stevens et al., 2001; Zatorre et al., 2007; Overy and Molnar-Szakacs, 2009).

Beyond the first year emerge the abilities to interpret goaldirected actions in a rational manner (e.g., Gergely et al., 2002), suppress imitative motor representations, and eventually perform complementary joint actions (Sebanz et al., 2006). Motivation for these changes stems in part from "shared intentionality," or the sharing of psychological states in order to reach mutual collaboration (Tomasello and Carpenter, 2007). Planned coordination calls upon the more advanced cognitive–motor skills of precise (and shared) sensory and motor representations, action monitoring, and behavioral modification (cf. Keller, 2008). Presumably, concurrent with the maturation of the above-mentioned cognitive and motor skills (hence less reliance on scaffolding by adults) as well as language development (e.g., von Hofsten et al., 2009), the practice of musical activities during childhood may continue to engage attention and foster coordination skills. For example, prioritized integrative attending, when musical ensemble performers monitor the aggregate sounds (which children's choirs can do to an extent) presumably builds upon the earlier abilities for joint attention and dividing attention between actors, once a shared goal representation is also established. Between the ages of 2.5 and 3 years, the skills of coordination (timing) in joint action show substantial improvement even if individual performance on a task improves only marginally (Meyer et al., 2010). Complementary roles in joint action appear to be mastered from the age of 3 years (similar to linguistic turn-taking; see von Hofsten et al., 2009), as action-planning and control become refined (Meyer et al., 2010). Improvisation in social contexts may have a role in the building of planned coordination capacities upon emergent coordination capacities. For example, turn-taking has been observed in children's vocal play, even in improvised and complementary forms (Dissanayake, 2000), and such vocal play is expressed in virtual social contexts, as in imaginary dialogs and play-acting (see Rochat, 2007).

To achieve temporal entrainment in music, it may be necessary to practice the skill of sensorimotor synchronization – that is, the ability to move in time with perceived external events (Repp,2005). For example, the clapping games and nursery rhymes, songs, and group dances, of school-age children begin to demonstrate the kind of coordination that is required to keep time as a group (e.g., Provasi and Bobin-Begue, 2003), with an external periodic pulse (McAuley et al., 2006), and using multiple levels of metrical hierarchy (Drake et al., 2000). Such music and games, which represent collective social entrainment (Phillips-Silver et al., 2010), may also foster the development of ensemble playing, and play an important role in the improvement of automatic and deliberate adaptive timing skills, attention, and auditory–motor imagery (Keller, 2008).

The ability for precise synchronization seems to mature gradually, probably building on early perceptual abilities for processing the musical beat (Hannon and Trehub, 2005; Phillips-Silver and Trainor, 2005; Winkler et al., 2009). In studies of synchronization of body movement with music or with a musical partner (in children between the ages of 5 months and 5 years), the children's motions – or the sounds produced by them – are not tightly synchronized (phase-locked) to the musical beat (Eerola et al., 2006; Kirschner and Tomasello, 2009; Merker et al., 2009; Zentner and Eerola, 2010). This suggests that sensorimotor synchronization in music is not typically developed until sometime later in childhood or near adolescence (Merker et al., 2009; although cases of exceptional childrens' musical performances can suggest otherwise, e.g., Merker et al., 2009; Sowinski et al., 2009)1.

Practice of the affective component of entrainment in joint action is natural, as infants and young children show a predisposition to "groove" to music in social contexts – that is, they are compelled to move to the music and derive pleasure from it (Janata et al., 2012). Infants and toddlers display a variety of dance gestures in response to music (Eerola et al., 2006), and they produce spontaneous dance motions more to music than to speech (Zentner and Eerola, 2010). The social component is clear in that infants' and toddlers' dancing is associated with positive affect (Zentner and Eerola, 2010), and young childrens' musical drumming is enhanced in a social context (Kirschner and Tomasello, 2009). The development of action simulation in childhood could further facilitate the understanding of others' affective states (Decety and Grèzes, 2006) and complementary joint action as in musical exchange (Kirschner and Tomasello, 2009), especially when the roles are truly complementary.

# **DOES ENSEMBLE PRACTICE SHOW MIMESIS OF EARLY COORDINATION?**

We have proposed that temporal and affective forms of entrainment together support musical joint action by enabling chorusing, turn-taking, and hybrid modes of interpersonal coordination via dual routes. On a low road, emergent coordination arises through motor and perceptual resonance that underlies imitative behavior, while on a high road, planned coordination is achieved with the assistance of covert internal simulations thatfacilitate complementary joint action. We then described the relevance of these mechanisms and behaviors to ensemble skills that allow experienced individuals to coordinate their actions during group music making and dance, as well as to the consolidation of social coordination skills in human development. In the present section, we speculate that the development of coordinated social action in infancy and early childhood, and the practice of advanced musical ensemble skills at later stages, are mimetic processes. Specifically, the process of acquiring and refining skills pertaining to temporal adaptation, attention, and anticipation in early human development is repeated in the acquisition and refinement of skills that are required for precise and flexible interpersonal coordination in aesthetic displays of music and dance (**Figure 1**). This may be viewed as a form of mimesis where advanced interpersonal coordination skills serving artistic purposes (aesthetic communication through music and dance) are grounded in early coordination skills that serve basic functions (mother–infant bonding, cognitive–motor

<sup>1</sup>In other atypical cases, synchronization ability may not develop adequately to support normal dance behavior (Phillips-Silver et al., 2011). This is referred to as beat deafness, and future research will shed light on the perceptual and action bases of the disorder.

development, and social and cultural learning, cf. Cross, 2001). Indeed, it has been proposed that both the infant–caregiver relationship and the arts including music and dance reveal a propensity in humans to respond to temporally dynamic social stimuli with emotional affiliation and temporal cooperation (Dissanayake, 2000; Miall and Dissanayake, 2003).

We have suggested that musical joint action builds upon early manifestations of affective and temporal entrainment. The affective component of entrainment – which is central to interpersonal synchrony in music – is grounded in emotional resonance and affective mirroring in early infancy (Rochat, 2007). This component is arguably the first and foremost in ontogeny of coordinated action and musical communication. For example, imitative forms of chorusing and turn-taking are mastered more readily than complementary varieties of musical joint action. Imitation may also provide the easiest route to the socio-emotional benefits of joint musical experience. Indeed, synchronous, matched motor activity has been found to foster interpersonal affiliation (McNeill, 1995; Hove and Risen, 2009), cooperation and pro-social behavior (Wiltermuth and Heath, 2009; Kirschner and Tomasello, 2010), and even altruism (Valdesolo and DeSteno, 2011).

The temporal component of entrainment emerges in various forms during childhood: through joint attention activities in infancy, the improvement of adaptive timing, and finally anticipatory processes in school-aged children. Formal music training

### **REFERENCES**


plays a role in developing such capacities for attention and executive functioning (see Hannon and Trainor, 2007), as well as refined auditory–motor interactions (see Zatorre et al., 2007). Hannon and Trainor (2007) have suggested that joint task goals, action monitoring, and synchronization can contribute to the benefits of musical training on cognitive and motor processing abilities in musical joint action. As increasing capacities for imitation and complementary joint action are developed, the cognitive and motor demands of chorusing and turn-taking can be met in musical joint action.

Musicians and dancers often strive to attain the height of temporal and affective entrainment, and so we look to the roots of those behaviors to understand the process by which they are embodied. From the earliest musical exchanges between the infant and mother, temporal and affective entrainment serve more than the primary bond: they lay the groundwork for the refinement of skills in joint action and musicianship. In infants and children, as well as in musical experts via a mimetic process, maturation and musical experience result in entrainment that allows for precision and flexibility in timing, a sense of participation and emotional communion, and a musical aesthetic that reveals the complexity and richness of human interaction.

# **ACKNOWLEDGMENTS**

We thank the reviewers for their helpful comments, and Yoel Diaz for permission to use the ensemble photo in **Figure 1**.


causes. *Contem. Music Rev.* 22, 79–89.


in synchronization. *Conscious. Cogn.* 16, 102–111.


coordinated rhythmic movement. *Music Percept.* 28, 3–14.


bodies and minds moving together. *Trends Cogn. Sci. (Regul. Ed.)* 10, 70–76.


and social interaction. *Philos. Trans. R. Soc. Lond. B Biol. Sci.* 358, 593–602.

Wolpert, D. M.,Miall,R. C., and Kawato, M. (1998). Internal models in the cerebellum. *Trends Cogn. Sci. (Regul. Ed.)* 2, 338–347.

Zatorre, R. J., Chen, J. L., and Penhune, V. B. (2007). When the brain plays music: auditory-motor interactions

in music perception and production. *Nat. Rev. Neurosci.* 8, 547–558.

Zentner, M., and Eerola, T. (2010). Rhythmic engagement with music in infancy. *Proc. Natl. Acad. Sci. U.S.A.* 107, 5768–5773.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 26 July 2011; accepted: 07 February 2012; published online: 28 February 2012.*

*Citation: Phillips-Silver J and Keller PE (2012) Searching for roots of entrainment and joint action in early musical* *interactions. Front. Hum. Neurosci. 6:26. doi: 10.3389/fnhum.2012.00026 Copyright © 2012 Phillips-Silver and Keller. This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.*

# *Dance in the Brain*

*The Impact of Aesthetic Evaluation and Physical Ability on Dance Perception 331* 

Emily S. Cross, Louise Kirsch, Luca F. Ticini and Simone Schütz-Bosbach

*Practice of Contemporary Dance Promotes Stochastic Postural Control in Aging 343*

Lena Ferrufino, Blandine Bril, Gilles Dietrich, Tetsushi Nonaka and Olivier A. Coubard

# **THE IMPACT OF AESTHETIC EVALUATION AND PHYSICAL ABILITY ON DANCE PERCEPTION**

**Emily S. Cross, Louise Kirsch, Luca F. Ticini and Simone Schütz-Bosbach**

# The impact of aesthetic evaluation and physical ability on dance perception

# *Emily S. Cross1,2,3\*, Louise Kirsch1, Luca F. Ticini 1,4 and Simone Schütz-Bosbach1*

*<sup>1</sup> Junior Research Group "Body and Self," Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany*

*<sup>2</sup> Behavioural Science Institute, Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Netherlands*

*<sup>3</sup> School of Psychology, Bangor University, Wales, UK*

*<sup>4</sup> Italian Society of Neuroaesthetics "Semir Zeki," Trieste, Italy*

#### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Marcel Brass, Ghent University, Belgium Tamer Demiralp, Istanbul University, Turkey*

#### *\*Correspondence:*

*Emily S. Cross, School of Psychology, Adeilad Brigantia, Bangor University, Bangor, Wales LL57 2AS, UK. e-mail: e.cross@psych.ru.nl*

The field of neuroaesthetics attracts attention from neuroscientists and artists interested in the neural underpinnings of esthetic experience.Though less studied than the neuroaesthetics of visual art, dance neuroaesthetics is a particularly rich subfield to explore, as it is informed not only by research on the neurobiology of aesthetics, but also by an extensive literature on how action experience shapes perception. Moreover, it is ideally suited to explore the embodied simulation account of esthetic experience, which posits that activation within sensorimotor areas of the brain, known as the action observation network (AON), is a critical element of the esthetic response. In the present study, we address how observers' esthetic evaluation of dance is related to their perceived physical ability to reproduce the movements they watch. Participants underwent functional magnetic resonance imaging while evaluating how much they liked and how well they thought they could physically replicate a range of dance movements performed by professional ballet dancers. We used parametric analyses to evaluate brain regions that tracked with degree of liking and perceived physical ability. The findings reveal strongest activation of occipitotemporal and parietal portions of the AON when participants view movements they rate as both esthetically pleasing *and* difficult to reproduce. As such, these findings begin to illuminate how the embodied simulation account of esthetic experience might apply to watching dance, and provide preliminary evidence as to why some people find enjoyment in an evening at the ballet.

#### **Keywords: dance, neuroaesthetics, parietal, visual, fMRI, AON, ballet**

# **INTRODUCTION**

In recent years, the nascent field of neuroaesthetics has gained momentum as scientists interested in the neural processes underlying an esthetic experience, such as a beautiful painting, piece of music, or dance performance, have begun to elucidate the links between sensory input and the observers' affective evaluation (Zeki, 1999; Blood and Zatorre, 2001; Cela-Conde et al., 2004; Kawabata and Zeki, 2004). Most neuroaesthetics research to date has focused on brain engagement when participants evaluate paintings or music (for reviews, see Di Dio and Gallese, 2009; Chatterjee, 2011). One theory emerging from the neuroaesthetics research on visual art is that an important factor in shaping an observer's esthetic experience is the simulation of actions, emotions, and corporeal sensations visible or implied in an artwork (Freedberg and Gallese, 2007). Freedberg and Gallese (2007) suggest that embodied resonance of art in an observer can be driven by the content of the work (such as empathic pain experienced when viewing the mangled bodies in Goya's *Que hay que hacer mas*) or by the visible traces of the artists' creation (such as evidence for vigorous handling of the artistic medium, like that which might be experienced when viewing a Jackson Pollock painting). While an embodied simulation account of esthetic experience provides a useful context for considering an observer's esthetic experience

of art, the authors acknowledge that "a question arises about the degree to which empathic responses to actions in real life differ from responses to actions that are represented in paintings and sculpture" (p. 202). In the present study, we address this question by studying an artistic medium where the actions required to create the artwork *are* the artwork. Specifically, we investigate the relationship between esthetic experience, physical ability, and activation of sensorimotor brain regions when watching dance.

Compared with the abundance of studies focused on music and visual art, the neuroaesthetics of watching dance has received relatively limited research attention (Calvo-Merino et al., 2008, 2010; Hagendoorn, 2010; Cross and Ticini, 2011). Dance neuroaesthetics is a particularly rich topic to investigate, as it is informed not only by research on the neural substrates of esthetic experience, but also by an extensive literature on how the experience of action shapes action perception (e.g., Decety and Grezes, 1999; Buccino et al., 2001; Casile and Giese, 2006; Aglioti et al., 2008), including a number of studies specifically looking at dance perception among dance experts (Calvo-Merino et al., 2005, 2006; Cross et al., 2006) and novices (Cross et al., 2009a,b).

By now, numerous studies have demonstrated overlap between action perception and performance in the human motor system. Supporting evidence is provided by experiments measuring corticospinal excitability with motor evoked potentials (MEPs; e.g., Fadiga et al., 1995) and changes in blood oxygenation level dependent (BOLD) responses in motor areas of the brain with functional magnetic resonance imaging (fMRI; e.g., Grafton et al., 1996; Grèzes and Decety, 2001; Caspers et al., 2010; Molenberghs et al., in press). Of particular interest in these studies are brain regions that respond when watching others move, collectively known as the action observation network (AON; Grèzes and Decety, 2001; Cross et al., 2009b; Gazzola and Keysers, 2009). This network, comprising premotor, parietal, and occipitotemporal cortices, is believed to help us make sense of others' bodies in motion, in order to help us decode the goals and intentions underlying their movements (Gallese et al., 2004; Rizzolatti and Sinigaglia, 2010).

A noteworthy approach for investigating how the AON subserves action perception is to measure how an observer's prior physical or visual experience influences his or her perception of others' actions. Scientists from a growing number of laboratories are turning to expert and novice dancers to help address such questions (Calvo-Merino et al., 2005, 2006; Cross et al., 2006, 2009b; Bläsing et al., 2010). One consistent finding this research has revealed is that when dancers observe a type of style of movement that they are physically adept at performing, greater activity is recorded within parietal and premotor portions of the AON (e.g., Calvo-Merino et al., 2005, 2006; Cross et al., 2006, 2009a,b). Moreover, it has also been demonstrated that the amplitude of the response within parietal and premotor portions of the AON, as measured by fMRI, increases parametrically the better an observer is able to perform the observed dance sequence (Cross et al., 2006).

Such research has opened a gateway to understanding how specific neural changes are associated with an individual's ability to perform highly complex and coordinated actions. However, findings in this vein stop short at being able to explain how and why dance observers often derive intense pleasure from watching dance (Cross and Ticini, 2011). Is it because we embody the forms and movements articulated by the dancers within our own motor system, consistent with the embodied simulation account of esthetic experience (Freedberg and Gallese, 2007), or does enjoyment stem from a more purely visual experience? To our knowledge, only one published study (Calvo-Merino et al., 2008) has explored how participants' subjective evaluations of dynamic displays of dance correlate with activity within sensorimotor brain regions that compose the AON. In this study, the authors asked dance-naïve participants to carefully observe a number of videos featuring different dance movements while undergoing fMRI (Calvo-Merino et al., 2008). Approximately 1 year later, participants watched the dance videos again, and this time their task was to rate each video using a five-point Likert scale on the five key esthetic dimensions identified by Berlyne (1974): like–dislike, simple–complex, dull–interesting, tense–relaxed, and weak–powerful. The authors averaged participants' responses and focused on how the consensus ratings for each dance stimulus related to brain responses. They found that when participants watched dance movements they rated as highly likable, increased activity emerged within right premotor cortex, as well as bilateral early visual regions. The authors concluded that the premotor portion of the AON might thus be

important in assigning an automatic and implicit esthetic evaluation to dance.

This previous study offers an intriguing first glimpse of the neural substrates that might underlie the esthetic experience of watching dance. However, it also leaves many enticing questions open for further exploration. For example, since Calvo-Merino et al. (2008) explicitly chose to focus on the brain responses corresponding to a group's consensus esthetic evaluation of each stimulus, it remains unknown how individual ratings of a dance's esthetic value might be related to AON activity. We know from prior work that parietal and premotor portions of the AON are sensitive to individuals' physical experience with movements (e.g., Calvo-Merino et al., 2005; Cross et al., 2006), and that responses within visual and premotor regions correlate with how much a group likes watching certain movements (Calvo-Merino et al., 2008), but these how two factors interact remains unknown. In the present study, we aim to address this interaction between physical ability and esthetic evaluation.

We selected participants with little experience performing or watching dance and asked them to observe videos depicting movements performed by expert ballet dancers. Following each video, participants rated either how much they liked watching the movement, how well they could physically reproduce each movement, or responded to a factual question concerning the content of the video (such as whether the dancer jumped or not). Because Calvo-Merino et al. (2008) found BOLD response correlations only with participants' like–dislike ratings (and not the other four esthetic dimensions identified by Berlyne (1974), we focus on only the like–dislike esthetic dimension in this study.

We analyzed the imaging data using participants' individual liking and physical ability ratings as parametric modulators via three main contrasts. The first evaluated regions modulated by how much participants liked a movement. If individual ratings are largely consistent with the group-averaged ratings used by Calvo-Merino et al. (2008), then we should find increased activation of right premotor and early visual cortices when participants watched movements they liked. The second contrast replicates Cross et al. (2006), who measured regions parametrically modulated by participants' perceived ability to perform each movement. If such ratings made by expert dancers generalize to ratings made by non-dancers, then we might expect left parietal and premotor cortices to show increased activity as participants rate actions as increasingly easy to replicate. The third contrast evaluates the interaction between liking and perceived ability, while a related behavioral analysis enables us to measure whether a relationship emerges between subjective ratings of these two modulators. Findings should further our understanding of the embodied simulation account of esthetic experience as it may apply to dance.

# **MATERIALS AND METHODS PARTICIPANTS**

Twenty-two physically and neurologically healthy young adults were recruited from the fMRI Database of the Max Planck Institute for Human Cognitive and Brain Sciences (Leipzig, Germany). All were monetarily compensated for their involvement, and gave written informed consent. The local ethics committee approved all components of this study. The 22 participants (9 females) ranged in age from 21 to 33 years (mean = 24.8 years, SD = 2.9 years). All participants were strongly right handed as measured by the Edinburgh Handedness Inventory (Oldfield, 1971).

Moreover, all participants were recruited as naïve observers with limited or no dance experience, qualified by completion of a questionnaire following the experimental manipulation to evaluate past experience in performing and watching dance. No participant had formal training in ballet or modern dance (though some participants took one semester of ballroom dance training in school, as is required in some regions in Germany). When asked to evaluate their ability as a dancer on a 1- to 5-scale (1 = awful; 2 = bad; 3 = intermediate; 4 = good; 5 = very good), participants scored themselves with a mean rating of 2.7 (SD = 1.12). To quantify experience with dance observation, the mean number of professional dance performances (or theatre/opera performances that had some dance element) attended each year by participants was 1.02 (SD = 1.06).

# **STIMULI AND DESIGN**

Stimuli featured a male or female dancer performing a dance movement. The dancers, both members of the Leipziger Ballett performed a range of movements varying in complexity, speed, difficulty, and size, as well as to use movement from both classical and contemporary dance vocabularies. From the footage captured of both dancers, 64 different dance video stimuli were constructed, each 3 s in length. To establish a stimulus-specific baseline, two additional 3 s videos were used, created from footage of each dancer standing still in a neutral posture in the same studio setting.

# **MOTION ENERGY QUANTIFICATION**

Because each dance sequence differed in terms of the size, speed, and spatial range of the movements, we took an additional step to attempt to control for such differences in the imaging data. In order to do this, we quantified the motion energy in each video clip using a custom Matlab algorithm, based on motion recognition work by (Bobick, 1997) in computer science. Such quantification of motion energy has been applied successfully before to stimuli used in neuroimaging studies of action observation (Schippers et al., 2010; Cross et al., in press-a). With our particular algorithm, we converted each movie to gray-scale, and then calculated a difference image for pairs of consecutive frames in each movie. The difference image was thresholded so that any pixel with more than 10 units luminance change was classified as "moving." The average numbers of moving pixels per frame and per movie were summed to give a motion energy score for that movie.

# **fMRI TASK**

During functional neuroimaging, all videos were presented via Psychophysics Toolbox 3 running under Matlab 7.2. The videos were presented in full color with a resolution of 480 × 270 pixels using a back projection system, which incorporated a LCD projector that projected onto a screen placed behind the magnet. The screen was reflected on a mirror installed above participants' eyes. Participants completed one functional run 34 min in duration, comprising 128 experimental trials (2 presentations of each of the 64 dance videos) organized randomly. Each experimental trial video was followed by one of the two main questions of

interest (how much did you like it?/how well could you reproduce it?); participants' task was to watch each video closely and answer the question following the video. Importantly, trials were arranged to collect one liking and one reproducibility rating for each stimulus, thus participants never answered the same question about a particular video twice. In order to reduce task predictability and to encourage the maintenance of focus throughout the experiment, eight additional trials were randomly interspersed among the experimental trials, after each of which participants were asked an unpredictable yes–no question about the video content, addressing various features of the stimulus movement (e.g., did the dancer jump?; did the dancer turn?; did the dancer's hands touch the ground?). Also interspersed randomly across the 128 experimental trials were 16 repetitions (8 trials with each of the 2 dancers) of the 3-s videos of the dancers standing still in a neutral position. The intertrial intervals were pseudologarithmically distributed between 4 and 8 s. A schematic depiction of the task is illustrated in **Figure 1**.

# **fMRI DATA ACQUISITION**

All data were collected at the Max Planck Institute for Human Cognitive and Brain Sciences (Leipzig, Germany). Functional images were acquired on a Bruker 3-T Medspec 20/100 whole-body MR scanning system, equipped with a standard birdcage head coil. Functional images were acquired continuously with a single shot gradient echo-planar imaging (EPI) sequence with the following parameters: echo time TE = 30 ms, flip angle 90˚, repetition time TR = 2,000 ms, acquisition bandwidth 100 kHz. Twenty-four axial slices allowing for full-brain coverage were acquired in ascending order (pixel matrix = 64 × 64, FOV = 24 cm, resulting

other question concerning the content of the video). Participants' task was to watch each video closely and respond to the question as accurately as

possible.

in an in-plane resolution of 3.75 mm × 3.75 mm, slice thickness = 4 mm, interslice gap = 1 mm). Slices were oriented parallel to the bicommissural plane (AC–PC line). The first two volumes of each functional run were discarded to allow for longitudinal magnetization to approach equilibrium, and then an additional 1015 volumes of axial images were collected.

Geometric distortions were characterized by a B0 field-map scan [consisting of a gradient echo readout (32 echoes, interecho time 0.64 ms) with a standard 2D phase encoding]. The B0 field was obtained by a linear fit to the unwarped phases of all odd echoes. Prior to the functional run, 24 two-dimensional anatomical images (256 × 256 pixel matrix, T1-weighted MDEFT sequence) were obtained for normalization purposes. In addition, for each subject a sagittal T1-weighted high-resolution anatomical scan was recorded in a separate session on a different scanner (3-T Siemens Trio, 160 slices, 1 mm thickness). The anatomical images were used to align the functional data slices with a 3D stereotaxic coordinate reference system.

#### **fMRI DATA ANALYSIS**

Data were realigned, unwarped, corrected for slice timing, normalized to individual participants' T1-segmented anatomical scans with a resolution of 3 mm × 3 mm × 3 mm, and spatially smoothed (8 mm) using SPM8 software. A design matrix was fitted for each participant, with each 3 s dance movie trial modeled by a boxcar with the duration of the video convolved with the standard hemodynamic response function. Three additional parametric modulators were included for the main dance video trials: participants' individual ratings of how much they liked each dance sequence, participants' individual ratings of how well they thought they could reproduce each dance sequence, and a regressor expressing the mean motion energy of each video, which compensates for major differences in contrasts of interest due to varying amounts of movement between stimuli (Cross et al., in press-a). Additional regressors in the model included the "still body baseline" (comprising the 16 still body videos), the "test questions" (comprising the eight trials where participants were asked a yes–no question about the previously viewed video), and the "question and response phase" (encompassing the time when participants were asked each question and made a keypress response).

Imaging analyses were designed to achieve four objectives. The first group-level analysis evaluated which brain regions were more active when observing a dancer's body in motion compared to viewing a dancer's body standing still. Such a contrast enables the localization of brain regions responsive to dance *per se*, and not extraneous features of the display that are not of interest for this study (e.g., the dancers' identity, the layout of the dance studio, etc.). Regions that emerged from this contrast, illustrated in **Figure 2**; were used to create a task-specific mask for all subsequent analyses reported in the paper, at the *p* < 0.001, *k* = 10 voxel level. The second analysis identified brain regions responsive to esthetic appraisal of dance movements. To accomplish this, we evaluated both directions of the parametric regressor for "liking," to differentiate between brain regions showing an increased response with increased liking and those showing an increased response with decreased liking. The third analysis followed the identical

**observation** *>* **static body baseline.** This contrast was made to determine, in an unbiased, subject- and task-specific manner, which regions were to be included in the mask of the AON.

approach for the parametric modulator for "perceived physical ability." The fourth analysis evaluated the interaction between "liking" and "perceived physical ability." Two directions of the interaction were evaluated, highlighting in one direction regions that responded more when participants liked a movement but perceived it as difficult to reproduce, and in the other direction brain regions that were more active when participants watched movements they did not like but perceived as easy to reproduce. All contrasts were evaluated at *p*<sup>u</sup> < 0.001 (uncorrected for multiple comparisons), and *k* = 10 voxels. For the main parametric contrasts, we focus on those results that reached a cluster-level significance of *pcor.* < 0.05 (FDR-corrected for multiple comparisons)1. For anatomical localizations, all functional data were referenced to cytoarchitectonic maps using the SPM Anatomy Toolbox v1.7 (Eickhoff et al., 2005, 2006, 2007). For visualization purposes, the *t*-image of the AON mask is displayed on partially inflated cortical surfaces using the PALS data set and Caret visualization tools (**Figure 2**; http://brainmap.wustl.edu/caret). All other analyses are illustrated on an averaged high-resolution anatomical image of the study population (**Figures 3** and **4**).

# **RESULTS**

The first imaging analysis, evaluated as all dance > still bodies, revealed broad activation in a network comprising areas classically associated with action observation (e.g., Grèzes and Decety, 2001; Cross et al., 2009b; Caspers et al., 2010; Grosbras et al., in press), including bilateral parietal, premotor, supplemental motor, and occipitotemporal cortices. A full listing of regions can be found in **Table 1**. This contrast, illustrated in **Figure 2**; was used as a mask for all analyses described below.

<sup>1</sup>For completeness and transparency, the tables list all regions significant at the uncorrected threshold of *p* < 0.001.

yellow.

regions with greater responses the more difficult participants think a

#### **FIGURE 4 | Interaction between "liking" and "physical ability" parameters.** The parietal and visual brain regions illustrated here are cluster-corrected activations that are active when participants watch dance movements that they rate as being highly enjoyable to watch, but very difficult to reproduce.

# **AON REGIONS MODULATED BY LIKING**

The positive direction of this parametric contrast revealed bilateral activation within visual brain regions implicated in the processing of complex motion patterns (namely, area V5/MT+), and human bodies (ITG/MTG), as well as a large cluster within the right inferior parietal lobule (IPL; **Figure 3A**; **Table 2A**). The inverse direction of this contrast, which interrogated regions showing an increased BOLD response the less participants liked a movement, did not reveal any suprathreshold activations.

# **AON REGIONS MODULATED BY PERCEIVED PERFORMANCE ABILITY**

In direct contrast to the results reported previously with expert dancers (Cross et al., 2006), no suprathreshold activations emerged from the positive direction of the analysis that evaluated brain regions that increase in response the better a participant thinks he or she can perform an observed movement, either at the corrected or uncorrected level. The inverse contrast, which evaluated brain regions that became increasingly active the *less* participants thought they could perform the observed movement, resulted in no activations reaching cluster-corrected significance, though several uncorrected clusters emerged within bilateral middle occipital gyri (**Table 2B**). For comparison of the visual regions activated by liking and perceived difficulty to reproduce an observed movement, **Figure 3B** illustrates the overlap of both parametric contrasts. As **Figure 3B**shows, similar portions of the middle temporal gyri are engaged both by movements that participants enjoy watching and by those they believe are difficult to reproduce. This strongly suggests that these two factors are not independent, an issue to which we return in greater detail below. Even when the effects of liking and perceived physical ability were evaluated at the whole brain level (i.e., not masked by the dance > body contrast), no additional regions emerged.

# **INTERACTION BETWEEN LIKING AND PHYSICAL ABILITY**

The final analysis examined the interaction between liking and perceived ability when watching dance. The behavioral data indicate that liking and physical ability ratings were not entirely independent; in other words, participants liked more those movements they rated as difficult to perform. Pearson correlation coefficients calculated on an individual subject level demonstrate that the relationship between liking and physical ability ranged from *r* = 0.021 to *r* = −0.615, with an average *r* = −0.27 (SD = 0.21). The presence of an interaction between these variables in the behavioral data enables us to investigate brain regions showing an increased BOLD signal when watching movements that are increasingly enjoyable to watch and increasingly difficult to execute. This


*Locations in MNI coordinates and labels of peaks of relative activation from contrast comparing observation of dance to a still body baseline. Results were calculated at puncorrected* < *0.001, k* = *10 voxels. Up to three local maxima are listed when a cluster has multiple peaks more than 8 mm apart. Entries in bold denote activations significant at the FDR cluster-corrected level of p* < *0.05. Abbreviations for brain regions: BA, Brodmann's area; R, right; L, left; MTG, middle temporal gyrus; PMd, dorsal premotor cortex; SPL, superior parietal lobule.*

analysis revealed activity within bilateral occipitotemporal cortices and the right IPL (**Figure 4**; **Table 2C**). It is of note that broader AON activation emerges in the uncorrected results (**Table 2C**), including left parietal and right premotor cortices. The inverse interaction, examining brain regions responding to movements participants dislike but can perform, revealed no suprathreshold activations at corrected or uncorrected levels.

# **DISCUSSION**

The present study represents the first attempt to investigate the relationship between esthetic appreciation and observers' physical ability when watching dance. Dance-naïve participants watched a series of videos featuring expert dancers and were asked to make explicit judgments about each video, including how much they liked the movements and how well they believed they could execute them. We report two novel findings that have the potential to inform our understanding of how we perceive the art of dance. First, our behavioral data indicate that participants tended to like movements more that they perceived as difficult to physically perform. Second, we report that the interaction between liking and physical ability is represented within occipitotemporal and parietal regions of the AON. We consider now how these findings inform our understanding of the embodied simulation account of esthetic experience, as well as the relevance of the present data to prior work on expertise and aesthetics. We conclude with consideration of possible future directions for dance neuroaesthetics.

# **LIKING WHAT WE CANNOT DO**

In the present study, participants reported liking dance movements more that they perceived as difficult to perform themselves. Anecdotally, this finding resonates with the fact that spectators routinely pay high prices to watch the outstanding physical mastery of acrobats in Cirque du Soleil, slam-dunking basketball players in an NBA game, or the exacting precision of the Bolshoi *corps de ballet*. If every audience member could reproduce the movements made by the acrobats, athletes or dancers, then such events would no longer be spectacular. One possible account of this relationship could be that the seemingly effortless nature with which highly physically skilled individuals perform difficult and spectacular movements leads to increased liking precisely *because* the spectator knows she is witnessing a physical feat well beyond her own abilities.

A stronger preference for movements that appear easy for the dancer, but difficult for the observer to perform could possibly inform a perceptual fluency account of why we rate certain stimuli as more likable than others (Berlyne, 1974). A number of studies demonstrate that people tend to like stimuli more that are easy to understand (e. g., Jacoby and Dallas, 1981; Whittlesea, 1993). Researchers have also demonstrated that we like objects more that we have watched others interact with smoothly and efficiently, compared to objects that were interacted with awkwardly (Hayes et al., 2008), thus demonstrating a link between liking and perceived action fluidity. In the present study, we add another

**Table 2 | Parametric effects of and interaction between liking and physical ability.**


*Locations in MNI coordinates and labels of peaks of relative activation for regions parametrically modulated by increased liking of stimuli (a), decreased physical ability to reproduce the actions observed in the stimuli (b), and the interaction between a and b (c). Results were calculated at puncorrected* < *0.001, k* = *10 voxels. Up to three local maxima are listed when a cluster has multiple peaks more than 8 mm apart. Entries in bold denote activations significant at the FDR cluster-corrected level of p* < *0.05. Only regions that reached cluster-corrected significance are illustrated in the figures in the main text. Abbreviations for brain regions: BA, Brodmann's area; R, right; L, left; V5/MT*+*, visuotopic area MT; ITG, inferior temporal gyrus; MTG, middle temporal gyrus; MOG, middle occipital gyrus; IPL, inferior parietal lobule; (a)IPS, (anterior) intraparietal sulcus; V3, third visual complex; SPL, superior parietal lobule.*

element to the relationship between liking and action perception: namely, that observers also tend to rate actions that are beyond their physical abilities as more likeable.

At this stage, of course, it is unclear how reliable the relationship is between liking and lack of physical ability. One possible way to further evaluate this relationship would be to implement a training paradigm where participants first observe and rate a range of complex movements as novices, and then train over several days or weeks to attain physical mastery of the movements before observing and rating the same movements again. Such an approach might enable much more precise quantification of how the relationship between liking and physical ability is manifest behaviorally.

# **NEURAL CORRELATES OF OBSERVING DIFFICULT AND LIKEABLE ACTIONS**

Turning our focus to the imaging data, the most illuminating contrast is the interaction between liking and perceived reproducibility. This interaction analysis revealed brain regions that showed a stronger response the more participants liked watching a movement and the less well they thought they could reproduce the same movement. The three main clusters to emerge from this contrast were found in bilateral occipitotemporal cortices and the right IPL. In line with the theory proposed by Freedberg and Gallese (2007), one possible way to interpret the IPL finding is that activation in this region is related to increased "embodied simulation" of movements that we like watching. IPL has been previously implicated in embodied simulation processes by a number of studies (e.g., Keysers et al., 2004; Ebisch et al., 2008), and its association with action perception and performance is further reinforced by the identification of so-called "mirror neurons" in the homologous cortical region of non-human primates (Rizzolatti et al., 2001, 2006; Fogassi and Luppino, 2005). Moreover, recent neuroimaging work with humans provides evidence that neurons within human IPL code action perception and execution in a similar manner (Chong et al., 2008; Oosterhof et al., 2010; for a review, see Rizzolatti and Sinigaglia, 2010).

Thus, it could be that when we generally like watching an action that we cannot physically perform, this part of the cortical motor system"works harder"to try and embody it. Put another way,activity within this portion of sensorimotor cortex may be reflecting an attempt to incorporate physically difficult but visually enjoyable actions into the observer's motor system (for more in-depth discussion of this possibility, see Cross et al., in press-a). Alternatively, this relationship could work in the inverse manner, such that increased activation of IPL when watching physically difficult movements leads to increased liking. Although future experimentation is required to confirm or refute the notion that IPL plays a causal role in embodiment and esthetic evaluations when watching dance (and the direction of this relationship), the evidence we present here adds tentative support to Freedberg and Gallese's (2007) proposal that using one's own body to simulate what is seen in art is related to one's esthetic experience of that art.

Our finding of bilateral occipitotemporal cortices when participants watch actions they like but cannot perform is informed by a recent study on the role of the extrastriate body area (EBA)<sup>2</sup> in esthetic evaluation (Calvo-Merino et al., 2010). Using transcranial magnetic stimulation (TMS), Calvo-Merino et al. (2010) demonstrated that TMS to premotor cortex enhances participants' performance on an esthetic sensitivity task, while TMS to EBA led to decreased esthetic sensitivity. The authors interpret their finding in terms of a dual-route model of body processing (Urgesi et al., 2007a), wherein representations of body parts (mediated by EBA: see Taylor et al., 2007; Cross et al., 2010), and global whole-body

configurations (mediated by the premotor cortex: see Urgesi et al., 2007b) are evaluated in a complementary manner and integrated to arrive at a decision about the esthetic quality of a stimulus.

The purported involvement of EBA in assigning an esthetic value to bodies is perhaps even more intriguing in light of this region' simplification in representing not only observed bodies, but also the observer's body (David et al., 2007). As David et al. (2007) discuss, one possible process EBA may contribute to is a comparison between one's own body and an observed body. Data from perceiving contortionists (Cross et al., 2010), robotic actions (Cross et al., in press-a), gymnasts (Cross et al., in press-b), and now ballet dancers (present study, parametric effect of physical ability; **Figure 3B**; **Table 2B**) are consistent with the notion that the more *unlike* the observer's body/motor repertoire an observed body/movement is, the greater the response within EBA. The novel contribution from the present study, then, is that such occipitotemporal activity when observing others' bodies might be associated with several, possibly related, processes, including coding the degree of deviation between the observed, and observer's body/physical abilities, the degree of liking, and the interaction between these two factors. At this stage, future work is needed to establish whether any causal relationships exist between these processes.

# **RELATION OF PRESENT FINDINGS TO PREVIOUS LITERATURE**

Unlike our previous work on action observation and the observer's perceived performance ability (Cross et al., 2006, 2009b), in the present study we found no relationship between AON activity and increasing perceived performance ability. We believe this is most likely due to the fact that participants in the present study had no physical experience with the movements they observed. Prior evidence supports the notion that a lack of physical experience specific to the skills required for performing an observed action leads to only weak AON activity during observation of that action (as was seen in dance novices who observed expert ballet or capoeira movements; Calvo-Merino et al., 2005). We suggest that it would be useful for future work to include a larger range of dance movements or simple actions (such as jumping jacks) when studying the relationship between liking and doing, in order to identify how near to an observer's prior motor experience an observed action needs to be in order to demonstrate increased AON activity for increased perceived performance ability.

In relation to prior research on dance neuroesethetics (Calvo-Merino et al., 2008), our findings provide a counterpoint on the role of the AON in esthetic evaluation. While Calvo-Merino et al. (2008) showed participants' *group* esthetic ratings to be correlated with activity within primary visual cortices and the premotor cortex, when we looked at *individual* esthetic ratings, we found stronger activation within bilateral occipitotemporal cortices and right IPL. These differences are likely attributable (at least to some degree) to differences in task and analysis strategy. It is also worth noting that Calvo-Merino et al. (2008) found that participants rated movements with a higher level of visual motion as more likeable. In the present study, when we assessed the relationship between group-averaged liking ratings and visual motion (motion energy), we also found a positive linear relationship between these variables, computed as a goodness of fit

<sup>2</sup>Extrastriate body area, located within the occipitotemporal region of the AON, is a cortical region specialized for perception of human bodies (Downing et al., 2001; Peelen and Downing, 2007). The portion of EBA stimulated by Calvo-Merino et al. (2010) is likely subsumed in the bilateral occipitotemporal clusters reported in the interaction between liking and reproducibility in the present study, in that stimulation foci for EBA in Calvo-Merino et al. (2010) are 5.39 mm from the maximum of the right middle temporal cluster and 10.19 mm from the maximum of the left middle temporal cluster found in the present study. Nonetheless, we also advise caution in the interpretation of any of our occipitotemporal activations as "extrastriate body area," due to the fact we did not functionally localize these regions (see Downing et al., 2001; Peelen and Downing, 2007 for discussion of EBA localization). It should also be noted that these clusters span much more of occipitotemporal cortex than just EBA, as anatomical localizations reveal that other (sub)peaks within these clusters fall within motion-responsive extrastriate area V5/MT+ (see **Table 2**).

statistical correlation (*R*<sup>2</sup> <sup>=</sup> 0.376, *<sup>p</sup>* <sup>=</sup> 0.002). However, unlike Calvo-Merino et al. (2008), we explicitly modeled out differences in visual motion between stimuli, and therefore these differences alone cannot account for visual activations reported in the present study. Nonetheless, on a behavioral level, a positive correlation between visual motion and liking ratings suggests that this relationship could be a productive direction for future investigation.

Another feature of the present findings worth considering is the broader pattern of activity that emerged in the interaction between liking and perceived ability (**Table 2C**). When using the same statistical threshold as Calvo-Merino et al., 2008; *p*<sup>u</sup> < 0.001), more widespread activation of the AON is seen, including right premotor cortex. The fact that right premotor cortex was involved in esthetic processing in the present study lends additional support to the notion that the premotor portion of the AON is involved in processing the global features of bodies in action, and this information is also used when assigning an esthetic value such bodies (Urgesi et al., 2007a; Calvo-Merino et al., 2010).

#### **IMPLICATIONS AND FUTURE DIRECTIONS**

Taken together, the present findings provide a useful point of departure for further investigation into the relationship between an observer's physical experience and esthetic evaluation of dance. We suggest that future work in this area has the potential to inform not only scientists about how the brain perceives and appreciates art, but also stands to benefit the dance community (Hagendoorn, 2004, 2010; Cross and Ticini, 2011). One intriguing possibility would be for choreographers to experiment with dimensions of

#### **REFERENCES**


movement difficulty or complexity and esthetic quality, to determine what features of very simple movements might also result in high esthetic evaluation by observers. Along these lines, if future work establishes a more causal relationship between AON activity levels and esthetic enjoyment, then brain imaging can help to determine whether movements perceived as more difficult reliably result in greater activation of the AON, or whether much simpler movements performed with a particular movement quality can also lead to strong AON activation in the observer, as well as high liking ratings. We also recommend more in-depth investigation into the constituent roles played by different AON regions (namely premotor, parietal, and occipitotemporal cortices) in esthetic evaluation of dance. As we have discussed previously (Cross and Ticini, 2011), many other avenues for investigating how we perceive and evaluate the performing arts await exploration. The findings from the present study highlight the complexity of quantifying esthetic experience of the performing arts at brain and behavioral levels, as esthetic experience can be influenced by any number of other factors, including the observer's physical ability. Investigating other factors that influence esthetic experience, and how they might interact, offers rich opportunities for future studies.

### **ACKNOWLEDGMENTS**

The authors would like to thank Julia Lechinger for assistance with data collection, Richard Ramsey for helpful comments on an earlier draft of the manuscript, Lauren R. Alpert with manuscript preparation, and the Leipziger Ballett for assistance with stimulus generation.

Roca,M.,Rossello,J.,and Quesney,F. (2004). Activation of the prefrontal cortex in the human visual aesthetic perception. *Proc. Natl. Acad. Sci. U.S.A.* 101, 6321–6325.


Grafton, S. T. (2009b). Sensitivity of the action observation network to physical and observational learning. *Cereb. Cortex* 19, 315–326.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 03 May 2011; paper pending published: 15 June 2011; accepted: 03 September 2011; published online: 21 September 2011.*

*Citation: Cross ES, Kirsch L, Ticini LF and Schütz-Bosbach S (2011) The impact of aesthetic evaluation and physical ability on dance perception. Front. Hum. Neurosci. 5:102. doi: 10.3389/fnhum.2011.00102*

*Copyright © 2011 Cross, Kirsch, Ticini and Schütz-Bosbach. This is an openaccess article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.*

# **PRACTICE OF CONTEMPORARY DANCE PROMOTES STOCHASTIC POSTURAL CONTROL IN AGING**

**Lena Ferrufino, Blandine Bril, Gilles Dietrich, Tetsushi Nonaka and Olivier A. Coubard**

# Practice of contemporary dance promotes stochastic postural control in aging

# *Lena Ferrufino1,2\*, Blandine Bril 1, Gilles Dietrich1,3,Tetsushi Nonaka1,4 and Olivier A. Coubard2\**

*<sup>1</sup> Groupe de Recherche Apprentissage et Contexte, Ecole des Hautes Etudes en Sciences Sociales, Paris, France*

*<sup>2</sup> The Neuropsychological Laboratory, CNS-Fed, Paris, France*

*<sup>3</sup> Université Paris Descartes, Paris, France*

*<sup>4</sup> Research Institute of Health and Welfare, Kibi International University, Okayama, Japan*

#### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Lutz Jäncke, University of Zurich, Switzerland Laurence Bernard Demanze, UMR 6149 CNRS, France*

#### *\*Correspondence:*

*Lena Ferrufino, Groupe de Recherche Apprentissage et Contexte, Ecole des Hautes Etudes en Sciences Sociales, 190 avenue de France, 75013 Paris, France. e-mail: lenaferrufino@gmail.com; Olivier A. Coubard, The Neuropsychological Laboratory, CNS-Fed, 39 rue Meaux, 75019 Paris, France. e-mail: olivier.coubard@cns-fed.com*

As society ages and the frequency of falls increases, counteracting gait and posture decline is a challenging issue for countries of the developed world. Previous studies have shown that exercise and hazard management help to improve balance and/or decrease the risks for falling in normal aging. Motor activity based on motor-skill learning, particularly dance, can also benefit balance and decreases falls with age. Recent studies have suggested that older dancers have better balance, posture, or gait than non-dancers. Additionally, clinical or laboratory measures have shown improvements in some aspects of balance after dance interventions in elderly trainees. This study examined the impact of contemporary dance (CD) and of fall prevention (FP) programs on postural control of older adults. Posturography of quiet upright stance was performed in 41 participants aged 59–86 years before and after 4.4-month training in either CD or FP once a week. Though classical statistic scores failed to show any effect, dynamic analyses of the center-of-pressure displacements revealed significant changes after training. Specifically, practice of CD enhanced the critical time interval in diffusion analysis, and reduced recurrence and mathematical stability in recurrence quantification analysis, whereas practice of FP induced or tended to induce the reverse patterns. Such effects were obtained only in the eyes open condition. We suggest that CD training based on motor improvisation favored stochastic posture inducing plasticity in motor control, while FP training based on more stereotyped behaviors did not.

**Keywords: aging, postural control, motor activity, contemporary dance, flexibility, plasticity**

# **INTRODUCTION**

Balance is a complex function achieved by (i) multi-sensory integration of visual, vestibular, and somesthetic afferences, (ii) central motor control, and (iii) context-specific response generation (Nashner, 1976). In aging, changes occur in visual (Lord, 2006), vestibular (Kristinsdottir et al., 2001), proprioceptive and exteroceptive inputs (Famula et al., 2008), central processing (Horak, 2006), and muscular effectors (Schultz, 1995). Reduced sensory cue congruency, increased visual dependency and motor tone cause decreased balance and unsteady gait (Judge, 2003). Balance and gait disorders are the second risk factor for falls in aging with dramatic morbidity and mortality consequences (Rubenstein, 2006). Developing strategies to prevent falls is thus a major issue of public health to preserve a successful aging (Judge, 2003).

Several programs have been proposed to improve balance and reduce falls in aging: exercise, environmental inspection, and hazard management (Day et al., 2002; Rubenstein, 2006). Home- or center-based exercise interventions have focused on the training of lower limb strength and of balance (Robertson et al., 2001), walking and stair climbing with weights, joint reinforcement, functional balance (King et al., 2002), or flexed posture (Benedetti et al., 2008), and their efficacy has been assessed using clinical (Berg et al., 1995; Rossiter-Fornoff et al., 1995) or laboratory (Benedetti et al., 2008) measures. Tai Chi and dance have also been suggested to

be promising programs to develop balance and prevent falls in older adults (American Geriatrics Society, British Geriatrics Society, and American Academy of Orthopaedic Surgeons Panel on Falls Prevention, 2001; Judge, 2003).

Cross-sectional studies have shown that older social dancers (i.e., ballroom, in line, or pairs) have better balance, a more stable walking pattern (Verghese, 2006), faster leg reaction time, and better postural stability (Zhang et al., 2008) than non-dancers. In an intervention study, Shigematsu et al. (2002) showed that 20 women aged 72–87 years trained to 36 sessions of dancebased aerobic exercise had better balance in single-leg stance and functional reach, and higher locomotion in walking around two cones, compared to untrained controls. After only six sessions of a Laban-based movement program, Hamburg and Clair (2003) observed that 36 adults aged 63–86 years increased their balance in time up-and-go and standing toe/heel lifts, and their velocity and cadence gait. Alpert et al. (2009) reported a progressive balance improvement in the Sensory Organization Test in 13 women aged 52–88 years performing jazz dance for 15 weeks. Hui et al. (2009) compared 52 adults aged 68 years in average trained to low impact aerobic dance (cross and Cha-cha steps) to 42 untrained controls. After 24 sessions, dancers had improved their dynamic balance in time up-and-go, but not their static balance as assessed by the Physical Performance Battery.

The purpose of this study was to examine the effects of two motor activities sought to improve balance, contemporary dance (CD), and fall prevention (FP), on postural control of older adults. To achieve this goal, we used a force platform and measured the center-of-pressure (CoP) displacements in upright stance, eyes open, and eyes closed, before and after 4.4 months of training. In addition to classical statistic scores (length, areas, mean, and variance velocities, Romberg quotient, fractal dimension), we examined the dynamics of CoP displacements using stabilogram diffusion analysis (SDA;Collins and De Luca,1993) and recurrence quantification analysis (RQA; Riley et al., 1999). In SDA, Collins and De Luca (1993) have shown that the trajectories of the mean square displacement plotted as a function of time interval are different from those expected for a Brownian motion. Indeed, the SD plot changes slope after a critical point thus exhibiting short- and long-term regions which, from a physiological viewpoint, respectively approximate open- and closed-loop control mechanisms. RQA is a non-linear method that yields, among others, the degree of autocorrelation measured by recurrence (%REC), determinism vs. randomness (%DET), and mathematical stability measured by maximal line diagonal (MAXL). RQA is an empirical method rather than a theoretical approach. Thus RQA measures are not considered as absolute but must be taken with respect to the levels of a manipulated data (e.g., before and after a training; Riley et al., 1999).

Regarding the influence of the sensory context (i.e., vision), SDA has shown two patterns. Visual input has caused either a decrease or an increase in stochastic activity, which has been interpreted in both schemes to serve to decrease the stiffness of the musculoskeletal system (Collins and De Luca, 1995). In RQA, the deterministic structure of CoP displacements increase with the eyes closed as compared to eyes open (Riley et al., 1999). With regards to age, critical mean square displacement and critical time interval increase with age in SDA (Collins et al., 1995). While no systematic study has been done on the aging effect on RQA measures (Riley et al., 1999), %REC, %DET, and MAXL are known to increase with decreasing behavioral flexibility (Riley et al., 1999; Webber and Zbilut, 2005). Consistent, Parkinson's disease yield higher values of %REC, %DET, and MAXL compared to healthy controls (Schmit et al., 2006), while ballet dancers have lower RQA values than track athletes (Schmit et al., 2005).

In this study, we expected the CD program to favor stochastic posture in SDA and postural flexibility in RQA in higher proportion than the FP program. As both training programs were mostly performed eyes open, such effects were expected to be observed in the eyes open condition. To justify this hypothesis, we suggest that normal aging is accompanied by inflexibility as a result of decreasing motor and cognitive control with age. Indeed, attentional control show more precocious and larger decline than other functions in older adults (Bherer et al., 2004), consistent with an early decline of prefrontal areas of the brain (Rajah and D'Esposito, 2005; Raz and Rodrigue, 2006). As a correlate, motor control has been shown to decline with age with pejorative consequences for gait and posture (de Bruin and Schmidt, 2010; Theill et al., 2011). In a recent study, Coubard et al. (2011) showed that the practice of CD improved switching attention

(i.e., cognitive flexibility), which was not the case of that of FP or of Tai Chi Chuan. As CD focused on motor improvisation while FP and Tai Chi Chuan taught motor routines, it was suggested that CD may have worked as a training for change, thus inducing plasticity in flexible attention (Coubard et al., 2011). Yet the causal relationship between motor and cognitive dimensions remain under debate (Coubard, 2011), we suggest that such a motor correlate may be observed for posture. In other words, we expected CD to improve motor flexibility (in other words to reduce motor stiffness) as compared to FP, which should take the form of higher stochastic activity in SDA and reduced determinism in RQA.

# **MATERIALS AND METHODS PARTICIPANTS**

Forty-one French natives participated in the study, which was approved by the local ethics committee (Ecole des Hautes Etudes en Sciences Sociales, Paris). They were right-handed, had normal or corrected-to-normal vision, no known neurological disorders, and were unaware of the goal of the experiment. **Table 1** details the participants' sex, age, body mass index (BMI, defined as the weight divided by the squared height in kg m−2), years of education, and their score in the Mini-Mental State Examination (MMSE) for cognitive status (Folstein et al., 1975).

Sixteen participants aged 64–83 years made up the CD group, and 25 participants aged 59–86 years made up the FP group. The two groups were matched in sex (Chi2 < 1), age (*t* < 1), BMI (*t* <sup>39</sup> = 1.84, *P* > 0.05), education (*t* <sup>39</sup> = 1.38, *P* > 0.05), and their score in the MMSE (*t* < 1).

The two groups were matched in past physical activity as appreciated by the reported years of practice, except for gymnastics. One participant had done 1 year of FP in the past, eight participants had done 4.0 ± 5.4 years of aqua gymnastics (no between-group difference, Mann–Whitney test, *Z* < 1), and 19 of them had done 4.7 ± 4.5 years of gymnastics, with higher practice for the FP group (Mann–Whitney, *Z* = −2.68, *P* < 0.01).

# **APPARATUS**

Static posture was examined using a Techno-Concept platform (Céreste, France), which consisted of two dynamometric clogs, one for each foot, embedded in a board so that the angle made by the feet was 30˚. The displacements of the CoP were recorded for 51.2 s and digitized at 40 Hz using a 16 bit analogical–digital converter.

**Table 1 | Number (gender) or mean** ± **SD (age, BMI, education, MMSE) for the groups of participants (CD, contemporary dance; FP, fall prevention).**


### **TRAINING PROGRAMS**

Participants were trained to CD or FP for 4.42 months in average. Participants did not have a choice in the training in which they were enrolled, which was set by the district where they lived in the Ile-de-France region. The training was conducted by a professional instructor and supervised by a senior teacher, each in their specialty (CD or FP). The frequency of the training was once a week, and each session lasted 1 h. Music could be used during up to 50% of the session duration. Participants did not take part in other motor programs during the intervention period.

# *Contemporary dance*

Contemporary dance focused on motor improvisation. (1) Opening was adapted to the needs of the group (e.g., variations of the action of walking, free dance on a popular music). (2) Warmup and preparation to dance. Body was awakened by passive and active movements of joints, and movements of muscular stretching promoting coordination, link between breath and movement, and body positioning and alignment. (3) Improvisation. Based on a theme or constraint (word, action, idea, object, music, or location), it was organized around four steps: (i) individual exploration of the theme; (ii) exploration in pair or more taking into account the others' presence; (iii) each group presented the work developed in (i) and (ii); (iv) participants improvised in a solo and developed a natural movement to express their own sensations. At steps (i) and (ii), the instructor suggested dance tools to favor exploration and use of each individual and of the exercise resources. (4) Closure. Cooling down by breath and massages. For a session, 5, 20, 30, and 5 min were respectively dedicated to the stages 1–4.

# *Fall prevention*

Fall prevention focused on balance and the development of lower limbs. A session was organized as follows. (1) Warm-up and stretching. (2) Development of visual, vestibular, kinaesthetic, and proprioceptive functions, through specific exercises, to optimize each function. (3) Workshops were organized around objects: as example, participants stepped over obstacles, walked on foam rubbers, on small bags of sand, on a rope, etc. The training emphasized motor skills ensuring postural stability, as well as accuracy and amplitude of movements. (4) Cooling down and stretching. The training was performed individually and in pairs. For a session, 10, 20, 20, and 10 min were respectively dedicated to the stages 1–4.

# **POSTURAL RECORDING**

Participants underwent postural recordings before and after the training intervention. In a quiet normal illuminated room, they stood in an upright posture on the platform, barefoot, and their arms comfortably at their sides, with the instruction to breathe normally and keep relaxed. Participants underwent four conditions: in the eyes closed condition, they wore a mask in front of their eyes enabling darkness; in the eyes open conditions, they fixated a black circular surface at eye level subtending 1.5˚ of visual angle, at a distance of 600, 150, or 40 cm (Coubard, 2011). When necessary, participants wore their usual spectacle correction.

# **POSTURAL MEASUREMENTS**

Raw data provided by the manufacturer software were preprocessed using home-made scripts under Matlab 7.0 (The MathWorks, USA). First and last of the 2048 samples were discarded as they exhibited artifacts due to respectively onset and offset of the recording, and only positions of the CoP in mediolateral (*x*) and anteroposterior (*y*) planes were kept for further analysis. An example of CoP displacements is illustrated in **Figure 1A**. We calculated statistic scores and performed dynamic analyses using home-made scripts under R (www.r-project.org).

# *Statistic scores*

We measured the length (in millimeters, mm), the confidence ellipse area (in mm2) that includes 90% of the positions of the CoP, and the mean and variance velocities (in mm s−1) of the CoP displacements. We also calculated the convex hull area (in mm2) including 100% of the positions of the CoP (Andrew, 1979; see **Figure 1A**), the Romberg quotient as the surface confidence ellipse eyes closed divided by the average one in the eyes open conditions, and the fractal dimension ratio (Chiari et al., 2000).

# *Diffusion analysis*

We plotted the planar mean square displacement of the CoP <Δ*r* <sup>2</sup>> in mm<sup>2</sup> (where *r* is the sum of the displacements in the mediolateral and anteroposterior planes, and the brackets means the average over time) as a function of time interval Δ*t* in seconds (see **Figure 1B**). We calculated the diffusion coefficients *D*<sup>S</sup> and *D*<sup>L</sup> (in mm2 s <sup>−</sup>1) for respectively the short- and long-term regions (before and after the critical point) from the slopes of linear-linear plots of <Δ*r* <sup>2</sup>> vs. Δ*t* curves, and calculated the corresponding scaling exponents *H*<sup>S</sup> and *H*<sup>L</sup> from the log–log plots of such curves. We measured the coordinates of the critical point separating short- and long-term regions, defined as the critical time interval Δ*t*rc (in s) on the *x* axis and the critical mean square displacement <Δ*r* <sup>2</sup>><sup>c</sup> (in mm2) on the *y* axis (see **Figure 1B**).

# *Recurrence quantification analysis*

We examined the local recurrence of data points in the reconstructed phase space. Following input parameters were chosen according to Webber and Zbilut's (2005) recommendations: time lag was set to 50 ms corresponding to two samples; the embedding dimension was set to 12; the radius was restricted to the values 2–3% of the mean distance between data points since 1% or 4–5% provided respectively not enough and too many recurrent points; the number of successive points to define a diagonal line segment was set to 3. We measured the percentage of %REC and of %DET, and the MAXL. We calculated the three measures separately for the *x* and *y* planes of the CoP fluctuations, and for radii equal to 2 and 3%. An example of RQA plot is shown in **Figure 1C**.

# **STATISTICAL ANALYSIS**

All measures were submitted to analyses of variance (ANOVAs) with Group (two levels: CD vs. FP) as between-participant factor, Period (two levels: pre-test vs. post-test), and Eye (two levels: eyes closed, eyes open) as within-participant factors. The training duration was entered as a covariate in all statistical analyses since it was higher in the CD group compared to the FP group (5.06 ± 0.92 vs.

4.01 ± 0.85 months, respectively; *t* <sup>39</sup> = 3.73, *P* < 0.001). *Post hoc* tests were calculated using Fisher's least significant difference (LSD) method. Critical results were corroborated by calculating more conservative *post hoc* tests using Newman–Keuls (NK) method. For critical results, we also calculated effects sizes using Cohen's measure defined as (*m*exp − *m*ctrl)/[(σexp + σctrl)/2], where *m* and σ are respectively mean and SD for experimental (CD) and control (FP) groups. We used Statistica 7.0 (StatSoft, USA) for all analyses. Distributional information was given by standard errors (SE).

# **RESULTS**

#### **STATISTIC SCORES**

Results are detailed in **Table 2**. Three-way ANOVAs with Group, Period, and Eye as factors showed neither main effect nor interaction for the length, the confidence ellipse, and convex hull areas, the mean and variance velocities, and the Romberg quotient. For fractal dimension, only the Group × Eye interaction was statistically significant (*F*1,38 = 6.02,*P* < 0.05), due to a lower mean value eyes open in the CD group (LSD, *P* < 0.01).

# **DIFFUSION ANALYSIS**

Results are provided in **Table 3**. Three-way ANOVAs with Group, Period, and Eye as factors showed neither main effect nor interaction for the diffusion coefficient *D*S, the scaling exponents *H*<sup>S</sup> and *H*L, and the critical mean square displacement <Δ*r* <sup>2</sup>>c. For the diffusion coefficient *D*L, we found a main effect of Period (1.48 vs. 1.78 mm2 s <sup>−</sup><sup>1</sup> in pre- and post-test periods, respectively; *F*1,38 = 4.57, *P* < 0.05).

For the critical time interval Δ*t*rc, we observed a main effect of Group (1.48 vs. 1.14 s in the CD and FP groups, respectively; *F*1,38 = 6.84, *P* < 0.05). Critical for our hypothesis, the Group × Period × Eye interaction was significant (*F*1,38 = 4.21, *P* < 0.05), with eyes open an increase between the pre- and posttest periods in the CD group (1.53 vs. 1.65 s; LSD, *P* = 0.030; NK, *P* = 0.045) vs. a tendency for a decrease in the FP group (1.16 vs. 1.02 s; LSD, *P* = 0.059). With respect to effect sizes, Cohen's d values in the eyes open condition were 2.93 and 4.75 in pre- and post-test periods, respectively (see **Figure 2A**).

#### **RECURRENCE QUANTIFICATION ANALYSIS**

**Table 4** shows detailed results. In the mediolateral plane, threeway ANOVAs with Group, Period, and Eye as factors showed a main effect of Group for %REC and a radius of 2% (*F*1,38 = 9.90, *P* < 0.01), for %REC-3% (*F*1,38 = 9.66, *P* < 0.01), for MAXL-2% (*F*1,38 = 11.8, *P* < 0.01), and for MAXL-3% (*F*1,38 = 6.95, *P* < 0.05) with higher mean values in the CD group, and a main effect of Period for %REC-2% (*F*1,38 = 5.41, *P* < 0.05) with a higher value in the pre-test period. We observed a Group × Eye interaction for all measures except MAXL-2% due to systematic higher mean values eyes open in the CD group (LSD, *P* < 0.05), and a Period × Eye interaction for %REC-2% (*F*1,38 = 5.65, *P* < 0.05), and %REC-3% (*F*1,38 = 4.90, *P* < 0.05) with only a tendency for higher mean values eyes open in the post-test period (LSD, *P* > 0.05).

With regard to our hypothesis, we found a Group × Period interaction for %REC-2% (*F*1,38 = 4.34, *P* < 0.05), which was due


**Table 2 | Mean** ± **SE of statistic scores for the groups of participants (CD, contemporary dance; FP, fall prevention).**

*ANOVAs' F and P values are those of the third-order interaction (except for the Romberg quotient for which F and P values are those of the second-order interaction).*

**Table 3 | Mean** ± **SE of diffusion analysis for the groups of participants (CD, contemporary dance; FP, fall prevention).**


*ANOVAs' F and P values are those of the third-order interaction. Asterisk only indicates statistical significant difference (LSD, P* < *05) between pre- and post-test periods.*

diffusion analysis. **(B)** Mean percentage of recurrence (%REC) and **(C)** mean maximal diagonal line (MAXL) for mediolateral fluctuations (CPx) for a radius of 2% from the recurrence quantification analysis. Results are shown as a function of pre-test (Pre) and post-test (Post) periods, and of eyes closed (left panels) and eyes open (right panels) conditions, for the two training programs: contemporary dance (CD) in full lines, and fall prevention (FP) in dotted lines. Vertical bars are SE. Asterisks only indicate statistical significant difference (LSD, *P* < 0.05) between pre- and post-test periods within a group.

to a tendency for a decrease between pre- and post-test periods in the CD group (respectively 0.737 and 0.623; LSD, *P* > 0.05) whereas no change occurred in the FP group (0.343 and 0.339; LSD, *P* > 0.05). The Group × Period × Eye interaction was significant for all measures (*P* < 0.05), with a systematic pattern that is illustrated in **Figures 2B,C**. In the eyes open condition, the mean value decreased in the CD group for all measures (%REC-2%, LSD, *P* = 0.002, NK, *P* = 0.002; %REC-3%, LSD, *P* = 0.005, NK, *P* = 0.006; MAXL-2%, LSD, *P* = 0.012, NK, *P* = 0.006), whereas it increased in the FP group for all measures (LSD, *P* < 0.05 only for %DET-2%; see **Figures 2B,C**). To illustrate effect sizes, Cohen's *d* values for %REC-2% in the eyes open condition were 6.57 and 1.54 in pre- and post-test periods, respectively (see **Figure 2B**). For MAXL-2%, Cohen's *d* values eyes open were 7.27 and 2.17 in preand post-test periods (see **Figure 2C**). Eyes closed, the pattern was the reverse: an increase between the two periods in the CD group

(LSD, *P* < 0.05 only for %DET-2%) vs. a decrease in the FP group (LSD, *P* < 0.05 only for MAXL-3%).

In the anteroposterior plane, three-way ANOVAs with Group, Period, and Eye as factors showed a main effect of Eye for %REC-2% (*F*1,38 = 4.19, *P* < 0.05) and %REC-3% (*F*1,38 = 4.16, *P* < 0.05) with higher mean values eyes open. As in the mediolateral plane, we observed a Group × Eye interaction for all measures as a result of higher mean values eyes open in the CD group (LSD, *P* < 0.05 except for %REC-3% and %DET-3%).

With respect to our expectancy, a Group × Period interaction was found for %REC-2% (*F*1,38 = 4.10, *P* < 0.05), as a result of a tendency for a decrease vs. an increase in mean values between the pre- and post-test periods in the CD vs. the FP groups, respectively (the difference between the two periods failed to reach significance for the two groups). The Group × Period interaction was also significant for MAXL-2% (*F*1,38 = 5.50, *P* < 0.05), for which the decrease in the CD group was insignificant contrary to the increase in the FP group (LSD, *P* < 0.05). Finally, there was no third-order interaction for any measure.

#### **DISCUSSION**

The main findings of this study were that the practice of CD in older adults enhanced the critical time interval in SDA, and reduced recurrence and mathematical stability in RQA, as compared to practice of FP which tended to induce reverse patterns. One limitation of this study was that the two groups did not perform equally in the pre-test period. The initial higher level in gymnastics in the FP group may have participated in betweengroup disparity and further research is needed to corroborate our observations.

In aging, previous cross-sectional studies have suggested that social dance may benefit balance, postural stability, and walking pattern (Verghese, 2006;Zhang et al., 2008), and intervention studies have evidenced how aerobic (Shigematsu et al., 2002; Hui et al., 2009), Laban-based (Hamburg and Clair, 2003), or jazz (Alpert et al., 2009) dance improve balance using clinical (Shigematsu et al., 2002; Hamburg and Clair, 2003; Hui et al., 2009) or laboratory (Alpert et al., 2009) measures. Our study is the first one to examine the effects of CD on motor control using posturography and dynamic analyses of static posture before/after the intervention, which enabled us to provide further insight into the underlying mechanisms by which dance influences postural control (Judge, 2003).

For Collins and De Luca (1993), human being in quiet standing is viewed not as an inverted pendulum but a pinned-polymer whose CoP displacements result from a blend of stochastic and deterministic processes. Over short-term interval, the postural control system would utilize open-loop mechanisms dominated by randomness, before closed-loop mechanisms would be called into play with a prevalence of deterministic control. Here we showed that CD delayed the point at which the postural system switches from open- to closed-loop control, suggesting that CD enlarged the initial temporal window for motor stochastic processes. A previous study reported age-related increased critical mean square displacement and critical time interval resulting in a short-term interval higher slope, which the authors interpreted as enhanced postural stiffness with age (Collins et al., 1995). In our CD group,


# **Table 4 | Mean** ± **SE of recurrence quantification analysis for the groups of participants (CD, contemporary dance; FP, fall prevention).**

*ANOVAs' F and P values are those of the third-order interaction. Asterisks only indicate statistical significant difference (LSD, P* < *05) between pre- and post-test periods.*

post-test increased critical time interval without any change in critical mean square displacement resulted in lower short-term interval slope, which suggests reduced postural stiffness after CD training.

Further insight was provided by RQA approach (Riley et al., 1999) allowing us to quantify the degree of recurrence, determinism, and mathematical stability of CoP displacements. CD reduced both recurrence and mathematical stability suggesting that CoP displacements were less likely to repeat themselves over time and that their dynamics were more flexible, whereas FP yielded the opposite tendency. Such a pattern was almost visible for mediolateral displacements of the CoP, which we explain

from a biomechanical viewpoint by the fact that the base of support for upright stance is wider in the mediolateral plane than in the anteroposterior one.

Taken together, we propose that CD promoted stochastic postural control of older adults, by providing more time to random postural processes, decreasing repeatability and increasing flexibility of postural oscillations. In both SDA and RQA, the effects were observed eyes open, which may be due to the fact that motor activities were mostly practiced eyes open. Since we measured static posture, we suggest that CD influenced the flexibility of the central postural system *per se*, resulting in higher complexity in the mathematical sense, i.e., higher adaptability in the physiological sense. Such effect may have been caused by improvisation which favored creativity and constant adaptation to constraints in space, time, interaction with other dancers, whereas FP based on more stereotyped behaviors tended to produce opposite effects.

Suggesting that CD improves postural flexibility in older adults, the present study completes a previous report (Coubard et al., 2011) showing that CD improves cognitive flexibility in aging. How motor and cognitive dimensions interact to yield parallel effects needs to be further investigated by measuring motor and attentional control in the same participants undergoing such training programs. In the meanwhile, we suggest a cortical-subcortical loop hypothesis to account for the improvement of postural flexibility. Motor control involves extensive areas of the central nervous system from the spinal cord to the cerebral cortex: globus pallidus, putamen, caudate nucleus, thalamus, substantia nigra, subthalamic nucleus, cerebellum, reticular formation, vestibular nuclei. At a higher level, the supplementary motor cortex, the frontal eye fields, the dorsolateral prefrontal cortex play a supramotor role in, respectively, preparing (Jenkins et al., 2000), monitoring (Schall, 2004), and controlling (Rowe et al., 2000) the movement to be produced by the primary motor cortex.

#### **REFERENCES**


normal, dans la maladie d'Alzheimer et dans la démence frontotemporale. *Psychol. Neuropsychiatr. Vieil.* 2, 181–189.


It is likely that this motor network together with cortical– subcortical loops linking the cerebral cortex to basal ganglia (Alexander et al., 1986) may be involved in motor activities such as CD and FP. However, CD may require higher attentional demand than FP due to the practice of improvisation. In such a way, the prefrontal–subcortical interaction may have been recruited with higher frequency and intensity, resulting in enhanced motor flexibility. Taken together with the report by Coubard et al. (2011), this study suggests that CD induces changes at both postural and attentional levels, which may share common characteristics.

To conclude, the results of this study suggested that CD practice favors flexible postural control in older adults. Taken together with good acceptance, adherence, and moderate intensity associated with this practice, we recommend CD to develop motor plasticity not only in normal aging but also in pathological conditions with motor stiffness (Schmit et al., 2006).

# **ACKNOWLEDGMENTS**

Results were presented at the 6th International Conference on the Arts in Society (Berlin, Germany, 2011). The authors thank ADAL (Paris) for allowing the assessment of some participants.


disability. *Am. J. Prev. Med.* 25, 150–156.


delivered home exercise programme to prevent falls. 2: controlled trial in multiple centres. *BMJ* 322, 701–704.


sig, R. W. (2011). Simultaneously measuring gait and cognitive performance in cognitively healthy and cognitively impaired older adults: the Basel motor-cognition dual-task paradigm. *J. Am. Geriatr. Soc.* 59, 1012–1018.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 14 May 2011; accepted: 08 December 2011; published online: 29 December 2011.*

*Citation: Ferrufino L, Bril B, Dietrich G, Nonaka T and Coubard OA (2011) Practice of contemporary dance promotes stochastic postural control in aging. Front. Hum. Neurosci. 5:169. doi: 10.3389/fnhum.2011.00169*

*Copyright © 2011 Ferrufino, Bril, Dietrich, Nonaka and Coubard. This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.*

# *Multi-Modal Artistic Processing in the Brain*


Fortunato Battaglia, Sarah H. Lisanby and David Freedberg


# **THE CINEMA-COGNITION DIALOGUE: A MATCH MADE IN BRAIN**

**Yadin Dudai**

# The cinema-cognition dialogue: a match made in brain

# *Yadin Dudai\**

*Department of Neurobiology, The Weizmann Institute of Science, Rehovot, Israel*

#### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Oliver C. Schultheiss, Friedrich Alexander University, Germany Uri Hasson, Princeton University, USA*

#### *\*Correspondence:*

*Yadin Dudai, Department of Neurobiology, The Weizmann Institute of Science, Rehovot 76100, Israel. e-mail: yadin.dudai@weizmann.ac.il*

That human evolution amalgamates biological and cultural change is taken as a given, and that the interaction of brain, body, and culture is more reciprocal then initially thought becomes apparent as the science of evolution evolves (Jablonka and Lamb, 2005). The contribution of science and technology to this evolutionary process is probably the first to come to mind. The biology of *Homo sapiens* permits and promotes the development of technologies and artefacts that enable us to sense and reach physical niches previously inaccessible. This extends our biological capabilities, but is also expected to create selective pressures on these capabilities. The jury is yet out on the pace at which critical biological changes take place in evolution. There is no question, however, that the kinetics of technological and cultural change is much faster, rendering the latter particularly important in the biography of the individual and the species alike. The capacity of art to enrich human capabilities is recurrently discussed by philosophers and critics (e.g., Arsitotle/*Poetics*, Richards, 1925; Smith and Parks, 1951; Gibbs, 1994). Yet less attention is commonly allotted to the role of the arts in the aforementioned ongoing evolutional tango. My position is that the art of cinema is particularly suited to explore the intriguing dialogue between art and the brain. Further, in the following set of brief notes, intended mainly to trigger further thinking on the subject, I posit that cinema provides an unparalleled and highly rewarding experimentation space for the mind of the individual consumer of that art. In parallel, it also provides a useful and promising device for investigating brain and cognition.

**Keywords: brain, cinema, dissociative states, emotional mental travel, mental time travel, working memory**

# **ON THE CINEMA-BRAIN RESONANCE**

Born just a little over a century ago, cinema capitalized on the rich history of the art of the theatre and on developments in the technology of photography, while harnessing the visual illusion of motion. Combined with the budding of globalization, this culminated in the fast development of cinema into a popular cognitive domain and social phenomenon, and ultimately into a rich universe of visual (and ultimately audiovisual) artistic and social experience (Cook, 1981; Salt, 1992; Thompson and Bordwell, 2003). But what is it that turned cinema into such a success? I propose that in addition to the ripe technological and social context that promoted cross-cultural dissemination, a major drive in the fast and triumphant evolution of cinema is that this form of art uniquely fits, exploits and expands the potential of basic and critical faculties of human brain and cognition. These are *Working Memory* (WM), *Mental Time Travel* (MTT), *Mental Emotional Travel* (MET), and a spectrum of transitions in consciousness manifested in *Dissociative States*. Furthermore, since cinema taps into the above faculties, it can also be exploited as a convenient scientific tool to investigate those faculties and their brain substrates.

# **ON INFORMATION SYSTEMS IN THE BRAIN**

Understanding how the human brain reads a movie and reacts to it can benefit from understanding how the brain acquires information about the world in general. Our brain has evolved multiple knowledge or memory systems (**Figure 1**). These are commonly classified along multiple axes (Dudai, 2002). One of these axes is time—whereas some information is stored for seconds or minutes only, other information is stored for weeks, months, years, even a life time. The first type of information is aptly termed "short-term memory" (STM), whereas the second is "long-term memory" (LTM). Another criterion for the taxonomy of memory systems, which is presently dominant in the science of memory literature, concerns the role of conscious awareness in retrieving the information. Hence LTM is considered as either "declarative" ("explicit") or "non-declarative" ("implicit") (Dudai, 2002). Declarative memory involves the conscious recollection of facts and events, as opposed to non-declarative, in which retrieval can materialize in the absence of conscious awareness. The declarative—non-declarative dichotomy is widespread in the literature not only because it is intuitively appealing but also because the brain honors it, i.e., different brain circuits subserve the two types of information.

# **ON WORKING MEMORY (WM)**

A dedicated information processing system that combines STM and LTM, is "WM" (**Figure 2A**). WM is a limited capacity system embedded in distributed brain circuits, that holds information under attentional control in temporary storage during the

planning and execution of a task (Miller et al., 1960; Baddeley, 2007). It combines on-line information (i.e., percepts) with offline information (i.e., LTM) to yield temporary task-oriented internal representations. Some of these representations may subsequently become consolidated into LTM, but often, it is disadvantageous to retain the task-related information in LTM because it may interfere with subsequent tasks. WM is hence a "mental hub" essential for mentation and behavior and indispensible for human cognition and intelligence. Rudimentary WM capabilities may exist in species lower on the phylogenetic scale, but it is considered to have reached its pinnacle in humans, and it takes years to mature in the individual of the species (Luciana and Nelson, 1998).

A particularly influential cognitive model of WM proposes three types of components (Baddeley, 2007) (**Figure 2A**). One is an attentional control system, termed the "central executive" (CE). Another type is content-dedicated workspaces that are depicted as subordinates of the CE. The model singles out two: a phonological loop, which deals with speech-based information and is assumed to comprise a phonological store and articulatory rehearsal mechanism, and a visuospatial sketchpad, which deals with visuospatial information. The two workspaces are assumed to process information related to the most salient domains of the human mind—vision, space, sound, and language. Additional "workspaces" may exist (Yeshurun et al., 2008). Finally, a third type of hypothetical component is the episodic buffer: mental space in which information from the content-dedicated workspaces and LTM is temporarily bound under the control of the CE to form coherent representations of events, on their potential route to LTM. The CE is postulated to interconnect with modulatory and reinforcing circuits, e.g., encoding emotion and hedonic valence, which control the allocation of attention and filter the transformation of WM representations to LTM.

It is noteworthy that generic attributes of film resonate optimally with the capabilities of WM, and that WM seems to be able to exploit efficiently information in movie stimuli. This resonance was postulated to greatly enhance the rapid successful integration of movies as an "extracorporeal" cognitive organ and a global social phenomenon (Dudai, 2008). Several points support this assumption (Dudai, 2008):


**FIGURE 2 | Film resonates with working memory.** A dominant model of WM considers multiple components (Baddeley, 2007). They are portrayed as a master system, the *central executive*, which executes attentional control over subordinate systems that are content-dedicated mental workspaces, the *phonological loop*, which deals with speech-based information, and the *visuospatial sketchpad*, that deals with visuospatial information. Another postulated component is the *episodic buffer*, in which information from the content-dedicated-workspaces and LTM is temporarily bound under the control of the central executive, to form coherent representations of events, on the potential route to LTM. The mental state evoked by the relevance to survival (e.g., threat, mate, food) of the information flowing into each of the subordinate systems and bound in the episodic buffer, could be considered as "emotion"; it is usually not explicitly included in models of WM and therefore not depicted in the scheme discussed here, yet is highly relevant to the appeal and effect of cinema

(see MET in the text). **(A)** Defining attributes of narrative film resonate neatly with multiple components of WM, as well as with effective transformation of information from the episodic buffer into long-term memory. Three major attributes are contextual focusing of the central executive toward the stimulus, intense multi-modal co-activation of both the visuospatial sketchpad and the phonological loop, and compression of narrative highlights that facilitate the focusing of the CE as well the pruning of information to be consolidated from the episodic buffer into LTM. "Author" usually represents multiple individuals though in some cases mainly the director, still never really in isolation. **(B)** Captivating movies can induce a dissociative state in which the movie stimulus dominates the operation of WM components to temporary block simultaneous unrelated input. For further discussion including comparison to other art forms and other dissociative states, see text. (The frame in the inset is from Bresson's *Pickpocket*, 1959) (Adapted from Dudai, 2008).

function (Dudai, 2002). The multimodality of film hence enhances its perceptual and mnemonic effectiveness. The unique role of multi-sensory synergism in film has long been noted by major film directors (Eisenstein, 1998). Indeed, some silent films have outstanding affective impact and artistic qualities, nevertheless, activation of the brain's language workspace is likely to occur even in the absence of sound, by observing people talking and trying to decipher what they say. It is also of note that even silent film had snapshots of explicit verbal information, provided by intercalated text slides.


absence of explicit social interaction. This could markedly affect the CE, focusing attention and creating a special mind set, and could also activate social-intimacy and safety reward circuits. This enhanced attention in the semi-detached milieu could further activate the episodic buffer, while at the same time promote a transient, mild dissociative state (and see below). This added-value of contextual defamiliarization may account for the failure of Edison's Kinetoscope, in which spectators watched movies in isolation.

(h) Dissociative states of the aforementioned type can be assumed to involve transient loss of inhibitory control by frontal brain areas, i.e., disruption of CE function. This loss of control is potentially rewarding (as illustrated by the individuals and communities who enter trances of various sorts voluntarily; Kihlstrom, 1985; Robinson and Berridge, 2003). Once induced by spatiotemporal MTT in the unique contextual setting and mental set, the dissociative state in the spectator might be rewarding *per se*, promoting positive feedback that further promotes the enjoyable mental state.

# **ON MENTAL TIME TRAVEL**

Resonance with the capabilities of WM is, however, only one component in the productive dialogue between brain and cinema. Another is the ability of movies to extend, manipulate and promote individual experimentation with another pinnacle of human brain and cognition, namely, MTT (also termed chronesthesia). MTT refers to the ability to be aware of one's past and reenact it in mind, as well as to imagine potential future scenarios (Tulving, 1983, 2005; Suddendorf and Busby, 2005; Bar, 2011; Suddendorf et al., 2011). Some consider this mental faculty to be uniquely human, others posit that rudimentary forms exist in some other species as well (Tulving, 1983, 2005; Suddendorf and Busby, 2005; Bar, 2011; Suddendorf et al., 2011). MTT is the decisive fingerprint of bona-fide episodic memory. Its imagining component, i.e., the ability to mentally construct potential scenarios of future occurrences, has been suggested as a major drive in the evolution of episodic memory (Dudai and Carruthers, 2005; Schacter and Addis, 2011). It may also underlie the feeble veracity of episodic recollection: strict faithfulness to details might hamper useful imagination. It is noteworthy that episodic recollection and imagining share brain circuits (Hassabis and Maguire, 2011; Schacter and Addis, 2011).

Movies promote, entrain and enhance MTT. Their ability to simulate real-life, day-dreaming, and "dream-like" experiences by fusing multimodal perception with emotional and cognitive overtones, distanced from the acute spatiotemporal coordinates in which the spectator is present at that specific point in time, was long noted by movie theorists (Eisenstein, 1969; Morin, 2005). Indeed movies have been recently introduced as effective stimuli in perceptual studies and memoranda in memory studies that combine behavioral analysis and functional neuroimaging (Hasson et al., 2006, 2008a; Furman et al., 2007; Mendelsohn et al., 2008, 2009, 2010). What is less noted in the studies of brain and cognition is that the experience of becoming immersed in a movie also provides an intriguing mental experimentation space for exercising MTT in the observer, and as such, can provide internal reward in exposing the immersed observer to imaginary experiences otherwise unattainable. This rewarding value is shared with other forms of art, however, cinema, being a multi-model art form, may provide a more universal, and for most individuals probably more accessible opportunity, to tap into this type of reward.

# **ON MENTAL EMOTIONAL TRAVEL**

Similarly to the promotion and entraining of MTT, and coupled to this ability, movie art can also be considered an effective manipulator of MET. "Emotion" is considered in the scientific literature in multiple connotations, the two dominant ones being emotions as a trigger of an automatic physiological response, mostly to danger and social cues, and emotions as the subjective feeling which accompanies the above and other states related to the relevance of ambience to the self (LeDoux, 1996). In the present context, it is the latter manifestation of emotion that counts. Given the proper movie, the observer can wander into and explore a spectrum of rich and deep emotional experiences and domains unexplored by most people in daily life, let alone within the condensed time capsule that the movie offers. Selected (admittedly idiosyncratic) examples range from neorealistic cornerstones (e.g., Ozu's *An Inn in Tokyo*, its artistic sequel, *Bicycle Thieves* by De Sica, or Rosselini' *The War Trilogy*) to the bleak and provocatively disturbing postmodernism of Haneke in *Caché* and other masterpieces. Exploration of the unlimited imaginary emotional spectrum further expands the mental reward space provided by cinema. Although the movies and the examples of the cinematic devices brought up in this article mostly refer to "auteur" (in European cinema a top director is considered the author of the movie) or arthouse movies, clearly, a movie need not be a high quality art piece to achieve the aforementioned effects. Any emotional drama, irrespective of its artistic quality and literary value, evokes MET. Indeed, both MTT and MET are generic attributes of the cinema.

# **ON CINEMATIC APPROACHES TO PROMOTING MTT AND MET**

A wide range of styles used by various film directors can effectively trigger and promote explorative MTT and MET. It is noteworthy that excessive audiovisual effects or mimicking real-life excessively to the point that defamiliarization, an important artistic device (e.g., Brecht, 1977), is minimized, are not necessarily helpful; making a movie too real was proposed to even hamper imagining and hence MTT (Dudai, 2008). In the present context, only a single particularly interesting style, which echoes a highly successful conceptual framework of modern scientific research and therefore might particularly be appreciated by scientists, will be briefly noted. This is reductionism, characterized by an attempt to identify cognitive, emotional, and motor universals and manipulate them in a minimalistic manner. This is a bottom-up approach guided by the goal of entwining emergent cognitive and emotional outcomes from their most basic building blocks. This seems to effectively prompt the observer to reconstruct situations, plots and emotions while maximizing mental effort, attention and self-involvement—not unlike those sometimes required for successful reenactment of remote self-episodes. Two major representatives of this approach come immediately to mind, each unique in his idiosyncratic implementation of the concept: the French auteur Robert Bresson (1901–1999) and the Japanese auteur Yasujirô Ozu (1903–1963).

Bresson (*A Man Escaped, Pickpocket*, *Au Hasard Balthazar, Mouchette*, and nine other masterpieces), a master of lean and crystallized cinematography, recurrently used what he called "models": non-professional actors trained in neutral line reading, automatic gestures, and emotional inexpressiveness (Quandt, 1998). He attempted to identify and use the most reducible behavioral elements, and strip these motor, cognitive, and emotional atoms from all superfluous context- and time-dependent heuristics. By doing so he wished to present the "pure" human action (and hence potential feelings underlying it) to the observer, and to decipher and reconstruct the scene and its underpinning bare human actions. Bresson's style is to focus on body parts (e.g., hands) rather than the whole body, pushing reductionism even further. "Models who have become automatic (everything weighed, measured, timed, repeated 10, 20 times) and are then dropped in the medium of the events of your film—their relations with the objects and persons around them will be right, because they will not be thought" (Bresson, 1975) ". . . It is with something clean and precise that you will force the attention of inattentive eyes and ears."

Ozu, in contrast, exercised reductionism and minimalism while relying on a small cast of professional actors, many of them playing recurrently in his films. The overall outcome in terms of inciting universal responses in the observer is, however, quite similar to that of Bresson, though reflecting a more humane and empathic and less austere and religious ambient than the latter. Ozu used an almost unbelievable number of takes for every scene, "correcting our every inflection, over and over . . . trying to reduce things to their most basic essence, free of all excess" (Arima, 2003). An idiosyncratic Ozu shooting style, which promotes attention and focuses the gaze, was the so called "tatami shot," in which the camera (always static, no tracking shots) is placed at a low height, supposedly at the eye level of a person kneeling on a tatami mat. Ozu produced 53 movies, most of them a variant of a similar type of simple plot focusing on family life and generation gaps; *An Inn in Tokyo, The Only Sun, Late Spring* and *Tokyo Story* are notable examples. In a way, Ozu (like many great artists) repeated his leitmotif, again and again, 50 times, each time trying to extract novel nuances using the same elementary building blocks (Bordwell, 1988). Bresson and Ozu, each in his unique reductive and minimalist manner, exposed the underlying unity of the human condition. Their rationale, a driving force for many artistic giants, was effectively expressed two centuries earlier: "Nothing can please many, and please long, but just representations of general nature" (Johnson, 1765).

#### **ON DISSOCIATIVE STATES**

All forms of art are capable of inducing some form or another of transient "dissociative states." These are disruptions in integrative functions of consciousness, memory, identity, or perception, and can be pathological. Transient, mild dissociation occurs however in normal individuals when they get immersed in some activity while suppressing attention to other external or internal stimuli (see also in this context "suspension of disbelief " in Bazin, 1967). When induced by art, dissociative states could be regarded as the enslavement of the CE of the consumer to that of the author (Dudai, 2008) (**Figure 2B**). The appreciation that the artist can come to control the audience's mind has of course been with us since classical times, probably dating back to cave art at the dawn of civilization (Lewis-Williams, 2002). Although while engaged in the creative act the artist may not necessarily be aware of the long-reaching effects on the other's mind, many are; a notable example in film is Eisenstein, who, faithful to the tradition of Soviet pragmatism and Pavlovian physiological psychology, attempts to condition the spectator with discrete sensory and semantic devices (Eisenstein, 1947). Tarkovsky formulates the mind-control objective boldly: ". . . a kind of revision takes place within the subjective awareness . . . this process is inherent in the relationship between writer and reader; it's like a Trojan horse, in whose belly the writer makes his way into his reader's soul" (Tarkovsky, 1986). Many film theorists noted the dissociative, or "lowered consciousness" state that can be induced by cinema (Kracauer, 1960), some attributing it to the aforementioned "dream-like" state (Clair, 1953) or to "day dreaming" experiences (Morin, 2005). The depth, persistence and quality of the dissociative state clearly depends on the reader, listener or spectator, on the specific work of art, and on the context, but to get an idea, the reader of this discussion might wish to imagine getting absorbed in a book, a quartet, or a film. This transient partial detachment from the outside world is a function of the state of the WM system at that specific point in time. Dissociative states can have a marked reward valence—as well exemplified by those taking drugs to obtain them, risking addiction (Robinson and Berridge, 2003). They hence provide another potential reward value that promotes the enjoyment of movies.

# **ON CINEMA AS A PARTICULAR MENTAL EXPERIMENTATION SPACE**

One could argue that the ability of cinema to resonate with WM and to promote, instigate and extend MTT, MET, and limited dissociative states, is shared by other forms of art as well. The role in promoting and enriching MTT and MET is encapsulated already in Aristotle's reference to the poet, whose function is: ". . . to describe, not the thing that has happened, but a kind of thing that might happen . . . " (*Poetics* 1451.1). And one could neatly replace "Shakespeare" with "film auteur" in Johnson's praise of the Bard, who ". . . approximates the remote, and familiarizes the wonderful; the event which he represents will not happen, but if it were possible, its effects would probably be such as he has assigned; and it may be said, that he has not only shewn human nature as it acts in real exigencies, but as it would be found in trials, to which it cannot be exposed . . . he who has mazed his imagination . . . may here be cured . . . by scenes which a hermit may estimate the transactions of the world, and a confessor predict in the progress of the passions" (Johnson, 1765).

However, in my view, none of the other forms of art combines all the attributes of film, although a good piece of art is capable of evoking MET, and probably to a lesser degree MTT, particularly in the trained mind, eye or ear, irrespective of the medium. Painting and sculpture are visual, but do not involve concrete visual motion, auditory stimuli, and dynamic physical time compression. Even ingenious narrative-telling painters such as Poussin only create limited spatiotemporal compression in the mind of the spectator, restricting the appreciation of the limited mental travel which is anchored in a physically static snapshot only to those who invest proper mental effort. Another example is time-travel elicited by literary fiction, in which MTT, if elicited, is often more fragmentary and has to be accumulated over the time span of reading the piece. Music *per se* is not visual (though may evoke visual imagery). Theatre (including opera and some forms of dance) is audiovisual and uses limited spatiotemporal compression, with more restricted potential MTT and fewer technological capabilities than film (e.g., absence of rapidly merged flashbacks, close ups, and panning, unless film is integrated into theatre, opera, dance, or other forms of the visual arts). Furthermore, having human beings in real time on stage may *a priori* limit defamiliarization. Hence although it is an error to try and rank art forms, as the types of emotional and cognitive enrichment and reward that they incite and provide differ by the art form, the art piece, and the participant or consumer, one could still generalize that film as a medium is that art form that integrates the most varied and advanced technologies for mimesis, while at the same time presenting on average the anonymous spectator with most opportunities and, most importantly, lowest threshold to extract idiosyncratic enjoyment. Still, of course, without investing much mental work on top of the movi[ing pictur]e, and without some experience, one cannot fully appreciate a Bresson, Ozu or a Tarkovsky, or the works of many others.

# **ON CINEMA AS A SCIENTIFIC EXPERIMENTATION SPACE**

While film equips us with an extracorporeal cognitive and emotional space that can enrich and expand MTT and MET, as well as potentially induce rewarding dissociative states, it also provides scientific research with a powerful tool to probe human brain and cognition. Movies can serve as stimuli and memoranda that can effectively mimic realistic situations (Hasson et al., 2006, 2008a,b; Furman et al., 2007; Mendelsohn et al., 2008, 2010). They permit reproducible presentation of ongoing episodes, and are particularly useful in experiments that involve functional brain imaging, such as fMRI (functional magnetic resonance imaging) (Hasson et al., 2006, 2008a; Mendelsohn et al., 2008, 2010). Indeed the use of movies has already provided novel information on brain processes elicited by complex audiovisual stimuli (Hasson et al., 2006) and on the engagement of identifiable brain circuits in long-term episodic (Hasson et al., 2008a; Mendelsohn et al., 2008, 2010) and autobiographical (Mendelsohn et al., 2009) memory (see also the discussion of "neurocinematics" in Hasson et al., 2008b). Of particular interest is the finding that unlike "traditional" experiments, that consistently unveil subsequent memory effects for still images or context-less verbal material in the mediotemporal lobe (MTL) and the inferior frontal gyrus (IFG), the use of narrative film as memoranda also implicate the superior

**FIGURE 3 | Film as a window for exploring brain and cognition.** As discussed in the text, movies enrich human cognitive experience, but also provide a window into how this experience is encoded in the experiencing brain, because they can be used as reproducible real-life-like stimuli in perceptual and memory experiments. In this example, Hasson et al. (2008a) used a narrative movie as the stimulus to be encoded in long-term episodic memory. The statistical maps of blood oxygen level dependent (BOLD) activity depict brain areas with significantly enhanced activity during movie events that were subsequently remembered compared to events that were not remembered. These

temporal gyrus (STG), temporoparietal junction (TPJ), and the temporal poles in memory formation (Hasson et al., 2008a) (**Figure 3**). These regions have been consistently implicated in social cognition and perception; this suggests that in real-life, the modulation of social cognitive processes impacts episodic memory formation, a finding that tended to escape under the radar of brain imaging paradigms using non-realistic memoranda.

# **REFERENCES**


*a Future*. New York, NY: Oxford University Press.


areas include the right temporal pole (TP), bilateral anterior and posterior superior temporal gyrus (STG), bilateral anterior parahippocampal cortex (aPHG), bilateral posterior parahippocampal gyrus (pPHG), and bilateral temporoparietal junction (TPJ). These areas were implicated by other studies in social cognition. RH, LH, are right and left hemisphere, respectively. This suggests that in real-life, the modulation of social cognitive processes impacts episodic memory formation, a finding not commonly unveiled by using simple static and contextless stimuli in memory experiments. (Adopted with permission from Hasson et al., 2008a).

All in all, hence, movies can enrich human mental experience, yet can also provide a window into how this experience is encoded in the experiencing brain.

# **ACKNOWLEDGMENTS**

I am grateful to Rina Dudai and Uri Hasson for enriching discussions of cinema, and to Aya Ben-Yakov, Micah Edelson, and Alex Pine for helpful comments.


equipped to handle the future than the past. *Nature* 434, 567.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 24 May 2011; accepted: 10 August 2012; published online: 04 September 2012.*

*Citation: Dudai Y (2012) The cinemacognition dialogue: a match made in brain. Front. Hum. Neurosci. 6:248. doi: 10.3389/fnhum.2012.00248*

*Copyright © 2012 Dudai. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any thirdparty graphics etc.*

# **CORTICOMOTOR EXCITABILITY DURING OBSERVATION AND IMAGINATION OF A WORK OF ART**

**Fortunato Battaglia, Sarah H. Lisanby and David Freedberg**

# Corticomotor excitability during observation and imagination of a work of art

#### *Fortunato Battaglia1,2\*, Sarah H. Lisanby3 and David Freedberg4*

*<sup>1</sup> New York College of Podiatric Medicine, New York, NY, USA*

*<sup>2</sup> Division of Brain Stimulation and Therapeutic Modulation, Department of Psychiatry, Columbia University/New York State Psychiatric Institute, New York, NY, USA*

*<sup>3</sup> Department of Psychiatry and Behavioral Sciences, Duke University, Durham, NC, USA*

*<sup>4</sup> Department of Art History and Italian Academy for Advanced Studies, Columbia University, New York, NY, USA*

#### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Juliana Yordanova, Bulgarian Academy of Sciences, Bulgaria Hidenao Fukuyama, Kyoto University, Japan*

#### *\*Correspondence:*

*Fortunato Battaglia, New York College of Podiatric Medicine, 53 East 124th Street, Room 509, New York, NY 10035, USA. e-mail: fbattaglia@nycpm.edu*

We examine the effects of the artistic representation – here exemplified by Michelangelo's *Expulsion from Paradise* – of an action on the motor system. Using single and paired- pulse transcranial magnetic stimulation we analyze corticomotor excitability during observation of an action in the painting, during imagery of the painting, and during observation of a photograph of the same pose. We also analyze the effects of observation of two further paintings, one showing the same muscles at rest, and in the other in a more overtly emotional context. Both observation of the *Expulsion* and of imagery of the painting increased cortical excitability. Neither the relaxed pose of Michelangelo's *Creation* nor the flexed posture in the highly emotional context of Bellini's *Dead Christ* increased cortical excitability. Observation of a photograph of the same extended pose did not increase cortical excitability either. Moreover, intracortical inhibition was reduced during imagery of the painting. Our results offer clear motor correlates of the relationship between the esthetic quality of a work and the perception of implied movement within it.

**Keywords: art, transcranial magnetic stimulation, mental imagery, mirror neurons, motor cortex**

# **Introduction**

Works of art arouse a variety of reactions in their beholders. Among the most frequently reported responses is that of a sense that beholders seem to have of imitating the actions of figures in paintings. Several philosophers of art, notably German empathy theorists such as Robert Vischer and Theodor Lipps and the French phenomenologist Maurice Merleau-Ponty, have suggested that viewers of paintings feel bodily engaged by the movements represented within them, but no empirical research has been done on this aspect of response (Vischer, 1872; Lipps, 1906; Merleau-Ponty, 1962).

A still photograph of an action conveys dynamic information. It has been suggested that observers extract dynamic information by extrapolating future position from the motion implied by the photograph (Allison et al., 2000). The processing of such information from static images with implied motion engages brain areas activated during observation of real actions and motor imagery including cortical visual motion area (V5/MT+), extrastriate body area (EBA), superior temporal sulcus (STS; BA38), and motionrelated areas (Kourtzi and Kanwisher, 2000; Proverbio et al., 2009). Given the skill of 15th and 16th century artists in representing movement, we hypothesized that observation of an action by a painter such as Michelangelo would arouse the same corticomotor responses as the observation of the same action in reality.

Transcranial magnetic stimulation (TMS) can be used to investigate cortico-spinal excitability. The amplitude of the motor evoked potential (MEP), obtained with single-pulse TMS, reflects the firing of cortico-spinal neurons (Hallett, 2000). Furthermore, a TMS stimulus delivered during voluntary contraction induces a temporary pause of the target muscle contraction cortical silent period (CSP) due to activation of cortico-spinal inhibitory circuits (Cantello et al., 1992). In addition, when a test stimulus is preceded by a conditioning pulse (subthreshold) the resulting MEP can be either inhibited short-interval intracortical inhibition (SICI) or facilitated intracortical facilitation (ICF; Kujirai et al., 1993). SICI and ICF have proven to be useful parameter for probing inhibitory and facilitatory circuits within primary motor cortex (Ziemann, 2004; Fadiga et al., 2005; Battaglia et al., 2006). The technique had also been useful in elucidating the effects of observation of still photographs on motor excitability (Urgesi et al., 2006). But no study had yet been made of corticomotor excitability during the observation of an action represented in a work of art.

In this paper we use to TMS to investigate (1) whether the observation of an action in an artistic representation activates the corticomotor system; (2) whether this effect is attributable to arousal as a result of the emotional context of the actions shown in the work of art or not; (3) whether the mental rehearsal of observation of a painting induces the same degree of corticomotor activation; (4) whether there is any difference between responses to the action in the work of art and a photograph of the same pose.

# **Materials and Methods Subjects**

We studied 10 right-handed normal volunteer (7 men, 3 women; age range 29–37 years; mean age, 33.3 ± 2.9 years). All participants gave written informed consent and all experiments conformed to the Declaration of Helsinki. The experimental protocol was approved by the local ethics committee (NYCPM).

**Abbreviations:** ECR, extensor carpi radialis; MEP, motor evoked potential; RMT, resting motor threshold; TMS, transcranial magnetic stimulation.

# **EMG recording**

Surface EMG was recorded with disposable adhesive disk electrodes placed in a tendon–belly arrangement over the right extensor carpi radialis (ECR) muscle. The signal was amplified, filtered (bandpass 2–5 kHz), digitized (Micro 1401, Cambridge Electronics Design, Cambridge, UK), and stored in a laboratory computer for off-line analysis. During the experiments EMG activity was continuously monitored by visual (oscilloscope) and auditory (speakers) feedback to ensure either complete relaxation at rest or a constant level of EMG activity during tonic contraction.

# **TMS measurements of Cortical Excitability**

We used single-pulse and paired-pulse TMS to induce MEPs and to examine cortical excitability. TMS was performed with a 7-cm figure-of-eight coil and a Magstim 200 stimulator (The Magstim Company, Dyfed, UK). The coil was placed at the optimal position for eliciting MEPs from the right ECR muscle ("hot spot"). The coil was held tangentially to the skull with the handle pointing backward and laterally at an angle of 45° to the sagittal plane. Thus, the electrical current induced in the brain was approximately perpendicular to the central sulcus. This orientation of the induced electrical field is thought to produce a predominantly trans-synaptic activation of the cortico-spinal neurons (Rothwell et al., 1999). During the experiments EMG activity was continuously monitored by either visual (oscilloscope) or auditory (speakers) feedback to ensure complete relaxation.

Resting motor threshold (RMT) was determined as the minimum stimulator intensity (to the nearest 1%) to produce an MEP of 50 μV in five of 10 trials.

To assess MEP amplitude, we used a stimulus intensity of 120% of the RMT. Mean peak-to-peak MEP amplitudes were determined by averaging 10 monophasic magnetic stimuli delivered to the motor hot spot of the ECR muscle.

Cortical silent period was recorded while the subjects were performing about 50% of maximal voluntary contraction (EMG activity was monitored with an audio–video feedback). The CSP was evoked with single-pulse TMS with a stimulus intensity set at 130% of RMT. The duration of 15 CSPs was measured from the end of the MEP until the restart of a constant EMG activity of at least 50% of the pre-stimulus level and was expressed in ms. For CSP measurement, EMG traces were rectified but not averaged.

Short-interval intracortical inhibition and ICF were studied by means of the paired TMS paradigm described by Kujirai et al. (1993) with a subthreshold conditioning stimulation followed by a suprathreshold test stimulation. Inter-stimulus intervals (ISIs) of 2 ms (for SICI) and 10 ms (for ICF) were used. Each study consisted of 10 trials for each ISI, and the test stimuli alone were delivered in random order controlled by a laboratory computer (Signal software, Cambridge Electronics Design, Cambridge, UK). In all the paired-pulse studies, the test stimulus intensity was adjusted in order to evoke motor responses of a matched size (approximately 0.7 mV, peak-top-peak amplitude). TMS parameters were tested according to published guidelines for the use of TMS in clinical neurophysiology (Rossini et al., 1999; Rothwell et al., 1999).

# **Experimental conditions**

Throughout the experiment subjects were seated comfortably in front of a computer screen (20 inches) placed approximately 50 cm in front of the participant. At the beginning of each block, subjects watched the appropriate video to obtain indication about the experimental task. Each video lasted 6 s and TMS stimuli were delivered after 3 s. Visual stimuli were administered on a Pentium IV computer, using Presentation software (Neurobehavioral Systems, Inc.) to control the presentation and timing of all stimuli. Each condition was presented 20 times, following a pre-determined random order. The order of experiments (1–4) was kept constant across participants.

# *Experiment 1*

Here we examined cortico-spinal excitability during *rest* and during *observation of a painting* (**Figure 1A**)*.* The selected painting was Michelangelo's *Expulsion from Paradise* in the Sistine Chapel (1508–1512), with its trenchant depiction of the gesture which Adam makes with his extended right hand to keep the sword-bearing angel at bay. We chose this scene for two reasons. Firstly, because Michelangelo, with his habitual skill, clearly delineates the muscles of the forearm involved in the extension of the hand; this action is unequivocally represented and easily legible. Secondly, we did so because the distal muscles involved in this action have an extensive and well defined cortical representation as tested with TMS (Chen et al., 1998). Stimulation of these cortical areas thus elicits MEPs of reliable amplitude. In addition, intracortical mechanisms for inhibition and facilitation in these muscles have been well characterized (Chen et al., 1998). The experiment consisted of eliciting MEP responses while participants observed a rest video providing a signal to relax (REST on a blank screen) or a video displaying the selected painting *(the Expulsion from Paradise).*

# *Experiment 2*

We then studied cortical-spinal excitability during observation of hand position in three paintings: Michelangelo's *Expulsion from Paradise,* his *Creation of Adam*, and Giovanni Bellini's *Dead Christ* 

**Figure 1 | MEP amplitude during observation of** *Expulsion from*

*Paradise***. (A)** Experimental paradigm used to assess corticomotor excitability during observation of a painting. Two digitized video sequences were presented. In one sequence (REST), the participants were instructed to relax. In a second video, subjects were instructed to observe Adam's gesture in Michelangelo's *Expulsion from Paradise.* Each video was presented for 6 s and transcranial magnetic stimuli were delivered after 3 s. Each condition was presented 10 times*.* **(B)** Painting observation increased motor evoked potentials (MEP) size (mean ± SE) \* = *p* < 0.05.

*with Angels* (ca. 1465; Ss. Giovanni e Paolo, Venice; **Figures 2A–C**). In the *Expulsion* the ECR muscle is activated; in the *Creation*, it is at rest; in Bellini's *Dead Christ*, it is shown at rest in an overtly and highly emotional context. Subjects were asked to look at the video REST or at a video of the paintings.

# *Experiment 3*

In this experiment*,* subjects were required to observe a REST video (signal to relax) paired with an IMAGERY video (instructing the subjects to mentally rehearse the observation of the painting; **Figure 3A**). Prior to TMS, all subjects underwent imagery training. This training protocol was performed under visual feedback of EMG recording of the right ECR muscle to ensure complete target muscle relaxation. The imagery of the painting was externally paced (green triangle on a computer monitor) at a rate of 1 every 10 s in blocks of 30 imagined sequences. At the end of each session, subjects were asked to describe the intensity of vividness of the imagined painting with an arbitrary scale ranging from 0 (no visual sensation) to 6 (perfectly clear sensation). The training was terminated when the subject reached vividness score of four in absence of any ECR muscle contraction.

# *Experiment 4*

Here we investigated corticomotor responses to observation of Adam's hand action in the painting of the *Expulsion* and a photograph of the same action (**Figure 4A**).

The subjects attended two paired video presentations: (1) the video REST paired with the observation of Adam's hand action; (2) the video REST paired with the observation of a poser.

Analysis of variance (ANOVA) was used to assess differences between *paradigms* (observation of a paintings, imagery of a painting, photograph). Upon detection of significant main effects, we performed *post hoc* analysis to assess differences between *conditions* (rest vs. active observation). The statistical analysis was performed

(IMAGERY), participants were asked to imagine the painting. **(B)** Painting observation increased motor evoked potentials (MEP) size and **(C)** reduced the amount of short-interval intracortical inhibition. (Mean ± SE) \* = *p* < 0.01.

using statistical software packages (SPSS software version 13.0 for Windows® Chicago, IL, USA). The level of significance was set at *p* < 0.05 for all tests.

# **Results**

#### **Experiment 1**

Regarding RMT, ANOVA did not disclose a significant difference between conditions [rest: 54.3 ± 5.1%, painting: 52.7 ± 6.3; *F* (1,18) = 5.4; *p* = 0.1]. On the contrary, observation of the painting increased MEP amplitude [*F* (1,18) = 15.2; *p* = 0.02; **Figure 1B**]. CSP duration was found to be no different during the two conditions (rest =134.1 ± 11.3 ms, painting =131.8 ± 10.5 ms; *p* > 0.05). Intracortical excitability, tested by using the paired-pulse study, revealed that observation of the painting did not induce changes in the in the amount of SICI (rest: 52 ± 9.1%, painting: 56.4 ± 12.5%; *p* = 0.1) and ICF (rest: 144.7 ± 10.03%, painting: 152.7 ± 11.6%; *p* = 0.3).

# **Experiment 2**

Comparison of neural response to TMS during observation of the three paintings revealed a main effect of *paradigms* [*F* (1, 54) = 4.8, *p* = 0.01] without main effect c*ondition* [*F*(1,54) = 1.1, *p* = 0.2] with a significant *paradigms x condition* interaction [*F*(1,54) = 4.4, *p* = 0.01). *Post hoc t*-test showed that only observation of the *Expulsion from Paradise* increased MEP (*p* = 0.02) whereas the *Creation of Adam* and the *Dead Christ with Angels* did not have effect on corticomotor excitability (*p* > 0.05; **Figure 2D**). Therefore, the increased in MEP size detected during observation of the *Expulsion from Paradise* was due to the artistic representation of an action with a minimal contribution of emotional arousal.

# **Experiment 3**

Mental rehearsal of the painting modulated corticomotor excitability. RMT [*F*(1,18) = 3.2; *p* = 0.4), CSP [*F*(1,18) = 7.1; *p* = 0.09], and ICF [*F*(1,18) = 4.4; *p* = 0.2] were not different between rest and imagery of the painting. Moreover, imagery of the painting increased MEP amplitude [*F*(1,18) = 25.1; *p* < 0.01] and decreased the amount of SICI [F(1,18) = 18.3; *p* < 0.05; **Figures 3B,C**).

# **Experiment 4**

The ANOVA showed a large effect of *paradigm* on MEP amplitude. As shown in **Figures 4B,C**, observation of Adam's arm (Expulsion) increased MEP size (134.7% compared to REST). On the contrary, observation the same pose induced only a modest, non-significant, increase in MEP amplitude [111.6% compared to REST; *F* (1, 18)=27, *p* < 0.001].

# **Discussion**

This is the first study to investigate the effects on motor output of observation of an action in a painting. It demonstrates that observation of an action in a painting increases cortical-spinal excitability. This effect is the same as the one we found for mental rehearsal of the painting. Observation of a photograph reproducing the same action did not increase MEP amplitude.

Since observation of the photograph did not significantly affect corticomotor excitability, we assume that this effect, in the case of the painting, must be a consequence of the artist's skill in giving the illusory impression of movement. Clearly Michelangelo successfully conveyed the kinesthetic aspects of the movement he depicted in such a way as to overcome the static nature of the image. The degree to which this impression may be due to coloristic, anatomical, and lighting skills remains to be further examined.

MEP amplitude reflects the trans-synaptic excitability of cortico-spinal neurons and spinal motor neurons, and thus provides information about the strength of cortico-spinal connections (Boroojerdi et al., 2001). Given that observation of a movement could also have minor effects on spinal excitability (Baldissera et al., 2001), it is possible that the increase in MEP size we detected during observation of the *Expulsion from Paradise* might be due to both cortical and spinal effects.

Recent neuroimaging studies (Kawabata and Zeki, 2004) have found that observation of emotionally charged images induces sensory–motor activation. It is unlikely, however, that the increase in MEP size during observation of the *Expulsion from Paradise*  can be attributed to a general, unspecific increase in arousal (Baumgartner et al., 2007) since the observation of a more overtly emotional scene (*Bellini's Dead Christ with Angels*) did not increase MEP size.

While a number of authors have shown that MEP size increases during motor imagery (Fourkas et al., 2006; Stinear et al., 2006), we show, for the first time that imagery of a pictorial work of art modulates cortical excitability. We speculate that since our subjects were required to imagine Adam's gesture in Michelangelo's fresco, our imagery task also involved the kind of kinesthetic information conveyed by the observation of the actual movement itself. In addition, we demonstrate that visual motor memories (imagery of the painting) exert a modulation of intracortical inhibitory circuits as demonstrated by a reduction in the amount of SICI (Di Lazzaro et al., 2007). Our results are in agreement with previous studies (Abbruzzese et al., 1999; Stinear and Byblow, 2004) that have provided evidence that SICI can be modulated in a spatially and temporally specific way during imagined motor performance. Given that our study was performed on one hemisphere we do not have information regarding the topographic specificity of motor responses to observation of a work of art. These issues need to be further investigated.

While earlier studies have suggested the involvement of a number of brain areas in esthetic judgment such as the limbic system (Di Dio et al., 2007), orbitofrontal cortex, motor (Kawabata and Zeki, 2004), visual (Zeki and Lamb, 1994), and frontal areas (Cela-Conde et al., 2004; Jacobsen et al., 2006) our results make clear that esthetic factors have a modulatory effect on motor representations in primary motor cortex. It has been demonstrated (by using both TMS and event related potentials) that static images with implied motion perception activate both motor and visual areas (Urgesi et al., 2006; Proverbio et al., 2009). Visual perception of static body parts engages the EBA (Downing et al., 2001), while ventral premotor cortex plays a critical role in the understanding of complete body postures (Urgesi et al., 2004). It is likely that the same networks are engaged during esthetic understanding (Freedberg and Gallese, 2007; Calvo-Merino et al., 2010). With regard to imagery of human body parts, it is likely that different multimodal body representations in the occipito-temporal cortex are engaged in a contentspecific manner (Ishai et al., 2000; O'Craven and Kanwisher, 2000; Grossman and Blake, 2001; Costantini et al., 2011). Our results expand these findings by providing evidence of motor modulation induced by observation of an action in a work of art. It is likely that these responses are not only restricted to representational art. Recent studies have shown a similar pattern of brain activation in the case of less densely descriptive and more abstract images (Kim and Blake, 2007; Osaka et al., 2010) and during processing of human motion at a conceptual level, such as during story comprehension (Deen and McCarthy, 2010).

On the other hand, we should also consider the possibility that the motor activation we detected might be due to the specific action portrayed in the painting. In primates, electrical stimulation of the poly sensory zone in the precentral gyrus (roughly matching the dorsal part of area F4) induces a contra-lateral defensive posture

# **References**


consistent with the one' portrayed by Michelangelo in the *Expulsion from Paradise* (Graziano et al., 2002a,b). Neurons in the poly sensory zone have bimodal, visual–tactile modalities and represent the space immediately surrounding the body through touch, and vision (Graziano et al., 2002b). Hence, in the case of *Expulsion from Paradise,* it is likely that the artistic nature of the action induce a stronger activation in neurons that respond to visual stimuli in both primary motor (Rizzolatti et al., 1981) and premotor cortex (Graziano et al., 2002a). The modulatory effects on motor representations consistent with defending the body against nearby threatening objects might underlie to our TMS results during perception and imagination of the painting.

The present results add considerably to our knowledge of the motor networks engaged in responses to works of art, and enhance our understanding of the felt imitation not just of the actions of others but also of actions in pictorial works of art. The extent to which our findings apply to sculptures remains to be seen. So does the important question of the degree to which prior acquaintance with a work of art might affect MEP size during observation and imagery. For instance, the participants were exposed several times to the painting and were required to mentally rehearse the observation of the work of art. Given that stimulus novelty has been shown to have a significant effect on esthetic perception and judgment (Oliva and Torralba, 2007; Kirk, 2008; Kirk et al., 2009) it is possible that neural responses to the work of art were shaped by previous experience. Furthermore, future studies are needed to investigate the role of cultural, social, and psychological characteristics of the observer on art perception.

# **Acknowledgment**

The authors thank all the volunteers that participated in the study. The study was supported by a NYCPM research grant.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 06 April 2011; paper pending published: 08 June 2011; accepted: 26 July 2011; published online: 23 August 2011.*

*Citation: Battaglia F, Lisanby SH and Freedberg D (2011) Corticomotor excitability during observation and imagination of a work of art. Front. Hum. Neurosci. 5:79. doi: 10.3389/fnhum.2011.00079*

*Copyright © 2011 Battaglia, Lisanby and Freedberg. This is an open-access article subject to an exclusive license agreement between the authors and Frontiers Media SA, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are credited and other Frontiers conditions are complied with.*

# **STRONGER MISDIRECTION IN CURVED THAN IN STRAIGHT MOTION**

**Jorge Otero-Millan, Stephen L. Macknik, Apollo Robbins, Michael McCamy and Susana Martinez-Conde**

# Stronger misdirection in curved than in straight motion

#### **Jorge Otero-Millan1,2, Stephen L. Macknik 1,3, Apollo Robbins <sup>4</sup> , Michael McCamy <sup>1</sup> and Susana Martinez-Conde<sup>1</sup>\***

<sup>1</sup> Neurobiology, Barrow Neurological Institute, Phoenix, AZ, USA

<sup>2</sup> Signal Theory and Communications, University of Vigo, Vigo, Spain

<sup>3</sup> Neurosurgery, Barrow Neurological Institute, Phoenix, AZ, USA

<sup>4</sup> Whizmob Inc., Las Vegas, NV, USA

#### **Edited by:**

Luis M. Martinez, Universidad Miguel Hernández de Elche, Spain

#### **Reviewed by:**

Luis M. Martinez, Universidad Miguel Hernández de Elche, Spain Peter Thompson, University of York, UK

#### **\*Correspondence:**

Susana Martinez-Conde, Laboratory of Visual Neuroscience, Division of Neurobiology, Barrow Neurological Institute, 350 West Thomas Road, Phoenix, AZ 85013, USA. e-mail: smart@neuralcorrelate.com

# **INTRODUCTION**

Visual illusions developed by painters and sculptors have aided the understanding of important principles of visual perception. Likewise, cognitive illusions developed by magicians can reveal critical clues in cognitive processing (Kuhn et al., 2008b; Macknik et al., 2008). Centuries of informal but systematic research in magic theory have predated contemporary cognitive science concepts such as "change blindness" (Simons and Levin, 1998), "inattentional blindness" (Simons and Chabris, 1999), and "choice blindness" (Johansson et al., 2005). Magic remains a rich and largely untapped source of insight into perception and cognition (Barnhart, 2010).

One of the authors (Apollo Robbins, The Gentleman Thief) is a professional magician who specializes in sleight of hand and stage pickpocketing. Apollo Robbins noticed that he could draw a spectator's attention in distinctive ways by moving his hands along different trajectories, for instance while secretly stealing an object from a"mark,"or victim. Specifically,Apollo Robbins moves his hands in a curved motion to engage the spectator's attention along the motion trajectory, whereas he uses linear motion to shift attention from the start to the endpoint of a vector. Both types of movements decrease the attentional focus at the onset position of the hand movement, but with curvilinear motion the shift toward the final position is more permanent.

The French Drop is a classic sleight of hand magic trick (**Figure 1**; Movies S1 and S2 in Supplementary Material), with the following sequence: (a) The magician shows a coin or another small object between the fingers and thumb of one hand (i.e., left hand). (b) The right hand approaches the left hand and appears to take the coin. (c) The right hand moves away from the left hand as if carrying the coin; and (d) the magician opens his right hand to reveal that the coin has disappeared. This simulated maneuver results in the perception that the coin has magically vanished from the right hand (whereas in reality,it was not removedfrom the left).

Illusions developed by magicians are a rich and largely untapped source of insight into perception and cognition. Here we show that curved motion, as employed by the magician in a classic sleight of hand trick, generates stronger misdirection than rectilinear motion, and that this difference can be explained by the differential engagement of the smooth pursuit and the saccadic oculomotor systems. This research exemplifies how the magician's intuitive understanding of the spectator's mindset can surpass that of the cognitive scientist in specific instances, and that observation-based behavioral insights developed by magicians are worthy of quantitative investigation in the neuroscience laboratory.

**Keywords: saccades, smooth pursuit, magic, illusion, sleight of hand, eye movements**

Step (c) of the French Drop can be performed using either curved or straight hand motion. Although the illusion is effective either way, Apollo Robbins predicted that straight motion should result in the spectator's gaze bouncing from the open right hand back to the closed left hand (which retained the hidden coin) immediately after the reveal, whereas curved motion would cause the spectator's gaze to remain focused on the final hand rather than returning to the original hand. If true, the use of curved hand motions in certain magic routines may help to disrupt the "reconstruction process", that is, the ability of the spectator to reconstruct the trick after the performance, or to determine the secret method and link it to the intended magical effect.

# **MATERIALS AND METHODS**

#### **SUBJECTS**

Seven subjects (four females, three males) with normal or corrected-to-normal vision participated in this study. All subjects were naïve and were paid \$15 for a single experimental session. Experiments were carried out under the guidelines of the Barrow Neurological Institute's Institutional Review Board (protocol 04BN039), and written informed consent was obtained from each participant.

# **EYE MOVEMENTS RECORDINGS AND ANALYSES**

Eye position was recorded non-invasively in both eyes with a fast video-based eye movement monitor (EyeLink 1000, SR Research) at 500 samples per second (instrument noise 0.01 rms).

We identified and removed blink periods as the portions of the EyeLink 1000 recorded data where the pupil information was missing. We added 200 ms before and after each period to further eliminate the initial and final parts of the blink, where the pupil is partially occluded. We moreover removed those portions of the data corresponding to very fast decreases and increases in pupil

area (>20 units per sample) plus the 200 ms before and after. Such periods are probably due to partial blinks, where the pupil is never fully occluded (thus failing to be identified as a blink by EyeLink; Troncoso et al., 2008).

We identified saccades with an objective algorithm (Engbert and Kliegl, 2003; λ = 6). To reduce the amount of potential noise, we analyzed only binocular saccades (that is, saccades with a minimum overlap of one data sample in both eyes). Additionally, we imposed a minimum intersaccadic interval of 20 ms so that overshoot corrections were not categorized as saccades.

To identify pursuit we found all intersaccadic intervals longer than 80 ms and calculated the mean eye movement velocity in each of those periods, discarding the first 30 ms (to avoid interference from the preceding saccade and its overshoot). Pursuit periods were defined as those with a mean eye movement speed higher than 4˚/s. Trials with pursuit needed to contain at least one pursuit period.

We considered that subjects looked back to the original hand in any given trial if their gaze entered a 150 × 100 pixels box centered around the original hand for at least one data sample, after the reveal (i.e., the opening of the final hand).

To obtain the colormaps in **Figure 2**, we added the amount of time that subjects allocated their gaze to every pixel on the screen. Colormaps were smoothed with a Gaussian filter with a SD of 8 pixels.

### **EXPERIMENTAL DESIGN**

Subjects rested their head on a chin/forehead-rest 57 cm from a video monitor (Barco Reference Calibrator V, 60-Hz refresh rate). Each experimental session included 4 blocks of 4 experimental conditions, for a total of 16 trials. The four experimental conditions were: curved motion without reveal, straight motion without reveal, curved motion with reveal and straight motion with reveal. In the conditions without reveal, the video clips stopped shortly before the magician opened his final hand. In each block, the first two trials corresponded to the two conditions without reveal, in random order, and the last two trials corresponded to the two conditions with reveal, also in random order. Subjects were asked to answer "where the coin was" after each of the trails without reveal, and "how did he (the magician) do it" after each of the trials with reveal.

Because an actual magician (i.e., rather than a cartoon or computer simulation) performed all maneuvers, motion features such as timing, duration, length, etc. could not be exactly equated across experimental conditions. Future research using computer simulations of the magician's hand movements should quantify the importance of the *type* of motion performed (i.e., rectilinear versus curvilinear) versus other motion parameters such as timing and duration.

**video clip segments: before the right hand pretends to grab the coin, during the movement of the right hand, and after the right hand stops.**

# **RESULTS**

We tracked the eye movements of naïve subjects as they viewed videos of Apollo Robbins executing the French Drop with linear versus curved motion (**Figure 1**; Movies S1 and S2 in Supplementary Material). As predicted by Apollo Robbins, subjects showed different eye movement patterns for the two types of motion. The spectators' gaze stayed on the right hand more often after the curved motion, whereas it jumped back to the left hand after the straight motion (**Figure 2**; Movies S3 and S4 in Supplementary Material). Thus magicians manipulate not only the audience's gaze position during a sleight, but also the subsequent gaze location once the sleight is complete.

We tested if these effects could be due to differential engagement of the smooth pursuit versus the saccadic oculomotor systems (Macknik et al., 2008). Straight hand motion could invoke a saccadic eye movement. If so, suppression of visual perception during the saccade could result in reduced attention to the motion trajectory, leaving the attentional focus on the initial and/or final hand locations. Conversely, curvilinear motion might draw the spectator's oculomotor system into a long pursuit of the magician's wandering hand; in such case the retinal fovea would track the hand's non-linear trajectory, helping to draw the attentional spotlight along with it. Our results show that the spectators' oculomotor behavior is indeed different for both types of motion, with smooth pursuit being predominant in the curved motion condition, and saccades dominating in the straight motion condition. Further, spectators looked back less at the initial hand in trials containing smooth pursuit than in trials without smooth pursuit, irrespective of whether the hand moved in a straight or a curved path (**Figure 3**). None of these results were affected by training (i.e., the first and last trials offered comparable results; data not shown).

The subjects' verbal responses did not differ across conditions (data not shown), possibly because subjects were queried immediately after the vanish, while the last frame of the video clip in question remained visible, or because the trick was presented in isolation, rather than as part of a magic routine (an arrangement of tricks organized in logical fashion as part of a magic performance).

# **DISCUSSION**

Our results indicate that curvilinear motion is a more powerful source of misdirection than rectilinear motion, as used in a classical sleight of hand trick. Particularly, the use of curvilinear motion in the simulated maneuver at the core of the French Drop sleight prevented observers from looking back at the hand that actually retained the coin – after the magician revealed that the

hand that had appeared to take the coin was empty (**Figure 2**). To our knowledge, this is the first observation by a non-scientist member of the magic community to have led to a previously unknown, neuroscientific discovery.

Our data moreover show that the differences in the observers' gaze position at the end of each trial are strongly dependent on the presence or absence or pursuit eye movements during the viewing of the sleight, with curvilinear trials typically generating pursuit eye movements more often than rectilinear trials (**Figure 3**). This suggests a differential engagement of the smooth pursuit and saccadic systems in the dynamic control of attentional focus.

Previous work has investigated the magicians' use of social misdirection cues, such as their own gaze direction, to manipulate the audience's eye position (Kuhn and Tatler, 2005; Kuhn and Land, 2006; Kuhn et al., 2008a; Kuhn and Findlay, 2010; Cui et al., 2011). Here we show for the first time that different types of hand motion, as used by magicians, can have differential effects on the oculomotor behavior of observers.

Curvilinear target motions may be more salient intrinsically than linear target motions (in addition to the two types of motion affecting differentially the oculomotor system) (Kristjánsson and Tse, 2001). In the spatial domain, the curves and the corners of object surfaces are perceptually more salient and generate stronger neural activity than straight edges, possibly owing to the fact that they are less redundant and predictable, and therefore more informative (Troncoso et al., 2005, 2007, 2009). The same redundancy reduction argument might apply to non-predictable objectmotion trajectories, such as curvilinear versus straight motion. If this is the case, curvilinear motion trajectories should be more salient (and consequently engage stronger attention) than straight trajectories.

The capacity of curvilinear movement to misdirect the gaze and/or the attention of observers along a motion trajectory may

#### **REFERENCES**


*Front. Hum. Neurosci.* 5:103. doi:10.3389/fnhum.2011.00103


have far reaching implications outside of magic and pickpocketing, such as in the application of predator-evasion strategies in the natural world, in military tactics, in sports misdirection, and in marketing. Our results demonstrate that magic theory can provide new windows into the psychological and neural principles of perception and cognition.

# **CONCLUSION**

We show that curved motion, as used in a classical sleight of hand trick,is a more powerful source of magic misdirection than straight motion, and that the difference can be explained by the differential engagement of the smooth pursuit and the saccadic oculomotor systems – with curvilinear trials generating pursuit eye movements more often than rectilinear trials. These findings may have far reaching implications beyond magic, such as in the application of predator-evasion strategies in the natural world, in military tactics, in sports misdirection, and in marketing. This research also demonstrates that magic theory can provide new windows into the psychological and neural principles of perception and cognition; thus behavioral insights developed by magicians are worthy of quantitative investigation in the laboratory.

# **ACKNOWLEDGMENTS**

We thank Andrew Danielson for technical assistance. This study was supported by the Barrow Neurological Foundation (Stephen L. Macknik and Susana Martinez-Conde) and the National Science Foundation (award 0852636 to Susana Martinez-Conde). Jorge Otero-Millan is a Fellow of the Pedro Barrié de la Maza Foundation.

# **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at http://www.frontiersin.org/human\_neuroscience/10.3389/fnhum. 2011.00133/abstract

outcome in a simple decision task. *Science* 310, 116–119.


inattentional blindness reveals temporal relationship between eye movements and visual awareness. *Q. J. Exp. Psychol.* 63, 136–146.

Kuhn, G., and Tatler, B. W. (2005). Magic and fixation: now you don't see it, now you do. *Perception* 34, 1155–1161.


Vasarely's artworks. *Spat. Vis.* 22, 211–224.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 10 April 2011; accepted: 24 October 2011; published online: 21 November 2011.*

*Citation: Otero-Millan J, Macknik SL, Robbins A, McCamy M and Martinez-Conde S (2011) Stronger misdirection in curved than in straight motion. Front. Hum. Neurosci. 5:133. doi: 10.3389/fnhum.2011.00133*

*Copyright © 2011 Otero-Millan, Macknik, Robbins, McCamy and Martinez-Conde. This is an open-access article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, providedthe original authors and source are credited and other Frontiers conditions are complied with.*

# **SOCIAL MISDIRECTION FAILS TO ENHANCE A MAGIC ILLUSION**

**Jie Cui, Jorge Otero-Millan, Stephen L. Macknik, Mac King and Susana Martinez-Conde**

# Social misdirection fails to enhance a magic illusion

# *Jie Cui 1, Jorge Otero-Millan1,2, Stephen L. Macknik1,3, Mac King1 and Susana Martinez-Conde1\**

*<sup>1</sup> Division of Neurobiology, Barrow Neurological Institute, Phoenix, AZ, USA*

*<sup>2</sup> Signal Theory and Communications, University of Vigo, Vigo, Spain*

*<sup>3</sup> Division of Neurosurgery, Barrow Neurological Institute, Phoenix, AZ, USA*

#### *Edited by:*

*Luis M. Martinez, CSIC - Universidad Miguel Hernandez, Spain*

#### *Reviewed by:*

*Anthony S. Barnhart, Arizona State University, USA Daniel J. Simons, University of Illinois, USA*

#### *\*Correspondence:*

*Susana Martinez-Conde, Laboratory of Visual Neuroscience, Division of Neurobiology, Barrow Neurological Institute, 350 West Thomas Road, Phoenix, AZ 85013, USA. e-mail: smart@neuralcorrelate.com*

Visual, multisensory and cognitive illusions in magic performances provide new windows into the psychological and neural principles of perception, attention, and cognition. We investigated a magic effect consisting of a coin "vanish" (i.e., the perceptual disappearance of a coin after a simulated toss from hand to hand). Previous research has shown that magicians can use joint attention cues such as their own gaze direction to strengthen the observers' perception of magic. Here we presented naïve observers with videos including real and simulated coin tosses to determine if joint attention might enhance the illusory perception of simulated coin tosses. The observers' eye positions were measured, and their perceptual responses simultaneously recorded via button press. To control for the magician's use of joint attention cues, we occluded his head in half of the trials. We found that subjects did not direct their gaze at the magician's face at the time of the coin toss, whether the face was visible or occluded, and that the presence of the magician's face did not enhance the illusion. Thus, our results show that joint attention is not necessary for the perception of this effect. We conclude that social misdirection is redundant and possibly detracting to this very robust sleight-of-hand illusion. We further determined that subjects required multiple trials to effectively distinguish real from simulated tosses; thus the illusion was resilient to repeated viewing.

**Keywords: join attention, social misdirection, fixation, free-viewing, eye movements, sleight-of-hand, prestidigitation, motion perception**

# **INTRODUCTION**

Visual, multisensory, and cognitive illusions in magic performances provide new windows into the psychological and neural principles of perception, attention, and cognition (Macknik et al., 2008, 2010; Martinez-Conde and Macknik, 2008). Here we investigated a magic effect consisting of a coin "vanish" (i.e., the perceptual disappearance of a coin after a simulated toss from hand to hand).

A professional magician (Mac King, headliner, Harrah's Las Vegas) performed the coin vanish, as follows: (a) The magician tosses the coin vertically in his right hand; (b) The magician pretends to toss the coin from right to left hand, but surreptitiously holds the coin in his right hand, stopping it from flying; (c) The magician's left hand closes as if "catching" the supposedly flying coin; (d) The magician opens his left hand to show that the coin has disappeared (**Figure 2A**; Video S1 in Supplementary Material). Naïve observers typically perceive the coin flying from right to left hand, and are surprised to find the coin "magically" gone when the magician opens his left hand.

Magicians often perform this particular coin vanish while directing their own gaze to the presumed position of the coin at any given time, never looking at the spectators directly. Previous research has shown that magicians can use joint attention cues such as their own gaze direction to strengthen the observers' perception of magic, however (i.e., in the Vanishing Ball illusion; Kuhn and Land, 2006).

We wondered if Mac King might similarly enhance the present illusion by raising his eyes to face the viewer at the time of the simulated coin toss. If observers responded to Mac King's social misdirection by returning his gaze at the time of the toss, they would necessarily view the toss with their peripheral vision. If so, their perception of the toss might be enhanced (i.e., they would see the simulated toss with lower spatial resolution, in a part of the visual field where neurons are especially sensitive to motion cues; Hubel, 1988).

We presented naïve observers with videos of Mac King performing real and simulated coin tosses, with the magician's head occluded in half of the trials to control for the magician's use of his own gaze as an element of misdirection. We further included fixation trials (in which subjects were forced to look at the magician's face) and free-viewing trials (in which subjects were allowed to explore the scene freely, as they would during a magic show) to study the effect of peripheral versus central viewing on the perception of the illusion. Finally, we presented the subjects with both real and simulated tosses multiple times, to determine the effect of repeated viewing on the perception of this illusion. The subjects' eye positions were simultaneously measured, and their perceptual responses recorded via button press (see Materials and Methods: General for details).

# **MATERIALS AND METHODS: GENERAL SUBJECTS**

All subjects were naïve, had normal or corrected-to-normal vision, and were paid \$15 for a single experimental session (∼60 min). Experiments were carried out under the guidelines of the Barrow Neurological Institute's Institutional Review Board (protocol

04BN039) and written informed consent was obtained from each participant. Nine subjects (3 females, 6 males) participated in Experiment 1. Six new subjects (2 females, 4 males) participated in Experiment 2. Eight new subjects (1 female, 7 males) participated in Experiment 3.

# **EYE MOVEMENTS RECORDINGS**

Subjects rested their head on a chin/forehead-rest 57 cm from a video monitor (Barco Reference Calibrator V, 60-Hz refresh rate).

Eye position was acquired non-invasively with a fast videobased eye movement monitor (EyeLink 1000, SR Research) at 500 samples per second (instrument noise 0.01˚ rms). We identified and removed blink periods as the portions of the EyeLink 1000 recorded data where the pupil information was missing. We added 200 ms before and after each EyeLink 1000 identified blink period to further eliminate the initial and final parts of the blink, where the pupil is only partially occluded. Eye positions during the blinks were calculated by linear interpolation from the eye positions at the beginning of the blink to the end of the blink (Bour et al., 2000).

# **VISUAL STIMULI**

All videos had the same frame size [26.7˚ (w) × 14.1˚ (h)] and were displayed centrally on the monitor screen [40˚ (w) × 30˚ (h)].

Video clips S1–S5 in Supplementary Material may be divided into four segments: (1)"Before Toss"refers to the performance from the start of the video to the initiation of "Toss." During this period, the magician either throws a coin up in the air, using his right hand (as in Videos S1–S3, S6, and S7 in Supplementary Material) or pretends to do so (Videos S4 and S5 in Supplementary Material); (2) "During Toss" is the period from toss initiation to the end of the toss (i.e. beginning of "After Toss"). The "Toss" can be either a "Real Toss," i.e., the magician throws a coin from right to left hand (Videos S2 and S7 in Supplementary Material) or a "Fake Toss," i.e., the magician either pretends to throw the coin from right to left hand, but surreptitiously retains it in his right hand (Videos S1, S3, and S6 in Supplementary Material), or simply performs a tossing gesture (i.e., without a coin) from right to left hand (Videos S4 and S5 in Supplementary Material); (3) "After Toss" extends from the end of toss to the opening of the magician's left hand; and (4) "Reveal" from the opening of the left hand to the end of the video. That is, the "Reveal" stage starts at the time the magician first opens his left hand, and ends with the last frame of the performance.

Videos S6 and S7 in Supplementary Material are identical to Videos S1 and S2 in Supplementary Material, respectively, except that the "Reveal" segments are omitted.

# **DATA ANALYSES**

# *Perceptual reports analysis*

In Experiment 1 and Experiment 2, we calculated the percentage of coin toss reports (i.e., the percentage of times the subject saw the coin flying from right to left hand) in each block of 10 consecutive trials of the same condition (**Figures 1B** and **2B**), and on a trial-by-trial basis (**Figures 1C** and **2C**), for each subject. We then calculated the average and the SD across the subjects.

# *Gaze dynamics analysis*

In Experiment 3, we considered that subjects looked at the face of the magician if their gaze entered a circular area with a 3˚ diameter, centered on the midpoint between the magician's eyes, for at least one data sample.

To obtain the colormaps in **Figure 8**, we added the amount of time that subjects allocated their gaze to every pixel of the screen from the beginning to the end of the toss. Colormaps were smoothed using a 40 <sup>×</sup> 40 pixel (∼1.2˚ <sup>×</sup> 1.2˚) Gaussian filter with a SD of 10 pixels (∼0.3˚).

# *Signal detection analyses*

We carried out analyses based on signal detection theory (Macmillan and Creelman, 2005) to investigate detection sensitivity and response bias in the three experiments conducted. All three experiments may be identified as single-interval classification experiments (Macmillan and Creelman, 2005). In Experiments 1 and 2 subjects used two responses (pressing or not pressing the button) to sort two stimulus conditions (Magic Trick and Real Toss) into categories. In Experiment 3, the subjects classified five stimulus conditions (Real Toss, Magic Trick, No Coin Fake Toss, Final Coin Fake Toss and Two Coins Fake Toss) into two categories by pressing or not pressing the button. Throughout the experiments, only one stimulus condition was presented in each trial.

In order to estimate the parameters from the signal detection model in a single-interval paradigm, we fitted the model for each subject with RscorePlus (Harvery, 2011). In the model, the output of sensory process under each of the *m* stimulus conditions has a normal density function with mean μ*<sup>j</sup>* and SD σ*<sup>j</sup>* , where *j* = 0,...,*m* − 1, μ<sup>0</sup> = 0, and σ<sup>0</sup> = 1. The model also assumes that subjects hold *n* − 1 decision criteria *Xc* to classify the output of the sensory process into *n* response categories. The detection sensitivity for discriminating between two stimulus conditions is the absolute difference between the means of the two distributions, *d*- = |μ*j*<sup>1</sup> − μ*j*<sup>2</sup> |, where *j*1,*j*<sup>2</sup> = 0,1,...,*m* − 1 and *j*<sup>1</sup> = *j*2. Rscore-Plus employs singular value decomposition (Press, 2002), combined with a variation of the Marquardt method for non-linear

**versus Real Toss, with Reveal." (A)** Sequence of static images selected from the Magic Trick with Reveal (Upper Row) and Real Toss with Reveal (Lower Row) video clips. The "Reveal" image selected for both video clips corresponds to the last frame of the performance. The **(B)** Percentage of coin toss reports, per trial block, per experimental condition. Error bars represent SEM across subjects (*N* = 8). **(C)** Percentage of coin toss reports, trial by trial, per experimental condition (*N* = 8 subjects).

least-squares regression (Marquardt, 1963; Press, 2002), to find the maximum-likelihood fit of the multiple-distribution, variablecriterion signal detection model. (Specifically, it finds the means μ*<sup>j</sup>* and SD σ*<sup>j</sup>* of the remaining signal distributions, *j* = 1,...,*m* − 1, and then the decision criteria *Xc* relative to the first signal distribution). In our case, the input data for RscorePlus was the percentage of trials with or without coin toss reports for each stimulus condition. The detection sensitivity determined if subjects were better at classifying in some stimulus conditions than in others. The decision criteria *Xc* provided the response bias that indicated the subjects' tilt toward one response or the other (Macmillan and Creelman, 2005). Subsequently, we calculated the average and standard error of mean (SEM) from the measures of all subjects. During model fitting, we arbitrarily set the mean of the distribution of Real Toss at zero and found the means of remaining stimulus conditions relative to this position.

# **EXPERIMENT 1 (MAGIC TRICK VERSUS REAL TOSS, WITHOUT REVEAL)**

# **EXPERIMENT DESIGN**

Subjects pressed a key to start each trial. A blank screen lasting for 2 s was followed by a short video clip of Mac King performing one of two maneuvers: Magic Trick or Real Toss. Both videos stopped before the magician revealed the inside of his left hand (**Figure 1A**; Videos S6 and S7 in Supplementary Material). Subjects were asked to press a button, as soon as possible, in the event that they saw a coin flying from the magician's right hand to his left hand.

Each condition (Magic Trick or Real Toss) was presented for 50 trials, amounting to 100 trials in a single session. Trials were pseudo-randomly interleaved.

# **RESULTS**

When Mac King actually tossed the coin from right to left hand (Real Toss), subjects indicated that they saw the coin toss 93.5% of the time, throughout the 50 trials presented (**Figure 1B**). When Mac King only pretended to toss the coin (Magic Trick), subjects reported at first a coin toss 72.6% of the time (averaged across the first 10 trials). This initial percentage dropped to 52.4% for the rest of the trials, reflecting the effects of the subjects' learning to discriminate between real and illusory coin tosses (the subjects performance in the first 10 trials was significantly different from that in the remaining 40 trials; two-sample *t*-test, *p* < 0.01). Thus, subjects were able to distinguish between the two experimental conditions with maximal sensitivity after about 10 trials of each condition (that is, 20 trials combined). **Figure 1C** illustrates on a trial-by-trial basis the percentage of times that a coin toss was reported. Despite significant variance in the subjects' responses across time, the percentage of coin toss reports decreased steadily for the first few trials, stabilizing at around trial number 10.

# **EXPERIMENT 2 (MAGIC TRICK VERSUS REAL TOSS, WITH REVEAL)**

# **EXPERIMENT DESIGN**

Experiment 2 followed the design of Experiment 1, except that both video clips (Magic Trick and Real Toss) now included an additional "Reveal" stage, where the magician opened his left hand to reveal that it was empty (**Figure 2A**; Videos S1 and S2 in Supplementary Material).

# **RESULTS**

When Mac King actually tossed the coin from right to left hand (Real Toss with Reveal), subjects indicated that they saw the coin toss as often as in the equivalent condition (Real Toss without Reveal) from Experiment 1 (91.3% of the time, throughout the 50 trials presented; **Figure 2B**). When Mac King only pretended to toss the coin (Magic Trick with Reveal), subjects reported at first a coin toss around 62.9% of the time (averaged across the first 10 trials). This initial percentage was not significantly different from that obtained in the equivalent condition (Magic Trick without Reveal) from Experiment 1 (paired *t*-test, *p* > 0.05). The next 40 trials represented a distinct departure from the results from Experiment 1, however, as the coin toss reports dropped to 21.9% (a significant difference from the matching data in Experiment 1; paired *t*-test, *p* < 0.001). Thus including a "Reveal" stage with each performance allowed the subjects to gain additional information after about 10 trials of each condition (that is, 20 trials combined), thereby improving their discrimination of a real coin toss versus a simulated maneuver. **Figure 2C** illustrates on a trialby-trial basis the percentage of times that a coin toss was reported in Experiment 2.

Detection sensitivity and response bias were comparable in Experiment 1 (No Reveal) and Experiment 2 (Reveal) when we considered all 50 trials together, suggesting no effect of the"Reveal" stage (**Figure 3**, Left Column; see Materials and Methods: General for details on signal detection analyses). However, when we separated the first 10 trials from the last 40 trials, we found that the presence of a"Reveal"stage led to a significant increase in detection sensitivity (d- ) in the last 40 trials (**Figure 3**, Right Column). This result suggests that the information obtained from the "Reveal" stage helped subjects to discriminate between real and simulated tosses, after a number of trials.

# **EXPERIMENT 3 (MULTIPLE TOSS CONDITIONS, WITH REVEAL)**

# **EXPERIMENT DESIGN**

Experiment 3 followed the design of Experiment 2, except that we presented 5 different video clips: Magic Trick, Real Toss, Two Coins Fake Toss, Final Coin Fake Toss, and No Coin Fake Toss (**Figure 4A**; Videos S3–S5 in Supplementary Material). The Magic Trick and Real Toss video clips were identical to those in Experiment 2. The three additional video clips portrayed the following maneuvers: In the Two Coins Fake Toss, Mac King pretended to toss a coin from right to left hand, but actually retained it in his right hand. Subsequently, he opened his left hand to reveal a second coin that had remained hidden (in the left hand) until the reveal. In the Final Coin Fake Toss, Mac King pretended to toss a nonexistent coin from right to left hand, and subsequently opened his left hand to reveal an actual coin (which had remained hidden in his left hand until that point). In the No Coin Fake Toss, Mac King pretended to toss a non-existent coin from right to left hand, and subsequently opened his left hand to reveal that it was empty.

Two main motivations of this experiment were to determine the potential effects of (a) the magician's use of social misdirection cues and (b) the observer's gaze position on the perception of this magic trick. Thus, each of the videos was presented with (a) the magician's face visible and occluded, and (b) two different viewing

conditions: free-viewing and fixation (**Figure 4B**), for a total of 20 conditions (5 types of video clip × 2 face conditions × 2 viewing conditions). We presented 5 trials for each condition, amounting to 100 trials, pseudo-randomly ordered, in a single session.

In the Face Occluded condition, the face of the magician and surrounding area was blocked by an 11.9˚ (w) × 5.6˚ (h) black rectangle. In the Face Visible condition, no occlusion was used.

In the Fixation condition, subjects had to fixate a small red cross (0.75˚ wide) within a 2˚ × 2˚ fixation window, placed in the midpoint between the magician's eyes (in the first frame the magician oriented his gaze to the viewer), or on the corresponding location in space when the magician's face was occluded. This fixation window was invisible to the subjects. A trial was discarded whenever the subject's gaze left the fixation window for more than 500 ms (<500 ms gaze excursions were permitted to allow for blinks). Then the discarded trial was inserted randomly into the subsequent queue of trials and presented to the subject again.

In the Free-viewing condition, no fixation cross was presented and subjects were free to explore the visual scene at will. Before each trial, we presented an instructions screen indicating whether fixation or free-viewing would be required.

#### **RESULTS**

The strength of the illusion was remarkable: subjects perceived illusory coin tosses a large fraction of the time, including during those trials in which the magician never showed a coin in the initial hand (Final Coin Fake Toss and No Coin Fake Toss conditions; **Figure 5**).

#### *Effect of social misdirection*

In each type of performance (Magic Trick, Real Toss, Two Coins Fake Toss, Final Coin Fake Toss, and No Coin Fake Toss), the magician raised his gaze to face the viewer at the time of the (real or simulated) coin toss. Our reasoning was that, if Mac King looked directly at the observers, he might engage them to reciprocate his gaze, thereby forcing them to view the coin toss peripherally, rather than foveally. If so, the subjects might perceive the illusion more strongly, especially as peripheral receptive fields are known to be more responsive to movement than foveal receptive fields (Hubel, 1988). Surprisingly, subjects reported higher percentages of coin tosses (Face visible = 72.4% versus Face Occluded = 88.4%) when the magician's face was blocked (therefore nullifying the possibility of gaze misdirection) than when the magician's face was visible, across all types of performance (**Figure 5**).

The illusory effect was also stronger in absence of the magician's face when all five types of performance (Magic Trick, Real Toss, Two Coins Fake Toss, Final Coin Fake Toss, and No Coin Fake Toss) were grouped together, both under free-viewing and fixation conditions (**Figure 6A**).

We wondered if occluding the face of the magician could increase sensitivity or produce a criterion shift. We found a significant difference in response bias (but not in detection sensitivity) between the conditions of Face Visible and Face Occluded, indicating that subjects changed their criterion as a function of face visibility and were more likely to report a coin toss in the occluded face condition (**Figure 7**).

#### *Presence of an initial coin*

Experiments 1 and 2 showed that the illusion evoked by the simulated coin toss (Magic Trick condition) was very powerful and resilient to the effects of training (i.e., the subjects required about 10 observations of a simulated coin toss versus a real coin toss to differentiate one condition from the other optimally (**Figures 1B** and **2B**). We wondered whether the subjects might decide that a coin toss was real even before the (veridical or simulated) toss itself, based on the initial presence of a coin (as in the Magic Trick and Real Toss conditions, the coin toss is preceded by the magician throwing the coin up in the air and catching it in his right hand;

Videos S1 and S2 in Supplementary Material). Thus, Experiment 3 introduced two new conditions in which the magician conducted simulated tosses from his empty right hand (Final Coin Fake Toss and No Coin Fake Toss). **Figures 6B,C** shows that the presence of an initial coin in the magician's hand resulted in higher percentages of coin toss reports, as predicted.

We wondered if the increased coin toss reports associated to the presence of an initial coin were potentially due to the flight of an actual coin in the Real Toss condition. To rule out this possibility, we repeated the analyses from **Figures 6B,C** after excluding the data from the Real Toss condition. We found that the presence of an initial coin resulted in higher percentages of coin toss reports, even in the absence of a genuine flying coin (data not shown).

# *Effect of viewing condition*

We also wondered if the type of viewing condition (fixation versus free-viewing) might affect differentially the subjects' perception. The fixation condition required the subjects to look at a point in between the magician's eyes (or the equivalent position on the screen when the magician's face was blocked). Thus the subjects were forced to view the coin toss peripherally (and moreover were potentially susceptible to the magician's gaze misdirection when his face was visible). In contrast, the free-viewing condition

**FIGURE 6 | Average percentage of coin toss reports according to viewing condition, presence of an initial coin, and presence/absence of social misdirection. (A)** Percentages of button presses for free-viewing and fixation conditions, with the magician's face visible versus occluded. Error bars represent SEM across all five types of performance (*N* = 5 conditions). **(B)** Percentages of button presses for types of performance in which an initial coin was present versus absent, with the magician's face visible versus occluded. Error bars under Initial Coin condition represent SEM across six conditions, i.e., Face visible and Face occluded, for Magic Trick, Real Toss and Two-Coin Fake Toss. Error bars under No Initial Coin condition represent SEM across four conditions, i.e., Face visible and Face occluded, for Final Coin Fake Toss and No Coin Fake Toss. **(C)** Percentages of button presses according to the presence/absence of an initial coin, with free-viewing versus fixation conditions. Error bars under Initial Coin condition represent SEM across six conditions, i.e., Free-viewing and Fixation, for Magic Trick, Real Toss and Two-Coin Fake Toss. Error bars under No Initial Coin condition represent SEM across four conditions, i.e., Free-viewing and Fixation, for Final Coin Fake Toss and No Coin Fake Toss. \*Paired *t*-test, *p* < 0.05; \*\*Paired *t*-test, *p* < 0.001; +Two-sample *t*-test, *p* < 0.05; ++Two-sample *t*-test, *p* < 0.001.

allowed subjects to look anywhere on the screen. **Figure 6C** shows that free-viewing increased the percentages of coin toss reports only when an initial coin was present.

Detection sensitivity and response bias were comparable for the free-viewing and fixation conditions (**Figure 7**).

# *Gaze dynamics*

We studied the subjects' gaze dynamics during the free-viewing of each video clip (**Figure 8**). In each type of performance, subjects tended to avoid the magician's face during the coin toss, focusing on the magician's hands instead. In most instances, subjects looked at the magician's face only after the time of the button press (usually waiting until the opening of the left hand).

**Figure 9** further analyzes the subjects' gaze location in two time windows, one 500 ms before and another 500 ms after the button press. Unsurprisingly, subjects looked at the magician's face region more often when the face was visible than when it was occluded. No significant difference was observed between the gaze allocation before and after the button press, however. Moreover, subjects looked at the magician's face around the time of the button press only rarely (probability < 0.20).

# **DISCUSSION**

# **THE EFFECT OF SOCIAL MISDIRECTION IN THE PERCEPTION OF MAGIC**

Joint attention is the mechanism by which an observer can share the experience of another by following his/her gaze direction and pointing gestures. Magicians rely on joint attention as a form of social misdirection, to direct the spectators' attention away from the method behind the magical effect, and toward the magical effect itself. If the magician wants the spectators' eyes (and/or attentional spotlight) focused on his face, he may look directly at his audience. If the magician instead wishes the spectators to shift their gaze (and/or attention) to a particular object, he himself may turn his head and eyes toward that object, and the heads and eyes (and/or attention) of the spectators will quickly follow suit (Macknik et al., 2008). Thus, the face of the magician provides effective social misdirection – via joint attention – in many magic illusions. Joint attention is critical for language acquisition and cognitive and social development (Scaife and Bruner, 1975; Tomasello and Farrar, 1986). But it also makes us susceptible to magic tricks that exploit our natural impulse to pay attention to the same places and objects attended by other people around us.

Previous research has found that magicians can effectively use their own direction of gaze to influence the gaze direction of observers (Kuhn and Land, 2006). Here we wondered if gaze misdirection could similarly improve the perception of a magic illusion not previously studied in the laboratory, which involves the simulated toss of a coin from hand to hand, and its subsequent perceptual vanish. In contrast to the previous studies (Kuhn and Land, 2006), we found that the magician's gaze misdirection did not intensify the subjects' perception of the illusion. Our data suggest that there is no simple "one size fit all" solution concerning the effects of social misdirection on the perception of magic, and that different magic illusions may be enhanced, unchanged, or lessened by social misdirection, in ways that remain to be explored.

The specific types of social misdirection studied may additionally explain some of the discrepancy between the current findings and Kuhn and Land's (2006). In Kuhn and Land's study, the magician's fake throw was conducted under two conditions of "social cueing": a pro-illusion condition in which the magician's eyes and head followed an imaginary ball moving upward, and an antiillusion condition, in which the magician looked at the hand concealing the ball. In the current study, the magician either looked

fastest to slowest. The brown and orange lines illustrate the probability with which the subjects' gaze fell on the magician's face area; the brown and orange triangles indicate the mean time of button presses. The small circles on the lines indicating probability of gaze in the face area represent the selected time locations for error bar calculation. Error bars indicate one SD. The lower section of each panel represents the spatial distribution of the subjects' gaze for each experimental condition. The hotter the color, the higher the probability that the subjects' gaze was located in that area.

at the camera or his head was blocked. These two conditions are complementary to the ones used by Kuhn and Land, in that their magician looked either to the simulated position of the ball, or to the actual position of the ball, thus the magician's gaze was never neutral with respect to the ball. The inclusion of a "socially neutral" gaze condition could have involved the magician closing his eyes, or blocking the magician's eyes and head (as in the current study). Future research should ideally combine both experimental approaches, incorporating pro-illusion and anti-illusion conditions, but also conditions that eliminate or neutralize the potential effects of social misdirection.

The proficiency of the magician performing this trick may be a factor. Magic theoristAscanio (2005)stated that a magician should strive for such degree of dexterity that no misdirection should be necessary, and such effective misdirection as to negate the need for high dexterity. Mac King may perform this sleight with such skillfulness that the illusion is already optimized without the addition of social misdirection (Max Maven, personal communication). Cavina-Pratesi et al. (2011) found that the kinematics of a magician's simulated grasp are very close to those of an actual grasp, and that magicians' simulated actions do not show many of the typical kinematic biases usually seen with pantomimed actions by non-magicians. Here we found that naive observers required a large amount of exposure to real and simulated coin tosses to distinguish most effectively between the two (20 trials combined; **Figures 1** and **2**). Our video recordings of Mac King performing real and simulated tosses moreover indicate that Mac King's timing of a simulated toss matches the timing of a real toss with great

accuracy (∼235 ms in the Magic Trick condition, <sup>∼</sup>269 ms in the Real Toss condition). Thus, one might conclude that social misdirection is redundant for this particular magic trick, but only if performed by a master magician.

One counterargument is that subjects were able to overcome the illusion eventually, despite Mac King's mastery. Around the 10th trial, Mac King's sleight-of-hand technique was no longer sufficient to maintain the illusion, whether his face was covered or not. Had the facial cues been effective as misdirection, the illusion would have persisted for more repetitions in the Face Visible condition than in the Face Occluded condition. This was not the case. It follows that if a less skilled magician were to perform this trick with imperfect sleight-of-hand, facial cues may not enhance the illusion either. If so, one practical recommendation to magicians performing this sleight would be to execute the toss without lifting their eyes to the audience, but to keep their gaze trained on the supposed location of the coin at any given time, so as to maximize the audience's attention to the illusory coin (via joint attention) and thus enhance the feeling of magic when the coin "vanishes" from the conjuror's hand. Future research should determine the effectiveness of social misdirection for this and other magic illusions, as executed by magicians with varying degrees of ability.

The current study focused on gaze misdirection as a potentially powerful source of joint attention. Future work should

#### **REFERENCES**


address the comparative effectiveness of the various forms of joint attention/social misdirection (i.e., gaze/head direction, body orientation, verbal cues, etc.) on the perception of this and other magic illusions.

#### **DO NOT DO THE SAME TRICK 10 TIMES**

The magician's axiom "Never do the same trick twice" indicates that if a magician were to perform the same trick twice for the same audience, there would be an increased chance that the audience would identify the underlying method and figure out the trick (King, 2007; Macknik et al., 2008). Several previous studies have shown that magic tricks are more likely to fail when observers view them a second time (Kuhn and Tatler, 2005; Kuhn and Land, 2006; Tatler and Kuhn, 2007; Kuhn et al., 2008). Similarly, many inattentional blindness demonstrations are a one-time only kind of effect. For instance, observers are more likely to detect a gorilla among basketball players if they watch the video for a second time (Simons and Chabris, 1999; Simons, 2010). Our current results indicate that some magic tricks are very resistant to repeated viewing, requiring many more than two performances to lose their effectiveness entirely.

#### **ACKNOWLEDGMENTS**

We thank Andrew Danielson for technical assistance, and Max Maven for very helpful discussions. This study was supported by the Barrow Neurological Foundation (Stephen L. Macknik and Susana Martinez-Conde) and the National Science Foundation (award 0852636 to Susana Martinez-Conde). Jorge Otero-Millan is a Fellow of the Pedro Barrié de la Maza Foundation.

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at http://www.frontiersin.org/Human\_Neuroscience/10.3389/fnhum. 2011.00103/abstract

**Video S1 |** Magic trick.

**Video S2 |** Real toss.

**Video S3 |** Two-coin fake toss.

**Video S4 |** Final coin fake toss.

**Video S5 |** No coin fake toss.

**Video S6 |** Magic trick without reveal.

**Video S7 |** Real toss without reveal.


Misdirection in magic: implications for the relationship between eye gaze and attention. *Vis. Cogn.* 16, 391–405.


estimation of nonlinear parameters. *J. Soc. Ind. Appl. Math.* 11, 431–441.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 10 April 2011; accepted: 05 September 2011; published online: 29 September 2011.*

*Citation: Cui J, Otero-Millan J, Macknik SL, King M and Martinez-Conde S (2011) Social misdirection fails to enhance a magic illusion. Front. Hum. Neurosci. 5:103. doi: 10.3389/fnhum.2011.00103*

*Copyright © 2011 Cui, Otero-Millan, Macknik, King and Martinez-Conde. This is an open-access article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, providedthe original authors and source are credited and other Frontiers conditions are complied with.*

# **TRANSFORMATIVE ART: ART AS MEANS FOR LONG-TERM NEUROCOGNITIVE CHANGE**

**Son Preminger**

# Transformative art: art as means for long-term neurocognitive change

# *Son Preminger\**

*School of Psychology, Interdisciplinary Center Herzliya, Herzliya, Israel*

#### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel Lutz Jäncke, University of Zurich, Switzerland*

*\*Correspondence:*

*Son Preminger, School of Psychology, Interdisciplinary Center Herzliya (IDC), Herzliya 46150, Israel. e-mail: sonpreminger@gmail.com*

Every artwork leads to a unique experience by the observer or participant, may it be sensory, emotional, cognitive, interactive, or spiritual experience. At the neurobiological level, such experiences are manifested as activation of the corresponding neural networks. Neuroscience has demonstrated that experience, in particular repeated experience, can cause a long-term change in the involved brain circuits (*experience-dependent plasticity*). This review will discuss the molding and transformative aspect of arts, examining how repeated and on-going experience of arts may alter cognitive, emotional, and behavioral patterns as well as their underlying neural circuits. The application of this approach to cognitive training and neuropsychological rehabilitation methods will be addressed as well. In addition, it will be suggested that this approach to art, as a long-term transformative medium, may lead to a novel viewpoint on art and a different approach to its creation. Artists can design artworks that aspire to form, in addition to one-shot influencing experience, on-going experiences which gradually create a lasting change, possibly improving audiences' neuropsychological functions.

**Keywords: plasticity, neurocognitive, art, training, perceptual learning, rehabilitation, video games, improvisation**

# **ART AS A NEUROCOGNITIVE EXPERIENCE**

Art is a medium of inducing experiences. Artistic experiences can be a vehicle to convey meanings, a way to provide pleasure, or means for self-expression and communication. Every artwork leads to a mental experience by the observer, participant, or experiencer. Experiencing an art work commonly involves perceptual processes—for example, plastic arts engage low-level visual processes such as orientation and edge detection, as well as higher-level processes, such as object recognition and its segregation from background. Similarly to daily life experiences, an artistic experience would involve additional cognitive processes such as executive functions, memory, emotion, and other high-level cognitive processes. The engagement of executive functions such as working memory and attention, are at the grounds of many artistic experiences as being able to assemble the different pieces of the presented work and being able to avoid distractions to be captivated by the artwork, are an essential for experiencing it (Dudai, 2008). Intrinsic processes such as autobiographic memory, emotion, and theory of mind may be driven by perceptual elements and provide meaning and essence to the artwork.

The specific combination of cognitive functions engaged by an art piece depends on the art form, the particular piece and the observer's unique experience. For example, classical art forms such as plastic arts, music, and film, relay mainly on perception to drive an artistic mental experience. On the other hand, interactive arts such as interactive installations or video games involve also motor functions and behavioral control as part of the induced experience. Importantly, people performing or practicing arts such as musicians or actors, also engage such action-related mechanisms while they practice their arts.

# **THE EFFECT OF EXPERIENCE ON THE BRAIN EXPERIENCE AS NEURAL NETWORK ACTIVITY**

At the neurobiological level, mental experiences are manifested as activation of the corresponding neural networks at the visual and auditory cortices, attention and memory networks, emotional brain regions, frontal and other brain regions, and their combinations. The distribution of the neuronal activity patterns depends on the particular cognitive processes that form and drive the experience (Kandel et al., 2000). Whether experience can be captured merely by biological correlates, or there are components which can not be described in physical terms and are only introspectively accessible (e.g., the sensation of seeing red; termed *Qualia*; Koch, 2004), has been intensively debated in philosophy. Here I will focus on the aspects of experience that are manifested and can be measured on a neurobiological or behavioral level.

# **EXPERIENCE-INDUCED BRAIN PLASTICITY**

A key characteristic of the brain is its capacity to change as a result of experience (see reviews in Buonomano and Merzenich, 1998; Kolb and Whishaw, 1998; Kandel, 2001; Pascual-Leone et al., 2005). Neural learning theories are commonly based on the notion termed experience-dependent plasticity which states that activation of a neural network leads to a modification of the corresponding synaptic connections (Hebb, 1949). Multiple human and animal neuroscience studies have demonstrated that experiences can lead to brain changes ranging from cellular modifications to formation of new synaptic connections and reorganization of cortical networks (reviews in Buonomano and Merzenich, 1998; Kolb and Whishaw, 1998; Kandel, 2001; Kelly and Garavan, 2005; Pascual-Leone et al., 2005). In particular, studies suggest experience-based neural plasticity processes underlie the change and adaptation of visual perception capabilities (Kourtzi and DiCarlo, 2006; Sagi, 2010). Various experimental procedures were designed to induce longterm changes of perception, for example improvement of lowlevel perceptual abilities (e.g., Karni and Sagi, 1991; Recanzone et al., 1993; reviews Buonomano and Merzenich, 1998; Sagi, 2010), as well as higher perceptual abilities such as learning to recognize objects (Kourtzi and DiCarlo, 2006), and gradually transforming perceptual object recognition (Preminger et al., 2007, 2009b). Similar experience-dependent plasticity characteristics have been demonstrated in the motor domain (e.g., Newell and Rosenbloom, 1981; Karni et al., 1998; Bezzola et al., 2011) as well as higher-order cognitive functions such as working memory (Mahncke et al., 2006; Chein and Morrison, 2010). Many studies have also demonstrated that behavioral changes that are induced by repetitive training are accompanied by long-term changes in brain activity and structure (e.g., Karni et al., 1998; Li et al., 2009; Bezzola et al., 2011; review in Kelly and Garavan, 2005). The experience-based plasticity approach has been also applied successfully to induce neural changes and functional improvement in cognitive functions for people with various cognitive impairments. For example, training has been shown to improve visual perception in amblyopia (Polat et al., 2004), and motor function in stroke (Hlustík and Mayer, 2006).

Importantly, studies of perceptual and motor experiencedbased plasticity have shown specific conditions and constraints for perceptual and motor learning. These principles could serve as guidelines and stating point when considering more broad plasticity principles. Major factors include: time-scales of learning, specificity vs. difficulty, interference, order of learning, training intensity and duration, context, motivation, and arousal, feedback, and variability. A full description of these principles is out of the scope of this review, the details can be found in (Green and Bavelier, 2008; Sagi, 2010; Preminger, 2011). In addition, other insights from learning research may be applied. One important example is the role of deliberate practice in developing expertise—it has been suggested that the amount of deliberate practice is a major factor influencing the level of expertise in various domains (Ericsson et al., 1993).

It is important to note that the focus of this review is on mechanisms of change of cognitive function which are based on repeated or ongoing experiences which gradually modify synaptic connections and consequently cognitive processing. This is a major mechanism by which many of our cognitive functions adjust and improve over time. Other mechanisms for inducing long-term changes exist, such as one-shot learning, which is often seen in episodic memory (Roediger et al., 2007), where one-time experiences may be remembered, and may induce long-term neural changes (Ludmer et al., 2011). When considering a lasting effect of their work most artists probably have in mind such long-term effect due to single impression. Here I put forward a different approach, proposing that art could be viewed as a medium that by instigating repeated experiences may induce long-term changes and serve as means for modification, improvement, and rehabilitation of various cognitive functions.

# **THE LONG-TERM EFFECT OF ARTISTIC EXPERIENCE ON THE BRAIN**

There are many examples of potentially transformative aspects of experiencing art, such ongoing exposure to plastic art and music influencing our perception and sense of beauty, or watching films or television affecting our social and behavioral patterns. Although multiple studies explore brain activation while experiencing various art mediums: plastic arts (Zeki, 1999; Ishai, 2011), music (Stewart, 2008), and films (Hasson et al., 2008, 2009), research of the long-term effect of repeated and ongoing exposure to artistic experiences is limited. Only a few studies examine the lasting effect of perceiving art on the brain, as it is not easy to isolate the effect of such ongoing experiences and to follow them long-term. One recent example is a functional magnetic resonance imaging (fMRI) study conducted on a short training in recognizing objects in cubist paintings (Wiesmann and Ishai, 2010), which has demonstrated that trained subjects where better than non-trained ones at the trained task and showed enhanced brain activity in the parahippocampal cortex.

Studies of the long-term effects of art often focus on the effects of practicing art—some by comparing artists to nonartists, and others by examining effects on purposely trained subjects. Specifically, the effect of practicing music on brain and cognition has been studied considerably, demonstrating structural and functional specializations (Stewart, 2008; Jancke, 2009). For example, a brain imaging study comparing brains of musicians and non-musicians demonstrated that musicians have more gray matter in primary auditory cortex (Schneider et al., 2005). Another study, which compared electroencephalography (EEG) event-related potentials during presentation of auditory stimuli between children with musical training and ones without, has demonstrated that musical training enhances the sensitivity of the auditory system (Meyer et al., 2011).

The practice of visual arts has also been examined. For example, a recent EEG study (Kottlow et al., 2011) found decreased upper alpha waves in artists vs. non-artists during various drawing tasks, suggesting that cognitive functioning, semantic memory, and object recognition are enhanced in artists. Another study, comparing eye-movements of visual artists to non-artists, showed that while viewing pictures artists spend more time scanning structural and abstract features, whereas non-artists focus more on human features and objects (Vogt and Magnussen, 2007).

Importantly, some of these studies point to the fact that practicing art may influence not only perception but also higher-order cognitive processes such as attention, memory, and executive functions. Of particular potential for such effects on high-level cognitive functions are also art domains which are centered on action by the artist, such as performing arts, or by the experiencer, such as interactive arts (e.g., interactive installations and theater, video games). In the following, I elaborate on the long-term effects of two such examples—video games and improvisation in theater and music.

# **VIDEO GAMES**

Video games are one domain in which the long-term effect on brain and cognition was studied more extensively. This is probably due to the fact that video games are designed for repetitive experience and are used as such. Although the inclusion of video games as part of arts is debated (Jenkins, 2000; Kroll, 2000), contemporary approaches appreciate the artistic quality of video games and their strong influence on modern culture (Smuts, 2005). In the context of our discussion, since video games clearly induce experience, are rewarding and engaging, and often communicate meaning, we view them as artistic experience. Studies of the long-term effects of playing video games have demonstrated that action games improve performance in many sensory, perceptual, and attentional tasks that extend well beyond the trained conditions (Green and Bavelier, 2006a). Playing action video games was shown to improve visual reaction times (review in Dye et al., 2009), enhance visuomotor coordination (e.g., Drew and Waters, 1986), improve spatial visualization skills (e.g., Dorval and Pepin, 1986), enhance various aspects of attention, (Green and Bavelier, 2003, 2006b; Feng et al., 2007), and improve probabilistic inference (Green et al., 2010). These effects were demonstrated by comparing gamers to non-games or by studying training effects on a purposely trained group vs. a control group. Recently it was demonstrated that at least some cognitive differences between gamers and non-gamers result either from pre-existing group differences or from very extensive game experience (Boot et al., 2008), thus studying gamers to infer the neurocognitive effects of games should be done with caution.

Altogether, video game playing has been shown to result in a broad spectrum of generalized performance enhancements in perceptual abilities as well as high-level cognitive functions. It has been suggested that probabilistic inference may be a general learning mechanism underlying these wide-range improvements (Green et al., 2010). However, these cognitive enhancements could also be attributed to several other factors. First, video games involve a more holistic and variable experience than classical perceptual learning paradigms (Green and Bavelier, 2008; Sagi, 2010) and thus are more ecological. Second, experiencing video games involves behaving and acting in an intensive and dynamic environment which may lead to enhanced learning (Achtman et al., 2008). Finally, rewards in video games are very dominant and timely assigned, which could be another reason for improved learning (Achtman et al., 2008). These characteristics of video games are also in contrast to art domains where experience involves mainly perception, which incites thoughts and emotions but does not involve action, and often is less intense. The value of these various characteristics and the principles underlying the generalization power of different training and practice formats will be further discussed later.

# **IMPROVISATION IN THEATER AND MUSIC**

To improvise means to grasp the present circumstances and act accordingly and appropriately, not being constrained by habits, fears, or conformity, nor behaving randomly or by impulses (Nachmanovitch, 1990). Performance of such "non-default" behavior in complex and novel contexts characterizes the functions of the frontal lobes (Mesulam, 2002). Although we all improvise in our daily life when we act in complex or novel situations, improvisation is also strongly associated with improvised performance in the arts, specifically in music and theater.

Recently several neuroimaging studies have examined brain activation during musical improvisation and demonstrated activation of prefrontal regions (Bengtsson et al., 2007; Limb and Braun, 2008; Berkowitz and Ansari, 2008). Furthermore, it has been noted that many of the exercises and practices performed by improvisation actors share high resemblance to classical neuropsychological assessments for prefrontal functions (Preminger, 2009a, 2011). Consequently it was suggested that improvisation training can serve as training and rehabilitation for prefrontal functions (Preminger, 2009a, 2011), proposing that a major benefit of such training is its resemblance to real-life, leading to a higher potential for transfer. Brain activity during some specific improvisation tasks and the long-term effect of their continued practice has been examined and demonstrated using fMRI and computerized paradigms that simulate improvisation exercises to allow for scientific investigation (Preminger et al., 2008, 2010a; Loya et al., 2010).

So far I focused on describing how a given art experience affects the brain. Given that brain and cognition have the capacity to be molded by artistic experiences, art can be created in a way that takes this knowledge into account and utilizes it to generate transformative experiences with particular artistic or rehabilitational goals in mind.

# **ART AS ENGINEERED EXPERIENCE**

Although the nature of aesthetic experience has been forcefully debated for centuries, there is a general agreement in the western culture about the term aesthetic experience, and various suggestion for its components where proposed. For example, Ramachandran and Hirstein proposed the *Eight laws of artistic experience*—common principles for how visual artistic experiences excite human cognition and brain (e.g., grouping, symmetry, and peak shift principles; Ramachandran and Hirstein, 1999). Furthermore, many art theorists agree that overall there are many commonalities in the way different people (from same culture) experience a given artwork. Some even claim that the ability to convey a consistent experience is a criterion for the quality of the artwork. Indeed, previous studies have demonstrated that eye-movement of different viewers while viewing the same famous paintings share a lot in common (Yarbuz, 1976). Similar results where recently obtained with viewers of commercial movies (Hasson et al., 2004, 2008). Furthermore, neuroscientists have suggested that at least at an elementary level, what happens in different brains when viewing works of art is very similar (Zeki, 1998). Neuroimaging studies have demonstrated a common brain activity pattern when subjects viewed paintings (Ishai, 2011), or when subjects listened to musical pieces (Stewart, 2008). Furthermore, it was demonstrated that when viewing a movie the time course of brain activity in sensory and association cortices correlated across viewers (Hasson et al., 2004). Nevertheless, correlation across viewers was not seen in all regions, in particular not in frontal cortical regions. The subjective aspect of perception was also demonstrated by a recent study of conscious perception using Dali's ambiguous painting (Smith et al., 2006) which demonstrated that different image components drive different observers to reach the same perceptual awareness.

Clearly, the ability to control the cognitive processes depends on the art medium (e.g., plastic art vs. cinema vs. music). Furthermore, the inclination of artists to induce a common experience across all viewers varies. One clear example is the different approaches taken by various film directors and theorists: whereas the Russian montage editing approach promoted "engineering" new movie experiences by combining different perceptual components together, the other side of the spectrum can be represented by André Bazin who advocated films which are "objective reality," leaving the viewer free to have his own subjective experience. Along these lines, it has been recently demonstrated that different films by different directors vary in the consistency of brain activation across viewers, where structured movies such as *Bang! You're Dead* by Hitchcock, elicited a wide range of brain regions where activity was correlated across subjects who viewed the movie (Hasson et al., 2008). On the other hand, an un-structured movie of a daily-life scene very few regions demonstrated correlation across subjects (Hasson et al., 2008).

In summary, at least to some extent, artists can be viewed as experts in controlling and manipulating humans' perceptions as well as the emotional and cognitive experience that they induce.

# **ENGINEERING ARTISTIC EXPERIENCES FOR LONG-TERM CHANGE**

This capacity of the artist to engineer experiences, and thus brain activity, can be, and has been, taken one step further. Art can be used to induce long-term changes in cognition, emotion, and behavior. One example for a similar approach was applied in the past, again by Soviet film directors (Dziga Vertov, Sergei Eisenstein), who promoted the usage of cinema as a propaganda tool for emotional and psychological influence on their audience for ideological purposes. Likewise, advertizing and marketing experts have been using similar principles over and over to achieve their commercial goals by changing behavioral, emotional and thought patterns (termed NeuroMarketing; e.g., Rothschild et al., 1986; Rothschild and Hyun, 1990; McClure et al., 2004).

These approaches are aligned with the theory of experiencebased plasticity by their similar reliance on repeated experience for modifying cognition. However, in contrast to these approaches who put in front of their eyes a goal which is external to the goals of their experiencer (e.g., buy some product, support a political movement), the approach presented here places focus on the consideration of art as a method for influencing and optimization of cognitive and brain functions to the benefit and value of the observer or experiencer, similarly to the approach taken by rehabilitation disciplines in designing cognitive training programs.

# **THE VALUE OF ART AS A TRANSFORMATIVE EXPERIENCE**

To understand how artistic experiences may contribute in designing transformative experiences, the first question that comes to mind is what is unique about artistic experience as opposed to other experiences such as daily-life experience or classical cognitive training experiences. The uniqueness of artistic and aesthetic experience is a broad and complex topic which has been addressed extensively by artists and philosophers. I will focus here mainly on the aspects that might be relevant to the transformative aspect of art. What qualities of artistic experience may cause it to be a better transformative experience than a daily life experience? What value can if offer over classical cognitive training paradigms?

Some theorists describe aesthetic experience with emphasis on unity, holism, and emotional experience. "*In aesthetic experience . . .there is completeness and unity and necessarily emotion*." (Dewey, 1934). Likewise it was suggested (Beardsley, 1958) that all aesthetic experiences have in common the following features: "*focus, intensity, and unity, where unity is a matter of coherence and of completeness*." This view, which stresses the immersive and holistic nature of the experience of art, points to the advantage of art as a training tool due to its captivating and engaging nature which presumably leads to focus, attention, and motivation which are know to enhance learning (Green and Bavelier, 2008). A neurocognitive model was recently suggested to explain the immersive nature of art, particularly films, which proposed that under proper context and mind set, the viewer's central executive hands-over control to the film information, resulting in a temporary rewarding dissociative state. (Dudai, 2008). This model emphasizes the potential of some forms of art to detach their audience from the surrounding world. It is important to note that the information that forms such a holistic and immersive experience does not have to originate solely from the artwork, some information may also be complemented from intrinsic, self-related processes, as will be discussed below.

The distinctiveness of artistic experience was also extensively addresses by Zeki (Zeki, 1998), elaborating on views of various artists and theorists who emphasized the "gist" aspect of art: "*art shows us that there are also constant truths concerning forms*" (Mondrian). "*Artist . . . give(s) reality a more lasting interpretation*" (Matisse). "*The whole beauty and grandeur of Art consists . . . in being able to get above all singular forms, local customs, particularities of every kind . . . makes out an abstract idea of their forms . . . "* (John constable) (quotes from Zeki, 1998). Along the same lines, Zeki presents the views of Hegel and Kant who deem art as being able to represent reality better than "*ephemera of sense data*," proposing that the brain, which has representation of concepts, can, when viewing art, easily extract them in a simpler manner than when viewing real-world perceptions. "*Art furnished us with the things themselves, but out of the inner life of the mind*" (Hegel, quote from Zeki, 1998). This point of view suggests two potential values for training: one concerning art as an abstract and generalized representation, and the other concerning the necessity the art induces to complete the experience from "the inner mind" thus using imagery and internal mentation.

The first aspect, regarding art as abstract and generalized representation, proposes that art extracts and presents the essence of things. Zeki further suggested that art and the brain share this capacity (Zeki, 1998), both seeking to find constancies in the world, and to represent the essence of objects and events given the dynamic and variable world around. This proposed characteristic of art again may imply that art can be a vehicle for improvement of representation, and allow more efficient access and modification of brain representations. This may suggest that learning through artistic experience can be more effective and more prone to generalization. Interestingly, this notion of a generalized representation, which is invariant of specific details and exemplars, resembles ideas behind the prevailing neural network models of associative memory (Hertz et al., 1991; Amit, 1995). In these models, memories are represented as stable states of network activity, called attractors, each attractor being the memory representation of all input stimuli that are drawn to it through network dynamics. Attractors are typically formed by Hebbian learning of overlapping patterns of synaptic efficacies (Hebb, 1949; Hertz et al., 1991; Amit, 1995), and thus the learned memory representations are formed by the specific history of experiences (Blumenfeld et al., 2006).

The second aspect, the necessity to complete the experience from "the inner mind," addresses another possible advantage the self-driven activation of some brain representations as opposed to pure externally driven activation. Indeed, artworks require the experiencer to "complete the experience" using imagination and other internally generated cognitive processes to fill in the gap to create a holistic experience. Imagery is known to be a useful tool for cognitive training in physiotherapy. For example, it was demonstrated that training for stroke patients which included motor imagery in addition to occupational therapy was more effective than training with occupational therapy alone (Page et al., 2001). In addition, filling-in the gap by "inserting" the viewers own subjective self experience, is possibly what allows for enhanced immersion and identification. Self-related thought processes, such introspection, self-representation, autobiographic and prospective memory, and volition, are suggested be mediated by a network of brain regions called the default-mode network (Gusnard and Raichle, 2001; Buckner et al., 2008; Preminger et al., 2010b). Such self-driven processes, combined with the externally driven ones are possibly what make the experience comprehensive. Clearly, different forms of art vary in the gap they leave between the provided and self-generated information. This can be considered and used depending on the goals of the transformative artwork.

Another interesting aspect can be derived from theory of performing arts (Schechner, 2002). It is suggested that the performing art entail an aspect of "re-doing" on top of to the "doing" itself. Re-doing means that any activity is a restored behavior, thus the activity comes in context of previous instances, and these restored behaviors can be "*worked on . . . played with, made into something else . . . [even] transformed*" (Schechner, 2002). More broadly, the artist can be thought of as "re-doing" a real-life experience of the experiencer. He drives the experiencer to reexperience some "basic" stored experience, but in a manipulated way. Thus, the artist can choose to emphasize a specific aspect of the experience, decide how emotional the experience will be, to what context it will be put in, etc.

As opposed to scientists, who are specialists in breaking down the cognitive experience into pieces, artists specialize in activation of multiple cognitive functions in cohort, generating holistic experiences involving perception, executive functions, intrinsic processes, and sometimes even action. Whereas researchers of visual perception commonly study simple and isolated stimuli to study the visual cortex (Sagi, 2010), painters, film makers, and video game designers are experts in inducing experiences by complex visual stimuli, thus causing a wide spread brain activation, where different brain networks work in cohort (in cinema, Hasson et al., 2004). Thus, possibly, the experience of art, can be placed somewhere in between scientific paradigms and real-life experience. It is "engineered" and has the essence of the experience but on the other hand it is also holistic and comprehensive.

# **ART AS LONG-TERM TRANSFORMATIVE EXPERIENCE**

Currently many artworks are seen by observers on a singleexposure basis, in a museum, gallery, or theater, and are created as such. However, some artworks do get repetitive exposure by their audiences, intentionally or un-intentionally. A good example is art which is presented in the public space (e.g., artworks by Christo, Anish Kapoor) where people have the chance to experience it again and again when passing by. Another example is recorded music, where a particular piece of music may be heard again and again. Similarly a painting which is hanged in the corridor of an office building, or in a living room, is experienced repetitively every day. How does the experience change from exposure to exposure? Does it leave any accumulating long-term mark on observers' cognition and brain? Would the artworks be designed differently if the repetitive exposure effect would have been taken into account? For instance, when seeing the same painting in your office corridor day by day, you probably already ignore it. What if it would have been a video-art that slightly changes every day? Will you notice? What effect could it have on you? These questions are for artists and researchers to answer.

Neuroscience methods offer a good starting point for developing systematic measures of the impact of long-term exposure to art on specific cognitive capabilities of the viewer, and to the unique contribution of the artistic aspects to these effects. Specifically, studies of experience-based plasticity utilize various cognitive tasks and neuropsychological assessments (Lezak, 1995) to evaluate changes in various cognitive functions (e.g., Mahncke et al., 2006; Chein and Morrison, 2010; Preminger et al., 2010a). Similar tools can be used to study the effect of longterm repeated artistic experience, as done in (Green and Bavelier, 2006b; Green et al., 2010). Assessing the unique effect of artistic experiences as opposed other experiences is more complex and touches upon philosophical questions regarding the definition of artistic experience, which is out of the scope of this review. Ignoring these philosophical aspects, and assuming commonly accepted views of art, some recent approaches taken in neuroesthetics literature present attempts of comparing artistic experience to non-artistic experience in studies examining brain activation by art. One approach is by comparing artistic stimuli to similar natural stimuli (e.g., Dio et al., 2011, comparing photos of sculptures of human body with photos of real bodies, or comparing paintings to scrambled paintings in Wiesmann and Ishai, 2010). Another approach compares different genres of art (e.g., comparing representational and indeterminate paintings in Ishai et al., 2007). Similar comparisons can be used to study the unique effects of long-term exposure to art, for example comparing people who walk to their work through a park with sculptures/art with people walking to their work via the same or a similar part without sculptures, or comparing effects playing different genres of video games or hearing types of music. Such approaches may provide insights to what are the effective components in different art forms and genres. Another approach that can be used is the comparison between artists to non-artists as described above.

Utilizing continuing experience to induce long-term change of neuropsychological functions may lead to a novel viewpoint on art and a different approach to its creation. Artists can design artworks that aspire to form, in addition to a one-shot influence on experience and affect, continuing experiences which accumulate into to a lasting change on specific neurocognitive functions. Similarly to the questions an artist asks today: what meaning or experience do I want to convey to my audience? How will I convey it? Artists could consider the questions: what lasting change do I want to induce by on-going exposure and on which neurocognitive functions? How can I induce it? Principles of experience-based plasticity discovered within the perceptual, motor, emotional, and executive domains, together with the experience and knowledge acquired in the arts, could form the foundation for this new approach to art as a continuing transformative experience.

# **REFERENCES**


**CONCLUSIONS**

This review links between the research of experience-based plasticity and the on-going and repeated experience of art in its various forms. The integration of knowledge from these different fields offers a potential for both disciplines. Scientists who study plasticity and develop paradigms to induce long-term neurocognitive changes may benefit from the various advantages of the holism, engagement, and the "gist" that art experiences offer. On the other hand, artists may use the experience-based plasticity approach to extend their aesthetic creations from immediate oneshot experiences to on-going or repeated experiences that induce long-lasting neurocognitive effect.

# **ACKNOWLEDGMENTS**

I would like to thank Barak Blumenfeld, Lior Noy, Oran Singer, and the referees for critical reading of earlier versions of this manuscript, and Tanya Preminger, Tamar Erez, Ehud Ben-Yaakov, Uri Hasson, Eydor Diamant, and Guido Hesselmann for fruitful discussions. This manuscript was partially made possible through the support of a grant from the John Templeton Foundation. The opinions expressed in this publication are those of the author and do not necessarily reflect the views of the John Templeton Foundation.

games. *Curr. Dir. Psychol. Sci.* 18, 321–326.


and aesthetics of indeterminate art. *Brain Res. Bull.* 73, 314–324.


"Neural activity associated with self-paced overt word generation – an fMRI study," *Israeli Society for Neuroscience 19th Annual Meeting*, Israel. [Paper in preparation].


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 12 April 2011; accepted: 02 April 2012; published online: 24 April 2012.*

*Citation: Preminger S (2012) Transformative art: art as means for long-term neurocognitive change. Front. Hum. Neurosci. 6:96. doi: 10.3389/fnhum.2012.00096*

*Copyright © 2012 Preminger. This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.*

# **THE ROLE OF THE VISUAL ARTS IN ENHANCING THE LEARNING PROCESS**

**Christopher W. Tyler and Lora T. Likova**

# The role of the visual arts in enhancing the learning process

# *ChristopherW. Tyler \* and Lora T. Likova*

*Smith-Kettlewell Brain Imaging Center, The Smith-Kettlewell Eye Research Institute, San Francisco, CA, USA*

#### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Antoni Rodriguez-Fornells, University of Barcelona, Spain Lutz Jäncke, University of Zurich, Switzerland*

#### *\*Correspondence:*

*Christopher W. Tyler, Smith-Kettlewell Brain Imaging Center, The Smith-Kettlewell Eye Research Institute, 2318 Fillmore Street, San Francisco, CA 94115, USA. e-mail: cwt@ski.org*

With all the wealth of scientific activities, there remains a certain stigma associated with careers in science, as a result of the inevitable concentration on narrow specializations that are inaccessible to general understanding. Enhancement of the process of scientific learning remains a challenge, particularly in the school setting. While direct explanation seems the best approach to expedite learning any specific subject, it is well known that the ability to deeply absorb facts and concepts is greatly enhanced by placing them in a broader context of relevance to the issues of everyday life and to the larger goals of improvement of the quality of life and advancement to a more evolved society as a whole. If the sciences can be associated with areas of artistic endeavor, they may be viewed as more accessible and favorable topics of study. There is consequently an urgent need for research in the relationship between learning and experience in the arts because both art education and scientific literacy remain at an inadequate level even in economically advanced countries. The focus of this review is the concept that inspiration is an integral aspect of the artistic experience, both for the artist and for the viewer of the artwork. As an integrative response, inspiration involves not only higher cortical circuitry but its integration with the deep brain structures such as limbic system and medial frontal structures, which are understood to mediate the experience of emotions, motivational rewards, and the appreciation of the esthetic values of the impinging stimuli. In this sense, inspiration can turn almost any occupation in life into an avocation, a source of satisfaction in achieving life goals. Conversely, when inspiration is lacking, the motivation to learn, adapt, and prosper is impeded. Thus, inspiration may be viewed as a potent aspect of human experience in linking art and science.

#### **Keywords: art, learning, neuroscience, limbic system, inspiration**

*"How can we develop techniques in learning to engage the emotions? Learning should not be driven by fear but by desire. Why are the arts considered separate from the sciences? What were the first arts? How did the scientific enquiring mind arise?"*

*d'Amboise (2008)*

To a significant extent, our future will be determined by how we deal with the unprecedented explosion of knowledge with which we are presently confronted. In a world of hyper technological advancement, there tends to be an intense focus on the technical and scientific aspects of the world around us, with a consequent neglect of other aspects of life that can enhance the learning of complex material, social skills, and overall quality of life. Although more information has been generated in the past century than in all of human history before it, the gulf between science and the arts has grown ever greater, and students are identified as being of one type or the other early on, creating self-fulfilling prophesies of their life trajectories. Even with all the wealth of scientific activities, there remains a certain stigma associated with careers in science, as a result of the inevitable concentration on narrow specializations that are inaccessible to general understanding. This disconnect between art and science may have had unintended consequences. Apart from the danger of creating a generation of scientists who lack an esthetic sense or appreciation of metaphorical expression, and of artists without scientific literacy, opportunities for cross-pollination, and mutual benefits are being squeezed out of contention by the professional demands.

# **STUDIES OF ARTS, CREATIVITY, AND LEARNING**

Despite the divergence between arts and sciences, a growing body of quantitative research suggests that the learning of science may be enhanced by relationships with the arts. Contemporary research is beginning to explore explicit neuroscientific hypotheses concerning the effects of activities such as, drawing, visual esthetics, and dance observation.

Visual art learning is reliant on a complex system of perceptual, higher cognitive, and motor functions, thus suggesting a shared neural substrate and strong potential for cross-cognitive transfer in learning and creativity. Within just a few weeks, for example, human infants can imitate and action such as sticking out the tongue in response to someone sticking out his tongue at them – how does the infant know just what motor action plans to implement based only on a visual input? Mirror neurons may account for this ability, translating visual input to motor output, underlying a connection between visual arts and movement, and the auditory arts and music. From pre-historical times, visual art

has been a form of communication deeply imprinted in human nature; the act of experiencing art and esthetic appreciation in the "receiver"also has the power of cross-cognitive effect any time during individual development. Compositional universals have been shown to govern the design of visual artworks across ages and cultures (Arnheim, 1988; Tyler, 1998, 2007; Ramachandran and Hirstein, 1999).

The ability to tolerate ambiguity and uncertainty during the creative process is an important mental trait. The tolerance for ambiguity is also an important attribute in the learning of science in order to deal with the complexities and ambiguities of scientific knowledge. Unlike its popular stereotype, science is replete with ambiguities and contradictions that have to be resolved in order for learning to proceed. Allusive thinking by appearance alone lends intuitive judgment to overly rational thought and can lead to the discovery of meaningful metaphors (Tucker et al., 1982; Smolucha and Smolucha, 1985; Peterson, 1993). This type of thinking could be developed with focused visual education methods and its applicability shown in a variety of academic disciplines.

In terms of accessible art practice, prior research on neurological patients has shown a conceptual link between drawing and language (Gainotti et al., 1983; Swindell et al., 1988; Kirk and Kertesz, 1989), and these researchers hypothesized that drawing may access the semantic system in a manner that improves cognitive access. Studies exploring the issue of mechanisms shared between different cognitive modalities revealed mechanisms that are used to process spatial representations in the visual modality, are shared with other modalities, such as the processing of pitch in music (Douglas and Bilkey, 2007). These findings have implications not only for scientific learning, but also for learning, pedagogical principles, and general social and educational policies.

# **THE NEED FOR LEARNING ENHANCEMENT**

The enhancement of learning remains a challenge, particularly in the school setting. While direct explanation seems the best approach to teaching any specific subject on the curriculum, it is well known that the ability to absorb reams of facts and concepts is greatly enhanced by placing them in a broader context of relevance to both the issues of the quality everyday life and the larger goals of human advancement to a more evolved status of society as a whole. It is these larger goals that evoke the need for research in the relationship between learning and experience in the arts. The need is urgent because arts education and scientific literacy remain at a low level in the U.S. and educational interventions are sorely deficient. To the extent that the sciences can be associated with relevant areas of artistic endeavor, they may be viewed as more accessible and more favorable as a topic of study. Moreover, there is an increasing level of neuroscience research that supports the idea of enhancing transfer of learning abilities from the arts to other cognitive domains.

All too often, the arts are marginalized in our schools. In response to this marginalization, educators have sought to justify the arts in terms of their instrumental value in promoting thinking in non-arts subjects considered more important, such as reading or mathematics (Murfee, 1995). However, there has been

little convincing research that the study of the arts promotes academic performance or elevates standardized test scores (Winner and Hetland, 2000). Really to understand whether art learning transfers to academic performance, we need first to assess what is actually learned in the arts and then to specify the mechanisms that underlie a transfer hypothesis. Hetland et al. (2007) therefore made a qualitative, ethnographic meta-analysis of the kinds of cognitive skills actually taught in the arts classroom, choosing the visual arts as their point of departure. The goal was to understand what is taught, in order to be able to develop a plausible theoretical transfer hypothesis. Eight "studio habits of mind" were identified as being taught in visual arts classes. Students are taught (1) to observe – to see with acuity; (2) to envision – to generate mental images and imagine; (3) to express – to find their personal voice; (4) to reflect – to think meta-cognitively about their decisions, make critical and evaluative judgments, and justify them; (5) to engage and persist – to work through frustration; (6) to stretch and explore – to take risks, "muck around," and profit from mistakes; and of course (7) to develop craft; and (8) to understand the art world. This work is the first to demonstrate objectively the kinds of thinking skills and working styles taught in arts classes. The group is now investigating the possibility that the skill of envisioning, taught in visual arts classes, may foster geometric reasoning ability.

# **STUDIES OF INTERSENSORY CONNECTIONS AND THE ARTS**

Neuroimaging studies have revealed that visual arts as well as music engage many aspects of brain function, and involve nearly every neural subsystem identified so far (Zeki, 1999; Solso, 2001; Brown et al., 2006; Cross et al., 2006; Levitin, 2006; Likova, 2010a,b). Could this fact account for claims that arts exercise other part of the brain and improve other cognitive abilities? Experience with the visual arts may be expected to produce similar facilitatory effects through the learning of artistic styles (Hess and Wallsten, 1987), although there is less formal research on the effect of visual art on learning enhancement in general. The visual system is legendary for its ability to analyze the complex interplay among spatial structures in 2D and 3D space. These powerful analytic capabilities are far in advance of what can be achieved by even the most sophisticated computer algorithms, but they are central to any achievement in visual arts (Kubovy, 1986; Gombrich, 1994, 2000; Tyler, 1998; Ramachandran and Hirstein, 1999; Livingstone, 2002). Indeed, neuroscience studies have begun to develop important techniques for the study of the neural circuitry mediating the appreciation of esthetic qualities (Zeki, 2001, 2004; Kawabata and Zeki, 2004; Tononi, 2004). Brain imaging studies have identified the cortical substrates for the encoding of a variety of art-related properties, from primary figure/ground categorization (Likova and Tyler,2008), and long-range symmetry properties (Tyler, 1994; Norcia et al., 2002; Sasaki et al., 2005; Tyler et al., 2005), through facial expressions (Kanwisher et al., 1997; Zaidel and Cohen, 2005; Chen et al., 2006) to dynamic athletic performances such as dance (Brown et al., 2006; Cross et al., 2006; Brown and Parsons, 2008; Likova, 2010a, 2012a,b). Such experience with the complex structures utilized in the visual arts is likely to make an important contribution to the enhancement of learning in all fields of endeavor.

The analysis of such complex spatial and dynamic spatial structures is one of the key aspects underlying the creativity of advanced thinking. Creative learning is a key aspect of the human thought processes that crosses many domains of neural functioning (Gardner, 1982; Glover et al., 1989; Csikszentmihalyi, 1997). The role of emotional evaluation in the cognitive processes underlying creativity has been emphasized by Damásio (1994), a theme that he has elaborated into other domains of human endeavor in subsequent work. Indeed, Dietrich (2004) has proposed that there are four basic types of creative learning, each mediated by a distinctive neural circuit. Creativity may arise either from a basis of deliberate control or from spontaneous generation. When the result of deliberate control, the prefrontal cortex instigates the creative process; the spontaneous generation may arise from activation of the temporal cortex. Both processing modes, deliberate and spontaneous, can guide neural computation in structures that contribute emotional content and in those that provide cognitive analysis, yielding the four basic types of creativity. This theoretical framework systematizes the interaction between knowledge and creative thinking, and how the nature of this relationship changes as a function of domain and age.

Defining art as a communicative system that conveys ideas and concepts explaining why it is possible for the same brain structures that supports other cognitive functions such as human language to be involved in arts such as music or drawing. This characterization presuppose millions of years of brain evolution and biological adaptive strategies. As a multidisciplinary communicative system, the arts provide an ideal platform for learning about the pleasure of knowing, which in turn provides the motivational inspiration to explore further, to ask questions, analyze and synthesize, and engage in convergent and divergent thinking.

#### **LEARNING AND ACTIVE INVOLVEMENT IN THE ARTS**

The current expansion of interest in the science of learning motivates exploration of the expanded possibilities of conceptual interrelationships offered by training in the arts. The difficult task of understanding and effectively enhancing learning across disciplines, ages, and cultural specificities is a high priority throughout the world, and may be particularly benefited by training in and even exposure to the arts.

Contemporary research is beginning to explore new neuroscientific hypotheses concerning the effects of learning in activities such as musical performance, drawing, visual esthetics, and dance, on learning in non-artistic domains. Neuroimaging studies have started to reveal that the process of drawing shares cortical substrate with writing, access to the semantic system, memory, naming, imagery, constructional abilities, and the ability to estimate precise spatial relations. Learning in the domain of visual art, in particular, is reliant on a complex system of perceptual, higher cognitive, and motor functions, suggesting a shared neural substrate and strong potential for cross-cognitive transfer in learning and creativity. For instance, case study by Solso (2001) has revealed significant processing differences between the brains of a professional artist and a novice during drawing in the scanner; the comparative analysis of the activation patterns suggests a more effective network of cognitive processing for the brain of the artist. Results consistent with some of these conclusions have

also been reported on the basis of differences in alpha rhythm as a function of level of artistic training (Kottlow et al., 2011). Recent neuroimaging studies in our lab have addressed the process of learning to draw by comparing BOLD fMRI brain activity before and after training to draw, and correlating it with the advance in drawing performance. These studies, run in diverse groups of people – from sighted to totally blind from birth, were made possible by a unique Cognitive–Kinesthetic Training Method that Likova developed for learning to draw even under the condition of total blindness. Indeed, in blind subjects who have never had any visual input, training in a spatial drawing skill generates dramatic utilization of occipital lobe resources as early as the primary "visual" cortex for this purely spatial task (despite the complete lack of any visual experience), as well as a reorganization in a network of temporal, parietal, and posterior frontal lobe regions consistent with its multifunctional role (Likova, 2010a,b, 2012a). An additional assessment showed a significant improvement in generic spatial and spatiomotor cognition abilities as well.

Another approach to the neuroanatomical underpinnings of visual art production and appreciation comes from observations of brain damage in established artists have been described (Zaidel and Cohen, 2005), which also provides insight into the relationship between art and other communicative displays by biological organisms, and the role that beauty plays in art. Art should be regarded as a cognitive process in which artists engage the most perplexing issues in present experience and try to find a way of symbolizing them visually so that they can bring coherence to their experience. In consequence, the definition of art is constantly changing in relation to its time. Understanding how we symbolize our experience, how we use symbolic form to organize our psyches, and what are the neuroanatomical corollaries to these processes, will have obvious implications for learning. From pre-historical times, visual art has been a form of communication deeply imprinted in human nature. Compositional universals govern the design of visual artworks across ages and cultures, and the act of art experience and appreciation in the "receiver" also has the power of cross-cognitive effect during any time point in individual development. These findings have implications not only for biomedical sciences, but also for learning, pedagogical principles, and general social and educational policies.

Another key aspect that the arts bring to the mix is the creativity involved in the generation of the art work, which was analyzed into its experiential components by Wallas (1926), involving


The Wallas (1926) account is largely cognitive, emphasizing the processes involved in reaching the solution to the problem. He does not specifically address the motivational aspects of how these processes would enhance the learning experience, except in the implied rewardingness of the insight (or moment of illumination) of the problem solution. In Wallas's scheme, the preparation stage corresponds to much of the learning required in the educational process, but what is not mentioned is the *inspiration* that characterizes the motivation for people to take up avocations and hobbies, i.e., the sense of enthusiasm and zeal that some domain of activity is of particular interest or relevance to a person. It is the inspiration, making the preparation stage intrinsically rewarding rather than a painful grind, can make all the difference to the learning experience.

A fine example of the creative moment in science was described by Andrew Feinberg in a keynote lecture on the expanding field of epigenetics (Seay, 2010). When on an architectural visit to Westminster Abbey, he noticed that adjacent to Isaac Newton's grave is a small plaque indicating Paul Dirac's grave (who was awarded the Nobel Prize for advances in the stochastic theory of quantum mechanics). Next to it is Charles Darwin's grave, with no adjacent plaque, but the juxtaposition gave him the epiphany that the modern version of Darwin's theory would be the stochastic variation in epigenetic processes that Feinberg subsequently developed into a major scientific breakthrough. In this case his preparation was many years of scientific research, but it was the foray into the nonscientific architectural tour that gave rise to the novel insight that took his work to the next level.

# **ARTS, LEARNING, AND INSPIRATION**

Another key aspect of learning that can be facilitated by the arts is the emotional inspiration to be involved in the learning process. Inspiration is an integrative mental function at the intersection of (a) cognitive, (b) emotional, and (c) conative processes. (Conative processes are those goal-directed functions relating to the classic third component of the mind championed by Kant, 1788, and McDougall, 1923, constituting the desire, ambition, and will.) As such, inspiration is an aspect of mental experience that involves not just cortical circuitry but its integration with the limbic system and medial frontal structures that are understood to mediate the experience of emotional desires, motivational rewards, and the appreciation of the integrative esthetic values of the impinging stimuli (Damásio, 1994). This system goes beyond classical concepts of beauty to incorporate the elegance of theoretical concepts, the appreciation of the emotive power of the diverse array of post-modern art installations, the grace and dynamism of athletic performances, the economy and evocativeness of political addresses, the interconnected synergy of natural ecological systems, and innumerable other examples throughout the sphere of our world knowledge. In a sense, inspiration can turn almost any occupation in life into an avocation, a source of satisfaction in achieving life goals. It is when individuals feel themselves part of larger enterprise that they are inspired to learn, to achieve, and to pursue a meaningful career. Conversely, when their job involves performing the same daily drudgery, inspiration is lacking and they lack motivation to learn, adapt, and prosper.

Thus, inspiration is a component of the emotional response to stimuli and actions, when they are perceived as uplifting or emotionally rewarding. As such it should be expected to be mediated by the limbic system and the reward systems of the brain. An impressive array of neural processing appears to be dedicated to the extraction of reward-related information from environmental stimuli and use of this information in the generation of goal-directed behaviors. In particular, the differential characteristics of activations seen in the dopaminergic mesencephalon, the dorsal striatum, and the orbitofrontal cortex provide distinct examples of the different ways in which reward-related information is processed. Moreover, the differences in activations seen in these three regions demonstrate the different roles they may play in goal-directed behavior (Hollerman et al., 2000). The dopaminergic systems appear to reflect a relatively pure signal of a reward prediction error. The representation of goal-directed behaviors may involve the basal ganglia of the putamen, globus pallidus, and striatum (Acevedo et al., 2011; Paulmann et al., 2011), where different subpopulations neurons differentiate between rewarding and non-rewarding outcomes of behavioral acts and are activated at different stages in the course of goal-directed behaviors, with largely separate populations activated following presentation of conditioned stimuli, preceding reinforcers, and following reinforcers (Apicella et al., 1991; Hollerman et al., 2000). Moreover, unlike the dopamine system, much of the striatal system responds to predicted rewards (Salimpoor et al., 2011). These activations could serve as a component of the neural representation of the appropriate goal-directed behaviors in response to the environmental contingencies associated with desirable goals (Engelmann et al., 2009). Finally, neuronal activations in the orbitofrontal cortex appear to encode the relative motivational significance of different rewards.

Further insights into this reward circuit may be obtained from psychopharmacological studies. In particular, cocaine is known not simply for inducing a sense of reward, but for producing an enhanced (though illusory!) sense of well-being, capability, and quasi-omnipotence. These are the core experiences of inspiration, which can evidently be accessed by this biochemical substitute. Functional imaging studies of focal signal increases for acute cocaine infusion are found in such limbic and basal ganglia structures as the caudate, putamen, basal forebrain, nucleus accumbens, thalamus, hippocampus, and parahippocampal gyrus; in the insular, subcallosal, cingulate, lateral prefrontal, temporal, parietal, and striate/extrastriate cortices; and in the midbrain structure of the ventral tegmentum and the pons (Breiter et al., 1997). Similarly, some of these areas are also implicated in the responses to a romantic image (such subcortical structures as the caudate nucleus, globus pallidus, putamen, lateral thalamus, subthalamic nuclei, and ventral tegmental area; Bartels and Zeki, 2004; Aron et al., 2005; Acevedo et al., 2011). Conversely, acute cocaine infusion produced signal *decreases* in the temporal pole, medial frontal cortex, and amygdala (Breiter et al., 1997).

As indicated by these studies, activation of the frontal reward network should not be treated as a unitary mental function, since reward in human experience incorporates a diversity of aspects. In particular, it is worth distinguishing four mutually complementary domains of reward – the appetitive, cognitive, social/conative, and inspirational aspects of reward. The first three aspects may be seen as corresponding to the Freudian mental subdivisions of id, ego, and superego functions – i.e., respectively those of hedonic gratification, of the effectiveness of the reward strategies, and

artistic objects that give rise to these experiences. The extent to which different key parameters play a role in the artistic experience should be investigated parametrically, and determine how these functions map onto the spectrum of artistic expertise.


# **REFERENCES**


a motor simulation de novo: observation of dance by dancers. *Neuroimage* 31, 1257–1267.


New analytic techniques will be necessary for understanding the whole physiological reaction, and open the opportunity for a converging approaches.


# **ACKNOWLEDGMENTS**

This research was supported by NSF/SLC Grants #0824762 to C. W. Tyler and #0846430 to L. T. Likova.

Hirstein. *J. Conscious. Stud.* 7, 17–27.


artistic objects that give rise to these experiences. The extent to which different key parameters play a role in the artistic experience should be investigated parametrically, and determine how these functions map onto the spectrum of artistic expertise.


# **REFERENCES**


a motor simulation de novo: observation of dance by dancers. *Neuroimage* 31, 1257–1267.


New analytic techniques will be necessary for understanding the whole physiological reaction, and open the opportunity for a converging approaches.


# **ACKNOWLEDGMENTS**

This research was supported by NSF/SLC Grants #0824762 to C. W. Tyler and #0846430 to L. T. Likova.

Hirstein. *J. Conscious. Stud.* 7, 17–27.


of the basal ganglia. *PLoS ONE* 6, e17694. doi: 10.1371/journal.pone. 0017694


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 11 April 2011; paper pending published: 22 October 2011; accepted: 20 January 2012; published online: 08 February 2012.*

*Citation: Tyler CW and Likova LT (2012) The role of the visual arts in enhancing the learning*

*process. Front. Hum. Neurosci. 6:8. doi: 10.3389/fnhum.2012.00008*

*Copyright © 2012 Tyler and Likova. This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.*

# **DRAWING ENHANCES CROSS-MODAL MEMORY PLASTICITY IN THE HUMAN BRAIN: A CASE STUDY IN A TOTALLY BLIND ADULT**

**Lora T. Likova**

# Drawing enhances cross-modal memory plasticity in the human brain: a case study in a totally blind adult

# *Lora T. Likova\**

*The Smith-Kettlewell Eye Research Institute, San Francisco, CA, USA*

#### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*David J. McGonigle, Cardiff University, UK Hidenao Fukuyama, Kyoto University, Japan*

#### *\*Correspondence:*

*Lora T. Likova, The Smith-Kettlewell Eye Research Institute, 2318 Fillmore Street San Francisco, CA 94115, USA. e-mail: lora@ski.org*

In a memory-guided drawing task under blindfolded conditions, we have recently used functional Magnetic Resonance Imaging (fMRI) to demonstrate that the primary visual cortex (V1) may operate as the visuo-spatial buffer, or "sketchpad," for working memory. The results implied, however, a modality-independent or amodal form of its operation. In the present study, to validate the role of V1 in non-visual memory, we eliminated not only the visual input but all levels of visual processing by replicating the paradigm in a congenitally blind individual. Our novel Cognitive-Kinesthetic method was used to train this totally blind subject to draw complex images guided solely by tactile memory. Control tasks of tactile exploration and memorization of the image to be drawn, and memory-free scribbling were also included. FMRI was run before training and after training. Remarkably, V1 of this congenitally blind individual, which before training exhibited noisy, immature, and non-specific responses, after training produced full-fledged response time-courses specific to the tactile-memory drawing task. The results reveal the operation of a rapid training-based plasticity mechanism that recruits the resources of V1 in the process of learning to draw. The learning paradigm allowed us to investigate for the first time the evolution of plastic re-assignment in V1 in a congenitally blind subject. These findings are consistent with a non-visual memory involvement of V1, and specifically imply that the observed cortical reorganization can be empowered by the process of learning to draw.

**Keywords: drawing, blind, brain plasticity, primary visual cortex V1, working memory, visuo-spatial sketchpad, learning, fMRI**

*"*... *we must look upon artists as persons whose observation of sensuous impression is particularly vivid and accurate, and whose memory for these images is particularly true."*

*Helmholtz, 1871*

# **INTRODUCTION**

We may not be aware of the complexity of drawing, but when analyzed in detail it becomes clear that drawing is an amazing process that requires precise orchestration of multiple brain mechanisms; perceptual processing, memory, precise motor planning and motor control, spatial transformations, emotions, and other diverse higher cognitive functions, are all involved. In terms of the multiple-intelligence theory (Gardner, 1983), drawing heavily employs such categories as bodily-kinesthetic and visuo-spatial intelligence.

This operational complexity may be one reason for the neglect of drawing as an experimental paradigm, being considered too complex to be successfully analyzed. In contrast to other arts, such as music, there have been only a few neuroimaging studies of the neural mechanisms of visual art, and of drawing in particular. In actuality, most of the available research on drawing (e.g., Makuuchi et al., 2003; Ferber et al., 2007; Ogawa and Inui, 2009) was heavily motivated by its importance in neurological tests, such as for the diagnosis of constructional apraxia

(Mayer-Gross, 1935; Piercy et al., 1960; de Renzi, 1982; Grossi and Trojano, 1999; Lee et al., 2004).

More than a century ago, in his famous 1871 lecture, Helmholtz pointed out that artists possess not only advanced observational capabilities, but also *enhanced memory* for the observed images. While the first part of this claim is often mentioned in vision science, the second—memory-related—part has been widely neglected, as though it did not reach the right audience.

We sought to understand if there is something especially advanced about artists' memory. And if so, is that advanced memory an inborn artistic trait or can it be engendered by the process of learning to draw? Drawing, and in particular memory-guided drawing, challenges the encoding of detailed spatial representations, their retrieval from memory and "projection" back onto a mental high-resolution "screen," so as to guide the motion of the drawing hand with the requisite precision.

#### **THE ROLE OF THE MEMORY BUFFER**

One theoretical construct that meets these demands is the *visuospatial memory buffer* also termed the "visuo-spatial sketchpad." In the classic model of working memory (as proposed by Baddeley and Hitch, 1974; Baddeley, 1986, 2000, 2003), this buffer is a major component that instantiates the function of developing and holding in working memory an accurate spatial representation of the retrieved object, providing a "sketch" that can be further spatially manipulated by the central executive to guide goaldirected behaviors. This influential model helps to understand the processes of memory encoding and retrieval for the kinds of spatial representations involved in the drawing task, allowing for the active maintenance of information about stimuli no longer in view. (It is not by chance that this landmark component of the real drawing process—the use of a disposable sketchpad was the metaphor Baddeley employed for the memory module in question.) This model has provided major insights into functional neuroimaging of memory, and conversely, it has been successively updated based on neuroimaging data. For example, it has recently been proposed that working memory is not restricted to retention only, and that working memory and long-term memory may be functionally interrelated (e.g., Buchsbaum and D'Esposito, 2009; Ishai, 2009; Ranganath, 2009).

Where in the brain may the working memory "sketchpad" be implemented? Previous theoretical and neurophysiological studies in non-human primates (e.g., Mumford, 1991, 1996; Lee et al., 1998; Super et al., 2001a,b; Lee and Mumford, 2003; Super, 2003) had suggested that the primary visual cortex (area V1) may provide for the high-resolution visuo-spatial "sketchpad" function. These suggestions are based on the fact that V1 is unique in being the *largest topographic map* in the brain, with the *highest spatial resolution*, in addition to having a connectivity allowing *parallel* processing of the information from the whole map surface—these features being critically important for a successful "sketchpad" implementation.

Traditionally, however, all areas in the early visual cortex have been considered predominantly bottom-up, purely sensory, and devoted to the visual modality. Nevertheless, increasing evidence has shown that they are also subject to a number of top-down processes. Most recently, early visual cortex has been implicated in visual *memory*. It is now considered, for example, that this cortex, and V1 in particular, are not only important for processing information about the immediate sensory environment, but can also retain specific visual information for working memory over periods of many seconds in the absence of direct input to support higher-order cognitive functions (e.g., Williams et al., 2008; Harrison and Tong, 2009).

# **THE AMODAL MEMORY HYPOTHESIS**

Furthermore, visual cortex can be activated in a number of *nonvisual* perceptual and memory tasks. It has been shown that verbal-memory can generate robust activation in the visual cortex of congenitally blind individuals (Amedi et al., 2003), and furthermore, episodic memory retrieval in congenitally blind individuals was associated with V1 activation (Raz et al., 2005).

The present study extends these results to memory in the tactile modality. We investigated whether V1 is involved in tactile working memory task in a congenitally blind individual. This question is of high importance for models of the functional architecture of human memory.

We have proposed that a highly demanding *tactile-memory* task, such as drawing guided solely by tactile-memory, is a powerful technique for addressing this question (Likova, 2010a, 2012). Beyond this, drawing has the unique advantage of providing an explicit readout of the memory content recalled during task performance, as it objectively "externalizes" the specific memory representation guiding the motor output in each trial.

Employing this novel memory paradigm in blindfolded subject, we have recently found that tactile-memory drawing strongly activates V1 (although no visual or even tactile information was available), while massively deactivating the entire extrastriate hierarchy (Likova, 2010a, 2012; Likova and Nicholas, 2010c). It is important to notice that this pattern of activation is quite distinct, almost the *inverse* of that for "classical" high-order functions, such as the known hierarchical pattern for visual imagery. The visual imagery signal propagates in top-down fashion through the visual hierarchy, being strongest in the *higher* extrastriate areas, *decreasing* towards the lower areas (e.g., Ishai and Sagi, 1995; Kreiman et al., 2000; O'Craven and Kanwisher, 2000; Kosslyn et al., 2001; Kosslyn and Thompson, 2003; Tong, 2003; Mechelli et al., 2004; Amedi et al., 2005; Merabet et al., 2005), and often *not* reaching V1 at all; thus, there is still an open debate whether imagery activates V1 itself or not. Notably, this visual imagery "signature" is entirely *opposite* to the occipital pattern generated by our drawing-from-tactile-memory task, which was characterized by having the *strongest* (and *only*) occipital activation in V1, while the extrastriate pathways were "*cut off* " by deactivation.

Consequently, the unique pattern of results in the blindfolded study was not compatible with an explicit role for visual imagery in this form of working memory. Instead, the strong V1 activation was more consistent with the hypothesis of the implementation of a working memory component, such as the spatial memory buffer, in this area. In view of the lack of any visual input under blindfolding, however, our findings suggest a re-conceptualization of the putative buffer as being *modalityindependent* or *amodal*.

In the current study, to further probe the *amodal* hypothesis, and to address the essential nature of drawing, we eliminated not only the visual input but any potential higher-level visual processing by selecting a congenitally blind novice. In contrast to late-onset blind individuals, who (similarly to the sighted) have had enough visual stimulation to develop vision and its associated visual imagery, visual memory, etc., congenitally blind individuals have had no access to visual information throughout life. Thus, congenital blindness, and even the wider category of early blindness, is considered to eliminate any visual influences of both bottom-up and top-down nature. In particular, it has been recognized as "clearly true that visual imagery does not account for cross-modal activation of visual cortex for the early blind" (Lacey et al., 2009), and that the congenitally blind are unable to perform visual imagery tasks (e.g., Goyal et al., 2006).

# **DRAWING IN BLIND INDIVIDUALS**

Drawing, and visual art in general, is presumed to be highly dependent on the visual modality (as implicit in its specification as "visual art"). However, there are totally blind people, including those blind from birth, who have been able to develop visual art skills (e.g., Kennedy, 1993, 2000; Heller, 2000; Kennedy and Igor, 2003; Kennedy and Juricevic, 2006; Ponchillia, 2008). Perhaps the most famous congenitally blind artist is Esref Armagan from Turkey, who draws and even paints in color although he has never seen light. Astonishingly, this blind artist is able even to draw in one-point perspective, showing a respectable grasp of how horizontal lines converge to a point in the distance. Amedi et al. (2008) studied this artist using fMRI, and found activation in a widely distributed network, including not only frontal and parietal regions, but also brain areas normally associated with early vision, such as the calcarine sulcus. This is the only neuroimaging study of drawing in the blind prior to our studies. However, as Armagan was a professional painter with many decades of experience, these authors did not have the opportunity to study the process of learning-based brain reorganization itself, but only to look at the completed learning state. To our knowledge, therefore, our studies are the first on *learning* to draw in the blind, and on the corresponding *dynamics* of brain reorganization.

# **THE NEED FOR THE COGNITIVE-KINESTHETIC TRAINING METHOD**

Our philosophy is that, because drawing encompasses a large range of demanding *perception-to-action* components, it provides for elaborated training in active spatial cognition. It forces the learners not just passively to explore the stimuli, but to develop detailed and stable memory representations in order to be able to re-express these representations in an explicit sensory format (i.e., to communicate the memory contents through drawing). These characteristics make drawing a potent paradigm for the study of memory.

Looking at line drawings by Matisse (e.g., **Figure 1**), we see how expressive only a few lines can be! Their appreciation seems such an effortless process that we are not aware of the invisible work of powerful brain mechanisms that provide the artist with the ability to transform 3D objects into their 2D projections by abstracting just the right contours into a line drawing; neither are we aware of how complex is the "inverse transformation" of such 2D drawings into an immediate understanding of the 3D objects that they represent.

It came as a surprise, therefore, to find that, when exposed for the first time to 2D raised-line drawings, many blind people have tremendous difficulty even in tactile recognition and comprehension of the 3D objects depicted. This negative finding, however, provided the opportunity to employ a *learning paradigm* in adults to investigate the developmental evolution of cognitive components of key importance for drawing, such as spatial memory.

In addition to the fact that the blind individuals are used to the haptic exploration of 3D shapes rather than their 2D projections

or abstract form, the explored 2D images are usually much larger than the pad of the index finger, so "scanning" movements of the finger along with the whole hand are needed to sense the entire image (Loomis and Klatzky, 2008). This requires an extensive spatiotemporal binding and memory of the continuously upgrading image. Thus, although there are some professional blind artists, both recognition of drawings and reproduction by drawing are extremely challenging for blind people. Furthermore, specific psychological barriers have to be faced and overcome, because most blind people find it difficult to believe that they would be able to learn to draw and would not even make the attempt.

All these considerations have been serious obstacles to conducting non-visual drawing studies, and have motivated the development of the Cognitive-Kinesthetic Method (Likova, 2010a,b), which has proven to be both effective and inspirational for blind people. Key components of this learning method are the incorporation of top-down feedback through conceptual and/or spatial interpretations, and encouraging enjoyment from the learning process.

The congenitally blind individual of this study was welladapted to operating in the everyday spatial world, including longstanding familiarity with complex tactile manipulations and Braille reading, but had no writing or drawing experience. Functional Magnetic Resonance Imaging (fMRI) was run before and after the Cognitive-Kinesthetic training in order to investigate the dynamics of brain reorganization as a function of learning to draw. Although this congenitally blind individual was a mature adult, her brain showed dramatic functional reorganization. Most remarkably, V1, which exhibited no specific involvement before training, was massively recruited in the drawing task after training. Temporal waveform analysis revealed characteristic phases in the progression of the reorganization process.

# **MATERIALS AND METHODS**

# **AN INNOVATIVE EXPERIMENTAL PLATFORM**

As there are no preceding neuroimaging studies of the kind, to make these studies possible it was necessary to develop a unique conceptual and experimental platform integrating a number of innovations, such as: (1) the Cognitive-Kinesthetic Method to effectively train people to draw without vision, (2) the first multisensory MRI-Compatible Drawing Tablet (for both tactile and visual drawing), incorporating a motion-capture system, (3) the first Method for estimating Topographic Maps in the Blind, (4) as well as implementation of standard probabilistic maps in blind individuals. This platform opens up a whole dimension of multimodal sensorimotor processing to neuroimaging studies.

# **SUBJECT AND TRAINING**

The congenitally blind subject CB4 was a 61-year-old righthanded female, totally blind with no light perception, who lost her vision as a result of German measles (rubella) in her expectant mother, severely and permanently damaging the fetal optic nerves. The subject gave informed consent for the experimental protocol approved by the local research ethics committee, Institutional Review Board.

CB4 had not been previously studied by fMRI or behavioral methods of any kind. She is a sophisticated intellect and a fluent Braille reader, with a high education and lifetime employment, and was highly motivated to participate in the study. Nevertheless, despite her Braille fluency and longstanding familiarity with complex tactile manipulations, she had no experience with writing or drawing. Consequently, her training to draw had to start with the basics, such as the proper holding of the pen and key spatial concepts of the representation of 3D structure on a 2D plane. CB4 had relied heavily on active tactile exploration for her whole life, so it was quite surprising that she did not have a clear idea of elementary geometric concepts such as a straight line vs. a curve, right angles, etc., and was unable to reproduce any simple component through drawing. These issues were manifested at all levels of the experimental process—the tactile recognition and memorization phase, the memory recall in the drawing phase, the understanding of spatial relationships, and even the kinesthetic feedback and self-evaluation of her own performance. For example, she could think she had just drawn a straight line, while she actually drew an almost closed curve, and so on.

It became clear, however, that these "negatives" could be turned into significant "positives" that would for the first time allow tracking of the full evolution of the neural process of learning to draw. Another advantage was the fact that CB4 was an intelligent adult, able both to readily follow instructions and to express back her introspections.

Interestingly, it seems typical for blind people to expect that they would be unable to perform a task such as drawing without guidance from the non-drawing hand. Even the exceptional blind artist Armagan, despite his many decades of blind drawing, still used a technique that involved "holding the pencil in his right hand to draw, while following the created indentations with his left hand" (Amedi et al., 2008). The same study reported that he "cannot complete his drawings if he is not allowed to use his left hand to follow the indentations created by the drawing," meaning that he relied on tactile *perception* from the left hand to provide the configural feedback.

In contrast, the unique technique by which we trained CB4 taught her to draw without using any tactile feedback from the non-drawing (left) hand, thus focusing the training on the development of an effective *memory representation* to guide the drawing trajectory.

The training was performed for 1–1.5 h per day for five days during the week following the initial fMRI session. Our novel drawing method was able to inspire and to motivate CB4 to acquire the exciting drawing skill. Remarkably, after only a week of training, she advanced significantly relative to her starting level, although her capability was still not satisfactory to her. Two months later she came back for two "refresher" training sessions which she felt brought her up to an adequate skill level. To study the dynamics of the learning process, we ran fMRI before training, as well as after the prolonged period of consolidation and a refresher training session.

#### **EXPERIMENTAL DESIGN**

We used a three-task block paradigm, with interleaved baseline conditions (**Figure 2**). A battery of raised-line models of faces and objects was developed as the drawing targets (**Figure 3**). The three tasks were as follows: *Explore/Memorize*, *E/M*—perceptual

**FIGURE 2 | Experimental design.** Drawing was investigated in a three-phase paradigm consisting of a memory-guided drawing task, abbreviated as "*MemoryDraw" (MD)*, plus two control tasks: a motor and "*negative*" memory control task "*Scribble" (S)*, and a task of perceptual exploration and memorization of the model to be drawn "*Explore/Memorize" (E/M)*. Each task duration was 20 s, with 20 s rest intervals elapsing between the tasks, with the whole trial sequence being repeated 12 times in each scanning session.

**FIGURE 3 | Raised-line drawing models.** Realistic faces and objects explored by the subject using her left hand in the *E/M* task were drawn from memory in the *MD* task after a 20 s rest interval. Two repetitions of each of the six stimuli were run in each fMRI session for total of 12 runs per session.

exploration and memorization of the model to be drawn; *MemoryDraw*, *MD*—a memory-guided non-visual drawing task; and *Scribble*, *S*—a motor-control and negative memory-control task. Each task duration was 20 s, with a 20 s baseline condition *("RestInterval," RI)* intervening between the tasks during which the subject rested motionless being instructed to clear any image from mind. The start of each task or rest interval was prompted by an auditory cue. The whole three-task sequence with interleaved rest intervals (*RI, E/M, RI, MD, RI, S*) was repeated 12 times in each fMRI session.

One of the advanced aspects of the experimental design was that the models were always explored with the left hand but drawn by the right hand, thus requiring the subjects to develop a clear mental representation in order to transfer the information to the opposite (drawing) hand. This design ensures that in the *MemoryDraw* task the right (drawing) hand does not have any "haptic knowledge" of the image. Moreover, the fact that the left hand was not allowed to follow the contour drawn by the right hand ensures that the subject learns to draw without relying on any tactile configural feedback. Together, these design features enforce the encoding of a robust memory representation needed to guide the drawing trajectory.

In *Explore/Memorize,* using the left hand only, the subject had to tactually explore a raised-line drawing model on the left slot of the drawing tablet, and to develop a full memory representation of the image in preparation for the *MemoryDraw* task. Then the model image was removed, and the subject rested motionless for 20 s with no image in mind (*RestInterval*), followed by the *MemoryDraw* phase. In the following *MemoryDraw* phase the fiber-optic stylus was used to draw the image (from tactile memory) on the right slot of the tablet with the right hand. *Scribble* was a control for both the generic hand movement and memory involvement; the subject had to move the stylus with the right hand in a random trajectory over the right slot of the tablet to the extent and rate similar to the drawing movements, but under instructions not to plan or imagine any particular trajectory form, avoiding any cognitive content.

# **TACTILE STIMULUS PRESENTATION AND HAND MOVEMENT CONTROL** *Custom-built, multisensory MRI-compatible drawing system*

To run drawing studies in the scanner is not a conventional protocol and faced a lot of unresolved technological problems. We developed a special-purpose drawing system that: (1) is MRIcompatible, (2) is ergonomically adaptable, (3) allows multiple tactile images to be presented in the scanner, (4) captures and records the drawing trajectory with high precision, and (5) provides a real-time visual feedback when drawing in the sighted is studied. This system (**Figure 4**) incorporates a dual-slot drawing tablet that is height/distance adjustable and an adapted version of a fiber-optic device for motion-capture of the drawing

**FIGURE 4 | A subject on the scanner bed operating our novel multimodal MRI-compatible drawing device.** The plexiglass gantry supports a drawing tablet while a fiber-optic drawing stylus captures and records the drawing movements with high precision. The motion capture information synchronized with the fMRI allows the effect of behavioral events to be analyzed.

movements. To our knowledge, this is the first multisensory drawing-system to support the fMRI investigation of *tactilely*guided drawing by providing for the presentation of multiple tactile raised-line images in the scanner without the need of any operator. It also allows us to record relevant behavioral and feedback events and to correlate them to the brain activation for full off-line analysis.

# *Auditory cue presentation*

The auditory stimuli were presented through Resonance Technologies Serene Sound earphones (Resonance Technologies, Salem, MA). To reduce scanner noise, this equipment employs external ear protectors with perforated ear plugs that conduct the auditory cues directly into the auditory passage while blocking much of the scanner noise.

# **MRI DATA COLLECTION, ANALYSIS, AND VISUALIZATION** *fMRI acquisition*

MR data were collected on a Siemens Trio 3T scanner equipped with 8-channel EXCITE capability, a visual stimulus presentation system, response buttons. A high-resolution anatomical (T1 weighted) volume scan of the entire brain was obtained for each observer (voxel size = 0.8 × 0.8 × 0.8 mm). The fMRI bloodoxygenation-level-dependent (BOLD) responses were collected with EPI acquisition from the whole head coil. There were 34 axial slices at 2 s TR, with TE of 28 ms and flip angle of 80◦, providing 3.0 × 3.0 × 3.5 mm voxels throughout the whole brain. The functional activations were processed for slice-time correction and motion correction. The two-phase motion correction consisted of a within-scan correction and a between-scan correction of each scan to the reference scan, both of which used mrVista (Stanford Vision and Imaging Science and Technology) to correct for six parameters of rigid-body motion. An anatomical segmentation algorithm (mrGray) was applied to the T1 scan, ensuring localization of the signal within the cortical gray matter close to the activated neurons and greatly reducing the blood drain artifacts that afflict studies in which cortical segmentation is not used. The activation was specified in terms of the statistical significance (*p* < 0.05) of the signal in each voxel (after Bonferroni correction for the number of gray matter voxels).

# *Pre-processing*

The raw DICOM-format data from each fMRI scan were converted to a 4D NIFTI file. Using FSL tools, we ran within-scan and between-scan motion corrections, bringing all functional data into alignment with the fMRI volume acquired closest in time to the T1-weighted "inplane" anatomy. Then we averaged across scans, resulting in a single 4D NIFTI file for that scan session.

# *fMRI time course analyses*

The data were analyzed to estimate the effective neural activation amplitudes (for each task across the 12 repeats of the 3-task sequence in a one-hour scan) by the following procedure. A General Linear Model (GLM) consisting of a (3 + 1)-parameter boxcar neural activation model convolved with an estimated hemodynamic response function (HRF) was fitted to the BOLD responses for each 3-task sequence, combined with a 1-parameter boxcar corresponding to the 8 auditory cue presentations and an additive 4th-order polynomial to capture low-frequency drift in the BOLD signal. Thus, the parameters of the activation model consisted of the boxcar activation amplitudes for the three task periods, combined with the amplitudes of the auditory signals. (The HRF parameters were determined once per session by optimizing this model to a subset of gray matter voxels identified as most responsive to the task/rest alternation frequency in this experiment.)

# *Voxel-wise parametric maps*

For each task—*E/M*, *MD*, and *S*—statistical parametric maps were generated, based on the estimated activation amplitudes from the above GLM in each voxel that exceeded the noise threshold defined by the variability across the 12 repeats of the 3-task sequence in each one-hour scan. Also, voxel-wise maps of the change in activation following the training period were generated, scaled in terms of *z*-score of the pre-post difference signals.

# *ROI activation analysis*

The effective neural activation amplitudes (bar graphs) for each condition in each region of interest (ROI) were estimated by the same GLM procedure but now applied to the *average* signal across all voxels within the ROI. This procedure also provided highquality time courses for evaluation of the response dynamics and its comparison across tasks and stages of training.

The confidence intervals were defined by the amplitude variability the 12 repeats of the 3-task sequence in each one-hour scan. The *dashed lines* and the *error bars* represent confidence intervals for *two different forms* of statistical comparison of the activation levels (i.e., of the beta weights for the event types in the GLM): (1) The *dashed lines* represent the 99% "*zero" confidence interval* (*p* < 0.01, uncorrected) within which the activation amplitudes are not significantly different from zero (i.e., relative to the noise variance for no stimulus-related activation defined as the residual variance after the GLM model fit of the *FMRI time course analyses* section described above); thus this statistical criterion is designed to indicate the significance of each individual activation (at *p* < 0.05, corrected for multiple applications within each figure); (2) The *error bars* are "*difference" confidence intervals* designed to illustrate the *t*-test for the significance of differences *between* activation levels in each figure (i.e., the differences are not significant unless they exceed the confidence intervals for both compared activations), again at *p* < 0.05 (corrected for multiple applications).

In the text, all ROI-comparisons are specified as significant by the *t*-test using a statistical criterion threshold of *p* < 0.05 corrected for multiple comparisons.

# *Topographic maps in the blind*

On the one hand, no informed analysis of the visual cortex could be done without knowledge of its retinotopic and functional organization; on the other hand, no retinotopic mapping or visual localizers are possible in the blind, so it was a challenge to localize any specific visual area. To resolve this issue and determine the borders of area V1 in blind participants, we took a three-pronged approach. First, we used the Freesurfer probability map atlas (see http://surfer.nmr.mgh.harvard.edu/fswiki/Brod

mannAreaMaps), to transform the primary visual area map back to the blind subject's brain through the Freesurfer spherical surface registration procedure. To verify the process, we first ran this procedure in the brains of sighted subjects, for which we already had individual retinotopic maps; the borders of the retinotopically defined V1 aligned fairly accurately with those from the Freesurfer. Second, we verified the location of the V1 ROI by intersecting with its anatomical marker (the calcarine sulcus). And third, we used an innovative 14-step procedure (Likova, 2010a,b, 2012) that allows us to warp the brains of sighted and blind subjects to the same MNI brain. This innovative threeway comparison enabled us to estimate the corresponding topographic regions in the blind brain. All methods converged very well to the definition of the V1 ROI.

# **RESULTS**

# **ENHANCED V1 ACTIVATION AFTER TRAINING TO DRAW FROM TACTILE MEMORY**

The focus of this analysis is the occipital region along the calcarine sulcus corresponding to the location of area V1. The V1 ROI was determined as explained in the Materials and Methods. **Figure 5** shows a difference map for the *MemoryDraw* task, which represents voxel-wise comparison of the *post-*training BOLD activation relative to the *pre-*training level, projected on inflated representations of the medial views of the two hemispheres. It reveals strong *post-training* enhancement of the V1 activation (orange-yellow coloration within the green outlines) in both the left (LH) and the right (RH) hemispheres.

# **SUBJECT REPORT**

# *Pre-training subject self-report*

Prior to training, subject CB4 reported a complete inability to comprehend the objects depicted by the raised-line models. All three tasks, even scribbling with a pen, were extremely challenging for her, and her performance was correspondingly poor. Although the familiarization session was sufficient to orient the subject to the experimental tasks and equipment, it did not advance her state beyond that of a total novice. Thus, a rudimentary functional organization was expected at this stage. In the first training session (after familiarization and the pre-training fMRI), CB4 spent an average of 183 ± 43 s in completing each drawing.

**FIGURE 5 | Primary visual cortex shows the predominant learning effect in the** *MD* **task.** A voxel-wise comparison, projected on inflated representations of the posterior left (LH) and right (RH) hemispheres, shows the increase (orange-yellow coloration) of the *post*-training BOLD activation in *MD* relative to the *pre*-training level. Dark gray, sulci; light gray, gyri.

# *Post-training subject self-report*

The post-training fMRI session was run eight weeks after the subject went through a week of 1–1.5 h/day training. Two "refresher" training sessions were also conducted in the week of the fMRI session. In the final training session, she spent an average of 23 ± 3 s to complete each drawing, which is a highly significant improvement over the first training session (*p* < 0.0004; *t* = 4.32; df = 19). The subject reported being able to recognize the raised-line drawings and generate a clear memory representation within the 20 s interval of the *E/M* task; and 20 s later in the *MD* task to recall the template from memory, to mentally "dissociate" it from the location where it was explored (the left slot of the tablet), to "project" it to the right-slot and to "trace" it there with the drawing stylus, as instructed during training. This report implies that robust *memory representations*, which are a prerequisite for guiding the complicated drawing movements, were successfully developed during the training period. Moreover, the dissociation from the initial location reflects successful learning of coordinate-transformation, which is known to be an important component of drawing, typically affected in some neurological conditions, such as constructional apraxia (e.g., Makuuchi et al., 2003; Ferber et al., 2007; Ogawa and Inui, 2009).

# **COMPARATIVE PRE/POST-TRAINING ANALYSIS**

Comparison of the pre-training to post-training BOLD responses shows a dramatic enhancement from negligible activation in V1 before training (**Figure 6A**), to a massive task-specific activation as a result of training (**Figure 6B**).

# **CROSS-TASK COMPARISON OF V1 ACTIVATION**

All cross-task ROI-comparisons in the text are specified as significant by the *t*-test using a statistical criterion threshold of *p* < 0.05 corrected for multiple comparisons. The *dashed lines* and the *error bars* represent confidence intervals for two *different forms*

**FIGURE 6 | V1 activation in** *MD* **before training (A) and after training (B).** BOLD activation (orange-yellowish coloration) from the *MD* task, derived according to the GLM described in Materials and Methods, and projected on inflated representations of the posterior left (LH) and right (RH) hemispheres is shown for both the pre-training **(A)** and the post-training **(B)** fMRI sessions. Medial views of the posterior part of the brain optimally visualize area V1 (green outlines) along the calcarine sulcus. Scale bars show the color-coding for the *z*-score levels of the activation. Comparison of the *pre-* to *post*-training responses shows a dramatic enhancement from negligible activation in V1 before training **(A)**, to a massive task-specific activation as a result of training **(B)**. Note that, interestingly, the extension of the post-training activation approximately corresponds to the spatial extent of the images (∼10◦ diameter).

of statistical comparison of the activation levels (see "ROI activation analysis" section in "Materials and Methods" for more detail).

# *Pre-training: lack of task-specificity*

Bar-graphs for the estimated activation in the V1 ROI in each hemisphere in (**Figure 7A**) indicate a lack of task-specificity (not significantly different activation levels, at *p* < 0.5, corrected) for the *MD* and both control tasks in the left hemisphere, with similar (NS at *p* > 0.5) activation for *S* in the right hemisphere, but noisy signals to *E/M* and *MD* in the right hemisphere.

# *Post-training: memory task dominance*

Cross-task comparison of the V1 response for *MD* (red bars) to those for *E/M* (blue) and *S* (green) in the left and right hemispheres after training are shown in **Figure 7B**. Note that, after training, the *MD* response dominates in both left and right V1. As indicated by the confidence intervals, the following relationships are statistically significant (*p* < 0.05, corrected): *MD* > *E/M* and *MD* > *S* in both hemispheres, and *MD* > *E/M* > *S* in the left hemisphere. Thus, *MD* was the task that most strongly activated V1 bilaterally, showing highly significant % BOLD responses at low noise; the *E/M* task gives significantly weaker, left-dominant responses; however, the motor-control scribbling task, *S* (which lacks any memory component), is even suppressed in the left hemisphere.

# *Pre/post comparison*

Comparison of the *post-training* response pattern to that *before training* (**Figures 7A,B**) shows the following statistically significant (*p* < 0.05, corrected) relationships: *E/Mpost* ∼= *E/Mpre*, *MDpost* > *MDpre* and *Spost* < *Spre* in both the left and the right hemispheres. This analysis implies a significant change in the V1 response pattern as a function of training. In particular, the V1 response in the *memory*-guided drawing task *MD* was substantially increased, while that of the *non-memory* motor control task *S* was reduced effectively to zero.

# **COMPARISON OF THE TIME-COURSE OF THE BOLD RESPONSE IN V1** *Before training: immature BOLD response waveforms in V1 (Figure 8A)*

Analysis of the time course of the BOLD responses underlying the estimated average response amplitude reveals deeper aspects of the neural processing at this initial stage of functional changes. As seen in **Figure 8A**, the average time courses for the sequence of the three task intervals (white bars) show substantial deviations of their waveforms (black lines) from the model prediction fits (color lines). The model takes into account both the task duration and the estimated HRF (see Materials and Methods). The pretraining response waveforms are rudimentary, poorly developed and noisy, with a prominent *transient* nature and *early offsets* long before the end of the 20 s task periods, in spite of the continuous hand movements during the full task period (as evident from the fully-fledged time course in the motor hand area, **Figure 9A**, and from the motion-capture records as well). These early offsets imply that the V1 neural response was essentially a brief transient pulse, suggesting an unsuccessful attempt to activate this area,

which was immediately withdrawn. Such undeveloped utilization of V1 is consistent with the subject self-report and drawing performance.

# *After training: well-developed BOLD response waveform in V1 (Figure 8B)*

Notably, as a result of training, the V1 temporal waveforms became fully developed, i.e., a good match to the model prediction based on the sustained drawing activity throughout each task period (**Figure 8B**). We no longer see the transient earlyoffset signals. V1 responded very differently to the two types of drawing: while *MD* generated the strongest signal bilaterally, the non-memory drawing *S* was lacking any significant response.

# **MOTOR CORTEX AS A CONTROL: MATURE SIGNALS, CLEAR TASK-SPECIFICITY**

# *Before training: right hand specific, well-fitted BOLD response for both memory and non-memory drawing*

To verify that the rudimentary signals in V1 before training were not a general property of this brain, we also investigated the BOLD waveforms in non-deprived areas such as the hand area in the left motor cortex, which is well known to control right-hand movements. In contrast to V1, this area showed the expected functional specialization even before training: only the two right-hand tasks (*MD* and *S*, red and green bars, respectively) elicited activation, while the left-hand task (*E*/*M*, blue) did not (**Figure 9A**).

Furthermore, the signal waveforms were well-developed (black lines), conforming to the prediction of the neural activation model (color lines). These results thus verify that, despite the transient nature of the signals in V1, this congenitally blind cortex was able to generate normal responses in other areas before training.

# *After training: BOLD response characteristics similar to the pre-training session*

Comparative analysis in the motor area shows that, as in the pretraining session (**Figure 9A**), the post-training signal waveforms were fully-developed and fitted by the model, with equally strong responses to *MD* and *S* (**Figure 9B**). These results indicate that

**FIGURE 8 | Response waveform analysis in V1.** The average time courses of BOLD activity (black lines) are shown for the sequence of the three task intervals (white bars); the four dark-gray bars indicate the 20 s rest intervals separating *E*/*M*, *MD*, and *S* tasks. Immature and non-specific transient "bursts" before the CK-training **(A)**, were transformed after training **(B)** into well-developed waveforms for the *memory*-drawing *MD*, in contrast to the loss of any significant response for the *non-memory* drawing *S*.

the hand motor area does not discriminate between the memoryand non-memory tasks, but treats them as two similar right-hand tasks, both before and after training. In contrast, after training V1 responded strongly to the memory task but not to the nonmemory task.

# **DRAWING RESULTS**

Before training, the drawings of the congenitally blind subject were unrecognizable scrawls (**Figure 10**, middle panels), consistent with her self-report of an inability to comprehend the object depicted by the raised-line drawings. The training, however, was effective in developing CB4's capability to produce well recognizable drawings under non-visual memory guidance (**Figure 10**, right panels), consistent with her reporting of now being able to recognize and recall clearly detailed spatial representations.

**FIGURE 10 | Representative examples of pre- vs. post-training drawings.** The left panels show the respective raised-line models to be drawn. Note the significant advance from practically unrecognizable drawings before training (middle panels) to well-recognizable drawings achieved by this totally blind subject as a result of the Cognitive-Kinesthetic training (right panels).

# **DISCUSSION**

Training to draw in an absolute novice, who has been blind from birth, allowed us to investigate for the first time the evolution of the *temporal dynamics* of the functional reorganization in area V1, and the effect of *learning* to draw from tactile memory. Comparative pre/post-training analysis of the V1 BOLD response waveforms revealed their remarkable change from being transient and task non-specific (**Figure 8A**), to becoming full-fledged and task-specific (**Figure 8B**), and extending to a particular eccentricity in the cortical map (**Figure 6B**).

Except for her blindness and reduced hearing, this congenitally blind subject was in excellent physical and mental health, so there were no reasons to expect abnormal BOLD responses. However, it became evident that a reliance on tactile perception in her everyday life was insufficient to lead to development of the specific V1 functionality demanded by the *MD* task, which showed undeveloped waveforms before training. Conversely, the well-developed response patterns in the hand motor area both before and after training (**Figures 9A,B**) confirmed that the abnormalities in V1 were a signature of its lack of relevant functional specialization before training, not an idiosyncratic subject response characteristic. It is important to stress that the use of the *learning* paradigm as an empirical intervention allowed us to go beyond mere task/activation correlation to the causal inference that the changes in V1 were a result of the training in the tactile-memory drawing task. Moreover, the concurrent measures of the objective memory readout (the drawings recorded by our motion-capture system) and the pre/post subject self-reports converged with fMRI evidence for the causal efficacy of the drawing training.

# **MAIN PRINCIPLES**

This study is based on several basic principles. First, as emphasized by the capability of blind drawing, the "space" domain is not represented solely by vision: although the visual system is the modality best suited to process spatial information, it is not the only one; if deprived of visual input, the brain is capable of employing the "free" visual processing resources in the most relevant way. Second, learning an unusual, demanding task (such as drawing in blind adults) is a fruitful paradigm for studying brain reorganization and its developmental stages. Third, to provide for elaborated training in active spatial cognition, the training task has to encompass a large range of perception-to-action components (such as drawing). Finally, an effective way to force the learners to develop robust memory representations is to have a task that demands the explicit re-expression of these representations through the active motor loop (i.e., to "communicate back" these representations through drawing). Thus, the nonvisual drawing incorporating all these principles is a powerful experimental and memory training paradigm.

# **COGNITIVE-KINESTHETIC TRAINING METHOD AND BLINDNESS**

Typically, studies in blind individuals are not done in a training paradigm, but rely on spontaneous experience-based plasticity and simply compare their current state with that of sighted individuals. The development of the Cognitive-Kinesthetic training method for non-visual drawing emphasizes a different approach. By implementing the above principles, this method presents a powerful research, and potentially, rehabilitation tool. By effectively teaching blind adults on a short time-scale, it allows the natural dissociation of cross-modal processing subsystems. Furthermore, it opens a window to observing the evolution of reorganization at both neural and behavioral level, and has the advantage of working with adults who can provide clear introspection and produce complex behavioral measures (as oppose to working with difficult-to-communicate infants).

Even the pre-training results are remarkable in capturing a very early developmental stage, one that is usually difficult to observe. The non-specific and immature pre-training responses seem to reflect the initial stage of functional "search," when the brain is still "probing" for the best resources before reaching the needed functional capability.

The self-reports of the congenitally blind subject *before* and *after* training were consistent with the changes observed in the V1 BOLD response (i.e., **Figures 7A** and **8A**, before training; and **Figures 7B** and **8B**, after training). Thus, the Cognitive-Kinesthetic training helped to maximize the ability for spatial reasoning and detailed memory representations with clear understanding of how the 2D tactile images being drawn relate to the depicted 3D objects.

# **DRAWING AS A MEMORY PARADIGM**

The particular innovation of the neuroimaging experiments was to incorporate the drawing-based memory paradigm in the context of the Cognitive-Kinesthetic training method, together with the technological advances of the multisensory MRI-compatible system. This novel memory paradigm has the unique advantage of providing an *explicit memory* "readout" of the specific mental representation that guides it. Importantly, as in the case of the blindfolded study (Likova, 2010a, 2012), the Cognitive-Kinesthetic training enabled CB4 to draw from memory the *specific memorized* objects and faces that she had explored, not just some longstanding "clichés," thus showing that the particular *memory-representations* generated during the *tactile exploration* phase were guiding her drawing activity.

# **RELATION TO PREVIOUS STUDIES**

In general, our results are consistent with previous reports in showing that brain areas traditionally considered purely visual, such as the primary visual area V1, can be activated in a crossmodal manner. Braille reading, naming, auditory localization, tactile discrimination, and other non-visual perception tasks can lead in the blind to reorganization and recruitment of visual cortex in a compensatory manner (e.g., Uhl et al., 1991, 1993, Sadato et al., 1996; Cohen et al., 1997; deVolder et al., 1997; Buechel et al., 1998; Hamilton et al., 2000; Burton et al., 2002; Amedi et al., 2003, 2004, 2008; Gizewski et al., 2003; Theoret et al., 2004; Merabet et al., 2005; Pascual-Leone et al., 2005; Voss et al., 2006; Goyal et al., 2006; Borowsky et al., 2007; Ptito et al., 2008; Amedi et al., 2008).

The present study extends these results to *tactile memory* and the effect of *training*. We asked if area V1 is involved in tactile working memory in a congenitally blind individual who was a total novice in a highly demanding *tactile-memory* task. The rapid recruitment of adult primary "visual" cortex as a result of a shorttime Cognitive-Kinesthetic training implies the operation of a much faster form of learning-based plasticity than exhibited by the many decades of self-training in the blind artist Armagan (Amedi et al., 2008).

As the post-training activation in V1 in the memory drawing task was generated without either visual or tactile *sensory* stimulation, this paradigm excludes explanations based not only on visual bottom-up input, but also on direct signals between primary sensory cortices.

# **INTERPRETATION OF V1 AS A MEMORY BUFFER**

To put the present results in specific perspective, we needed concepts relevant to the mechanisms involved in the memory encoding and retrieval for the kinds of spatial structures used in the drawing task. The spatial memory buffer construct logically provides one likely framework. The *MemoryDraw* task is an active task that demands not only a physical sketchpad, but an internal "sketchpad" on which to "project," hold and manipulate the memory representation, so as to be able to use it to guide the trajectory of the complicated drawing movements. These demands closely resemble (although in a non-visual form) the description of the *visuo-spatial memory buffer*, termed the "visuo-spatial sketchpad." In the classic model of working memory (as proposed by Baddeley and Hitch, 1974; Baddeley, 1986, 2000, 2003), the *visuo-spatial sketchpad* is one of four major components, the one that instantiates the function of developing and holding in working memory an accurate spatial representation of the retrieved object to guide action, providing a "*sketch*" that can be further spatially manipulated by the central executive (**Figure 11**). The improvement of drawing performance we observed would not have been possible without recourse to a working memory representation of this kind.

An intriguing issue, of course, is where in the brain such representational buffer may be located. Previous theoretical and neurophysiological studies have proposed that area V1, is the cortical region best suited to perform such memory-buffer function (e.g., Mumford, 1996; Lee et al., 1998; Super et al., 2001a,b; Lee and Mumford, 2003; Super, 2003). This area has the special status of being the largest topographic map in the brain, with the highest spatial resolution and parallel processing of the information of the whole map; thus it has been suggested that "instead of being

the first stage in a feedforward pipeline, V1 is better described as the unique high-resolution buffer in the visual system" (Lee and Mumford, 2003).

The current study provides a causal manipulation that links the memory enhancement to the increased activation and specialization in V1, consistent with the memory-buffer interpretation. Moreover, an important twist for this interpretation is the lack of any visual stimulation in the congenitally blind case (as it was in the blindfolded study), implying that the buffer is independent of the input modality. Our re-conceptualization of the *visuo-spatial* sketchpad as being *amodal-spatial* is depicted by the yellowish block in **Figure 11**. Indeed, the original motivation for the study was our view that, although vision provides the best spatial representation, space itself is not "owned" by vision, but is inherently an amodal domain. Thus, it is adaptively effective for the brain to ensure the modality-independence of spatial representation.

# **ALTERNATIVE INTERPRETATIONS**

An alternative interpretation of the V1 activation to be evaluated in this congenitally blind case is its possible role in visual imagery. The interrelationships between working memory and imagery are still unclear and challenging. Although each of these major cognitive constructs is treated in various ways across studies, often without any attempt at formal definition, they all accept that both imagery and working memory involve a type of internal representation available to our awareness. In working memory, however, there is a further emphasis on goal-oriented, active maintenance, and use of this conscious representation to guide voluntary action; for this purpose, the multicomponent working memory models incorporate representational buffers, such as the visuo-spatial sketchpad discussed above, plus central executive functions (**Figure 11**).

In general, any form of retrievable and robust spatial memory "sketch," including one involving visual imagery, might in principle provide a mechanism for guiding the drawing movements. On the other hand, it has to be taken into account that the representational format and even the very nature of imagery are not yet resolved and are still a subject of hot debates even in the visual domain. Thus, "the analog-propositional debate, occasionally also called the picture-description debate, is an ongoing and notoriously irreconcilable dispute within cognitive science about the representational format of visual imagery," Stanford Encyclopedia of Philosophy (http://plato.stanford.edu/ entries/mental-imagery/).

In contrast to late-onset blind individuals, who (similarly to the sighted) have had enough visual stimulation to develop vision and visual imagery, congenitally blind individuals have had *no* access to visual information. Thus, it is accepted as "clearly true that visual imagery does not account for cross-modal activation of visual cortex for the early blind" (Lacey et al., 2009), and that congenitally blind are unable to perform visual imagery tasks (e.g., Goyal et al., 2006). Besides, it is logically impossible to ascertain whether the congenitally blind have visual imagery, as they have no previous vision-related experience and hence no basis for such a subjective qualia comparison.

Besides, an objective property of visual imagery is that its underlying cortical signals propagate in a *top-down* fashion through the occipital visual hierarchy, with the signal being strongest in the higher extrastriate areas (e.g., Ishai and Sagi, 1995; Kreiman et al., 2000; O'Craven and Kanwisher, 2000; Kosslyn et al., 2001; Tong, 2003; Mechelli et al., 2004; Amedi et al., 2005; Merabet et al., 2005), and possibly not reaching V1 at all; thus, there is still an open debate whether imagery activates V1 itself or not. The above characteristics of the imagery response in the visual hierarchy, and in V1 in particular, are highly incompatible with the strong V1 activation in CB4 under our *tactile-memory* task; while, in contrast, this strong and mature activation following the memory training is more consistent with the alternative hypothesis of V1 operating as a memory buffer.

# **WHAT NEURAL MECHANISMS MAY BE ACTIVATING V1?**

The novel memory paradigm, based on training in drawing guided solely by tactile memory, opens an intriguing domain for future research. If the putative spatial buffer or "sketchpad" for working memory is implemented in V1, does imagery involve the same passive buffer? Alternatively, does working memory employ an imagery-specific representational mechanism to occupy our awareness; or do they both utilize a more generic "projection screen" (of either modality-specific or modality-independent nature)? All of these important questions are still open.

In particular, further studies in a wider population are needed to investigate what deeper mechanisms may mediate the crossmodal reorganization observed in V1. There is a range of theoretical possibilities, such as unmasking of preexisting connections, changes in synaptic weights, growth of new connections, modulation of long-range influences, up-regulation of specific transmitter sources, or a combination of a number of different mechanisms (e.g., Florence and Kaas, 1995; Jones, 2000; Raineteau and Schwab, 2001; van Brussel et al., 2011; Merabet et al., 2008b). Indeed, the extent of V1 connectivity is currently undergoing an extensive re-evaluation. Recent electrophysiological and anatomical studies in non-human primates reveal a picture of multiple reciprocal connections at lower hierarchical levels, including the primary areas. An impressive number of interconnections to and from V1 across the brain have been established. In addition to the well-known direct feedback projections to V1 originating from V2, V3, V4, V5 or MT, MST, FEF, LIP, and inferotemporal cortex (Perkel et al., 1986; Ungerleider and Desimone, 1986a,b; Shipp and Zeki, 1989; Rockland, 1994; Budd, 1998; Barone et al., 2000; Suzuki et al., 2000), there are direct feedforward projections to V1 originating from the pulvinar, LGNd, claustrum, nucleus paracentralis, raphe system, locus coeruleus, and the nucleus basalis of Meynert (Ogren and Hendrickson, 1976; Rezak and Benevento, 1979; Graham, 1982; Blasdel and Lund, 1983; Doty, 1983; Perkel et al., 1986; Lachica and Casagrande, 1992; Hendry and Yoshioka, 1994; Adams et al., 2000; Schmolesky, 2000).

Other potential candidate sources of direct V1 activation'were considered by Merabet et al. (2007), such as long-range corticocortical connections from multimodal parietal areas (Rockland and Ojima, 2003), or from somatosensory or other primary sensory cortices (Falchier et al., 2002; Cappe and Barone, 2005), plastic changes in subcortical pathways (e.g., Sur and Leamey, 2001), or multisensory interactions within primary sensory areas mediated by a competing balance between a form of direct drive and potentially inhibitory top-down projections from associative cortical areas. Even so, the connections listed above still represent only a subset of the direct and indirect projections that carry signals into and out of V1. That interconnectivity provides a number of potential sources of direct V1 activation, one of which is the pulvinar, since the anatomical connectivity with the pulvinar is particularly strong.

Additionally, Clavagnier et al. (2004), examined feedback projections to area V1 using retrograde tracer injections. Notably, in addition to well-known areas and a number of long-distance feedback connections originating from auditory (A1) and multisensory (STP) cortices, they also found connections from a perirhinal area. The perirhinal-to-V1 connections are of particular interest in the context of our finding of a memory related role for V1, as they could represent another potential pathway for the involvement of V1 in working memory and the active processing of stored spatial information (or what Clavagnier et al. refer to as "consciousness"). Definitive studies on these issues remain to be conducted, however.

# **IMPLICATIONS FOR BRAIN ORGANIZATION AND PLASTICITY**

The present results in this case of congenital blindness address a series of cogent questions in relation to cortical plasticity. What happens to the "territory" of the visual cortex once the visual input is lost? For example, it is well-known that the two eyes normally "cooperate" to provide our binocular vision capabilities. However, if the inputs to the two eyes become unequal for some reason, cooperation turns into competition. Such binocular competition often alters the cortical organization and results in pathological conditions such as amblyopia. An analogous competition is also operating at the subsequent hierarchical level—that of *cross-modal* interactions. When the diverse sensory inputs are in balance, the cooperative principle determines their integrative work. If this balance is seriously disturbed, as in blindness, then the stronger sensory modality has the capability of invading the visual territory through the compensatory cross-modal mechanisms of brain plasticity and reorganization. Deterioration or loss of vision thus induces cortical changes and corresponding shifts in the cortical weighting of information processing in order to adapt to both the modified infrastructure of available sensory inputs and the new demands to the brain (e.g., Pascual-Leone et al., 2005; Merabet et al., 2005; Merabet et al., 2008a; Borowsky et al., 2007; Ptito et al., 2008; Likova, 2012). The present results in CB4 imply that, therefore, this is not a "random walk incursion," but is ruled by sophisticated mechanisms accounting for the *task-specific* demands.

In terms of the time-course analysis, our finding of *immature* BOLD waveforms before training indicates that simply measuring some activation level is not a sufficient indicator of functional reorganization, and thus it has an important implication as a warning for neuroimaging studies on plasticity, implying that additional criteria, incorporating the characteristics of the response time-course, also have to be taken into account.

# **CONCLUSION**

The present results have multivalent implications. Although there have been a lot of studies on cross-modal *perception* in the blind, there have been only a few studies on cross-modal *memory,* and even fewer have incorporated a *training* process. Thus, to our knowledge, the current study provides the first to assess the involvement of area V1 in a *tactile-memory* task, as well as being the first on the *effect of training* on both tactile-memory under visual deprivation and on *learning* to draw in congenital blindness. In our task, the tactile model under the fingers of the left hand is removed after being tactually explored and memorized, so the drawing movements of the right hand are guided solely by memory, with no concurrent tactile input from the model.

The brain of this congenitally blind adult showed rapid changes in the process of learning to draw. This issue is particularly telling in relation to V1, which was massively activated as the skill of drawing from tactile memory was enhanced. It may seem particularly surprising to find such reorganization in V1, whose main role is considered to be the early processing of information from the visual input modality. The implication is that, even late in life, CB4's visual cortex still retained enough plasticity to be selectively accessible for use when the need arose, such as in our demanding memory task. This novel experimental approach, accessing the highest resolution topographic map in the brain (V1), provides a "real-life" yet tractable paradigm for re-evaluating principles of brain architecture. Moreover, the Cognitive-Kinesthetic training method should contribute to effective rehabilitative interventions in the lives of those with sensory disabilities. It is particularly noteworthy that the training was transformative for the congenitally blind individual. Helmholtz' concept of drawing as empowering "particularly vivid and accurate" registration of sensory structures, their storage, and retrieval so as to provide true memory in a form that can be translated to guide the accurate motor-control signals, seems to be true even under blindness, as in the case of CB4.

The present findings, to the extent that they can be generalized, add a compelling slate of evidence against the view of exclusively sensory and unimodal primary cortices, thus being consistent with the idea of a more distributed architecture and increased "task sharing" among sensory processing regions even within the sensory neocortex. In combination with our previous work with the drawing paradigm, these studies propel the emerging re-conceptualization of brain architecture as highly interactive and capable of reorganization even after long-term sensory deprivation.

# **REFERENCES**


object processing. *Brain Topogr.* 20, 89–96.


# **ACKNOWLEDGMENTS**

This research is supported by NSF/SLC Grant to Lora T. Likova. The author thanks Spero Nicholas for his help in data preprocessing and analysis tools, and Christopher W. Tyler for helpful discussions.

macaques. *J. Comp. Neurol.* 218, 159–173.


(2004). Mechanism of the closingin phenomenon in a figure copying task in Alzheimer's disease patients. *Neurocase* 10, 393–397.


right-sided cases compared. *Brain* 83, 225–242.


tomography study during auditory localization by late-onset blind individuals. *Neuroreport* 17, 383–388.

Williams, M., Baker, C. I., Opde Beeck, H. P., Shim, W. M., Dang, S., Triantafyllou, C., and Kanwisher, N. (2008). Feedback of visual object information to foveal retinotopic cortex. *Nat. Neurosci.* 11, 1439–1445.

**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 14 April 2011; accepted: 22 February 2012; published online: 14 May 2012.*

*Citation: Likova LT (2012) Drawing enhances cross-modal memory plasticity in the human brain: a case study in a totally blind adult. Front. Hum. Neurosci. 6:44. doi: 10.3389/fnhum. 2012.00044*

*Copyright © 2012 Likova. This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.*

# **MISDIRECTION – PAST, PRESENT, AND THE FUTURE**

**Gustav Kuhn and Luis M. Martinez**

# Misdirection – past, present, and the future

# *Gustav Kuhn1\* and Luis M. Martinez 2\**

*<sup>1</sup> Department of Psychology, Brunel University, Uxbridge, UK*

*<sup>2</sup> Instituto de Neurociencias de Alicante, Consejo Superior de Investigaciones Científicas-Universidad Miguel Hernández, Sant Joan d'Alacant, Spain*

#### *Edited by:*

*Idan Segev, The Hebrew University of Jerusalem, Israel*

#### *Reviewed by:*

*Lutz Jäncke, University of Zurich, Switzerland Shlomo Bentin, The Hebrew University of Jerusalem, Israel*

#### *\*Correspondence:*

*Gustav Kuhn, Department of Psychology, Brunel University, Uxbridge UB8 3PH, UK. e-mail: gustav.kuhn@brunel.ac.uk; Luis M. Martinez, Instituto de Neurociencias de Alicante, Consejo Superior de Investigaciones Científicas-Universidad Miguel Hernández, Campus de Sant Joan, Avenida Ramón y Cajal, S/N, 03550 Sant Joan d'Alacant, Spain. e-mail: l.martinez@umh.es*

Misdirection refers to the magician's ability to manipulate people's attention, thoughts, and memory. It has been argued that some of the techniques used by magicians to orchestrate people's attention and awareness may provide valuable insights into human cognition. In this paper we review the scientific, as well as some of the magic literature on misdirection. We focus on four main points: (1) the magician's concept of misdirection, (2) the paradigms used to study misdirection scientifically, (3) review of the current scientific findings, and (4) future directions.

**Keywords: misdirection, magic, attention, awareness**

"*The principle of misdirection plays such an important role in magic that one might say that magic is misdirection and misdirection is magic*" Hugard (1960, p. 115)

# **INTRODUCTION**

Over the centuries magicians have developed powerful ways of manipulating people's perception (Christopher, 2006). In recent years there has been much interest in understanding the scientific basis of some of the techniques, as they are thought to provide valuable insights into human cognition (Kuhn et al., 2008a; Macknik et al., 2008; Kuhn, 2010). Much of this work has focused on the concept of misdirection, a technique that is broadly defined as manipulating people's attention, thoughts, and memory. The aim of this paper is to review the scientific, as well as some of the magic literature on misdirection, to identify the differences between the magician's concept and the scientific view, while highlighting the current scientific findings and potential future directions.

Magicians can manipulate people to an extraordinary degree because our subjective impression of the world does not necessarily match reality (Gregory, 2009). For instance, we consciously experience the world as a seamless whole, continuous both in space and time. However, our subjective perception of a scene is actually based on a partial analysis performed by cells located within separate brain areas and each selective to distinct aspects of an object or event in different regions of visual space. In addition, our eyes constantly move as we explore the environment providing a sequence of multiple views of the objects in our surroundings. Visual continuity is, therefore, a brain construct that depends, among other things, on our ability to store properties of a scene and compare them across these perceptual interruptions. This task is even more challenging considering our usual cluttered visual environment, which is filled with information that is both relevant and irrelevant for our current behavior. The neural mechanisms underlying our capacity to visually interpret the world are still highly debated. Attention, defined as the process by which we select a subset of available information while filtering out the rest (Desimone and Duncan, 1995), seem to play a critical role in determining what (and how) we perceive about ourselves and the environment, hence the famous saying that we only see that to what we pay attention. Moreover, we frequently perceive and process events based on expectations, rather than the physical state of the world. For example Bunzeck et al. (2005) found activations in the auditory cortex during presentation of scenes normally accompanied by characteristic auditory stimuli, thus demonstrating the subjectiveness of perception on a neural level. Magicians have long taken advantage of perceptual processes involving attention and awareness to manipulate their audience's conscious experience during magic tricks. We believe that magical techniques, if used in a controlled, laboratory like, environment, will become an invaluable tool to explore the neural mechanisms and behavioral underpinnings of consciousness, attention, and visual perception.

# **MISDIRECTION – THE MAGICIAN'S CONCEPT**

Misdirection deals with manipulating what people see and remember about an event. Given the complexity of the perceptual process, it may come as little surprise that defining misdirection is rather difficult. As Fitzkee (1945) points out, the magic literature has failed to come up with any satisfactory definition of misdirection. In a literal sense, the prefix "mis" means wrong or wrongly, whilst

direction means to point out a way or to guide or to instruct. Misdirection can therefore literally be defined as pointing out the wrong way. Another way of defining misdirection is by focusing on its function. Any magic effect (what the spectator sees) requires a method (the method used to produce the effect). The main purpose of misdirection is to disguise the method and thus prevent the audience from detecting it whilst still experiencing the effect (**Figure 1**; Sharpe, 1988; Lamont and Wiseman, 1999).

Misdirection is central to magic, and has attracted much interest from magicians. Our conscious experience of the world is determined by a cascade of cognitive and neurological processes; generally starting with the encoding of perceptual information, which is then further processed and stored in memory, before being retrieved and thus entering consciousness (Koch, 2004). Alterations to any of these processes will influence our conscious experience and lead to conspicuous failures in awareness such as change blindness (Simons and Rensink, 2005), inattentional blindness (Mack and Rock, 1998; Simons and Chabris, 1999), repetition blindness (Kanwisher, 1987; Whittlesea and Podrouzek, 1995; Whittlesea et al., 1995), visual masking (Macknik, 2006), the attentional blink (Raymond et al., 1992), or simply forgetting or mis-remembering (Loftus, 1979).

As a consequence magicians have developed techniques that manipulate different levels of this perceptual chain. For example, what we attend to (i.e., manipulating spatial attention)? How we remember an event? How do we interpret causality? Whilst much of the practical, as well as theoretical, knowledge about misdirection is typically linked to specific magic tricks, numerous

**FIGURE 1 | The Conjuror by Hieronymus Bosch (estimated 1475–1505).** The conjuror on the right captures his audience attention with a game of cups and balls. Cups and balls routines were first introduced more than 2000 years ago and entail a host of classic effects of magic, such as vanishes, appearances, transpositions, and substitutions. Performing a cups and balls trick is highly regarded amongst magicians since it requires a great deal of motor skills and coordination, combined with an excellent audience management to effectively misdirect the spectators' attention away from the method. In this painting, misdirection is so powerful that the spectator in the forefront, mesmerized by the conjuror's performance, fails to notice that someone standing behind him is stealing his wallet.

magic scholars have proposed frameworks that formulate some of the general principles of misdirection. For example, Sharpe (1988) distinguished between active and passive misdirection, whereby the former involves those methods that attract spatial attention due to some kind of transient change in sound or movement. Passive misdirection, on the other hand, refers to methods that work by unobtrusively manipulating our minds through the way in which people react to static stimuli. Ascanio and Etcheverry (2000), on the other hand, described 3 degrees of misdirection. The first degree would be when the magician performs two simultaneous actions, the method behind the magic trick, or secret move, and a distractor. Having to attend to both, the spectator cannot focus on the method and that, in general, suffices to make this go unnoticed. In the second degree, the two actions are not perceptually equivalent, such as when a big move covers a small move, and as a result misdirection is enhanced. Ascanio's third degree would be the same as Sharp's active misdirection. Magicians often talk about misdirection in terms of creating zones of high and low interest, whereby the former will attract attention at the expense of the latter (**Figure 2**). In fact, Apollo Robins, believes that misdirection is not merely to divert attention away from the secret move. He thinks it is more about the magician's capacity to draw attention to a particular place, which he calls frame, at a particular time (Robins, 2007; Magic of Consciousness Symposium; http://assc2007.neuralcorrelate.com). This creates a sort of tunnel vision in which any action occurring outside of the frame goes unnoticed and, in addition, the smaller the frame the stronger the sense of misdirection (see also Ascanio and Etcheverry, 2000). Moreover, differences are drawn between manipulations of spatial attention and time perception (Sharpe, 1988; Tamariz, 1988; Lamont and Wiseman, 1999; Ortiz, 2006). Time misdirection works because magicians separate the method from the magical effect and this separation generates false causal links between unrelated actions, preventing the audience from being able to mentally reconstruct the trick. As is apparent from this small, and rather incomplete, review the concept of misdirection has attracted much interest amongst magicians, and whilst it is somewhat poorly defined and lacks a clear overarching theory, magicians have developed much expertise in how our perception can be manipulated.

# **MISDIRECTION – THE SCIENTIFIC PARADIGM**

Science relies on clear definitions of concepts. Rather than explaining misdirection as a whole, attempts have been made to link some of the misdirection principles to scientific concepts of perception, and develop paradigms that can be used to explore these mechanisms more systematically. One such paradigm is the Misdirection Paradigm (**Figure 2**), in which participants view a pseudo "magic trick" in which the magician makes a cigaret and lighter disappear (Kuhn and Tatler, 2005). The disappearance of these objects relies on the magician dropping them into his lap, which happens in full view of the observer. However, the misdirection employed by the magician prevents most observers from detecting this event. Crucially, as the method (i.e., the dropping of the objects) takes place in full view, we can use participants' detection of the method (i.e., did you see the object being dropped?) as a probe of the misdirection's effectiveness.

**FIGURE 2 | Zones of high and low interest during a magic trick.** The figure shows the second by second breakdown of a misdirection routine. The magic effect behind the trick was the disappearing of a lighter and a cigaret; the method was for the magician to simply drop the items into his lap. Although the dropping gesticulation was fully visible, misdirection prevented most of the observers from seeing this event. The dotted and solid ovals represent the areas of high and low interest, respectively. A cigaret is removed from the packet and deliberately placed in the magician's mouth the wrong way round (1–7 s). The magician then pretends to light the cigaret (7 s). The flame creates a high luminance and attracts attention. Both the spectator and magician then notice this mistake, which raises the interest in the cigaret (8 s). The magician then turns the cigaret around, while keeping his gaze fixed on the cigaret and the hand manipulating it (8–9 s). During this maneuver, the hand holding

It has been argued that the mechanism involved in preventing participants from detecting the method is analogous to inattentional blindness (Kuhn and Tatler, 2005; Kuhn and Findlay, 2010). Inattentional blindness refers to the phenomena that people often fail to perceive a fully visible event when engaged in an attentionally demanding distractor task (Mack and Rock, 1998; Simons and Chabris, 1999). Given the similarity between inattentional blindness and misdirection, it has been argued that the principles involved in misdirection rely on inattentional blindness, whereby people's attention is misdirected thus preventing them from perceiving the method (Kuhn and Findlay, 2010). The similarities and differences between inattentional blindness and misdirection have caused much debate. Whilst some have argued for numerous discontinuities between the two (Memmert, 2010; Memmert and Furley, 2010), others have suggested that they do indeed involve very similar concepts (Moran and Brady, 2010; Most, 2010; Kuhn and Tatler, 2011). What is clear from this debate is that whilst inattentional blindness paradigms typically require participants' attention to be distracted using an explicit distractor task (e.g., count the number of basket ball passes), the distraction in the misdirection paradigm occurs implicitly through different misdirection principles (Kuhn and Tatler, 2011). Indeed it is people's failure in realizing that they have been misdirected, that is crucial, and one of the features that distinguishes it from simple distraction (Lamont et al., 2010).

The related phenomena of change blindness refers to people's failure in noticing substantial changes to a visual scene, if the visual transient associated with the change is masked (Rensink et al., 1997). Moreover, if attention is captured using a strong attentional cue, participants often fail to notice the change, thus the lighter is lowered to the tabletop and drops the lighter into the magician's lap. This dropping of the lighter happens in a low area of interest. The disappearing lighter is dramatically revealed by snapping his fingers and waving his hands (11 s). The method for making the cigaret disappear relies on it being dropped into the lap. This action is fully visible, with the cigaret dropped from 15 cm above the table top (11 s). Surprisingly, most participants did not see this: at the time the cigaret is dropped it is an area of low interest (the other hand is an area of high interest). In this case, the high interest is manipulated by three things: (i) surprise: the disappearance of the lighter automatically leads to interest, (ii) social cues: the magician looks at the hand that previously held the lighter and rotates his body in that direction, and (iii) movement and sound: at the time of the drop the magician snaps his fingers and waves his hand, thereby attracting attention. Adapted from Kuhn et al. (2008a).

demonstrating that attention is needed to consciously perceive it (O'Regan et al., 1999). Change blindness could also involve a limit in the amount of information about a scene that can be stored in visual short-term memory (vSTM) at any given time, or a limit on the comparison process (Scott-Brown et al., 2000). The exact way that these different aspects of scene perception are involved is still unclear. There are numerous situations in which a magician may switch an item for something else, and misdirection is employed to prevent participants from detecting the change. As such, rather than relaying on people's perception of a transient event, their susceptibility toward change blindness offers a valuable probe to investigate the effectiveness of misdirection. For example, in a series of experiments, misdirection has been used to prevent people from seeing an obvious color change to a deck of cards (Teszka et al., 2011). Here linguistic social cues (i.e., asking a question) were used to prevent participants from detecting this change; thus change detection was used to measure the effectiveness of the misdirection. Although the mechanism between inattentional and change blindness may differ substantially (Rensink, 2000), in practice misdirection may be used to induce both types of blindness (Memmert, 2010; Kuhn and Tatler, 2011).

Misdirection has also been used to investigate the mechanisms involved in vSTM. Change blindness has been studied both in the laboratory and in more realistic, real-world situations. In a laboratory setting, it is easy to control for cognitive load, vSTM capacity, and the allocation of attention. However, change blindness protocols often employ rather un-naturalistic viewing conditions in which subjects are asked to perform many repetitions of a task that they know, or even have practiced, beforehand. During natural vision experiments, on the other hand, subjects are naïve to

the task but it is difficult to control where they are directing their attention to and whether or not they may be engaged in other, competing, cognitive tasks. Some magic tricks provide a new and unique opportunity to leverage the strengths of the two experimental approaches while avoiding their particular drawbacks. Alonso-Pablos et al. (submitted) have recently used misdirection to study the interaction between attention and vSTM. Their results show that items, cards or human faces in this case, that lie outside the focus of attention can still be effectively stored in vSTM (see also Simons et al., 2002). Moreover, this passive representation of a visual scene is rather rich and, even though it does not give rise to conscious perception, it can be unconsciously retrieved and used in a two-alternative forced choice paradigm as efficiently as the previously attended objects. These results suggest that a classical change detection paradigm might not be the best approach to study the capacity of vSTM (see also Makovski et al., 2006). Interestingly, this passive, unconscious, vSTM was very labile and the authors showed that patter, the casual chitchat used by magicians to distract audiences, can effectively interfere with, and even completely abolish, its contents. These results further illustrate that magicians' intuitions about the potential for distraction of verbal misdirection, involving linguistic social cues, are fundamentally correct.

Where people look provide us with an effective online measure of overt attention (Liversedge and Findlay, 2000; Henderson, 2003). Advances in eye tracking technologies have enabled researchers to accurately measure people's eye movements whilst watching different types of magic tricks. Indeed these studies have demonstrated a high consistency of eye movements, thus illustrating that misdirection is very effective in manipulating were people look (Kuhn and Tatler, 2005;Kuhn and Land, 2006). Macknik et al. (2008) have defined overt misdirection as the magician's actions that divert the spectator's gaze away from the method behind the effect. Covert misdirection, on the other hand, refers to instances in which it is the attention of the audience that is directed away from the method, irrespective of the position of their gaze (e.g., Kuhn and Tatler, 2005). Whilst magicians are mainly concerned with what people see, rather than were they look, misdirection clearly offers a valuable tool to investigate, in addition, oculomotor behavior (e.g., Kuhn and Tatler, 2005; Kuhn and Land, 2006; Otero-Millan et al., 2011).

Rather than using misdirection to prevent people seeing an event, misdirection can make people perceive illusory events that have not occurred. For example, Triplett (1900; Kuhn and Land, 2006) developed the vanishing ball illusion in which a magician is seen throwing a ball up in the air a couple of times, before merely pretending to throw it. Most of the observers claimed to have seen a "ghost ball" leaving the hand on the final throw, thus illustrating that people's perception of an event is largely influenced by expectations, rather than the physical presence of the ball. Kuhn and Land (2006) developed careful measures enabling them to establish the effectiveness of this illusion. Cui et al. (2011) developed a related paradigm in which participants were repeatedly asked to view a video clip of a magician tossing a coin from one hand to the other. On some of the trials the coin was tossed for real, whilst on the other half of the trial the magician merely pretended to toss the coin. On a large proportion of trials, participants claimed to

have seen the coin fly from one hand to the other, even though it was not physically present. People's perception of this illusory event could be used to measure the effectiveness of the illusions.

Whilst the magicians' concept of misdirection may be rather broadly defined, scientists have come up with a variety of paradigms that enable us to investigate some of the principles of misdirection scientifically, and even take advantage of these magical techniques to explore the neural and behavioral correlates of visual perception, attention, and awareness.

# **MISDIRECTION – THE SCIENTIFIC FINDINGS**

Numerous studies have now demonstrated that misdirection provides an extremely effective way of manipulating what people see. Rather surprisingly, these studies have consistently shown that people's detection of the event (i.e., the lighter or cigaret drop) was independent of where they were looking (**Figure 3**), thus demonstrating that misdirection generally relies on manipulating covert (i.e., attention in the absence of eye movements), rather than overt attention (i.e.,were people look;Kuhn and Tatler, 2005;Kuhn et al., 2008b, 2009; Kuhn and Findlay, 2010). However, participants who detected the drop were significantly faster to fixate the location of where the event took place in subsequent saccades than those who missed it. These results clearly illustrate that whilst covert and overt attention can be dissociated in space (Posner, 1980), there is a clear temporal link between the two.

**FIGURE 3 | Misdirection works independently of direction of gaze.** An eye-tracker was used to record the subjects' fixation points at the time of the cigaret drop during the magic trick presented in **Figure 2**. **(A)** Results from naïve participants who missed the cigaret drop. **(B)** Naïve participants who detected the cigaret drop. **(C)** Informed participants who missed the cigaret drop. **(D)** Informed participants who detected the cigaret drop. Most of the naïve participants fixated either on the lighter hand, the head, or the area between the lighter hand and the head. Most of the informed participants looked at the lighter hand or the area between the lighter hand and the head. Interestingly, only one informed participant was able to detect the cigaret drop by using his foveal vision, showing that no systematic differences were found between the two conditions. Adapted from Kuhn et al. (2008a).

Whilst magic works when viewed live as well as on television, the subjective experience of watching a magician face-to-face is clearly different from observing him/her on television. That said, misdirection has been shown to be effective both when viewed in a face-to-face interaction (Kuhn and Tatler, 2005; Tatler and Kuhn, 2007) as well as when observed on a computer monitor (Kuhn et al., 2008b). However, differences did emerge. For example, the misdirection experienced in the face-to-face interaction was more effective than when viewed on a monitor. Moreover, in the face-toface scenario, participants' instruction as to what they were about to see did not influence their eye movement behavior, nor did it improve their detection of the dropped cigaret. However, when viewed on a computer monitor, prior instructions influenced both detection as well as eye movement behavior. It has recently been shown that eye movements in social context greatly vary depending on whether a person is seen for real compared to a video screen (Laidlaw et al., 2011), and future research could investigate the role that the presentation medium has on misdirection.

One of the key rules in magic states that magicians should never repeat the same trick using the same method. Indeed all of the published papers to date demonstrate that participants are less susceptible toward misdirection when the same trial is repeated (Kuhn and Tatler, 2005; Kuhn et al., 2008b, 2009; Kuhn and Findlay, 2010; Cui et al., 2011). Whereas some research groups have relied on single presentation of trials, others have opted to use numerous presentations of the same trial (e.g., Cui et al., 2011). Whilst the latter method is clearly advantageous in terms of efficient data collection, the fact that the effectiveness of misdirection is greatly reduced does raise some questions as to the reliability of multiple trial presentations.

Even as many of the misdirection techniques are heavily debated amongst magicians, most would agree that social cues (i.e., where the magician looks) play a fundamental role in misdirection. For example, as Sharpe points out "people tend to look in the same direction as the person they are watching looks" (1988, p. 64). Indeed most of the experimental work supports the view that gaze cues play an important role in manipulating what people see. For example, using the vanishing ball illusion, it has been shown that participants' susceptibility toward the illusion is greatly influenced by the magician's social cues (Kuhn and Land, 2006). When the magician looked at the hand that was concealing the ball, rather than following the imaginary trajectory of the ball, the effectiveness of the illusion was greatly reduced. Using the Misdirection Paradigm, an analysis of people's eye movements showed a strong correlation between were the magician was looking and the observer's gaze (Tatler and Kuhn, 2007). Moreover, using an experimental approach in which the magician's gaze cues were experimentally manipulated, it was shown that the magician's gaze cues influenced both what people saw, as well as where they were looking (Kuhn et al., 2009). Cui et al. (2011) on the other hand, argue that, at least in some routines, perception of magic can be stronger without social cues. Their conclusion is based on findings from the vanishing coin trick, in which the magician either tosses a coin for real, or merely pretends to toss it from one hand to the other. Immediately prior to the toss the magician's gaze is directed toward the observer, and it was thought that this direct eye gaze would capture participants' attention and thus prevent them from

distinguishing between the real and the fake toss. The magician's joint attention cues were manipulated by occluding his head using an artificial mask. Theyfound that subjects did not direct their gaze at the magicians face at the time of the toss, and that the illusion was strongest in the presentations where the magician's head was occluded. These results suggest that joint attention plays no role in the perception of this effect. However, it should be noted that the mask itself may have captured people's attention and thus misdirected them from the method. As acknowledged by the authors, further research in which the magician's gaze is experimentally manipulated is required before any final conclusions about the use of social cues in this illusion can be drawn. However, on the whole, the scientific evidence supports the notion that social cues play a pivotal role in misdirection.

Anecdotal evidence from magicians suggests that not everyone is equally deceived by misdirection. To date, however, there is only one experimental study that has investigated individual differences in misdirection. Individuals with autism have rather specific impairments in processing social information, and it is thought that these individuals tend to avoid social information (Nation and Penny, 2008), and in particular tend to be less effective at using joint attention (Leekam et al., 1998). Given the importance that social cues play in misdirecting attention, it was predicted that individuals with autism should be less misdirected and thus less susceptible toward the Vanishing Ball illusion. However, rather surprisingly, it was shown that individuals with autism did make use of the social cues, and in fact were more susceptible toward the Vanishing Ball illusion (Kuhn et al., 2010). This study further highlighted that individuals with autism had particular difficulties in allocating attention fast enough to the relevant location, which may have resulted in higher levels of deception. We are obviously only at the beginning of understanding some of the individual differences in susceptibility toward misdirection, but misdirection clearly offers a valuable tool to investigate individual differences in attentional allocation.

Otero-Millan et al. (2011), investigated the effectiveness of different types of motion trajectories in misdirecting attention. These authors showed that curved motion resulted in different types of eye movements (more smooth pursuit) than rectilinear motion, and participants were less likely to look back at the hand from which attention was being misdirected. These findings offer a valuable starting point for investigating the way in which different movements influence attention.

# **FUTURE DIRECTIONS**

From this review it is apparent that the recent interest in the science of magic has lead to great advances in understanding some of the brain mechanisms involved in misdirection. More importantly, the scientific investigations into misdirection have greatly furthered our understanding of visual cognition and perception in general. That said, this science of magic is clearly in its infancy, leaving much scope for future explorations.What direction should this field of study take? One obvious step would be to establish a taxonomy and more unifying theory of misdirection. There are several theoretical texts which try to conceptualize misdirection (Fitzkee, 1945; Sharpe, 1988; Tamariz, 1988, 2007; Ortiz, 2006), however, most knowledge and experience about misdirection is

described within the context of specific magic tricks (e.g., Ganson, 1980). Whereas it is debatable whether such an all-inclusive theory of misdirection is feasible (Lamont et al., 2010), a comprehensive, and up to date review of the magic literature focusing on misdirection would certainly be a valuable starting point for future scientific explorations. Crucially, it would at least make this knowledge accessible to researchers with little background in magic. Whilst some attempts have been made to bridge the gap between magic and science (Fraps, 1998; Lamont and Wiseman, 1999; Macknik et al., 2010), most theory to date has been written from the perspective of the magician, rather than the scientist. A wide-ranging review of the literature on misdirection would certainly require and benefit from the close collaboration between the two fields.

In addition, further steps should be taken in understanding the cognitive as well as neural mechanisms involved in misdirection. Magicians are primarily interested in discovering powerful and reliable ways of manipulating the audiences' awareness. As scientists, on the other hand, we are interested in understanding the underlying brain mechanisms of this deception. In principle, they could be at a perceptual level or involve higher cognitive processes, such as working memory or attentional mechanisms. For example,Apollo Robins'intuition that misdirection is stronger when the magician draws attention into a small frame could be reminiscent of a recent report showing that, in monkey primary visual cortex (V1), increasing task difficulty enhances neuronal firing rate at the focus of attention and suppresses it in regions surrounding the focus (Chen et al., 2008). Similar center–surround mechanisms of spatial attention (Moran and Desimone, 1985; Treue and Maunsell, 1996; Reynolds et al., 1999; Recanzone and Wurtz, 2000; Martinez-Trujillo and Treue, 2002; Ghose and Maunsell, 2008) have been reported previously in different visual cortical areas, including V4 (Sundberg et al., 2009) and even in motion processing areas such as hMT+/V5 (Moutsiana et al., 2011), It is, therefore, even possible that active and passive, or overt and covert, forms of misdirection have different neural correlates. Thus, magical techniques offer a unique test bed for current theories of visual perception, attention, and awareness. If used in a

### **REFERENCES**


controlled laboratory environment, they will certainly shed new light on highly debated perceptual phenomena such as change blindness, inattentional blindness, and others. As suggested above, whilst some of the mechanism used by magicians are likely to be the same as those used in traditional experimental paradigms (e.g., attentional orientating by gaze cues), others may differ and may be specific to magic (e.g., social conformity). Only future research will inform us about the exact relationship between misdirection and other attentional manipulations. We do not argue that misdirection is a concept entirely removed from what has been studied by scientists in the past. The main advantage of studying misdirection is that it allows us to exploit the magicians' real-world experience in attentional manipulation, and as such may inform us about the aspects of the environment responsible for driving attention in the real-world.

Misdirection will only be truly understood through empirical investigations using a broad range of new paradigms, each with their own and unique merits and pitfalls (Kuhn et al., 2008a;Macknik et al., 2008; Barnhart, 2010). These new avenues of research will permit to address countless unanswered questions that remain to be explored. What makes the techniques used in misdirection such powerful tools to manipulate spatial attention? Can we identify new attentional principles used by magicians, yet ignored by scientist? How does the context in which the magician is observed influence misdirection? How do magicians control the "collective attention" in an audience? Is this a self-organizing process, alike to what happens when an audience turns into synchronized clapping at the end of a play in a theater? What are the neural correlates of these synchronizing strategies employed by magicians? The answer to these questions, and many others, may be just a few steps away if we adopt magical techniques, such as those used in misdirection, as part of our laboratory toolkit to investigate sensory awareness.

# **ACKNOWLEDGMENTS**

Work in the laboratory of Luis M. Martinez is supported by grants CONSOLIDER CSD2007-00023 (European Regional Development Fund) and BFU2007-67834 and BFU2010-22220 (Spanish Ministry of Education and Science).


token individuation. *Cognition* 27, 117–143.


inattentional blindness reveals temporal relationship between eye movements and visual awareness. *Q. J. Exp. Psychol. (Hove)* 63, 136–146.


(2008). Attention and awareness in stage magic: turning tricks into research. *Nat. Rev. Neurosci.* 9, 871–879.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 20 June 2011; accepted: 12 December 2011; published online: 06 January 2012.*

*Citation: Kuhn G and Martinez LM (2012) Misdirection – past, present, and the future. Front. Hum. Neurosci. 5:172. doi: 10.3389/fnhum.2011.00172*

*Copyright © 2012 Kuhn and Martinez. This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.*

# BRAIN AND ART

**Idan Segev**, **Luis M. Martinez**, and **Robert J. Zatorre**

*"Awakening" – Image by Eberhard E. Fetz*

Could we understand, in biological terms, the unique and fantastic capabilities of the human brain to both create and enjoy art ? In the past decade neuroscience has made a huge leap in developing experimental techniques as well as theoretical frameworks for studying emergent properties following the activity of large neuronal networks. These methods, including MEG, fMRI, sophisticated data analysis approaches and behavioral methods, are increasingly being used in many labs worldwide, with the goal to explore brain mechanisms corresponding to the artistic experience.

The 37 articles composing this unique *Frontiers Research Topic*  bring together experimental and theoretical research, linking stateof-the-art knowledge about the brain with the phenomena of Art. It covers a broad scope of topics, contributed by world-renowned experts in vision, audition, somato-sensation, movement, and cinema. Importantly, as we felt that a dialog among artists and scientists is essential and fruitful, we invited a few artists to contribute their insights, as well as their art.

Joan Miró said that " art is the search for the alphabet of the mind." neurobiological alphabet of the Arts. We hope that the wide range of articles in this volume will be highly attractive to brain researchers, artists and the community at large.

Idan Segev, Luis M. Martinez and Robert J. Zatorre