# VISUAL MISMATCH NEGATIVITY (vMMN): A PREDICTION ERROR SIGNAL IN THE VISUAL MODALITY

EDITED BY: Gabor Stefanics, Piia Astikainen and István Czigler PUBLISHED IN: Frontiers in Human Neuroscience

#### *Frontiers Copyright Statement*

*© Copyright 2007-2015 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

*All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88919-560-2 DOI 10.3389/978-2-88919-560-2

# About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

# Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

# Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

# What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **VISUAL MISMATCH NEGATIVITY (vMMN): A PREDICTION ERROR SIGNAL IN THE VISUAL MODALITY**

#### Topic Editors:

**Gabor Stefanics,** Translational Neuromodeling Unit (TNU), Institute for Biomedical Engineering, University of Zurich & ETH Zurich, Zurich, Switzerland; Laboratory for Social and Neural Systems Research, Department of Economics, University of Zurich, Zurich, Switzerland

**Piia Astikainen,** Department of Psychology, University of Jyväskylä, Jyväskylä, Finland **István Czigler,** Institute of Cognitive Neuroscience and Psychology, Research Center for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary

Word cloud created using the text of all contributing papers of the Research Topic. Words that appear more frequently in the papers have been printed in larger font

Current theories of visual change detection emphasize the importance of conscious attention to detect unexpected changes in the visual environment. However, an increasing body of studies shows that the human brain is capable of detecting even small visual changes, especially if such changes violate non-conscious probabilistic expectations based on repeating experiences. In other words, our brain automatically represents statistical regularities of our visual environmental. Since the discovery of the auditory mismatch negativity (MMN) event-related potential (ERP) component, the majority

of research in the field has focused on auditory deviance detection. Such automatic change detection mechanisms operate in the visual modality too, as indicated by the visual mismatch negativity (vMMN) brain potential to rare changes. vMMN is typically elicited by stimuli with infrequent (deviant) features embedded in a stream of frequent (standard) stimuli, outside the focus of attention. In this research topic we aim to present vMMN as a prediction error signal. Predictive coding theories account for phenomena such as mismatch negativity and repetition suppression, and place them in a broader context of a general theory of cortical responses. A wide range of vMMN studies has been presented in this Research Topic. Twelve articles address roughly four general sub-themes

including attention, language, face processing, and psychiatric disorders. Additionally, four articles focused on particular subjects such as the oblique effect, object formation, and development and time-frequency analysis of vMMN. Furthermore, a review paper presented vMMN in a hierarchical predictive coding framework. Each paper in this Research Topic is a valuable contribution to the field of automatic visual change detection and deepens our understanding of the short term plasticity underlying predictive processes of visual perceptual learning.

**Citation:** Stefanics, G., Astikainen, P., Czigler, I., eds. (2015). Visual Mismatch Negativity (vMMN): A Prediction Error Signal in the Visual Modality. Lausanne: Frontiers Media. doi: 10.3389/978-2-88919-560-2

# Table of Contents


*122 Is it a face of a woman or a man? Visual mismatch negativity is sensitive to gender category*

Krisztina Kecskés-Kovács, István Sulykos and István Czigler

*133 Event-related potentials to unattended changes in facial expressions: detection of regularity violations or encoding of emotions?*

Piia Astikainen, Fengyu Cong, Tapani Ristaniemi and Jari K. Hietanen


Kairi Kreegipuu, Nele Kuldkepp, Oliver Sibolt, Mai Toom, Jüri Allik and Risto Näätänen

*167 Investigating developmental changes in sensory processing: visual mismatch response in healthy children*

Katherine M. Cleary, Franc C. L. Donkers, Anna M. Evans and Aysenil Belger

*180 Visual mismatch negativity: a predictive coding view* Gábor Stefanics, Jan Kremláček and István Czigler

*199 Visual mismatch negativity (vMMN): a prediction error signal in the visual modality*

Gábor Stefanics, Piia Astikainen and István Czigler

# Measuring affective reactivity in individuals with autism spectrum personality traits using the visual mismatch negativity event-related brain potential

# *Leigh C. Gayle , Diana E. Gal and Paul D. Kieffaber\**

*Department of Psychology, The College of William and Mary, Williamsburg, VA, USA*

#### *Edited by:*

*Gabor Stefanics, University of Zurich and ETH Zurich, Switzerland*

#### *Reviewed by:*

*Leslie J. Carver, University of California, San Diego, USA Piia Astikainen, University of Jyväskylä, Finland*

#### *\*Correspondence:*

*Paul D. Kieffaber, Department of Psychology, The College of William and Mary, 540 Landrum Dr., Integrated Science Center, Room 1087, Williamsburg, VA 23185, USA. e-mail: pdkieffaber@wm.edu*

The primary aim of this research was to determine how modulation of the visual mismatch negativity (vMMN) by emotionally laden faces is related to autism spectrum personality traits. Emotionally neutral faces served as the standard stimuli and happy and sad expressions served as vMMN-eliciting deviants. Consistent with prior research, it was anticipated that the amplitude of the vMMN would be increased for emotionally salient stimuli. Extending this finding, it was expected that this emotion-based amplitude sensitivity of the vMMN would be decreased in individuals with higher levels of autism spectrum personality traits as measured by the Adult Autism Spectrum Quotient (AQ). Higher AQ scores were associated with smaller amplitudes of the vMMN in response to happy, but not sad emotional deviants. The fact that higher AQ scores were associated with less sensitivity only to happy emotional expressions is interpreted to be consistent with the negative experience of social interactions reported by individuals who are high on the autism spectrum. This research suggests that the vMMN elicited by deviant emotional expressions may be a useful indicator of affective reactivity and may thus be related to social competency in Autism Spectrum Disorder (ASD).

**Keywords: mismatch negativity (MMN), autism spectrum disorders, affect, ERPs, affective disorders**

#### **INTRODUCTION**

Autism is a group of pervasive development disorders, often appearing within the first three years of life, that are characterized by atypical development of social and communication skills. Like a growing number of psychological disorders, including schizophrenia, autism is often considered a "spectrum" disorder encompassing a wide variety of symptom profiles and varying degrees of symptom severity. Currently, there are three categories used to group individuals on the autism spectrum. These categories include autistic disorder, Asperger's disorder, and pervasive development disorder not otherwise specified (PDD-NOS), collectively referred to as Autism Spectrum Disorder (ASD) (NIMH, 2011).

Symptoms of ASD include abnormalities in pretend play, social interactions, and verbal and non-verbal communication, as well as patterned or repetitive behaviors and actions such as twirling and banging of the head (Lord et al., 2000). ASD is also typically accompanied by speech and learning difficulties as well as rigid, inflexible routines. These social and communication deficits are most often measured by eye contact, facial expressions and body language, and an evaluation of the child's relationships with peers and family members, (American Psychiatric Association, 2000).

Although the epidemiology of ASD is currently unknown it is commonly linked with neurobiological, neurochemical, and genetic abnormalities (Newschaffer et al., 2007). In the 1950s Dr. Leo Kanner, who originally described autism as a mental disorder, believed that it was a genetically determined phenomenon (Kanner and Eisenberg, 1956). Presently the development of ASD is credited to an interaction between genetic and environmental causes.

Contemporary methods for identifying and diagnosing ASD are rooted in behavioral assessments. These methods typically rely on subjective observations of the child's social and learning behaviors by parents, teachers, and psychiatrists (Lord and Risi, 1998). Standardized tests, such as the Autism Quotient (AQ), the Checklist for Autism in Toddlers (CHAT), Autism Diagnostic Observation Scale (ADOS), and the Autism Diagnostic Interview (ADI), have been developed for the explicit purpose of identifying and quantifying personality and behavioral characteristics thought to occupy the autism spectrum, (Baron-Cohen et al., 2001; NIMH, 2011). Although these behavioral techniques have been used to standardize diagnostic criteria internationally (Lord et al., 1997; Lord and Risi, 1998), their weaknesses include the fact that they ultimately rely on subjective assessments of behavior and that they lack tangible physiological and/or neurological markers that may help to distinguish ASD from other disorders or from socially awkward, but otherwise neurotypical children.

One potentially useful procedure for investigating the integrity of neural mechanisms associated with social or emotional competency is the mismatch negativity (MMN) component of the event-related brain potential (ERP) (Behrmann et al., 2006; Zhao and Li, 2006). The MMN component is typically measured in response to the presentation of a deviant stimulus amidst a sequence of repeated, or "standard," stimuli. In the auditory domain, the MMN typically occurs 150–200 ms after a deviant stimulus is presented and can last as long as 300 ms (Näätänen et al., 1978; Näätanen, 2007; Garrido et al., 2009). A visual counterpart to the auditory MMN, the visual mismatch negativity (vMMN), is typically observed over parieto-occipital and inferotemporal scalp sites beginning about 140 ms following stimulus onset (Pazo-Alvarez et al., 2003; Maekawa et al., 2005; Czigler et al., 2006; Czigler and Sulykos, 2010).

In both the visual and auditory modalities, the MMN is often considered to be a pre-attentive reaction to change (Dunn et al., 2008), and additionally is thought to be indicative of the comparison of consecutive stimuli, sensory learning, and perceptual acuity (Garrido et al., 2009). Evidence of the pre-attentive nature of the MMN response is typically garnered from findings demonstrating the presence of MMNs in infants (Cheour et al., 2000) and even in comatose patients (Holeckova et al., 2008; Fischer et al., 2010). This quality of the MMN makes it an attractive candidate as an investigative tool for ASD because it can be measured regardless of an individual's level of cognition and/or developmental status, can easily be compared across populations, is independent of language fluency and can even be measured in individuals who are completely non-verbal.

The vMMN has been identified in response to deviances in color, luminance, image contrast, orientation, direction of motion, and spatial frequencies (e.g., Stagg et al., 2004; Näätänen et al., 2007; Li et al., 2012), as well as to more complex visual stimuli such as emotional images or expressions. Variations of the vMMN task have been performed using pictures that elicit emotional responses. In these studies, emotionally neutral images serve as the standard stimulus, and pleasant or unpleasant pictures that have previously been shown to induce either positive or negative emotions are used as deviants. Used in this way, the vMMN is thought to reflect an unconscious, involuntary reaction to change in emotional valence (Kayser et al., 2000; Delplanque et al., 2004, 2005). Zhao and Li (2006) referred to the emotion-elicited vMMN, which is expressed as a larger, or more negative, N170 component and a smaller, or less positive, P250 as the "expressional mismatch negativity" or eMMN (Zhao and Li, 2006; Astikainen and Hietanen, 2009).

Although comparatively little is known about the eMMN, research indicates that it may express hemispheric specialization of emotion processing (Zhao and Li, 2006; Stefanics et al., 2012). However, the nature of this hemispheric specialization is unclear as some findings suggest a right-lateralization in response to positive emotional expressions (Zhao and Li, 2006) and others a leftlateralization for positive emotional expressions (Stefanics et al., 2012). Moreover, recent imaging research further supports the notion that such measures of affective reactivity may be useful as endophenotypic markers of ASD. Spencer et al. (2011) observed significantly reduced activation in brain regions, including the fusiform face area and superior temporal sulcus, in response to happy emotional images in a group of individuals with autism compared with control participants. Most striking was that there was no difference in measures of neural activity between individuals with autism and a group of unaffected siblings of autistic individuals.

The primary aim of the present research was to determine how modulation of the vMMN by emotional expression is related to measures of autism spectrum personality traits in a sample of developmentally typical adults. Using a procedure very similar to the one used by Zhao and Li (2006), vMMN amplitude was measured in response to faces depicting happy or sad emotional expressions amidst a sequence of neutral emotional expressions. One modification to the procedure used by Zhao and Li (2006) was the addition of a non-emotional deviant stimulus, a neutral expression with a green tint added to the image. This nonemotional deviant was used in order to demonstrate that observed variability in the vMMN could be attributed to the emotional content of the deviants. Consistent with prior research (e.g., Zhao and Li, 2006), it was anticipated that the amplitude of the MMN would be increased for emotionally salient stimuli and that the vMMN to emotional expressions in particular would be lateralized in the right hemisphere. Extending prior research, it was expected that this emotion-based amplitude sensitivity would be decreased in individuals with higher levels of autism spectrum personality traits, reflecting a decreased sensitivity to affective expression.

# **METHODS**

# **PARTICIPANTS**

Forty-five participants (29 Male) without an ASD diagnosis from the College of William and Mary volunteered to participate in this research. The average age of the participants was 19.8 (*SD* = 1.67) years. Each participant provided informed consent and the study was performed in accordance with the rules and regulations of the College of William and Mary's IRB. Eight participants were excluded because of excessive movement artifact in the EEG recordings.

# **MEASURES**

After giving informed consent, participants completed the Adult Autism Spectrum Quotient (AQ) while seated behind a privacy screen. The AQ consists of 50 statements regarding social and communication skills, imagination, attention to detail, and sensitivity to change. Participants endorsed each statement with the following ordinal scale: strongly disagree, disagree, agree, and strongly agree. AQ scores were determined in accordance with Baron-Cohen et al. (2001). A score of 25 or above is considered Asperger's and a score of 32 or above meets criteria for a diagnosis of autism. All but one of the participants in this study fell below the level of Asperger's disorder and all of the participants scored below the level of autistic disorder.

# **STIMULI**

Twelve faces were selected from the NimStim database of standardized expressional faces (Tottenham et al., 2009). The faces included six males and six females, with two black, two white, and two Asian faces within each gender. For each face, one image was selected for each of the neutral, sad, and happy expressions, all with closed mouth expressions.

# **PROCEDURE**

Participants were seated 37 inches from an LCD monitor inside an electronically shielded Faraday chamber and were fitted with a pair of Eartone 3a insert earphones. Participants were instructed to fixate on a crosshair presented in the center of the monitor and to passively view a series of individual faces while performing an auditory distracter task. For the distracter task, participants were asked to listen to an auditory track of short stories taken from Shel Silverstein's Where the Sidewalk Ends and to count the number of words that began with the letters "T" and "K." At the end of each block of 115 trials, participants were asked to report the number of words beginning with those letters.

The vMMN procedure included twelve blocks of 115 trials. Each trial consisted of the presentation of 6–10 neutral expressions followed by one deviant expression. Thus, there were 460 instances of each of the three deviant stimulus types over the course of the experiment, which occurred with a probability of ∼0.13 on average. The identity of the face was constant within each block of trials, but was counterbalanced across blocks. There were three deviant stimuli presented amidst the sequences of standard stimuli in each trial block (Näätänen et al., 2004). The standard image for each block was a neutral, or non-expressive, face. Two of the deviants were emotional in nature and included faces with happy or sad facial expressions. The third deviant image in each block was the same as the standard image, but with a green tint added (see **Figure 1**). Occurrence of each of three categories of deviant stimuli was pseudo-randomly ordered and each category was equally represented within a block. Each face remained on screen for 150 ms. The inter-stimulus-interval was randomized to be between 500 and 700 ms and the inter-block interval was 10 s.

#### **DATA ACQUISITION/ANALYSIS**

Electrophysiological data were recorded continuously at 2000 samples per second using a high-impedance DBPA-1 Sensorium bio-amplifier (Sensorium Inc., Charlotte, VT) with an analog

high-pass filter of 0.01 Hz and a low-pass filter of 500 Hz. Recordings were made using a fabric cap bearing 72 Ag-AgCl sintered electrodes. EEG recordings were made using a forehead ground and a reference at the tip of the nose. Vertical and horizontal eye movements were recorded from electrodes placed above and below the eyes and from electrodes placed at the lateral canthi, respectively. All impedances were adjusted to within 0–20 k- at the start of the recording session.

EEG data were analyzed off-line using EEGlab. Data were inspected for excessive artifact and channels containing excessive artifacts over a majority of the recording time were interpolated using a spherical spline. Channel interpolation was required for 12 of the 45 participants. Of those 12 participants requiring channel interpolation, two required the interpolation of three channels, one required the interpolation of two channels, and nine required interpolation of just a single channel. Data were then corrected for both horizontal and vertical ocular artifacts using independent component analysis (Jung et al., 2000). Following the removal of ocular artifacts, the data were segmented between −200 and 800 ms with respect to stimulus onset. Following segmentation, data were baseline corrected and filtered using an IIR Butterworth filter with a low-pass frequency cutoff (half-amplitude) of 20 Hz. Individual trials with voltages outside a −100 to 100µV range were excluded from analysis. Segmented data were then averaged over trials for each of the standard and deviant stimulus presentations.

vMMN was identified and measured for each condition in the difference waveform generated by subtracting the ERP in response to the Standard image from the ERP in response to the happy, sad, and control deviant images. Combined with prior research (Zhao and Li, 2006), an evaluation of the grand average difference waveforms in **Figure 2** informed the decision to measure vMMN as the mean amplitude between 150 and 425 ms at parieto-occipital electrodes (P03, P04, P07, P08). A 3 (Emotion: happy, sad, and control) × 2 [Hemisphere: right (P04, PO8), left (P03, P07)] × 2 [Region: medial (PO3, PO4), lateral (P07, PO8)] repeated measures ANOVA was used to assess amplitude variability across emotional expressions, hemispheres, and medial/lateral regions. Greenhouse–Geisser correction for violations of sphericity was used where appropriate. Guided by the results of the ANOVA, a Pearson correlation coefficient was used to determine the relationship between vMMN amplitude and scores on the AQ.

# **RESULTS**

#### **vMMN**

The grand average ERP in response to standard and deviant stimuli is depicted in **Figure 3** for electrode PO8. The repeated measures ANOVA (Emotion × Hemisphere × Region) indicated significant main effects of Hemisphere, Region, and Emotion, each qualified by significant 2-way interactions and a significant 3-way interaction. The main effect of Hemisphere, *F*(1, <sup>36</sup>) = 16.4, *p* < 0.001, indicated that vMMN amplitude was larger (more negative) in the right by comparison with the left hemisphere. The main effect of Region, *F*(1, <sup>36</sup>) = 8.8, *p* = 0.005, indicated larger vMMN amplitudes in lateral by comparison with medial electrode sites. The main effect of Emotion indicated

that vMMN amplitudes were larger in response to sad emotional expressions than either happy (*p* < 0.01) or control (*p* < 0.001) expressions, which were not statistically different from one another.

deviant stimuli. **Right:** Scalp topographies of mean vMMN amplitude (deviant minus standard) over the 150–425ms epoch for each condition.

These main effects were qualified by a number of interactions, including the 3-way interaction between Hemisphere, Region, and Emotion, *F*(2, <sup>72</sup>) = 3.3, *p* < 0.05. **Figure 4**, depicting the mean vMMN amplitudes for each Emotion, Hemisphere, and Region, facilitates the interpretation of this interaction. Inspection of **Figure 4** reveals that lateralization of the vMMN to the right hemisphere was increased for sad and happy by comparison with control conditions and that this effect was largest at lateral electrode positions (e.g., PO8).

#### **CORRELATION BETWEEN vMMN AMPLITUDE AND AUTISM QUOTIENT**

In order to determine how the vMMN amplitude was related to autism spectrum personality traits, a Pearson correlation was used to evaluate the association between AQ score and vMMN amplitude measured over the right lateral (PO8) hemisphere. Consistent with the expectation that vMMN may be useful as an indicator of affective reactivity in ASD, there was a significant positive correlation between the vMMN amplitude to happy deviants and score on the AQ, *r*(37) = 0.343, *p* < 0.05 (see **Figure 5**).

**FIGURE 3 | Grand average ERP waveforms at electrode site PO8 for each of the standard and deviant conditions.**

The positive nature of this association indicates that individual's with higher scores on the AQ exhibited smaller (more positive) amplitude vMMN responses to happy emotional expressions. **Figure 6** depicts the grand averaged vMMN to happy emotional expressions and the topography of the mean voltage over the 150–425 ms interval used to quantify vMMN amplitude. The correlations between AQ score and vMMN amplitude to sad and control deviants were not statistically significant.

#### **DISCUSSION**

The overarching goal of this research was to determine how modulation of the vMMN by emotional expression is related to autism spectrum personality traits as indicated by the AQ. Electrophysiological data revealed a vMMN to emotional

**response (shaded area indicates 150–425ms epoch over which the vMMN difference was quantified).**

expression that was lateralized to the right hemisphere, a finding consistent with prior research (Blonder et al., 1991; Zhao and Li, 2006; Kimura et al., 2011a,b; Stefanics et al., 2012) and the "right hemisphere hypothesis" (Brood et al., 1998) stating that the right hemisphere is specialized for affective processing. Additionally, a significant positive correlation between vMMN amplitude to happy emotional deviants and the level of autistic personality traits suggests that this measure of affective reactivity may be useful as a tool for measuring affective reactivity in Autism.

Significant differences in vMMN amplitude were also observed between happy and sad emotional expressions, irrespective of the participant's AQ score. Similar effects have been described previously and are thought to be attributable to an inherent "negativity bias," which describes a predisposition to allocate early processing resources to negative emotional expressions (Stefanics et al., 2012). In fact, Stefanics et al. (2012) report that this negativity bias may appear as early as 195–275 ms following stimulus onset and be localized to the right hemisphere. However, the fact that the correlation between vMMN amplitude to sad expressions and AQ score was not significant, indicates that the impact of "negativity bias" on the vMMN may be independent of the effects of reduced affective reactivity.

The finding that the AQ score was selectively related to vMMN amplitude in response to happy expressions was unexpected in light of related research demonstrating more general deficits of affective processing in ASD, or even a contradictory patterns in some cases (e.g., Blair, 2005; Wallace et al., 2011; Mazefsky et al., 2012; Stefanics et al., 2012). However, this finding is consistent with research indicating low levels of approach motivation and diminished positive affect in individuals diagnosed with ASD (Garon et al., 2009). Additionally, research using startle probe methodology indicates an abnormal profile of affective reactivity in individuals with ASD that is driven by an aberrant psychophysiological response to only positive affect (Wilbarger et al., 2009). Imaging research also indicates that, by comparison with developmentally typical children, individuals with ASD exhibit reduced activation of brain areas like the fusiform face area and superior temporal sulcus in response to positive emotions (Spencer et al., 2011). Remarkably, Spencer et al. (2011) also demonstrate that this reduced affective reactivity is present in unaffected siblings of children with autism compared with controls without a family history of autism. Finally, we interpret a selective reduction of affective sensitivity to positive emotion to be consistent with a negative experience of social interactions in general. Whereas a reduction in sensitivity to negative but not positive affect might actually lead to a more positive overall experience in the context of social interactions, a selective deficit in the processing of positive affect would be expected to lead to an overall negative social experience.

One potential limitation of this study is the fact that the current design, similar to the one used by Zhao and Li (2006), did not counterbalance the designation of standard and deviant stimuli over blocks of the experiment. In other words, the expectancy violation which is thought to be elicited by the appearance of the less-probable emotional or control stimuli and thought to give rise to the vMMN component in the present study is confounded with differences between the happy, sad, and control images on physical dimensions other than emotional valence. This confound complicates direct comparisons between the vMMNs to the various emotional deviants because it is impossible to know whether observed differences are due to the change in affective valence or changes on other physical dimensions of the stimuli. However, these methodological concerns are assuaged by the fact that a similar pattern of results has been shown for happy and fearful emotional expressions using a fully counterbalanced design wherein responses to happy and fearfull emotional expressions each served as "standards" and "deviants" at different points of the experimental procedure (Stefanics et al., 2012), avoiding the problem of confounding physical differences with violations of affective expectancy.

Another important consideration for future research may be gender differences in measures of affective reactivity using the vMMN. This may be important because 64% of the participants in the present sample were male, however, data suggests that males are four times more likely than females to be diagnosed with autism (Lord et al., 2000). The present sample was drawn from the participant pool at a small university, thus, it

# **REFERENCES**


will also be important to determine that this observed relationship holds in a more diverse population with a more broadly distributed range of traits on the autism spectrum or even an ASD diagnosis.

Notwithstanding these limitations, the present results compel future research to determine how psychophysiological markers, such as the vMMN, may be successfully used to index affective reactivity in individuals with ASD. The fact that the vMMN amplitude was significantly correlated with the measures of behavior indexed by the AQ is important because it complements other recent research indicating that psychophysiological indices like the vMMN are not just epiphenomena, but have explicit behavioral relevance (Stefanics and Czigler, 2012). Moreover, because cognitive-behavioral interventions (CBI) have been shown to be effective in improving social interactions in children with high-functioning autism (Bauminger, 2002), the vMMN may prove to be a useful indicator of treatment efficacy. Finding a tangible neurological marker for ASD could be an important step forward in the development of improved diagnostic procedures and may even reduce the inappropriate labeling of socially awkward, but neurotypical children.

by emotional valence studied through event-related potentials in humans. *Neurosci. Lett.* 356, 1–4.


electroencephalographic artifacts by blind source separation. *Psychophysiology* 37, 163–178.


*Ment. Retard. Dev. Disabil. Res. Rev.* 4, 90–96.


research of central auditory processing: a review. *Clin. Neurophysiol.* 118, 2544–2590.


stimuli enhance startle response. *Neuropsychologia* 47, 1323–1331.

Zhao, L., and Li, J. (2006). Visual mismatch negativity elicited by facial expressions under non-attentional condition. *Neurosci. Lett.* 410, 126–131.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 09 August 2012; accepted: 04 December 2012; published online: 19 December 2012.*

*Citation: Gayle LC, Gal DE and Kieffaber PD (2012) Measuring affective reactivity in individuals with autism spectrum personality traits using the visual mismatch negativity event-related brain potential. Front. Hum. Neurosci. 6:334. doi: 10.3389/fnhum.2012.00334 Copyright © 2012 Gayle, Gal and Kieffaber. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

# Electrophysiological evidence of atypical visual change detection in adults with autism

# *H. Cléry , S. Roux , E. Houy-Durand , F. Bonnet-Brilhault , N. Bruneau and M. Gomot\**

*UMR 930 Imagerie et Cerveau, Inserm, Université François Rabelais de Tours, CHRU de Tours, France*

#### *Edited by:*

*István Czigler, Hungarian Academy of Sciences, Hungary*

#### *Reviewed by:*

*Alexandra Bendixen, University of Leipzig, Germany Estate M. Sokhadze, University of Louisville, USA*

#### *\*Correspondence:*

*M. Gomot, INSERM U930, Centre de Pédopsychiatrie, CHRU Bretonneau, 2 bd Tonnellé, 37044 Tours Cedex 9, France. e-mail: gomot@univ-tours.fr*

Although atypical change detection processes have been highlighted in the auditory modality in autism spectrum disorder (ASD), little is known about these processes in the visual modality. The aim of the present study was therefore to investigate visual change detection in adults with ASD, taking into account the salience of change, in order to determine whether this ability is affected in this disorder. Thirteen adults with ASD and 13 controls were presented with a passive visual three stimuli oddball paradigm. The findings revealed atypical visual change processing in ASD. Whereas controls displayed a vMMN in response to deviant and a novelty P3 in response to novel stimuli, patients with ASD displayed a novelty P3 in response to both deviant and novel stimuli. These results thus suggested atypical orientation of attention toward unattended minor changes in ASD that might contribute to the intolerance of change.

**Keywords: visual change detection, ERPs, vMMN, autism, adults**

# **INTRODUCTION**

Increased attention has been paid in the past 10 years to the study of the event related potential (ERP) evoked by automatic change detection in the visual modality: the visual mismatch negativity (vMMN). This electrophysiological component has been extensively described in healthy adults as a negative component culminating over occipital sites between 150 and 350 ms in response to various deviant stimuli such as direction of movement (Kremlacek et al., 2006), form (Besle et al., 2005), orientation (Astikainen et al., 2008), spatial frequency (Maekawa et al., 2005), and color (Czigler et al., 2004). vMMN is thought to reflect the automatic pre-attentional detection of a difference between the active sensory memory trace of a recent repeated event (standard) and an incoming deviant stimulus (for review see Kimura, 2012), thus reflecting, as proposed in the auditory modality (Näätänen, 1995; Garrido et al., 2009), an online updating of the model for predicting sensory inputs. This response to automatic visual change is also known to be dependent on the degree-of-deviance as shorter MMN latencies have been recorded for greater deviant–standard differences (Czigler et al., 2002). Moreover, if the salience of change exceeds a certain threshold, MMN can be followed by an additional P3a component reflecting involuntary orientation of attention toward the rare event (Czigler, 2007).

vMMN has been investigated in several psychiatric disorders such as major depression (Chang et al., 2011; Qiu et al., 2011), schizophrenia (Urban et al., 2008), and cognitive decline (Tales et al., 2002a,b) which are characterized by sensory and cognitive dysfunction in several aspects such as attention memory and executive functions.

It is highly relevant to focus on automatic change detection in autism spectrum disorders (ASD) in the light of clinical evidence in individuals with ASD that they react in an unusual way to unattended events that occur in their environment or that prevent their routines. These atypical reactions may be expressed in the form of tantrums as a response to change, or in the form of restricted interests and repetitive or stereotyped behaviors, that persist with age (Kobayashi and Murata, 1998; Richler et al., 2010). Individuals with ASD try to impose predictability, with insistence on repetition and sameness (McEvoy et al., 1993). Resistance to change may also occur at the sensory level; individuals with ASD clinically display unusual behaviors in response to changes in all sensory modalities stimuli (Boyd et al., 2010). Moreover, several behavioral studies and results from questionnaires have revealed unusual sensory responses such as hyper-reactivity or hypo-reactivity in all sensory modalities (Khalfa et al., 2004; Leekam et al., 2007; Reynolds and Lane, 2008; Ashwin et al., 2009; Ben-Sasson et al., 2009), both sometimes occurring in the same subject. Such paradoxical responses to sensory stimuli have led to a lack of consensus on the exact nature of the underlying sensory dysfunction, but have been hypothesized to contribute to stereotyped behaviors and quest for sameness (Gerrard and Rugg, 2009). Moreover, study of relationships between clinical and electrophysiological findings has demonstrated that atypical brain reactivity in response to sensory changes occurring in stimulus sequence is related to the degree of behavioral intolerance of change as assessed by the Behavioral Summarized Evaluation (BSE-R, Barthelemy et al., 1997) (Gomot et al., 2011). As a whole, these features indicate that intolerance of change in ASD may be rooted in basic abnormalities in the processing of sensory information, and especially in the automatic processing of changing stimuli (Gomot and Wicker, 2012).

A substantial body of electrophysiological findings provides evidence for atypical processing of auditory change in ASD subjects compared to typically developing controls but the results in terms of MMN amplitude and latency have been inconsistent (for review see O'Connor, 2012). However, only one study has investigated the brain processes involved in automatic change detection in ASD using scalp potentials (SPs) and scalp current densities (SCDs) mapping (Gomot et al., 2002). This study showed shorter MMN latency in ASD associated with abnormal functioning of a neural network, including the left frontal cortex. These findings strongly suggest particular processing of auditory stimulus change in children with autism that might be related to their behavioral need to preserve sameness.

A few studies have investigated visual change detection in ASD *per se* but the protocols used have mostly involved active target detection (Kemner et al., 1994; Sokhadze et al., 2009). The majority of results indicated smaller P3 amplitude in response to novel visual events in those with ASD than in controls (Courchesne et al., 1989; Ciesielski et al., 1990). In a three stimulus oddball paradigm, Sokhadze et al. (2009) showed that ASD subjects displayed a delayed P3a response to visual novel stimuli, suggesting that individuals with ASD require more time to process the information needed for the successful differentiation of target and novel stimuli. These findings indicating differences in amplitudes and longer latencies in the electrophysiological index of attention-dependent novelty processing suggest unusual processing of violation of sensory expectancy in ASD, possibly due to difficulties in building flexible predictions about an upcoming event.

Maekawa et al. (2011) used a visual oddball paradigm comprising standard, deviant, and target windmill patterns in ASD. The participants were instructed to press a button when they recognized the target while they listened to a story delivered binaurally through earphones. The results revealed intact vMMN in terms of latency and amplitude in response to non-target deviants but a smaller P3 in response to targets. However, it can be argued that the mismatch response recorded in this study did not purely reflect pre-attentional processing of change detection, as stimuli were presented in the attentional visual field.

Only one study has investigated visual change detection in passive conditions in ASD (Cléry et al., 2013), using an oddball paradigm constituted of standard, deviant, and novel stimuli in children with ASD. Findings suggested that neural networks involved in the perception of visual changes in children with ASD are atypical and less modulated by the salience of stimuli than in typically developing children.

Thus no study to date has reported vMMN in adults with ASD in passive conditions. The aim of the study presented here was therefore to investigate automatic deviancy detection in the visual modality in adults with ASD in order to determine whether this pre-attentional ability was affected in this disorder. To verify whether the unusual sensibility of the neural networks involved in the perception of an even minor change is observable in adults with ASD, the same three stimuli oddball paradigm than in our previous study conducted in children (Cléry et al., 2013) was used. SPs and SCDs mapping was used to conduct spatio-temporal analyses of brain activation elicited by unattended changing visual stimuli.

# **MATERIALS AND METHODS**

#### **PARTICIPANTS**

Thirteen adults with ASD (11 males and 2 females), aged 18 to 30 [mean age (years; months ± SD): 26; 2 ± 5] were recruited from the Autism Centre of the University Hospital of Tours. Diagnosis was made according to DSM-IV-R criteria (APA, 2000) and using the Autism Diagnostic Observation Schedule-Generic (ADOS-G, fourth module) (social interaction + communication scores mean ± SD: 10 ± 4; threshold for *ASD* = 7). Intelligence quotients (IQ) were assessed by the Wechsler Adult Intelligence Scale (WAIS-III). These intelligence scale provided overall intellectual (mean ± SD) (IQ: 89 ± 19), verbal (vIQ: 91 ± 17) and performance (nvIQ: 88 ± 24) quotients. Thirteen healthy volunteer also participated in the study [mean age (years; months ± SD): 24; 3 ± 2; 8 males and 5 females]. None of these healthy adults had a previous history of neurological or psychiatric problems. All participants had normal or corrected-to-normal vision and none were receiving psychotropic medication. The Ethics Committee of the University Hospital of Tours approved the protocol. Written informed consent from all participants was obtained.

# **STIMULI AND PROCEDURE**

Change detection processes were studied using a passive visual oddball paradigm with three types of dynamic stimuli: "Standard" (probability of occurrence *p* = 0.82), "Deviant" (probability of occurrence *p* = 0.09) and "Novel" (probability of occurrence *p* = 0.09). As shown in **Figure 1**, these stimuli consisted in the deformation of a circle into an ellipse either horizontally (Standard) or vertically (Deviant) or into another shape (Novel), adapted from Besle et al. (2005). Each stimulus was constituted of seven successive images presented within 140 ms (i.e., 50 images per second) which resulted in apparent motions in the stimuli. The distinction between "deviants" and "novels" was not based on their probability of occurrence but on their salience. Whereas the deviant was always the same stimulus and only differed from the standard on the orientation of the ellipse, novel stimuli were always different non-identifiable shapes. Stimuli were presented with a 650 ms inter-stimulus interval. The viewing distance was set at 120 cm (visual angle 2◦). There were 2 runs of 815 dynamic stimuli. To avoid confounds caused by physical features, Deviants were swapped with Standards halfway through the sequence. Total recording lasted 25 min. In order to present the visual stimuli within the visual field but outside the focus of attention, subjects were required to undertake a distractive task. They were asked to fixate the central cross (that appeared on the center of circles) and to respond as quickly as possible to its disappearance (Target 9% of the trials). The disappearance of the fixation cross (target) was never in synchrony with the presentation of deviant or novel stimuli but always during a standard trial.

#### **ACQUISITION AND DATA ANALYSIS**

The behavioral responses measured were mean reaction times (in ms) and response accuracy, calculated by taking into account the rates of hits (correct response less than 2 s after target disappearance), false alarms to non-target stimuli (response without target disappearance) and missed targets (no response within

2 s after target disappearance), according to the formula: (targets − missed targets)/(targets + false alarms) × 100 (Simon and Boring, 1990). Electroencephalographic (EEG) data were recorded from 31 Ag/AgCl electrodes referenced to the nose. Electrodes were placed according to the international 10–10 system (Chatrian et al., 1985): Fz, Cz, Pz, Iz, F3, C3, P3, O1, T3, T5, FC1, CP1, FT3, TP3, PO3 and their homologous locations on the right hemiscalp. Additional electrodes were placed at M1 and M2 (left and right mastoid sites), IM1 and IM2 (midway between M1-IZ and M2-IZ), and FFz (midway between Fz and Fpz). The whole experiment was controlled by a Compumedics NeuroScan EEG system (Synamps amplifier, Scan 4.3, and Stim2 software). The impedance value of each electrode was less than 10 k-. In addition vertical eye movements (EOG) were recorded using two electrodes placed above and below the right eye. The EEG and vertical EOG were filtered with an analog bandpass filter (0.3–70 Hz) and digitized at a sampling rate of 500 Hz. Eye-movement artifacts were eliminated using a spatial filter transform developed by NeuroScan. The spatial filter is a multi-step procedure that generates an average eye blink, utilizes a spatial singular value decomposition based on principal component analysis (PCA) to extract the first component and covariance values, and then uses those covariance values to develop a filter that retains the EEG activity of interest. EEG periods with movement artifacts were manually rejected. EEG epochs were averaged separately for the standard, the deviant and the novel stimuli over a 700 ms analysis period, including a 100 ms pre-stimulus baseline. The ERPs to deviants and novels included at least 120 trials for each subject. MMN was measured from the difference waves obtained by subtracting the standard-stimulus ERP from the deviant-stimulus ERP. Finally, responses to novelty were also examined by subtracting the standard-stimulus ERP from the novel-stimulus ERP.

The ELAN software package for analysis and visualization of EEG-ERPs was used (Aguera et al., 2011). Maximum amplitudes and peak latencies of the sensory ERP and mismatch responses were measured manually for each subject within a 80 ms time window around the peak of the grand average waveforms specific to each group.

SP maps were generated using a two-dimensional spherical spline interpolation and a radial projection from Oz (back views) or from Cz (top views), which respects the length of the meridian arcs. SCDs were estimated by computing the second spatial derivative of the interpolated potential distributions (Perrin et al., 1989). Topographic differences were specifically tested in the interactions between groups and electrodes on amplitude-normalized data (McCarthy and Wood, 1985). For each condition, measurements for each subject were normalized by finding the maximum and minimum values across all sites and by subtracting the minimum from each data point, and dividing it by the difference between maximum and minimum.

For each condition, amplitude and latency values were submitted to a mixed-model ANOVA with group (Controls vs. ASD) as the between subjects factor and electrode location [left vs. right Occipito-Parieto-Temporal regions (left OPT: O1, PO3, P3, T5; right OPT: O2, PO4, P4, T6)] as the within subjects factor. Within each group, the statistical significance of ERP amplitude compared to 0 was tested by student *t*-test analysis corrected for multiple comparisons, using the statistical-graphical method of Guthrie and Buchwald (Guthrie and Buchwald, 1991) as previously used in several electrophysiological studies (Colin et al., 2002; Vidal et al., 2008; Graux et al., 2012). This method provides a table indicating the minimum number of consecutive time samples that should be significant differences in ERP in order to declare an effect as significant over a given time period. For our sample of 13 subjects per group and an analysis period of 600 ms (from 0 to 600 ms, i.e., 300 sampling points), the minimum number corresponded to 12 consecutive time points (i.e., 24 ms) with *p*-values below the 0.05 significance level.

# **RESULTS**

#### **BEHAVIORAL RESULTS**

Both groups performed the distractive task well, indicating that all subjects have looked at the screen and thus received visual stimuli. Indeed, no significant between groups difference was found, neither in response accuracy (Ctrl: 95.2% ± 3.6; ASD: 94.4% ± 3.3; n.s.) nor in reaction times (Ctrl: 443 ms ±108; ASD: 475 ms ±77; n.s.).

#### **ELECTROPHYSIOLOGICAL ANALYSIS**

Both groups presented the same morphology and distribution of responses to standard visual stimuli, clearly localized over occipito-parietal sites, at O1, PO3, P3, T5 in the left hemisphere (left OPT) and at O2, PO4, P4, T6 in the right hemisphere (right OPT) (**Figure 2**). Unless specified, evaluations of left and right OPT responses were therefore calculated by averaging values measured at these four electrode sites on each hemisphere and statistical analyses of variance were conducted on these two sets of electrodes (left and right OPT as within subjects factor).

# **RESPONSES TO STANDARD STIMULI**

The obligatory responses consisted of a negative–positive complex peaking over parieto-occipital regions. In controls, a negative component peaked at a latency of 170 ms (called N2) and was followed by a more central positive wave culminating around 240 ms (P2) (**Table 1**). Compared to those of the controls, the responses in the ASD group to standard stimuli did not differ significantly in latency but displayed significant smaller amplitudes [N2: *F*(2, <sup>23</sup>) = 4.08, *p* < 0.05; P2: *F*(2, <sup>23</sup>) = 4.15, *p* < 0.05].

#### **RESPONSES TO DEVIANT AND NOVEL STIMULI**

As shown in **Figure 3**, both groups had almost the same morphology and distribution of responses to the deviant as to the standard stimuli composed of a N2 peaking over occipito-parietal sites at left OPT and right OPT and a central P2. Compared to controls, ASD displayed significant smaller amplitudes of responses to deviant stimuli, but only for the N2 [*F*(2, <sup>23</sup>) = 3.57, *p* < 0.05]. Besides, the P2 in response to deviant is delayed in ASD [*F*(2, <sup>23</sup>) = 5.07, *p* < 0.05].

In response to novel stimuli, participants of the control group displayed a biphasic N2, peaking over occipito-parietal sites at left OPT and right OPT, first at 160 ms (early N2) and then at 320 ms (late N2), followed by a novelty P3 culminating at 440 ms (cf **Table 1**). Compared to controls, adults with ASD did not display comparable responses to visual novelty in term of morphology. Indeed, they only showed an early N2, also peaking over occipito-parietal sites at left OPT and right OPT at


**Table 1 | Mean amplitudes and latencies of the responses to standard, deviant, and novel visual stimuli in each group.**

∗*Significant between group difference p* < *0.05.*

160 ms. Both groups display similar early N2 topography as indicated by results of the mixed-model ANOVA: Group (Control vs. ASD) × Hemisphere (left, right) × Electrode site (Occipital, Parieto-Occipital, Parietal, Temporal) [*F*(3, <sup>72</sup>) = 0.27, n.s.]. This component was followed by a novelty P3 culminating at 440 ms. Neither the early N2 nor the novelty P3 showed significant between groups differences in terms of amplitude or latency (**Table 1**).

#### **DEVIANCE PROCESSING**

The difference waves were obtained by subtracting the standardstimulus ERP from the deviant-stimulus ERP (**Figure 4A**).

In the control group, vMMN was elicited by the deviant stimuli, peaking over occipito-parietal sites at 210 ms (lOPT: 214 ms ± 22, −1.5µV ± 1.0; rOPT: 210 ms ± 21, −1.6µV ± 0.9; frontal: 226 ms ± 28, −1.1µV ± 0.7) with a frontal negative deflection peaking later at around 230 ms. **Figure 4B** (left panel) shows the statistically significant amplitudes from 0 at 29 electrode sites between 0 and 600 ms post-stimulus in the adult group. Using the criteria defined in the "Materials and Methods" section, two periods of significant amplitude were distinguished: (1) from 180 to 240 ms after stimulus onset over occipito-parietal sites and (2) from 210 to 250 ms over fronto-central sites.

In adults with ASD (**Figure 4A**), a vMMN-like response was observed over occipito-parietal sites from 150 ms, followed as in controls by a frontal negative deflection peaking around 215 ms. Finally the automatic deviance detection process was completed by an additional significant positive component over occipito-temporo-parietal sites at 460 ms that we labeled Mismatch Positivity (MMP450) (lOPT: 1.55 ± 1.22µV; rOPT: 1.58 ± 1.35µV). However, results of the statistical analysis displayed in **Figure 4B** (right panel) indicated that in ASD only the MMP450 was statistically different from 0.

As both groups did not display similar significant components, direct group statistical comparison was not performed.

# **TOPOGRAPHICAL ANALYSES** *Deviant–Standard ERPs*

The time course of the visual change-detection process in the 150–250 ms latency range is presented in **Figure 5A** for each group. The voltage maps in controls displayed negative potential fields over the bilateral occipito-parieto-temporal sites from 200 ms which reached the frontal region at around 230 ms. In the ASD group, SP maps showed a completely different time course of the visual change detection. Although nonsignificant, a first negative potential field was revealed over frontal site as soon as 150 ms, associated to a negative activity over infero-temporo-occipital sites, and from 200 ms an additional stable central positive activity was observed. Finally, SP maps calculated at the MMP450 peak latency showed in adults with ASD a large bilateral positive activity over the occipitoparietal areas whereas in controls no significant activity was measured.

The SCDs distributions of the change detection response at the latency of the vMMN for each group are shown in **Figure 5B** (bottom). SCD maps showed the involvement of both occipitoparietal and infero-temporo-occipital regions in both groups, as attested by the bilateral pattern of sinks recorded over occipital and parietal sites.

#### *Comparison of Deviant–Standard and Novel–Standard ERPs*

**Figure 6** showed SP and SCD maps in ASD calculated in the latency range of the novelty P3 in response to novel (Novel– Standard ERPs) and of the MMP450 recorded in response to deviant stimuli (Deviant–Standard ERPs). SP maps showed for both responses a positive activity over bilateral occipito-parietal

regions. SCD maps to both types of stimuli mainly showed bilateral occipito-parietal sources associated with a medial occipito-parietal current sink.

In order to determine whether the MMP450 (deviancy detection) and the novelty P3 (novelty detection) reflect the same component in ASD, we statistically compared the topographies of these two responses, using a mixed-model ANOVA: Condition (deviancy detection vs. novelty detection) × Hemisphere (left, right) × Electrode site (Occipital, Parieto-Occipital, Parietal, Temporal). ASDs display novelty P3 topography similar to that of the MMP450 as no significant topographic differences were found

between these two conditions in this group [*F*(3, <sup>36</sup>) = 1.12, n.s.]. This indicates that MMP450 and novelty P3 represent the same response. Henceforth MMP450 in ASD should thus be labeled novelty P3.

# **DISCUSSION**

The study presented here is the first to characterize electrophysiological indices of automatic visual deviancy processing in adults with ASD in passive conditions. Using a passive oddball paradigm, an atypical visual process was revealed in adults with ASD compared to controls.

The electrophysiological pattern of obligatory sensory responses to standard stimuli reported here showed the same morphology of response in both groups and consisted of a negative component peaking at around 170 ms (N2) followed by a positive component culminating at around 240 ms (P2). The N2 recorded here could reflect the main motion-onset visual evoked potential described by Kuba et al. (2007) peaking at around 150–200 ms and thought to be generated in the extrastriate temporo-occipital or parietal cortex (Nakamura and Ohtsuka, 1999; Henning et al., 2005). This N2 motion-onset is classically followed by a P2 deflection, usually peaking at around 240 ms and shown to depend on the type of motion presented (Kuba et al., 2007). These two sensory responses displayed significantly reduced amplitude in adults with ASD than in controls. Such smaller amplitudes were similarly observed in response to deviant visual stimuli. It should be noted that the visual stimuli used

consisted of the dynamic deformation of a circle into an ellipse in either one or another direction, resulting in two different shapes and thus involving two visual dimensions: object shape and motion direction. This kind of visual stimuli involving changes in form and motion was chosen to increase the chances of obtaining vMMN by stimulating the mismatch process with two physical stimulus features. Indeed, the visual system is functionally divided into at least two pathways (for review see Farivar, 2009). The ventral pathway is generally specialized for fine detail, static form, and color perception, whereas the dorsal pathway is predominantly responsible for processing and perceiving moving stimuli, locating objects and directing visually guided action. A number of studies have reported low-level perception deficits in ASD, mainly characterized by higher motion coherence thresholds, but intact performance on form coherence tasks, suggesting a specific dysfunction of the visual dorsal pathway

(Spencer et al., 2000; Milne et al., 2002; Braddick et al., 2003). The hypothesis of specific dorsal stream vulnerability in ASD has been questioned by findings suggesting an additional ventral stream deficit in ASD (Spencer and O'Brien, 2006) using a spatial-form-coherence detection task. The specific features of our dynamic stimuli could explain the atypical morphology of the sensory response in ASD, as numerous studies pointed to abnormalities in coherent motion perception and in local motion processing in ASD (for review see Simmons et al., 2009). Nevertheless, despite the large number of studies published on visual ERPs in autism, direct comparison of our results with previous findings is not easy as, to our knowledge, no study has reported ERPs in response to stimuli similar to those used in this study.

Visual MMN was identified in the control group, culminating over occipito-parietal sites at around 210 ms, followed by an anterior negative component peaking at 230 ms. This finding confirms previous studies suggesting the location of vMMN generators in both the visual occipital (Czigler et al., 2004; Pazo-Alvarez et al., 2004; Amenedo et al., 2007) and the frontal areas (Czigler et al., 2004; Urakawa et al., 2010). In adults with ASD, the visual MMN was almost absent. However, in view of the SP and SCD maps, it cannot be excluded that adults with ASD displayed a mismatch process comparable to that of the controls, but of smaller

amplitude that did not reach significance. All the studies that have investigated vMMN in psychiatric disorders characterized by sensory and cognitive dysfunctions (for review see Maekawa et al., 2012) have revealed a significantly smaller vMMN in psychiatric patients than in controls. Taken together, these results suggest that an impaired vMMN generation might contribute to characterize elementary cognitive processing in psychiatric disorders.

In ASD, the mismatch response was mainly characterized by a significant positive component culminating over bilateral occipito-parietal sites at around 460 ms and that we first labeled MMP450. Increasing the salience of visual change by presenting novel stimuli evoked a biphasic negative deflection (early N2 and late N2) followed by a positive novelty P3 component in controls. Adults with ASD did not display the same morphology of responses to novel stimuli as they only showed an early N2 followed by a novelty P3. Interestingly, the MMP450 recorded in response to deviance and the novelty P3 recorded in response to novel stimuli in ASD appeared at similar latencies and displayed the same scalp topography, thus suggesting that they reflect the same process. Because novelty P3 is thought to reflect involuntary switching of attention toward stimulus changes occurring outside the focus of attention (Pontifex et al., 2009), it can be hypothesized that adults with ASD are more attracted than controls by any visual change (even non-significant) occurring unexpectedly in their environment. This finding of a large novelty P3 in response to deviant stimuli is in accordance with our study investigating automatic visual change detection in children with ASD using the same paradigm (Cléry et al., 2013) and supports clinical reports showing that individuals with ASD often tend to be more distractible than controls, suggesting that their attention may in fact be "underselective" (Allen and Courchesne, 2001; Keehn et al., 2012). This may explain why individuals with ASD appear to ignore relevant stimuli in the environment in favor of relatively discrete and apparently meaningless stimuli, but it may also contribute to the exceptional perceptual abilities observed in some individuals with ASD (Mottron et al., 2006; Plaisted-Grant and Davis, 2009). This might be a maladjustment in so far as it leads to distress at small changes in the environment (Happe and Frith, 2006).

Interestingly patients with ASD displayed a smaller (nonsignificant) vMMN than controls in response to deviant stimuli, leading to suggest poorer automatic visual change detection in this pathology, but followed by an additional large novelty P3 reflecting the involuntary switching of attention toward stimulus changes. This finding raised question about the possible dissociation of this two components as it remains surprising that the attention could be involuntary captured by a change, without this change being first detected. However, similar cases of dissociation between early change detection negativity and the subsequent P3 have been reported in the auditory modality (Winkler et al., 1998; Sussman et al., 2003; Rinne et al., 2006). Recently Horváth et al. (2008) investigated distraction-related ERP responses using an auditory distraction paradigm and showed that a P3a can be elicited without previous MMN in response to some stimulus

# **REFERENCES**


features. The authors proposed that the P3a may rather reflect some possibly higher-level event detection process than attention switching itself. Such observation merits further investigations in the visual modality.

This finding that even small deviance detection involved a novelty P3 response in adults with ASD may be related to results previously obtained in children in the auditory modality by Gomot et al. (2002). Taken together these findings support of the existence of an atypical change detection process acting in several sensory modalities in people with ASD that might contribute to their intolerance of change.

# **ACKNOWLEDGMENTS**

This research was supported by grants from the "Fondation Orange" and the "Région Centre" and by the CHRU Bretonneau, Tours (PHRC). We thank all the subjects and their parents for their time and effort spent participating in this study. Special thanks are due to Pierre Emmanuel Aguera for his valuable help with the use of Elan software.

autism: a electrophysiological Study. *Psychophysiology* 50, 240–252.


auditory distraction? *Biol. Psychol.* 79, 139–147.


high-functioning autism spectrum disorder. *Res. Autism Spectr. Disord.* 5, 201–209.


search in Alzheimer's disease: a deficiency in processing conjunctions of features. *Neuropsychologia* 40, 1849–1857.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 November 2012; accepted: 16 February 2013; published online: 06 March 2013.*

*Citation: Cléry H, Roux S, Houy-Durand E, Bonnet-Brilhault F, Bruneau N and Gomot M (2013) Electrophysiological evidence of atypical visual change detection in adults with autism. Front. Hum. Neurosci. 7:62. doi: 10.3389/fnhum.2013.00062*

*Copyright © 2013 Cléry, Roux, Houy-Durand, Bonnet-Brilhault, Bruneau and Gomot. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

# Can eye of origin serve as a deviant? Visual mismatch negativity from binocular rivalry

#### *Manja van Rhijn1, Urte Roeber 2,3,4 and Robert P. O'Shea1 \**

*<sup>1</sup> Discipline of Psychology and Cognitive Neuroscience Research Cluster, School of Health and Human Sciences, Southern Cross University, Coffs Harbour, NSW, Australia*

*<sup>2</sup> BioCog, Institute for Psychology, University of Leipzig, Leipzig, Germany*

*<sup>3</sup> Discipline of Biomedical Science, University of Sydney, Sydney, Australia*

*<sup>4</sup> Discipline of Psychology, School of Health and Human Sciences, Southern Cross University, Coffs Harbour, NSW, Australia*

#### *Edited by:*

*Gabor Stefanics, University of Zurich & ETH Zurich, Switzerland*

#### *Reviewed by:*

*István Czigler, Institute of Cognitive Neuroscience and Psychology, Hungary Jan Kremlacek, Charles University in Prague, Czech Republic*

#### *\*Correspondence:*

*Robert P. O'Shea, Discipline of Psychology and Cognitive Neuroscience Research Cluster, School of Health and Human Sciences, Southern Cross University, Hogbin Drive, Coffs Harbour, NSW 2450, Australia. e-mail: robert.oshea@scu.edu.au* The visual mismatch negativity (vMMN) is a negative deflection in an event-related potential (ERP) between 200 and 400 ms after onset of an infrequent stimulus in a sequence of frequent stimuli. Binocular rivalry occurs when one image is presented to one eye and a different image is presented to the other. Although the images in the two eyes are unchanging, perception alternates unpredictably between the two images for as long as one cares to look. Binocular rivalry, therefore, provides a useful test of whether the vMMN is produced by low levels of the visual system at which the images are processed, or by higher levels at which perception is mediated. To investigate whether a vMMN can be evoked during binocular rivalry, we showed 80% standards comprising a vertical grating to one eye and a horizontal grating to the other and 20% deviants, in which the gratings either swapped between the eyes (*eye-swap deviants*) or changed their orientations by 45◦ (*oblique deviants*). Fourteen participants observed the stimuli in 16, 4-min blocks. In eight consecutive blocks, participants recorded their experiences of rivalry by pressing keys—we call this the *attend-to-rivalry* condition. In the remaining eight consecutive blocks, participants performed a demanding task at fixation (a 2-back task), also by pressing keys—we call this the *reduced-attention* condition. We found deviance-related negativity from about 140 ms to about 220 ms after onset of a deviant. There were two noticeable troughs that we call an early vMMN (140–160 ms) and a late vMMN (200–220 ms). These were essentially similar for oblique deviants and eye-swap deviants. They were also essentially similar in the attend-to-rivalry conditions and the reduced-attention conditions. We also found a late, deviance-related negativity from about 270 to about 290 ms in the attend-to-rivalry conditions. We conclude that the vMMN can be evoked during the ever-changing perceptual changes of binocular rivalry and that it is sensitive to the eye of origin of binocular-rivalry stimuli. This is consistent with the vMMN's being produced by low levels of the visual system.

**Keywords: visual mismatch negativity (vMMN), binocular rivalry, event-related potentials (ERP), attention, utrocular processing, eye-of-origin**

# **INTRODUCTION**

How do we process regularities and irregularities in our visual environments? The visual mismatch negativity (vMMN) is the electroencephalographic (EEG) signature of such processing (Czigler and Csibra, 1990). The vMMN arises when participants are exposed to a sequence of identical stimuli, called *standards*, in which every now and then, unpredictably, one of the standards is replaced by a stimulus, a *deviant* that differs in some way from the standards. As the name of the vMMN suggests, deviants yield event-related potentials (ERPs) that are more negative than those from standards.

Pazo-Alvarez et al. (2003) have reviewed studies of the vMMN. They found that deviants can be in the form, orientation, color, size, spatial frequency, and direction of movement of the stimuli. They defined the vMMN as occurring 250–400 ms after the onset of the deviant stimuli, beginning around the time of the second negative deflection in the ERP, the N2. Tales et al. (2009) have shown that the vMMN occurs when participants have withdrawn their attention from the stimuli [for a review, see Czigler (2007)], suggesting it is sign of a pre-attentive, automatic processing of irregularities in the visual environment.

The vMMN is thought to reflect processing that occurs when automatic predictions about upcoming stimuli are violated (Kimura et al., 2011). Based on the level of processing, Winkler and Czigler (2012) have argued that stimuli are represented as perceptual objects.

The phenomenon of binocular rivalry provides a test of the level of processing required for the vMMN. Binocular rivalry [e.g., reviewed by Blake and O'Shea (2009)] occurs when a person is presented with two different images, one to each eye (e.g., vertical lines to one eye and horizontal lines to the other). Instead of seeing a combination of the two images (i.e., a grid), the person sees one image for a second or so with no trace of the other, then the other image for a second or so with no trace of the first, then the first image, and so on, irregularly for as long as the person looks at the rival stimuli. Periods of exclusive visibility of one or the other image are usually separated by brief periods of some ever-changing mosaic or patchwork of the two images. All of this makes the conscious experience of binocular rivalry irregular and complex, yet the stimuli delivered to the eyes are unchanging. If the vMMN is an automatic, unconscious process, it should be possible to find it from a series of binocularrivalry standards and deviants. However, if the vMMN requires attention—for the deviants to be experienced as rare and as different from the standards—then one would predict that the busy, ever-changing experience of binocular rivalry would banish the vMMN. It is this test we wanted to make.

Our binocular rivalry standards were brief (400 ± 33 ms) displays of vertical lines to one eye and horizontal lines to the other (**Figure 1**). This time is easily enough for rivalry to be instigated and to develop into exclusive visibility of one or the other image (Wolfe, 1983; O'Shea and Crassini, 1984). Displays were separated by a briefer display (100 ± 33 ms) of a dark field. These are times that allows periods of exclusive visibility to persist over several displays of the rival stimuli (Noest et al., 2007; Klink et al., 2008).

**FIGURE 1 | Illustration of a possible sequence of 10 presentations of experimental stimuli.** In the first (T1), the left eye views a horizontal grating and the right eye views a vertical grating for 400 ± 33 ms, followed by no gratings for 100 ± 33 ms. This illustrates a standard; it is repeated for four presentations (i.e., T1–T4). The fifth presentation (T5) illustrates an eye-swap deviant. This is followed by three more standards followed by an oblique deviant (T9). Then there is a final standard (T10). The red cross in the center of the stimuli represents a red number that changed every 667 ms.

We had two sorts of otherwise-identical, binocular rivalry deviants:


To test explicitly for the effects of attention on the vMMN, we ran two conditions, one in which participants had to pay attention to their conscious experience of the rivalry by pressing keys to report which of the rival stimuli they were seeing, and another in which they reduced any attention to the rival stimuli and paid attention to a demanding task (a 2-back task) in the center of the rival stimuli.

We found essentially identical vMMNs to both sorts of deviants. Reducing attention shortened the duration of the vMMN.

# **MATERIALS AND METHODS PARTICIPANTS**

Seventeen participants volunteered for this experiment. All participants where right handed and had normal or correctedto-normal vision and visual acuity. All gave written, informed consent to participate and did so without any incentives, such as payment. The study was approved by Southern Cross University's Human Research Ethics Committee (approval number ECN-11-136).

One participant failed to experience binocular rivalry during a rivalry pre-test and so no other data were obtained from this participant. The data of two other participants were excluded because they did not yield enough epochs for at least one of the ERPs after data pre-processing (see below). Of the remaining 14 participants eight were female. Ages ranged from 21 to 58 years with a mean of 31.79.

#### **APPARATUS AND MATERIALS**

Left-eye and right-eye stimuli were presented on the left and right sides of a linerarized, Samsung (2233RZ), 22-inch, color, LCD monitor (1680 × 1050 pixels; running at 60 Hz). Participants viewed stimuli from 57 cm through a Screenscope SA-200- Monitor-type, four, front-surfaced mirror stereoscope, attached to a chin rest. One participant opted to cross fuse the stimuli rather than using the stereoscope (he showed the same pattern of results as the other participants). Participants used a numeric keypad to respond. The experiment was run using a Macintosh Mini. This computer was controlled by custom-written MATLAB scripts using the Psychophysics Toolbox (Brainard, 1997; Pelli, 1997).

Electroencephalography (EEG) data were recorded continuously with a BrainAmp system (Brain Products GmbH, Munich) running on a Dell PC.

#### **STIMULI**

There were three basic sorts of stimuli: grating stimuli, fusion stimuli, and fixation stimuli. *Grating stimuli* consisted of an annulus-shaped patch of achromatic, sine-wave grating shown to one eye and an orthogonal, but otherwise identical patch shown to the other eye. The outer diameter of a patch was 1.65◦ of visual angle; the inner diameter was 0.67◦. Spatial frequency was 3.50 cycles/◦, mean luminance was 43.37 cd/m2, and contrast was 0.99. They were displayed on a dark background (0.40 cd/m2).

A *fixation stimulus* was confined in the central region of the grating stimuli. It comprised of a central, red, one-digit number that changed every 667 ms to another randomly chosen number. The font style was Courier size 18 (0.50◦ height, ca. 0.30◦ width) with a pen width of 0.08◦. These stimuli were identical in the two eyes.

*Fusion* stimuli were three, continuously presented, concentric, white (86.68 cd/m2), one-pixel-thick rings with diameters such that the smallest one was 50 min of visual angle larger than that of a grating. The diameter of the outer ring was 3.20◦ and had an even space of 0.10◦ cm between rings with a pen width of 0.05◦. The fusion stimuli were identical to the two eyes. The fixation and fusion stimuli served to keep the eyes fixated centrally and aligned binocularly.

To form *rival stimuli*, one grating stimulus was shown to one eye and an orthogonally orientated grating stimulus was shown to the other, along with the fixation and fusion stimuli shown to both eyes (**Figure 1**). Some rival stimuli were *standards*; these had one arrangement of gratings to the eyes [e.g., left-eye horizontal (LEH) and right-eye vertical (REV)]. The remaining rival stimuli were *deviants*. There were two sorts: *eye-swap deviants* had the opposite arrangement of gratings to the eyes from the standards (i.e., LEV and REH) and *oblique deviants* had different orientations (e.g., left-eye, left oblique [LELO] and right-eye, right-oblique [RERO]). All rival stimuli had two combinations, one in which the stimuli were presented to the eyes as specified and one in which the stimuli were interchanged between the eyes.

Different stimuli were used to test visual evoked potentials (VEPs). The stimuli consisted of a central, 10-by-10 chequerboard, viewed on a gray background (43.37 cd/m2), with checks of 0.50◦ that phase reversed every 0.5 s for 50 s. At the same time, central red fixation numbers changed randomly every 667 ms.

#### **PROCEDURE**

We recorded the participant's sex, age, occupation, and dominant eye/hand. We measured the visual acuity of each participant's left eye, right eye, and both eyes together using the Freiburg Visual Acuity Test (Bach, 2007) at a viewing distance of 3.25 meters.

Then each participant responded in a rivalry pre-test. The participant viewed for 3 min binocular rivalry stimuli that were identical to the experimental stimuli except that there no deviants and pressed one key whenever and for as long as the vertical bars were visible with no trace of horizontal, and another key whenever and for as long as the horizontal bars were visible with no trace of vertical. The only difference from the standard stimuli in the experiment proper was that there was a continuously presented fixation cross instead of a changing fixation number. The first pre-test trial was then repeated with the opposite eye-orientation combination; order was counterbalanced.

Once the EEG electrodes were attached, we measured each participant's VEPs. The participant's task was to press a key when the fixation number was the same as the second last number shown. These VEP stimuli were presented once to the left eye while the right eye viewed the gray background, once to the right eye with gray to the left, and once to both eyes. Then they were repeated in the reverse order. Normal VEPs were defined as the VEPs' showing a N75, a P100, and a N135 that did not differ markedly between the eyes and that were larger for binocular stimulation (Odom et al., 2010; O'Shea et al., 2010). All participants showed normal VEPs.

The experiment proper consisted of 16 blocks. Each block involved 480 consecutive trials comprising 80% (384) standards, 10% (48) eye-swap deviants, and 10% (48) oblique deviants. Each trial was a display of rival stimuli for 400 ms with a uniform random jitter of ±33 ms, followed by the dark background for 100 ms with a uniform random jitter of ±33 ms. Order of trials within each block was randomized afresh for each participant and for each block with the constraints the first three and last two trials of each block had to show standard stimuli and that at least two standard-stimuli trials had to follow each deviant. Orientation-eye arrangement of standard rivalry stimuli alternated between blocks. Orientation-eye arrangement in the first block was counterbalanced across participants.

There were two attention conditions:


The numbers at fixation that changed every 667 ms to another randomly chosen number appeared in both conditions. Starting condition was counterbalanced over participants. In both conditions the participant was told to minimize eye blinks, and to relax.

#### **MEASUREMENT OF EEG**

EEGs were recorded from 26 active Ag/AgCl electrodes (F7, F3, Fz, F4, F8, FC5, FC1, FCz, AFz, FC2, FC6, T7, C3, Cz, C4, T8, CP5, CP1, CP2, CP6, P7, P3, Pz, P4, P8, O1, Oz, O2) mounted on an elastic cap (actiCap) placed according to 10–20 system and referenced to FCz, with the ground at AFz. The sampling rate was 500 Hz. A vertical electrooculogram (EOG) was recorded by electrodes above and below the right eye; a horizontal EOG was recorded by placing electrodes near the outer canthi of the eyes. Additionally an electrode was attached to each earlobe.

# **DATA ANALYSIS**

#### *Behavioral data*

From the rivalry pre-test, we determined the mean time of episodes of dominance of one or the other rival stimuli. In the attend-to-rivalry condition, we determined the frequency and response time (RTs) of a key release from 150 to 1500 ms after the onset of a deviant stimulus. These measures let us know whether the deviants were perceived.

In the reduced-attention condition we determined detection and false alarm rates, from which we calculated sensitivities (*d*'), and the RTs for correct responses. These measures let us know whether the participants paid attention to the 2-back task rather than to the rival gratings.

#### *Electrophysiological data*

In preparation for data analysis, we re-referenced the EEG data offline to the right earlobe and applied a 0.5–35 Hz bandpass filter (Kaiser windowed sinc FIR filter, 1857 points). We extracted epochs of the data from 100 ms before to 400 ms after stimulus (gratings) onset. We excluded from further analysis any epochs preceding, containing, or following a key press within 300 ms. We also excluded any epochs with signals exceeding a movingwindow, peak-to-peak amplitude of 200μV at any EEG channel, or of 100μV at any EOG channel (moving window width: 200 ms, distance between successive windows: 50 ms). Five data sets contained bad channels, which we corrected using spherical interpolation. The maximum number of channels we interpolated per data set was three. None of the channels was used in the statistical analysis.

We averaged ERPs separately for each stimulus type (standard, eye-swap deviant, oblique deviant) and condition (attention, reduced attention). We then excluded from further analysis two data sets that contained fewer than 100 epochs in any of the ERPs.

To investigate deviance-related differences we formed difference waves by subtracting the ERP to the standard stimuli from the ERPs to either of the deviant stimuli in both conditions. After visual inspection of the data for deviance-related differences, we defined three time windows of interest in each attention condition. Two of the time windows were the same for both attention conditions. Within these time windows we analysed the difference waves at occipital electrodes O1 and O2. We chose occipital electrodes for our analysis because gratings yield most pronounced responses in those electrodes.

We also calculated voltage maps for the various time windows to show the pattern of activity over all electrodes.

# **RESULTS AND DISCUSSION BEHAVIORAL DATA** *Rivalry pre-test*

The mean duration of episodes of dominance of one or the other rival stimuli was 2086 ms (we give the standard deviation, SD, in parentheses after each mean, in this case 990 ms). The distributions of times showed pronounced positive skew. All this is consistent with rivalry reported by others (Fox and Herrmann, 1967; Levelt, 1967; Cogan, 1973; Zhou et al., 2004). That is, rivalry produced an ever-changing, unpredictable, sequence of percepts from which no regularity could be discerned.

#### *Attend-to-rivalry condition*

The mean duration of episodes of dominance of one or the other rival stimuli was 1567 ms (630 ms). The distributions of times showed pronounced positive skew. The general pattern is consistent with rivalry reported by others. The distribution was also bimodal. There was an early, sharp peak, between 600 and 700 ms, and a later, broader peak around 1200 ms. The early peak is likely due the episodes of dominance that were terminated by the occurrence of a deviant (see below); the later peak is likely due to naturally occurring rivalry alternations.

About 20% (12%) of all eye-swap deviants had no preceding key press, meaning that participants were experiencing some form of patchy dominance or combination of the rival images. Of the remaining trials, 74% (18%) resulted in a key release between 150 and 1500 ms after onset of the deviant. RTs were 691 (90) ms. That is, participants noticed the eye-swap deviants.

About 15% (9%) of all oblique deviants had no preceding key press. This difference from 20% for the eye-swap deviants must arise from sampling error, because oblique and eye-swap deviant were presented at random. Of the remaining oblique-deviant trials, 78% (13%) resulted in a key release between 150 and 1500 ms after onset of the deviant. This is not significantly different from the percentage of key releases for eye-swap deviants. RTs were 691 (56) ms—not significantly different from that for eyeswap deviants. That is, participants equally noticed both sorts of deviants.

We repeated these analyses with maximum window durations of 1000 and 650 ms. Apart from reducing the number of key releases and shortening the RTs, we found no significant differences for these measures from oblique deviants and from eye-swap deviants.

#### *Reduced-attention condition*

We defined a 2-back target as being detected when the participant pressed the key between 150 and 1000 ms after its occurrence. Participants detected on average (standard deviation) 49% (20%) of the 2-back targets. False alarm rate was 1% (0.7%). Mean *d*' was 2.23 (0.65). Participants' correct responses had an RT of 663 (72) ms. These results show that participants performed the 2-back task quite well but far from perfectly, suggesting that the task was demanding and occupied most, if not all, of their attention.

We did not ask participants if they were aware of the rivalry alternations during the reduced-attention condition. (Neither, for that matter, did we ask participants if they were aware of the fixation numbers, or of 2-back targets, during the attend-torivalry condition.) However, it is likely that participants noticed some rivalry alternations, especially if they had paid attention to rivalry in their first eight blocks. All we can really say are our own impressions from pilot testing: we felt that the 2-back task occupied our attention completely, however, occasionally we would notice a rivalry alternation, especially if it were abrupt. It was as if such alternations engaged attention exogenously.

#### **EEG DATA**

On average there were 1264 (418) accepted epochs per participant for standard stimuli, 228 (82) for eye-swap deviants, and 225 (79) for oblique deviants in the attend-to-rivalry condition, and 1917 (178) accepted epochs for standard stimuli, 323 (28) for eye-swap deviants, and 323 (29) for oblique deviants in the reduced-attention condition.

**Figure 2** displays grand-averaged ERPs elicited by standard stimuli, by eye-swap deviants, and by oblique deviants and their difference waves (eye-swap deviants minus standards, oblique deviants minus standards) at the right hemisphere (O2) for both conditions separately. Activity was largest at electrodes O1 and O2 within all-time windows of interest. Data for the analyses were mean voltages across each time window and electrode.

The ERPs in both conditions show a similar pattern of deflections, starting with a pronounced positivity at about 100 ms (P1), a negativity at about 170 ms (N1), and a second positivity at about 250 ms (P2).

In both conditions, the earliest deviance-related negativity occurs at about 140 ms. Although this is earlier than Pazo-Alvarez et al. (2003) defined as being the vMMN, it is similar to results found by others for orientation changes in gratings (e.g., Winkler et al., 2005; Astikainen et al., 2008; Kimura et al., 2010). Certainly it is a deviance-related negativity.

In the attend-to-rivalry condition, this negativity sustains until about 350 ms with a second trough at about 280 ms for both types of deviants. In the reduced-attention condition, this earliest negativity sustains until about 250 ms for eye-swap deviants with another trough at about 200 ms, whereas it sustains only until about 170 ms for oblique deviants. For both eye-swap and oblique deviants in the reduced-attention condition we also see a deviance-related positivity at P1 that does not occur in the attend-to-rivalry condition.

**Figure 3** displays voltage maps for the difference waves for both sorts of deviants and for both attention conditions for all four time windows. The voltage maps show that the largest voltages were in the occipital electrodes, which is to be expected for visual stimuli, and that generally the two sorts of deviants yielded similar maps. There were two major differences:


We chose four time periods spanning 30 ms each within which we analysed amplitudes for the difference waves shown in **Figure 2**.

**FIGURE 2 | Left panel:** ERPs (colored traces) and difference waves (black traces) from electrode O2 for the attend-to-rivalry condition. The gray vertical rectangles show the time windows for which we analysed the data. **Center panel:** Representation of the electrode array on a schematic head. **Right panel:** Same as the left panel for the reduced-attention condition. In

the left panel, there is a clear negativity visible in the difference waves from about 140 ms after onset to about 350 ms. Difference waves from the two sorts of deviants are similar. In the right panel, there is a clear negativity visible in the difference waves from about 140 ms after onset to about 250 ms.

The statistical tests we report below confirm our characterization of the results. We tested whether the amplitudes of the difference waves differed from zero using one-tailed *t*-tests; we tested for differences in the difference waves among the various experimental conditions with repeated-measures ANOVAs with factors type of deviant (eye-swap vs. oblique), attention condition (attend-to-rivalry vs. reduced-attention), and hemisphere (left vs. right).

# *82–112 ms (P1)*

One-tailed *t-*tests yielded significant positivities for eye-swap and oblique deviants in the reduced-attention condition at both electrodes [eye-swap deviants: *t*(13) = 2.24, *p* = 0.022 [O1], *t*(13) = 4.31, *p* < 0.001 [O2]; oblique deviants: *t*(13) = 1.78, *p* = 0.05 [O1], *t*(13) = 2.39, *p* = 0.017 [O2]], but not in the attend-torivalry condition [eye-swap deviants: *t*(13) = −1.57, *p* = 0.07 [O1], *t*(13) = −0.24, *p* = 0.407 [O2]; oblique deviants: *t*(13) = 0.11, *p* = 0.458 [O1], *t*(13) = 1.34, *p* = 0.099 [O2]]. That is, in the reduced-attention condition, deviants elicited larger positivities than in the attend-to-rivalry condition, *F*(1, <sup>13</sup>) = 10.21, *p* = <sup>0</sup>.007, partial <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.440.

The positivities presumably arise from adaptation, or "refractoriness" (Kimura, 2012, p. 145): the standards are seen much more often than the deviants, so are processed by adapted neurons, whereas the deviants are rare, so are processed by lessadapted neurons. It is possible the lack of a positivity for the attend-to-rivalry condition comes from a ceiling effect in the ERPs: both standards and deviants yield P1s greater than 2μV. There is no such ceiling effect in the reduced-attention condition.

#### *130–160 ms (early vMMN)*

In the early time window within the first deviance-related negativity we found significant negativities for eye-swap deviants at both occipital electrodes in both conditions [attend-to-rivalry condition: *t*(13) = −3.02, *p* = 0.005 [O1], *t*(13) = −3.49, *p* = 0.002 [O2]; reduced-attention condition: *t*(13) = −3.13, *p* = 0.004 [O1], *t*(13) = −3.51, *p* = 0.002 [O2]]. That is, eye-swap deviants showed a more negative response than standard stimuli whether attention was directed to or withdrawn from the rival gratings. Differences from 0 for the oblique deviants failed to reach significance in the attend-to-rivalry condition [O1: *t*(13) = −1.01, *p* = 0.165; O2: *t*(13) = −1.58, *p* = 0.069] and at the left hemisphere in the reduced-attention condition [O1: *t*(13) = −1.53, *p* = 0.075]. These differences between the two attention conditions failed to reach significance in the ANOVA, *F*(1, <sup>13</sup>) = 0.95, *p* = 0.347. In other words, there is a vMMN to both sorts of deviants in the early time window.

# *196–226 ms (late vMMN)*

In the late time window within the first deviance-related negativity we found significant negativies for eye-swap deviants at both occipital electrodes in both conditions [attend-to-rivalry condition: *t*(13) = −1.99, *p* = 0.034 [O1], *t*(13) = −1.99, *p* = 0.034 [O2]; reduced-attention condition: *t*(13) = −2.21, *p* = 0.023 [O1], *t*(13) = −3.50, *p* = 0.002 [O2]]. That is, eye-swap deviants show a more negative response than standard stimuli when attention was directed to or withdrawn from the gratings. All oblique deviants showed negativities, significantly so in the attend-to-rivalry condition at the right hemisphere [O2: *t*(13) = −1.93, *p* < 0.038] but not at the left hemisphere [O1: *t*(13) = −1.73, *p* = 0.054] or in the reduced-attention condition at either hemispheres [O1: *t*(13) = −0.63, *p* = 0.271; O2: *t*(13) = −0.61, *p* = 0.276]. These differences between the two types of deviants failed to reach significance in the ANOVA, *F*(1, <sup>13</sup>) = 1.61, *p* = 0.227, again leading us to conclude that similar vMMNs occurred to both sorts of deviants.

#### *266–296 ms (late negativity)*

The deviance-related negativity following the P2 component of the ERPs occurs in the attend-to-rivalry condition only, *F*(1, <sup>13</sup>) = <sup>9</sup>.43, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.009, partial <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.420. It is significantly negative for eye-swap and oblique deviants at both occipital electrodes [eyeswap deviants: *t*(13) = −4.00, *p* = 0.001 [O1], *t*(13) = −3.01, *p* = 0.005 [O2]; oblique deviants: *t*(13) = −2.39, *p* = 0.016 [O1], *t*(13) = −2.38, *p* = 0.017 [O2]]. That is, eye-swap and oblique deviants show a more negative response than standard stimuli when attention was directed to the gratings.

# **GENERAL DISCUSSION**

We found a deviance-related negativity to eye-swap deviants during binocular rivalry from 140 to 250 ms after onset of the stimuli in both attention conditions and that persisted until about 350 ms when attention was on the rival gratings. We also found similar results for oblique, control deviants. We conclude that this negativity is the vMMN.

We have to admit to at least two limitations on the experimental evidence for our conclusion:


If we can accept that the deviance-related negativity we have found is the vMMN, then there are at least two further conclusions:


Nevertheless, there is abundant evidence for low-level processing of regularities and irregularities from other studies than ours both for visual input (e.g., Czigler, 2007) and for auditory input (the MMN; e.g., Sussman, 2007; Sadia et al., 2013), but we like to think that binocular rivalry presents a stringent test of this in that its experience is unpredictable (Fox and Herrmann, 1967; Levelt, 1967; Zhou et al., 2004). It is also consistent with the electrodes from which we found the vMMN—occipital electrodes over the visual areas of the brain—and with the early time of ERP differences in response to changes to one of the rival stimuli of which participants are either aware or not (Roeber and Schröger, 2004; Roeber et al., 2008, 2011; Veser et al., 2008). It is also consistent with our finding a vMMN in the reduced-attention condition; the 2-back task was so demanding that participants either missed seeing most of the changes in orientation of the gratings or missed seeing all of them.

We have painted low and high levels with a rather broad brush. It is quite possible that there are levels within those levels at which the comparisons between some model of regularities in visual input and the visual input to a lower level are made (Garrido et al., 2009). Our point is that these lower levels really are low—close to the neurons in the visual cortex that first combine the inputs from the left eye and right eye, because these are the first neurons that can encode eye of origin.

As we said we cannot rule out that some aspect of the experience of deviants yields the vMMN because the participants experienced the deviants in the attend-to-rivalry condition. We are conducting other research with deviants that are presented to only one eye during binocular rivalry (Roeber et al., submitted). Our preliminary results suggest that vMMNs can be evoked by deviants that are invisible because of rivalry suppression. But we can rule out, in the current study that a participant could figure out the rule that defines a deviant from his or her experience of orientations in the attend-to-rivalry condition, because that experience is unpredictable. To understand this, we have illustrated in **Table 1** some examples of sequences of experienced orientations from what rivalry is *not*.

In **Table 1**, we show 15 presentations of the stimuli, from left to right (i.e., T1, T2, and so on). We show four cases, each one representing a successively closer approximation of the experience of binocular rivalry. For each case, we show what consciousness would be like if it were contributed to only by the left eye (LE), only by the right eye (RE) and as if binocular vision simply summed up the inputs from the LE and RE. The orientations are coded as V for vertical, and H for horizontal. We show three eye-swap deviants in the yellow columns. We give an asterisk if a deviant could possibly be experienced as a deviant.

In the first case, we show what would happen if consciousness consisted of simply summing the input from the LE alone and from the RE alone. Note that each eye alone yields three clear deviants, but that with both eyes open, there are no deviants. We know from EOG electrodes that all participants kept both eyes open for all accepted epochs, so this case demonstrates that the vMMN must arise from eye-of-origin information. We also know that binocular vision did *not* sum the LE and RE input; rather there was binocular rivalry.

In the second case, we show what would happen if rivalry were like a participant's alternately winking one or the other eye for 1 s each. In each eye, this yields pairs of presentations of gratings (i.e., two 400-ms presentations plus two 100-ms ITIs) interspersed by pairs of presentations of darkness. Again each eye alone could generate a vMMN, but both eyes do not reveal any clear deviants (although it is possible over longer sequences there could be some rules that could identify deviants). But again, we know that binocular rivalry is *not* like alternately winking the eyes at a regular rate.

In the third case, we show what would happen if rivalry were like a participant's alternately winking one or the other eye for a random time from 1 to 3 s (this temporal sequence is more like that of a typical experience of rival images than the second case). Again each eye alone could generate a vMMN, but both eyes do not reveal any clear deviants (although it is possible over longer sequences there could be some rules that could identify deviants). But again, we know that binocular rivalry is *not* like randomly, alternately winking the eyes.

In the fourth case, we show what would happen if rivalry were like the third case, except that at transitions from one percept to the next, participants saw composites of the images from each eye. All of this makes the experience of rivalry unpredictable, ruling out any vMMNs being developed to experience of both eyes.

**Table 1 | Possible sequences of 15 stimuli (standards and deviants) and percepts that are closer and closer approximations to the experience of rivalry.**


A further complication is that visibility of a stimulus from one eye during rivalry is *not* as we have represented it—that is it like one eye is closed—but it is simply an attenuation of visibility (Fox and Check, 1966; Alais et al., 2010). Moreover, composites are neither simple nor stable—they are complex, representing superimpositions or patchworks, and they are dynamic. All of this should serve to make the experience of rivalry completely unpredictable and to prevent any regularities from being extracted against which to contrast deviants.

In conclusion, our study is a first step on a journey to prove that eye of origin can serve as a deviant that will yield

#### **REFERENCES**


rivalry alternations. *Percept. Psychophys*. 2, 432–436.


a vMMN and to combine the fields of research into binocular rivalry and into processing of regularities in visual input. We look forward to our and others' taking further steps on this journey.

#### **ACKNOWLEDGMENTS**

We thank Jess Bultitude who helped collect the data and Bradley N. Jack for general help and for commenting helpfully on earlier drafts of this paper. We are also grateful to Andreas Widmann and Duncan Blair for their invaluable help in setting up the SCU EEG Research Laboratory, in which this research was conducted.

control determine perceptual choices in bistable vision. *J. Vis*. 8, 16.1–16.18.


visual stimulus features. *J. Cogn. Neurosci.* 17, 320–339.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 March 2013; accepted: 25 April 2013; published online: 15 May 2013.*

*Citation: van Rhijn M, Roeber U and O'Shea RP (2013) Can eye of origin serve as a deviant? Visual mismatch negativity from binocular rivalry. Front. Hum.* *Neurosci. 7:190. doi: 10.3389/fnhum. 2013.00190*

*Copyright © 2013 Van Rhijn, Roeber and O'Shea. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

# Object-related regularities are processed automatically: evidence from the visual mismatch negativity

# *Dagmar Müller\*, Andreas Widmann and Erich Schröger*

*Institut für Psychologie, Universität Leipzig, Leipzig, Germany*

#### *Edited by:*

*Gabor Stefanics, University of Zurich & ETH Zurich, Switzerland*

#### *Reviewed by:*

*Stefan Berti, Johannes Gutenberg University Mainz, Germany Marta Garrido, The University of Queensland, Australia*

#### *\*Correspondence:*

*Dagmar Müller, Institut für Psychologie, Universität Leipzig, Seeburgstraße 14-20, D-04103 Leipzig, Germany e-mail: dagmar\_mueller@ uni-leipzig.de*

One of the most challenging tasks of our visual systems is to structure and integrate the enormous amount of incoming information into distinct coherent objects. It is an ongoing debate whether or not the formation of visual objects requires attention. Implicit behavioral measures suggest that object formation can occur for task-irrelevant and unattended visual stimuli. The present study investigated pre-attentive visual object formation by combining implicit behavioral measures and an electrophysiological indicator of pre-attentive visual irregularity detection, the visual mismatch negativity (vMMN) of the event-related potential. Our displays consisted of two symmetrically arranged, task-irrelevant ellipses, the objects. In addition, there were two discs of either high or low luminance presented on the objects, which served as targets. Participants had to indicate whether the targets were of the same or different luminance. In separate conditions, the targets either usually were enclosed in the same object or in two different objects (standards). Occasionally, the regular target-to-object assignment was changed (deviants). That is, standards and deviants were exclusively defined on the basis of the task-irrelevant target-to-object assignment but not on the basis of some feature regularity. Although participants did not notice the regularity nor the occurrence of the deviation in the sequences, task-irrelevant deviations resulted in increased reaction times. Moreover, compared with physically identical standard displays deviating target-to-object assignments elicited a negative potential in the 246–280 ms time window over posterio-temporal electrode positions which was identified as vMMN. With variable resolution electromagnetic tomography (VARETA) object-related vMMN was localized to the inferior temporal gyrus. Our results support the notion that the visual system automatically structures even task-irrelevant aspects of the incoming information into objects.

**Keywords: deviance detection, human ERP, prediction error, object formation, variable resolution electromagnetic tomography (VARETA), visual mismatch negativity**

# **INTRODUCTION**

In everyday life our visual system is challenged with a multitude of information which has to be structured into coherent objects. There is a long-standing debate on whether or not the formation of visual objects requires attention. Evidence supporting the significance of attention for visual object formation for example comes from experiments in which participants searched for targets defined by a conjunction of two features. Reaction times in such experiments typically increase with the number of objects presented on the display thus suggesting that attention had to be shifted serially in order to form feature-conjunctions (Treisman and Gelade, 1980). Moreover, when objects bearing two different features were presented outside the focus of attention participants reported the occurrence of illusory conjunctions, i.e., the combination of features originally belonging to different items (Treisman and Schmidt, 1982). The opposing view, i.e., the approach of pre-attentive or automatic object formation, receives support from studies showing that participants judged two task-relevant features more accurately and/or faster when the features belonged to one object compared with when the features belonged to two different objects overlapping in space (e.g., Duncan, 1984; for a review of similar studies see, Scholl, 2001). Additional evidence for automatic object formation comes from another line of experiments which showed that the processing of centrally presented targets was affected by the organization of task-irrelevant and unattended elements presented in the background (e.g., Driver et al., 2001; Kimchi and Razpurker-Apfeld, 2004; Lamy et al., 2006; Kimchi and Peterson, 2008; Shomstein et al., 2010).

In such behavioral studies automatic object formation solely is indicated by the responses given by the participants. Eventrelated potentials (ERPs), which can be elicited by task-irrelevant, unattended aspects of the stimulation, may be exploited for investigating automatic object formation as such an approach could shed light on the temporal characteristics of automatic object formation as well as on the related cortical structures. In the auditory modality, several studies used the mismatch negativity (MMN) component to demonstrate automatic grouping of sounds into objects (e.g., Ritter et al., 2000; Atienza et al., 2003; Winkler et al., 2003; Sussman et al., 2007). The MMN is elicited when the actual stimulus deviates from a prediction generated on the basis of some regularity inherent to the preceding stimulus sequence (for a recent review see, Näätänen et al., 2011). In the past two decades it has been shown that there is an analogue mechanism extracting regularities from the visual environment and thus, generating predictions upon upcoming visual stimuli (for reviews see, Pazo-Alvarez et al., 2003; Czigler, 2007; Kimura et al., 2011b). If the actual input features an irregularity and thus mismatches the predicted stimulus a prediction error occurs which is thought to be reflected by the visual mismatch negativity (vMMN) component (Kimura et al., 2011b; Winkler and Czigler, 2012). It was convincingly shown that this mechanism operates in an automatic manner. That is, regularities are extracted irrespective of that they are not relevant for the task at hand and even when any possible intentional processing is prevented by masking (Kogai et al., 2011) or by presenting irregularities within the time window of the "attentional blink" (Berti, 2011). Recent studies have shown that this automatic system is capable of indicating not only highly salient violations of feature-regularities (e.g., a red-colored stimulus within a sequence of green-colored stimuli) but also less salient violations of regularities related to feature conjunctions (Winkler et al., 2005), facial emotional expressions (e.g., Astikainen and Hietanen, 2009; Chang et al., 2010; Kimura et al., 2011a; Stefanics et al., 2012), vertical mirror symmetry (Kecskes-Kovacs et al., 2013) or hand laterality (Stefanics and Czigler, 2012).

In the present study we investigated whether task-irrelevant violations of the regular assignment of single elements into visual objects elicited the vMMN. The elicitation of vMMN would indicate that the formation of visual objects can take place automatically which would make an important contribution to the aforementioned debate on the role of attention in visual object formation. In a previous study we could show that the automatic visual regularity detection system indexed by the vMMN is sensitive to object information: task-irrelevant color-irregularities were processed differently when the irregularities belonged to the same object compared with when they belonged to different objects (Müller et al., 2010), thus supporting automatic object formation by an electrophysiological measure. However, it is critically noteworthy that the highly salient color-irregularities used in this design could have induced involuntary attention shifts toward the task-irrelevant objects (e.g., Hopfinger and Mangun, 2001; Theeuwes, 2004). Thus, we designed the present experiment to rule out that object-specific processing is contingent on the processing of salient irregularities. Our displays consisted of two symmetrically arranged, task-irrelevant ellipses, the objects. In addition, there were two task-relevant discs of either high or low luminance presented on the objects. Thus, each of the ellipses and the discs presented on it should be combined to a common object based on the Gestalt principle of common region (Palmer, 1992). Participants had to judge the luminance of the discs (same vs. different, *p* = 0.5, respectively). We investigated object-related processing by varying the assignment of task-irrelevant objects and task-relevant discs. Frequently presented standard displays were characterized by a regular disc-to-object assignment, i.e., in two separate conditions regularly the discs either belonged to the same object or to different objects. In contrast, occasionally occurring deviant displays (*p* = 0.125) were characterized by a non-salient change in the regular disc-to-object assignment (see **Figure 1** for illustration). That is, standard displays and deviant displays consisted of the same elements, but differed only with regard to the task-irrelevant disc-to-object assignment. If in such a design deviant displays indeed elicit the vMMN we can draw a twofold conclusion: (1) As regularities and irregularities in our design are solely defined by object-related characteristics deviant displays will elicit the vMMN only if object-related information is encoded before the irregularity detection system checks the actual input, i.e., the elicitation of vMMN would support the notion of automatic object formation. (2) As standard displays and deviant displays in our design are not confounded by physical differences the elicitation of vMMN would show that the automatic visual irregularity detection system is not restricted to the detection of salient lower-order irregularities based on physical differences between standards and deviants but is also sensitive to the detection of non-salient higher-order irregularities.

# **MATERIAL AND METHODS**

#### **PARTICIPANTS**

Sixteen healthy students (10 women and 6 men, aged 18–30 years, mean age = 24.9 years) participated in the experiment for either course credit or payment. All of them reported normal or corrected-to-normal vision. Written informed consent was obtained from all of them according to the ethical code of the World Medical Association (Declaration of Helsinki). Data of two additional participants were excluded due to excessive eye movements which resulted in rejecting more than 50% of the trials from EEG analysis.

# **STIMULI AND PROCEDURE**

Stimulus presentation and the collection of behavioral responses were realized using the MATLAB toolbox Cogent2000v1.28. Stimuli were presented on a 19- color monitor (ViewSonic Graphics Series G90fB) set at a resolution of 1024 × 768 with a refresh rate of 100 Hz. We used a chinrest to maintain the viewing distance at 50 cm. Each test display consisted of two white ellipses (each subtending a visual angle of 7.<sup>97</sup> <sup>×</sup> <sup>3</sup>.43◦, 148.3 cd/m2), two discs (diameter 1.72◦), and a centrally presented white fixation cross (0.57 × 0.57◦). Ellipses were arranged in parallel and flanked the fixation cross. The distance between the center of each ellipse and the center of the display was 2.52◦. In different displays ellipses were pseudo-randomly tilted 45◦ either to the left or to the right in relation to the vertical midline. Displays containing left- and right-tilted ellipses occurred equiprobably within each block. In the following ellipses will be referred to as the "objects." The two discs were presented equally likely at two adjacent out of four possible positions (up, low, left, right, each 3.43◦ off the display-center) and were either of low luminance (darkgray, 14.55 cd/m2) or high luminance (light-gray, 80.6 cd/m2). In different displays the two discs were of either the same luminance (i.e., both discs were either dark-gray or light-gray) or different luminance (i.e., one disc was dark-gray and the other light-gray). Displays containing discs of the same luminance vs. different luminance occurred equiprobably within each block. As the luminance of the discs was task-relevant discs will be referred to as targets. In two separate experimental conditions we varied the target-to-object assignment: usually (*P* = 0.875) the targets

were presented on either the same object (standards of "sameobject-standard-condition") or on different objects (standards of "different-object-standard-condition"). Occasionally and unpredictably (*P* = 0.125), the regular assignment of the targets to the objects was exchanged: targets were presented on either different objects (deviants of "same-object-standard-condition") or on the same object (deviants of "different-object-standard-condition"). That is, deviants were exclusively defined on the violation of the regular target-to-object assignment whereas there were no physical differences between standard- and deviant-displays. All stimuli were presented against a black background. The fixation cross was shown constantly throughout a block (see **Figure 1** for an illustration of the design).

Each test-display was shown for 100 ms and was followed by an inter-stimulus interval of 1400 ms. Standard and deviant displays were presented in randomized order with the restriction that deviant-displays were always followed by at least two standard-displays. Stimuli were delivered in blocks of 128 trials each. The experiment consisted of 8 blocks of the "same-object-standard-condition" and 8 blocks of the "differentobject-standard-condition," respectively. Blocks were presented in pseudo-randomized order with the restriction that four blocks of each condition were included in the first and second half of the experiment, respectively. Including individual breaks between the blocks the experiment lasted about 1 h.

Participants were instructed to indicate as fast and as accurate as possible whether the two discs presented in each test display had the same or different luminance, i.e., disc-luminance was task-relevant whereas the disc-to-object assignment defining deviant- and standard-displays was task-irrelevant. Responses were given by pressing the outermost left/right button of a 4 button response pad with the left/right index finger. Responseto-button assignment (i.e., same/different luminance required left/right button presses and vice versa) was changed after completing the first half of experimental blocks. Subjects completed a training block of 32 trials in order to become acquainted with the task. In contrast to the experimental blocks in the training block the duration of test displays was increased to 300 ms. At the end of each block participants got feedback on their performance (mean reaction times and number of incorrect responses). We motivated the participants to focus on the task by rewarding each block in which they reached a certain criterion (not exceeding five incorrect responses, i.e., a hit rate of 96.1% minimum) with paying 25 cent. In addition, the participant with the highest performance received a book token of 10 Euro value.

After completing the experiment, we asked the subjects whether they noticed something specific in the design of the experiment. If they did not comment on the relation between task-relevant discs and objects by themselves we explicitly asked whether the realized that there was a "default" target-to-object assignment within each block which infrequently changed.

# **ELECTROPHYSIOLOGICAL RECORDING**

The electroencephalogram (EEG) was recorded continuously with a BrainAmp amplifier system (Brain Products GmbH, Munich, Germany) from 60 active electrodes mounted into an elastic cap according to the extended international 10–20 system (Fp1, FP2, AF3, AF4, F7, F5, F3, F1, Fz, F2, F4, F6, F8, FT7, FC5, FC3, FC1, FC2, FC4, FC6, FT8, T7, C5, C3, C1, Cz, C2, C4, C6, T8, TP9, TP7, CP5, CP3, CP1, CPz, CP2, CP4, CP6, TP8, TP10, P/, P5, P3, P1, Pz, P2, P4, P6, P8, PO9, PO7, PO3, POz, PO4, PO8, PO10, O1, Oz, O2). Horizontal and vertical eye movements were monitored by electrodes placed at the outer canthi of both eyes and above (electrode at position Fp2 was used) and below the right eye, respectively (electro-oculogram, EOG). An electrode attached at the tip of the nose served as off-line reference. Additional active electrodes placed at position FCz and AFz served as on-line reference and ground electrode, respectively. Data were filtered online (0.1–250 Hz bandpass) and sampled at 500 Hz.

# **ANALYSIS OF BEHAVIORAL DATA**

We calculated mean reaction times (RTs) and mean hit rates separately for the two stimulus types (standards vs. deviants) and the two target-to-object assignments (discs in the same object vs. discs in different objects). For the calculation of mean RTs, RTs related to incorrect responses and RTs out of a range individually defined by the mean RT calculated from all correct responses ± 2 standard deviations were excluded. Both RTs and hit rates were subjected to repeated measures ANOVAs with the factors of STIMULUS TYPE and TARGET-TO-OBJECT ASSIGNMENT, i.e., we compared responses given to physically identically deviant- and standard-stimuli obtained across the two different experimental conditions (see also **Figure 1** for an illustration of the comparisons).

#### **ANALYSIS OF ELECTROPHYSIOLOGICAL DATA**

Offline, EEG activity was re-referenced to the activity recorded from an electrode placed at the tip of the nose, and EEG and EOG activity was filtered (0.5–40 Hz band-pass digital FIR filter with a length of 1025 points). EEG and EOG activity was epoched from −100 ms before to 700 ms after the onset of test displays. The first 100 ms of each epoch served as the baseline interval. Epochs containing signal changes exceeding 100μV at any electrode, epochs related to displays to which participants did not respond (misses) or responded incorrectly (mistakes), epochs immediately following misses and mistakes and epochs related to standard displays directly following a deviant display were excluded from further analysis. Epochs were averaged separately for standards and deviants presented in the "same-object-standard-condition" and in the "differentobject-standard-condition," respectively. On average (mean ± SD), there were 586 ± 50/99 ± 7 epochs for standards/deviants from the "same-object-standard-condition" and 577 ± 70/97 ± 13 epochs for standards/deviants from the "different-objectstandard-condition" available for each participant.

To analyse genuine deviant-specific ERP responses, we calculated difference waves by subtracting ERPs elicited by standard displays from those elicited by physically identically deviant displays (i.e., standard-ERPs from the "same-objectstandard-condition" were subtracted from deviant-ERPs from the "different-object-standard-condition" and standard-ERPs from the "different-object-standard-condition" were subtracted from deviant-ERPs from the "same-object-standard-condition").

Visual inspection revealed that deviant and standard ERPs differed prominently at posterio-temporal electrode sites at about 260 ms latency, i.e., in the N2 latency range. Accordingly, we determined individual N2 peak latencies at electrode sites P5/6, P7/8, and PO7/8 in the 230–290 ms time range separately for each stimulus type (standard vs. deviant) and each target-toobject assignment (discs in the same object vs. discs in different objects). As the N2 peaked slightly earlier in trials in which discs belonged to the same object compared with trials in which discs belonged to different objects [main effect of factor TARGET-TO-OBJECT ASSIGNMENT, *F*(1, <sup>15</sup>) = 7.28, *p* = 0.017, η<sup>2</sup> *<sup>p</sup>* = 0.33] we adapted the position of 30-ms time windows used for computing individual mean amplitudes accordingly (246–276 ms/250–280 ms for trials in which discs belonged to the same object/to different objects). Additionally to the posterio-temporal region of interest (ROI) which comprises of the collapsed mean amplitudes at P5/7, P7/8, PO7/8, we selected a frontal ROI (AF3/4, F3/4, F5/6) to check for the occurrence of frontal deviant-related effects (Czigler et al., 2002). We tested for the significance of differences between standard- and deviant-responses by conducting a repeated measures ANOVA with the factors of STIMULUS TYPE × TARGET-TO-OBJECT ASSIGNMENT× HEMISPHERE (left vs. right) × ROI (posteriotemporal vs. frontal). Follow-up analyses comparing standardand deviant-responses separately for the left and the right hemisphere and the two ROIs were carried out by paired, two-tailed Student's *t*-tests. The alpha level criterion for all statistical analyses was set to.05. Effect sizes are presented as partial eta square (η<sup>2</sup> *p*).

We plotted voltage topography and scalp current density (SCD) maps of ERPs elicited by deviants and standards, and of the deviant-minus-standard difference potentials. Calculation and plotting was carried out by using the *sphspline* plug-in (Widmann, 2006) for EEGlab (Delorme and Makeig, 2004). As there were no striking differences in the distribution of deviant-related activity between the two target-to-object assignments we collapsed the data obtained in the two conditions. The time window was set to 246–280 ms thus, equally comprising the peaks of deviant-related activity of both target-toobject assignments. Furthermore, we applied Variable Resolution Electromagnetic Tomography (VARETA, Bosch-Bayard et al., 2001) in order to localize cortical generators of deviant-related activity. The VARETA technique uses a discrete spline distributed inverse model to estimate the spatially smoothest intracranial distribution of primary current densities that correspond to the EEG-signals measured at the scalp. In doing so VARETA estimates the smoothing parameter voxel-wise, thus allowing for variable amounts of spatial smoothness and localizing discrete and distributed sources with equal accuracy (Bosch-Bayard et al., 2001; Pizzagalli, 2007). We mapped possible sources on a 3D regular grid model (3244 voxels, 7 mm grid spacing) based on the probabilistic brain tissue maps available from the Montreal Neurological Institute (MNI, Evans et al., 1993) which restricts sources to the gray matter. Significant activations were displayed as 3D-images by computing statistical parametric maps of the estimated primary current densities based on a voxel-by-voxel Hoteling *T*2-test against zero. Random field theory (Worsley et al., 1996) was applied to correct thresholds for spatial dependencies between voxels. To localize deviant-specific activation we contrasted the solutions obtained for deviants with those obtained for standards.

# **RESULTS**

#### **BEHAVIORAL DATA**

When we asked for specifics of the design at the end of the experiment five out of our 16 participants reported that the task-relevant discs and the enclosing ellipses (i.e., the objects) were somehow related: they noticed that the targets could be enclosed in either the same object or in different objects. However, all but one <sup>1</sup> did not report spontaneously that they realized any difference in the frequency of the occurrence of the two types of target-to-object assignments. Even after we presented a figure displaying both target-to-objects assignments and we explicitly inquired whether they occurred with different frequencies none of the participants reported that they realized the occurrence of frequently and infrequently presented assignments within one block, i.e., participants neither realized object-based regularities nor violations of these regularities.

However, results of the repeated measures ANOVA with the factors STIMULUS TYPE (standards vs. deviants) and TARGET-TO-OBJECT ASSIGNMENT (discs belonging to the same object vs. discs belonging to different objects) conducted on the reaction times showed that the performance of the participants was significantly influenced by the (unnoticed) object-based regularities: participants responded significantly faster in trials with frequently presented target-to-object assignments (i.e., in standard trials, mean RT 505 ms ± 14 ms *SEM*) compared with trials with infrequent target-to-object assignments [i.e., in deviant trials, 515 ± 14 ms, main effect of factor STIMULUS TYPE: *<sup>F</sup>*(1, <sup>15</sup>) <sup>=</sup> <sup>35</sup>.5, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.70]. Furthermore, participants responded slightly faster when discs belonged to different objects compared with when discs belonged to the same object [507 ± 14 ms vs. 513 ± 14 ms, main effect of factor TARGET-TO-OBJECT ASSIGNMENT: *F*(1, <sup>15</sup>) = 5.75, *p* = 0.03, η<sup>2</sup> *<sup>p</sup>* = 0.28]. There was no interaction of the two factors [*F*(1, <sup>15</sup>) = 0.35, *p* > 0.5]. On average participants responded correctly in 95.76% ± 0.5 of all trials. Hits rates were not significantly affected by neither the factor STIMULUS TYPE nor TARGET-TO-OBJECT ASSIGNMENT (both *F* < 1). The interaction of the two factors only marginally failed to reach

significance [*F*(1, <sup>15</sup>) <sup>=</sup> <sup>4</sup>.52, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.051, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.23]. However, none of the possible follow-up comparisons reached significance [all *t*(df <sup>=</sup> <sup>15</sup>) < −1.65, all *p* > 0.1 even without correction for multiple comparisons]. Behavioral results are summarized in **Table 1**.

#### **ELECTROPHYSIOLOGICAL DATA**

**Figure 2** displays the grand average ERPs elicited by deviant and standard displays superimposed with the respective deviantminus-standard differences waveforms, separately for the two target-to-object assignments (discs belonging to the same object vs. discs belonging to different objects). Deviant and standard displays of both target-to-object assignments elicited a representative sequence of prominent visual ERP components at posterior electrode sites: P1 peaking at 95 ms, N1 at 150 ms, P2 at 205 ms and N2 at around 260 ms which was followed by a broad-peaked P3b in the 350–550 ms latency range (**Figure 2**). In the P1 and N1 latency range deviant and standard ERPs are nearly perfectly matched. In contrast, in the N2 latency range deviant ERPs clearly show a more negative response than standard ERPs. Visual inspection revealed that deviant-specific responses were most prominent at posterior-temporal electrode sites (**Figure 2**, lower row) whereas there were no deviant-specific responses at frontal electrode sites (**Figure 2**, upper row). The posterio-temporal distribution of deviant-specific responses is also illustrated by the corresponding potential maps and SCD maps (**Figure 3**, upper and middle row). Visual inspection further revealed that there were no differences between deviant and standard ERPs at frontocentral electrode sites at latency ranges around 400 ms poststimulus, i.e., we did not find evidence that deviants elicit the P3a component.

Results of a repeated measures ANOVA conducted on the mean amplitudes in the N2 latency range with the factors STIMULUS TYPE (standards vs. deviants) × TARGET-TO-OBJECT ASSIGNMENT (discs in the same object vs. discs in different objects) × HEMISPHERE (left vs. right)

#### **Table 1 | Behavioral performance. Stimulus type RT (ms) Hit rates (%) Deviants Standards Deviants Standards** Target-to-object assignmentsDiscs belonging to the same object Discs belonging to the different objects

*Reaction times (RT) and hit rates are displayed separately for deviant (red outlines) and standard trials (blue outlines) for the two target-to-object assignments, respectively. SEM are given in parentheses. Cells containing responses given within one experimental condition are marked by identical gray-scale and line-style (dark-gray cells with solid outlines correspond to the "same-objectstandard-condition," light-gray cells with dashed outlines correspond to the "different-object-standard-condition"). Responses given to physically identically deviants and standards are contrasted line-by-line. Asterisks indicate significant differences between deviant- and standard-responses averaged over the two target-to-object assignments (\*\*\*p* < *0.001).*

<sup>1</sup>One participant reported that displays containing discs belonging to the same object occurred more frequently throughout the whole experimental session.

× ROI (posterio-temporal vs. frontal) confirmed that deviants exhibited significantly more negative amplitudes than standards [main effect of STIMULUS TYPE: *F*(1, <sup>15</sup>) = 28.19, *p* < 0.001, η2 *<sup>p</sup>* = 0.65]. This effect was restricted to the posterior-temporal ROI [interaction of STIMULUS TYPE × ROI, *F*(1, <sup>15</sup>) = 53.59, *p* < 0.001, η<sup>2</sup> *<sup>p</sup>* = 0.78]. A significant threefold interaction of STIMULUS TYPE × HEMISPHERE × ROI [*F*(1, <sup>15</sup>) = 5.56, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.032, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.27] suggests that deviant specific responses found at the posterior-temporal ROI were more accentuated in the right hemisphere (−2.6μV ± 0.4 SEM vs. −2.3μV ± 0.4 in the right vs. left hemisphere). Follow-up analyses, however, failed to reach significance [*t*(df <sup>=</sup> <sup>15</sup>) = −1.67, *p* > 0.1]. In general, amplitudes at the posterior-temporal ROI were more negative than amplitudes at the frontal ROI [main effect of ROI, *<sup>F</sup>*(1, <sup>15</sup>) <sup>=</sup> <sup>20</sup>.58, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.58]. Amplitudes were not modulated by neither the TARGET-TO-OBJECT ASSIGNMENT [*F*(1, <sup>15</sup>) = 3.14, *p* = 0.1] and the HEMISPHERE [*F*(1, <sup>15</sup>) = 0.03, *p* = 0.87] itself nor by anyone of the other possible interactions of factors [all *F*(1, <sup>15</sup>) < 3.25, all *p* > 0.09]. Mean amplitudes of deviant and standard responses for the two target-to-object assignments are summarized separately for the posterior-temporal ROI and the frontal ROI, respectively, in **Table 2**.

The potential map of the deviant-minus-standard difference waves reveals a broadly distributed occipito-temporal twopeaked negative potential (**Figure 3**, upper row, right column). The corresponding SCD topography exhibits prominent bilateral occipito-temporal sinks accompanied by a weak source over the central occipital region and distributed weak sources over fronto-central areas (**Figure 3**, middle row, right column). Source analyses conducted by the VARETA approach show that brain activity elicited by deviant-trials is generated

in the posterior part of the inferior/middle temporal gyrus (MNI coordinates *X*, *Y*, *Z*: 50/−50, −62, −10) and at the occipital pole (17/−17, −95, −1, **Figure 3**, lower row, left column). Activity elicited by standard-trials is generated more superiorly in the middle temporal gyrus (50/−50, −62, −2) and at the occipital pole (15/−15, −98, −2), too (**Figure 3**, lower row, middle column). We contrasted source localizations obtained for deviants and standards for highlighting regions with **Table 2 | Mean amplitudes (**μ**V) elicited by deviants (red outlines) and standards (blue outlines) at posterio-temporal ROI (electrodes P5/6, P7/8, PO7/8) and frontal ROI (electrodes AF3/4, F3/4, F5/6) in the N2-latency range.**


*Responses are displayed separately for the two target-to-object assignments. SEM are given in parentheses. As in Table 1 cells containing responses given within one experimental condition are marked by identical gray-scale and line-style (dark-gray cells with solid outlines correspond to the "same-objectstandard-condition," light-gray cells with dashed outlines correspond to the "different-object-standard-condition"). Responses given to physically identically deviants and standards are contrasted line-by-line. Asterisks indicate significant differences between deviant- and standard-responses averaged over the two Target-to-object assignments (\*\*\*p* < *0.001).*

*Numerically the vMMN-amplitudes differed between the two target-to-object assignments (*−*2.83* ± *0.4*µ*V vs.* −*2.13* ± *0.4* µ*V when discs belonged to the same vs. different objects). However, within the present data this difference does not reach significance [interaction between the factors STIMULUS TYPE* × *TARGET-TO-OBJECT ASSIGNMENT F*(*1*, *<sup>15</sup>*) = *0.3, p* = *0.1 when we conducted the ANOVA for the posterior ROI only].*

deviant-specific activation. Deviant-specific activation was generated in the inferior temporal gyrus (50/−50, −62, −10, **Figure 3**, lower row, right column) and showed a right-hemispheric accentuation.

### **DISCUSSION**

In the present study, we investigated automatic visual object formation by testing whether task-irrelevant violations of object-based regularities are capable of (1) influencing implicit behavioral measures and (2) eliciting the vMMN—an automatic ERP-component which indexes the detection of a mismatch between the actual stimulus and a prediction generated on the basis of regularities extracted from the preceding sequence of stimuli (Kimura et al., 2011b). Importantly, in the present design violations of object-based regularities were exclusively related to the (non-salient) assignment of task-relevant elements of the display to the objects, i.e., there were no salient violations of feature-regularities. Indeed, our participants did not notice any object-related regularity or any violation of regularity within the sequence of stimuli. Nevertheless, task-irrelevant violations of object-related regularities resulted in increased reaction times. This result is in line with behavioral studies which showed that the regular organization of task-irrelevant background elements influences the processing of task-relevant items via perceptual grouping (e.g., Driver et al., 2001; Kimchi and Razpurker-Apfeld, 2004; Russell and Driver, 2005; Lamy et al., 2006; Kimchi and Peterson, 2008; Shomstein et al., 2010).

Extending these behavioral indicators, our study provides electrophysiological evidence for automatic visual object formation: compared with physically identical standard displays, those displays violating the regular target-to-object assignment elicited higher negative potentials in the 246–280 ms time window over posterio-temporal electrode positions. Latency and topography of this negative difference potential correspond to the characteristics of the vMMN elicited in an experiment designed to disentangle effects of sensory, N1-refractoriness-based deviance detection from genuine cognitive effects based on the violation of predictions (Kimura et al., 2009). Our results show that in the P1-N1 latency range ERPs elicited by deviant- vs. standarddisplays were nearly perfectly matched. Thus, we could convincingly show that the visual system is capable of detecting violations of higher-level regularities automatically even if those violations are not accompanied by N1-refractoriness-effects. Moreover, as we did not find any evidence for the elicitation of the P3a component—a component which is considered as an indicator of involuntary attention shifts (for a review see, Escera et al., 2000)—we conclude that the irregularities in our design indeed were detected without shifting attention toward the task-irrelevant aspects of the displays. In contrast, in visual studies investigating the processing of salient lower-order regularities/irregularities the elicitation of N1-differences/vMMN was accompanied by the elicitation of the P3a. For this reason in those studies the behavioral impairment observed in the processing of irregular stimuli was ascribed to costs related to involuntary attention shifts toward task-irrelevant aspects of the stimuli (Berti and Schröger, 2001, 2004, 2006; Kimura et al., 2008a,b). In contrast, in our design we observed increased reaction times for irregular displays compared with regular displays without an accompanying P3a (for similar results obtained in a visual multi-deviant design see, Grimm et al., 2009). Thus, the differences in the reaction times could be (at least partly) due to the facilitated processing of regular displays rather than the exclusive impaired processing of irregular displays. However, the elicitation of vMMN in our design suggests that the processing of irregular displays was associated with genuine costs, too.

So far, the automatic detection of higher-level regularities in the visual modality was shown by means of facial emotional expressions only [reviewed in Winkler and Czigler (2012)]. Our results show that the detection of higher-level regularities is not restricted to the ecologically highly important emotional expression of human faces but extends to rather general element-toobject assignments as regularities and irregularities in our design were solely defined on the basis of object-related characteristics. The elicitation of vMMN by object-related irregularities suggests that the process of object formation must have preceded the process of irregularity detection, i.e., our results support the notion of automatic visual object formation based on the Gestalt principle of common region. This conclusion fits to a recently published article reporting the elicitation of vMMN by violations of a conditional rule: task-irrelevant stimuli were presented pairwise in close temporal proximity with regularly both stimuli within one pair had the same color whereas irregularly the second stimulus of a pair took on a different color (Stefanics et al., 2011). As both colors occurred equiprobably within one experimental block regularities/irregularities were defined on the basis of the relation between the two elements of a pair (e.g., if the first stimulus is green then the second stimulus is green, too). Pairs of stimuli in this design can be seen as objects based on the Gestalt principle of temporal proximity. Thus, as in our study, object formation must have preceded the process of irregularity detection which suggests automatic visual object formation to be a more general mechanism. Such automatically formed object representations were recently suggested to be regarded as components of generative models which on the one hand predict the specifics of the upcoming stimulation and which on the other hand are modified by mismatches between the predicted and the actual stimulus (Winkler and Czigler, 2012).

In our study, we identified brain structures related to the violation of object-based regularities by computing SCD maps and applying VARETA. Our SCD maps show a bilateral occipital/occipito-temporal distribution of deviant-specific negative potentials in the 246–280 ms time range. Source analysis carried out by the VARETA technique localized our object-related vMMN to the posterior part of the inferior temporal gyrus (Brodmann's area 37). In numerous articles the inferior temporal gyrus—a structure belonging to the ventral pathway of visual information processing- was shown to be associated with high-order visual object processing in humans or macaques (e.g., Baizer et al., 1991; Goodale and Milner, 1992; Malach et al., 1995; Ishai et al., 1999; Haxby et al., 2001; Grill-Spector and Malach, 2004). The localization of object-related effects of irregularity detection in the inferior temporal gyrus corroborates our recently published localization data (Müller et al., 2012). Here, a combination of object-related and feature (color)-related irregularities generated activation in the inferior temporal gyrus, too. Moreover, also the aforementioned vMMN studies on facial emotional expressions, i.e., on material containing higher-order regularities, found activation related to regularity violation in the inferior temporal gyrus (Kimura et al., 2011a; Stefanics et al., 2012). In contrast, vMMN studies based on feature (orientation and/or color)-related regularities localized deviant-specific activity to earlier anatomical structures of the cortical visual system (occipital lobe—BA 19—Kimura et al., 2010; middle occipital gyrus—Urakawa et al., 2010a,b; occipital fusiform regions—BA 17, 18, 19/7—Yucel et al., 2007). The activation of different feature-/stimulus-specific cortical structures by different types of deviants parallels results from irregularity detection in the auditory modality (e.g., Alain et al., 1999; Rosburg, 2003; Grimm et al., 2006). Interestingly, in all of the vMMN studies cited above deviant-specific activity based on higher-order irregularities or on feature irregularities was additionally found in prefrontal cortical regions (mainly the inferior frontal/medial frontal cortex). It seems plausible to assume that this deviant-specific prefrontal activation indicates involuntary attention shifts toward ecologically relevant irregularities in either facial expression (Kimura et al., 2011a; Stefanics et al., 2012) or hand laterality (Stefanics et al., 2012) and toward salient feature irregularities, respectively (Kimura et al., 2010; Urakawa et al., 2010a,b; Yucel et al., 2007). In contrast, the source localization of our non-salient object-related

irregularities does not show a prefrontal activation, which might again underline that in our design there were no involuntary attentional shift and object-related information indeed was processed automatically. However, there are alternative suggestions regarding the functional role of the frontal generator of the auditory MMN (1) sensitivity tuning for irregularity detection in the auditory modality (e.g., Doeller et al., 2003), (2) inhibiting the tendency to respond to task-irrelevant auditory irregularities (Rinne et al., 2005), or (3) updating predictive models on the nature of upcoming stimuli (e.g., Garrido et al., 2009). The latter alternative is also taken into account for explaining the function of the frontal generators of the vMMN (Kimura et al., 2011a). The vMMN studies reporting combined cortical activation of feature-/stimulus-specific regions as well as of frontal regions suggest that the detection of irregularities in both the visual and the auditory modality works in a comparable hierarchically organized manner (for a model see Garrido et al., 2009). In contrast, our results as well as several studies on irregularity detection in the auditory modality suggest that irregular stimuli can elicit a mismatch response even without an accompanying frontal activation (for a review on the frontal generator of the auditory MMN see, Deouell, 2007). It remains a topic of further studies to investigate under which conditions irregularity detection is indicated by both stimulus-specific and frontal activation.

In conclusion, our results show (1) that object-based irregularities are automatically detected presumably by the visual subsystem encoding and/or processing object-related information. That is, we showed that object formation based on the Gestalt principle of common region must have occurred before the visual input was checked for the occurrence of regularities/irregularities. As the visual regularity extraction process was shown to work automatically (Berti, 2011; Kogai et al., 2011) we concluded that the process of object formation which in our design necessarily preceded the regularity extraction process should work automatically, too. Thus, our results support the notion of automatic visual object formation which parallels findings from the auditory modality for which the occurrence of automatic object formation also has been proved (e.g., Ritter et al., 2000; Atienza et al., 2003; Winkler et al., 2003; Sussman et al., 2007). (2) Although closely connected to our first conclusion we can state additionally that the detection of irregularities within sequences of visual stimuli is not restricted to salient stimulus attributes but also works for non-salient higher-order stimulus attributes thus emphasizing the sensitivity of processes extracting regularities from our environment.

# **ACKNOWLEDGMENTS**

This work was financially supported by the German Research Foundation (No. Schr 375/16). The experiment was realized using Cogent 2000 developed by the Cogent 2000 team at the FIL and the ICN and Cogent Graphics developed by John Romaya at the LON at the Wellcome Department of Imaging Neuroscience. We thank Katrin Bernhard and Jana Eliasova for assistance in data acquisition.

# **REFERENCES**


*J. Psychophysiol.* 21, 224–230. doi: 10.3389/fnhum.2011.00046


doi: 10.1146/annurev.neuro.27. 070203.144220


*Psychophysiology* 48, 4–22. doi: 10.1111/j.1469-8986.2010.01114.x


*Psychophys.* 67, 606–623. doi: 10.3758/BF03193518


in the perception of objects. *Cogn. Psychol.* 14, 107–141. doi: 10.1016/0010-0285(82)90006-8


*Sci. U.S.A.* 100, 11812–11815. doi: 10.1073/pnas.2031891100


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 16 March 2013; accepted: 23 May 2013; published online: 10 June 2013.*

*Citation: Müller D, Widmann A and Schröger E (2013) Object-related regularities are processed automatically: evidence from the visual mismatch negativity. Front. Hum. Neurosci. 7:259. doi: 10.3389/fnhum.2013.00259*

*Copyright © 2013 Müller, Widmann and Schröger. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

# Task difficulty affects the predictive process indexed by visual mismatch negativity

# *Motohiro Kimura\* and Yuji Takeda*

*Cognition and Action Research Group, Human Technology Research Institute, National Institute of Advanced Industrial Science and Technology, Tsukuba, Japan*

#### *Edited by:*

*István Czigler, Institute of Cognitive Neuroscience and Psychology, Hungary*

#### *Reviewed by:*

*Toshihiko Maekawa, Kyushu University, Japan János Horváth, Research Centre for Natural Sciences, Hungary*

#### *\*Correspondence:*

*Motohiro Kimura, Cognition and Action Research Group, Human Technology Research Institute, National Institute of Advanced Industrial Science and Technology, Central 6, 1-1-1 Higashi, Tsukuba, Ibaraki 305-8566, Japan e-mail: m.kimura@aist.go.jp*

Visual mismatch negativity (MMN) is an event-related brain potential (ERP) component that is elicited by prediction-incongruent events in successive visual stimulation. Previous oddball studies have shown that visual MMN in response to task-irrelevant deviant stimuli is insensitive to the manipulation of task difficulty, which supports the notion that visual MMN reflects attention-independent predictive processes. In these studies, however, visual MMN was evaluated in deviant-minus-standard difference waves, which may lead to an underestimation of the effects of task difficulty due to the possible superposition of N1-difference reflecting refractory effects. In the present study, we investigated the effects of task difficulty on visual MMN, less contaminated by N1-difference. While the participant performed a size-change detection task regarding a continuously-presented central fixation circle, we presented oddball sequences consisting of deviant and standard bar stimuli with different orientations (9.1 and 90.9%) and equiprobable sequences consisting of 11 types of control bar stimuli with different orientations (9.1% each) at the surrounding visual fields. Task difficulty was manipulated by varying the magnitude of the size-change. We found that the peak latencies of visual MMN evaluated in the deviant-minus-control difference waves were delayed as a function of task difficulty. Therefore, in contrast to the previous understanding, the present findings support the notion that visual MMN is associated with attention-demanding predictive processes.

**Keywords: attention, event-related brain potential, perceptual load, predictive process, prediction error, task difficulty, visual mismatch negativity**

# **INTRODUCTION**

**PREDICTIVE PROCESSES INDEXED BY VISUAL MISMATCH NEGATIVITY** The ability to extract sequential rules embedded in the temporal structure of sensory events and to predict upcoming sensory events based on the extracted sequential rules is crucial for successful adaptation to the external environment (e.g., Mumford, 1992; Friston, 2003, 2005). Recent electrophysiological studies have shown that such predictive processes in vision are well reflected by visual mismatch negativity (MMN), an event-related brain potential (ERP) component (for reviews, see Pazo-Alvarez et al., 2003; Czigler, 2007; Kimura et al., 2011; Kimura, 2012; Winkler and Czigler, 2012). Visual MMN is a negative-going ERP component with a posterior scalp distribution that usually emerges at around 150–400 ms after the onset of visual events. This component has been most typically observed in response to infrequent deviant stimuli that are randomly inserted among frequent standard stimuli (i.e., an oddball sequence). Importantly, however, the elicitation of visual MMN is not limited to such physically deviant stimuli, but rather includes a variety of stimuli that violate concrete or abstract sequential rules (e.g., Czigler et al., 2006; Kimura et al., 2010b, 2012; Stefanics et al., 2011). This leads to the notion that visual MMN emerges when a current visual event is incongruent with visual events that are predicted on the basis of extracted sequential rules (i.e., prediction error account of visual MMN; Kimura et al., 2011; Kimura, 2012).

#### **ATTENTION-INDEPENDENT PREDICTIVE PROCESSES**

One of the unique aspects of visual MMN elicitation is its automaticity. In most previous studies, visual MMN has been observed in response to deviant stimuli when oddball sequences are unrelated to the task and are not actively attended by the participant. This indicates that the elicitation of visual MMN is largely automatic and obligatory. This notion is further strengthened by the finding that visual MMN elicited by task-irrelevant deviant stimuli is insensitive to the manipulation of task difficulty (Heslenfeld, 2003; Pazo-Alvarez et al., 2004). Heslenfeld (2003) presented task-irrelevant oddball sequences consisting of deviant and standard grating stimuli with different spatial frequencies at the peripheral visual fields while the participant performed a visuo-motor tracking task that involved a small, continuously moving rectangle presented at the central visual field. The difficulty of the tracking task was manipulated among three levels (easy, moderate, and difficult) by varying the speed and frequency of changes in direction of the moving rectangle. Visual MMN elicited by deviant stimuli did not differ as a function of task difficulty. Pazo-Alvarez et al. (2004) obtained similar results. They presented task-irrelevant oddball sequences consisting of deviant and standard grating stimuli with different directions of motion at the peripheral visual fields while the participant performed a discrimination task that involved small colored digits discretely presented at the central visual field. The difficulty of the discrimination task was manipulated between two levels (easy and difficult) by asking the participant to perform either a task that involved the digit numbers (easy) or a task that involved the combination of both the digit numbers and the color of digits (difficult). Visual MMN elicited by deviant stimuli did not differ between the two task-difficulty conditions. According to the perceptual load theory of attention (Lavie and Tsal, 1994; Lavie, 1995, 2005), the task difficulty in the perceptual discrimination (i.e., perceptual load) of task-relevant information is one of the critical factors that determine the amount of attentional allocation to peripherally presented task-irrelevant information. Therefore, the lack of a task difficulty effect suggests that visual MMN is insensitive to the amount of attentional allocation, which leads to the notion that visual MMN reflects attention-independent predictive processes.

#### **PRESENT STUDY**

Although the results described by Heslenfeld (2003) and Pazo-Alvarez et al. (2004) support the notion that attentionindependent predictive processes underlie visual MMN, this idea needs to be studied further. In these previous studies, visual MMN was evaluated by comparing ERPs elicited by infrequent deviant stimuli to those elicited by frequent standard stimuli (i.e., deviant-minus-standard difference waves). However, more recent studies have questioned the validity of this comparison for the evaluation of visual MMN (see e.g., Czigler, 2007; Kimura et al., 2011; Kimura, 2012). This is because, due to the large difference in probability between deviant and standard stimuli, the state of refractoriness (or the level of habituation) of afferent neurons that specifically respond to the feature value of deviant stimuli can be drastically lower than that of afferent neurons that specifically respond to the feature value of standard stimuli. In other words, the amplitudes of visual evoked potentials (in particular, N1) in response to deviant stimuli can be substantially greater than those of N1 in response to standard stimuli. As a result, the classical visual MMN extracted in deviant-minus-standard difference waves [we refer to this effect as deviant-related negativity (DRN)] can include not only visual MMN elicited by deviant stimuli (i.e., prediction error effects) but also the N1 difference between deviant and standard stimuli (i.e., refractory effects) (for a more detailed discussion, see e.g., Czigler et al., 2002; Kenemans et al., 2003; Kimura et al., 2009). If we consider that these two effects often overlap each other both spatially and temporally in deviant-minus-standard difference waves (see e.g., Maekawa et al., 2005; Kimura et al., 2009, 2010b), it is possible that the effects of task difficulty on visual MMN have been underestimated in previous studies (Heslenfeld, 2003; Pazo-Alvarez et al., 2004). For example, if we assume that visual MMN and N1-difference contribute to DRN, as illustrated in **Figure 1A** (for similar empirical data, see e.g., Kimura et al., 2009, 2010b), neither the reduction of amplitudes (**Figure 1B**) nor the delay of latencies of visual MMN (**Figure 1C**) may be detected, at least with common ERP analyses that focus on the peak of DRN.

By considering this possibility, in the present study, we examined the effects of task difficulty on visual MMN, less contaminated by N1-difference. We used a so-called "equiprobable" protocol that allows for the reliable dissociation of visual MMN and N1-difference (e.g., Kimura et al., 2009; for the original protocol, see Schröger and Wolff, 1996; Schröger, 1997; Jacobsen and Schröger, 2001). While the participant performed a size-change detection task for a small fixation circle that was continuously presented at the central visual field, we presented either (1) typical oddball sequences consisting of the randomized presentation of deviant and standard bar stimuli with different orientations (e.g., 5.0 and 37.7◦ to the right from the horizontal; 9.1 and 90.9%, respectively) or (2) equiprobable sequences consisting of the randomized presentation of 11 types of equiprobable control bar stimuli with different orientations (5.0, 21.4, 37.7, 54.1, 70.5, 86.8, 103.2, 119.5, 135.9, 152.3, and

**FIGURE 1 | (A)** A schematic illustration of DRN, visual MMN, and N1-difference. **(B)** The modeled DRN as the sum of visual MMN with reduced amplitudes and N1-difference. **(C)** The modeled DRN as the sum of visual MMN with delayed latencies and N1-difference.

168.6◦; 9.1% each) at the surrounding visual fields in separate blocks (see **Figure 2B**). In this protocol, while the deviant stimuli should elicit visual MMN, the standard and control stimuli should not. This is because the standard and control stimuli do not violate any sequential rule. In addition, N1 elicited by the deviant stimuli should be equal to (or even smaller than) N1 elicited by the control stimuli, and should be greater than N1 elicited by the standard stimuli. This is because the probability of the deviant and control stimuli is kept the same (9.1%) and is lower than the probability of the standard stimuli (90.9%), and further, the physical separation among control stimuli (ca. 45.0◦, on average) is kept greater than that between deviant and standard stimuli (ca. 32.7◦) (for more detailed information, see Schröger and Wolff, 1996; Schröger, 1997; Jacobsen and Schröger, 2001; Kimura et al., 2009, 2010b). Thus, visual MMN (and possibly a small polarity-reversed N1-difference) should be extracted by comparing ERPs elicited by the deviant stimuli to those elicited by the control stimuli (i.e., deviantminus-control difference waves), while N1-difference should be extracted by comparing ERPs elicited by the control stimuli to those elicited by the standard stimuli (i.e., control-minusstandard difference waves). The difficulty of the size-change detection task for the central fixation circle was manipulated among three levels (easy, moderate, and difficult) by varying the magnitude of the size-change. With this experimental design, we examined the effects of task difficulty on visual MMN as well as DRN and N1-difference, and investigated whether or not the predictive processes reflected by visual MMN are truly attention-independent.

# **METHODS**

#### **PARTICIPANTS**

Twenty-two undergraduate and graduate university students (7 women, 15 men; age range = 20–25 years, mean = 21.5 years) participated in this experiment. Twenty-one participants were right-handed and one was left-handed. All participants had normal or corrected-to-normal vision and were free of neurological or psychiatric disorders. Written informed consent was obtained

for the central fixation circle in three task-difficulty conditions (easy, moderate, and difficult).

from each participant after the nature of the study had been explained. The experiment was approved by the National Institute of Advanced Industrial Science and Technology (AIST) Safety and Ethics committee.

#### **STIMULI AND PROCEDURE**

All stimuli were presented on a 17-inch cathode ray tube (CRT) display (Sony, Trinitron Multiscan G220), which was controlled by programs written in MATLAB (Mathworks, Inc.) with the Psychophysics Toolbox (Brainard, 1997; Pelli, 1997) installed on a computer (Apple, MacPro 1,1; NVIDIA, GeForce 7300GT). **Figure 2A** shows an example of the stimulus display consisting of a central fixation circle and surrounding bars. Eleven types of bar stimuli were used (**Figure 2B**). Each bar stimulus consisted of eight gray bars (luminance of 14.5 cd/m<sup>2</sup> and visual angle of 3.0◦ (length) × 0.4◦ (width) from a viewing distance of 70 cm, respectively) at eight surrounding locations (3.3◦ upper, lower, left, and right and 4.6◦ upper-left, upper-right, lower-left, and lower-right from the center of the display to the center of each bar, respectively) against a black background (luminance of 0.1 cd/m2). The 11 types of surrounding bar stimuli differed in the orientation of the bars (5.0, 21.4, 37.7, 54.1, 70.5, 86.8, 103.2, 119.5, 135.9, 152.3, and 168.6◦ to the right from the horizontal (ca. 16.4◦ step), respectively). The exposure duration of the surrounding bar stimuli was fixed at 250 ms and the stimulus onset asynchrony was fixed at 500 ms (i.e., the inter-stimulus interval was fixed at 250 ms) in all conditions.

These surrounding bar stimuli were presented in 23 types of stimulus sequences (**Figure 2B**): 22 oddball sequences and one equiprobable sequence. In the oddball sequences, two types of surrounding bar stimuli (deviant and standard, 11 and 110 times/block, i.e., 9.1 and 90.9%, respectively) were presented in random order, with the exception that a standard stimulus was presented at least 11 times at the beginning of each block and each deviant stimulus was followed by at least one standard stimulus. In the equiprobable sequences, 11 types of surrounding bar stimuli (control, 11 times/block each, i.e., 9.1% each) were presented in random order, with the exception that each control stimulus was followed by at least one control stimulus with a different orientation. Through the use of these stimulus sequences, we could ensure that, on average, the physical properties of deviant, standard, and control stimuli were the same, which allowed us to evaluate visual MMN as well as N1-difference and DRN without any contamination by the effects of physical differences in the eliciting stimuli. Also, in these stimulus sequences, the probability of control stimuli was kept the same as that of deviant stimuli (9.1%), and the physical separation among control stimuli (ca. 45.0◦, on average) was kept greater than that between deviant and standard stimuli (ca. 32.7◦), which guaranteed that the state of refractoriness for control stimuli was equal to (or may be even lower than) that for deviant stimuli.

In addition to the surrounding bar stimuli, a gray fixation circle (luminance of 14.5 cd/m2 and visual angle of 1.<sup>1</sup> <sup>×</sup> <sup>1</sup>.1◦) was continuously presented at the center of the display throughout the blocks (**Figure 2A**). From time to time, the size of the fixation circle suddenly became smaller. The mean frequency of the size-change was four times/block (ranging from three to five times/block) and the exposure duration of the size-changed fixation circle was 100 ms in all conditions. To ensure that the timing of the size-change was independent of the surrounding bar stimulation, we segmented the whole period of each block into consecutive 50-ms intervals and randomly selected two consecutive intervals for the size-change (i.e., 100 ms), with the exception that the size-change did not occur within the 5.5-s interval at the beginning of each block (where the first 11 surrounding bar stimuli were presented) and at least a 1.5-s interval was inserted between a size-change and the subsequent size-change. To manipulate the task difficulty, three levels of magnitude of the size-change were used in separate blocks (**Figure 2C**): from 1.1 × 1.1◦ to 0.6 × 0.6◦ in the easy condition, to 0.8 × 0.8◦ in the moderate condition, and to 1.0 × 1.0◦ in the difficult condition.

The experiment consisted of 66 blocks (11 blocks for the easy oddball condition, 11 blocks for the moderate oddball condition, 11 blocks for the difficult oddball condition, 11 blocks for the easy equiprobable condition, 11 blocks for the moderate equiprobable condition, and 11 blocks for the difficult equiprobable condition), each of which consisted of the presentation of 121 surrounding bar stimuli. For half of the participants, oddball sequences #1–11 and the equiprobable sequence (**Figure 2B**) were used, while for the other half of the participants, oddball sequences #12–22 and the equiprobable sequence (**Figure 2B**) were used. The order of these blocks was randomized across participants.

The participant was seated in a reclining chair in a soundattenuated and electrically-shielded dimly lit room. Before the start of the experiment, the participant was instructed to focus on a fixation circle, ignore surrounding bars, and press a button with the right index finger as quickly and accurately as possible when the fixation circle became smaller. The participant was also asked to minimize any eye movement and blinking during each block. Before the start of each block, the participant was informed about the magnitude of the size-change of the fixation circle in the upcoming block (i.e., large, medium, or small).

#### **RECORDINGS**

The electroencephalogram (EEG) was recorded with a digital amplifier (Nihon-Kohden, Neurofax EEG1100) and silver-silver chloride electrodes placed at 26 scalp sites (Fp1, Fp2, F7, F3, Fz, F4, F8, FCz, T3, C3, Cz, C4, T4, T5, P3, Pz, P4, T6, PO7, PO3, POz, PO4, PO8, O1, Oz, and O2 according to the extended International 10–20 System). All electrodes were referenced to the nose tip. To monitor blinks and eye movements, vertical and horizontal electrooculograms (EOGs) were also recorded with two electrodes above and below the right eye and two electrodes at the right and left outer canthi of the eyes, respectively. The impedance of all electrodes was kept below 10 k-. The EEG and EOG signals were digitized at a sampling rate of 1000 Hz and bandpass-filtered at 1–30 Hz with a finite impulse response (FIR) filter. The EEG and EOG signals time-locked to the onset of surrounding bar stimuli were then averaged for nine categories defined by three stimulus types (deviant, standard, and control) and 3 task difficulties (easy, moderate, and difficult). Averaging epochs were 600 ms featuring a 100-ms pre-stimulus baseline. In the averaging procedure, (1) the first three epochs in each block, (2) epochs during which the size-change of the fixation circle occurred and the two subsequent epochs, (3) epochs during which the participant made a button press and the two subsequent epochs, (4) epochs preceded by deviant stimuli, and (5) epochs in which the signal changes exceeded ± 80μV on any of the electrodes, were excluded. As a result, the averaging number for deviant, standard, and control stimuli was, on average, 89, 735, and 921 times for the easy condition, 89, 735, and 919 times for the moderate condition, and 91, 745, and 928 times for the difficult condition, respectively.

#### **DATA ANALYSIS**

#### *Behavioral performance*

Behavioral performance was measured in terms of reaction time (ms), hit rate (%), and false alarm (times/block). Responses were scored as a hit if the button was pressed within 200–1000 ms after the onset of the change in the fixation circle. Responses outside this period were classified as a false alarm. These measures were subjected to repeated-measures ANOVAs with two factors: 2 Sequences (Oddball vs. Equiprobable) and 3 Task difficulties (Easy, Moderate, vs. Difficult). The Greenhouse–Geisser ε correction for the violation of sphericity was applied when appropriate. Effect sizes were calculated as partial eta squared (η2). *Post-hoc* comparisons involved paired *t*-tests with the Bonferroni correction.

#### *ERPs and difference waves*

Grand-average deviant-minus-standard difference waves were calculated for the three task-difficulty conditions. In the difference waves, a bilateral parieto-occipital (PO7 and PO8) maximum negativity (DRN) that peaked at 188 ms (easy condition, PO8), 196 ms (moderate condition, PO8), and 203 ms (difficult condition, PO8) was observed. To decompose DRN into N1-difference and visual MMN (and possibly, small polarityreversed N1-difference), grand-average control-minus-standard and deviant-minus-control difference waves were then calculated for the three task-difficulty conditions, respectively. In the control-minus-standard difference waves, a bilateral parietooccipital (PO7 and PO8) maximum negativity (N1-difference) that peaked at 193 ms (easy condition, PO8), 196 ms (moderate condition, PO8), and 197 ms (difficult condition, PO8) was observed. In the deviant-minus-control difference waves, a right parieto-occipital (PO8) maximum negativity (visual MMN) that peaked at 186 ms (easy condition, PO8), 193 ms (moderate condition, PO8), and 225 ms (difficult condition, PO8) was observed.

#### *Scalp distributions of N1-difference and visual MMN*

To compare the scalp distributions of N1-difference and visual MMN, the mean amplitudes of the control-minus-standard and deviant-minus-standard difference waves (within the 11 ms time-windows including ± 5 ms from the corresponding peak) at 13 posterior electrodes in the three task-difficulty conditions were subjected to repeated-measures ANOVAs with three factors: 2 Difference waves (Control-minus-standard vs. Deviant-minus-control), 13 Electrodes (T5, P3, Pz, P4, T6, PO7, PO3, POz, PO4, PO8, O1, Oz, vs. O2), and 3 Task difficulties (Easy, Moderate, vs. Difficult). Further, the same analysis was performed on the amplitude values that were normalized by vector length, where, for each of the six conditions defined by two difference waves and three task-difficulty conditions, each amplitude value was divided by the square root of the sum of the squared amplitudes over the 13 electrode locations (McCarthy and Wood, 1985). The Greenhouse–Geisser ε correction for the violation of sphericity was applied when appropriate. Effect sizes were calculated as partial η2. *Posthoc* comparisons involved paired *t*-tests with the Bonferroni correction.

#### *Mean amplitudes of DRN, N1-difference, and visual MMN*

To test the significance of the elicitation of DRN, N1-difference, and visual MMN, the mean amplitudes of the deviant-minusstandard, control-minus-standard, and deviant-minus-control difference waves (within the 11-ms time-windows including ± 5 ms from the corresponding peak) at an electrode (PO8, where these components had the maximum amplitudes) in the three task-difficulty conditions were subjected to one-tailed paired *t*-tests. The effect sizes are presented as *d*-values. Further, to compare the mean amplitudes of each component among the three task-difficulty conditions, the mean amplitudes of each component were subjected to repeated-measures ANOVAs with one factor: 3 Task difficulties (Easy, Moderate, vs. Difficult). The Greenhouse–Geisser ε correction for the violation of sphericity was applied. Effect sizes were calculated as partial η2. *Posthoc* comparisons involved paired *t*-tests with the Bonferroni correction.

#### *Peak latencies of DRN, N1-difference, and visual MMN*

To estimate the peak latencies of DRN, N1-difference, and visual MMN, a jackknife method was used (Miller et al., 1998; Ulrich and Miller, 2001; Kiesel et al., 2008). With regard to 22 sub-grandaverage difference waves at PO8 electrode for each component in each task-difficulty condition, the peak latency was evaluated as the time at which the difference waves reached the peak amplitude of each component, within 100–300 ms after stimulus onset. To compare the peak latencies of each component among the three task-difficulty conditions, the evaluated peak latencies of each component were then subjected to repeated-measures ANOVAs with one factor: 3 Task difficulties (Easy, Moderate, vs. Difficult). The Greenhouse–Geisser ε correction for the violation of sphericity was applied. The *F*-values were corrected according to Ulrich and Miller (2001). The effect sizes are shown as partial η2. *Post-hoc* comparisons involved paired *t*-tests with the Bonferroni correction.

#### *Mean amplitudes of visual evoked potentials*

To examine the effects of task difficulty on visual evoked potentials, the mean amplitudes of standard and control ERPs (within each of 20 consecutive 10-ms time-windows from 100 to 300 ms) at the PO8 electrode in the three task-difficulty conditions were subjected to repeated-measures ANOVAs with two factors: 2 Stimuli (Standard vs. Control) and 3 Task difficulties (Easy, Moderate, vs. Difficult). The Greenhouse–Geisser ε correction for the violation of sphericity was applied. Effect sizes were calculated as partial η2. *Post-hoc* comparisons involved paired *t*-tests with the Bonferroni correction.

# **RESULTS**

#### **BEHAVIORAL PERFORMANCE**

The mean reaction time in the oddball condition was 459 ms (*SD* = 79), 463 ms (77), and 480 ms (77), while that in the equiprobable condition was 459 ms (78), 467 ms (76), and 479 ms (79), in the easy, moderate, and difficult conditions, respectively. Two-Way ANOVAs (2 Sequences × 3 Task difficulties) revealed a significant main effect of Task difficulty [*F*(2, <sup>42</sup>) = <sup>11</sup>.49, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>ε</sup> <sup>=</sup> <sup>0</sup>.97, partial <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.35]. *Post-hoc* comparisons showed that the reaction time in the difficult condition was longer than those in both the easy (*p* < 0.001) and moderate conditions (*p* < 0.05). The hit rate in the oddball condition was 95.4% (*SD* = 5.3), 94.5% (6.9), and 88.4% (11.9), while that in the equiprobable condition was 95.3% (6.3), 93.4% (8.6), and 87.6% (10.9), in the easy, moderate, and difficult conditions, respectively. Two-Way ANOVAs (2 Sequences × 3 Task difficulties) revealed a significant main effect of Task difficulty [*F*(2, <sup>42</sup>) <sup>=</sup> <sup>21</sup>.24, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>ε</sup> <sup>=</sup> <sup>0</sup>.74, partial <sup>η</sup><sup>2</sup> <sup>=</sup> 0.50]. *Post-hoc* comparisons showed that the hit rate in the difficult condition was lower than those in both the easy (*p* < 0.001) and moderate conditions (*p* < 0.01). The false alarm was negligible in all conditions (on average, less than 0.1 times/block). Two-Way ANOVAs (2 Sequences × 3 Task difficulties) revealed no significant effects (*F*s < 1.0).

#### **ERPS AND DIFFERENCE WAVES**

**Figure 3** shows the grand-average ERPs and EOGs in response to deviant, standard, and control stimuli in the easy (left column), moderate (middle column), and difficult conditions (right column). **Figure 4A** (left column) shows the traditional, grand-average deviant-minus-standard difference waves in the three task-difficulty conditions. A posterior negativity (DRN) that peaked at 188 ms (easy condition, PO8), 196 ms (moderate condition, PO8), and 203 ms (difficult condition, PO8) was observed. **Figure 4A** (middle column) shows the grand-average control-minus-standard difference waves in the three taskdifficulty conditions. A posterior negativity (N1-difference)

**FIGURE 3 | Grand-average ERPs and EOGs in response to deviant, standard, and control stimuli in the easy (left column), moderate (middle column), and difficult conditions (right column).**

**FIGURE 4 | (A)** Grand-average deviant-minus-standard difference waves (left column), grand-average control-minus-standard difference waves and topographical maps of N1-difference (middle column), and grand-average deviant-minus-control difference waves and topographical maps of visual MMN (right column), in the three task-difficulty conditions (easy, moderate, and difficult). **(B)** Grand-average mean amplitudes of DRN, N1-difference,

and visual MMN in the three task-difficulty conditions (electrode: PO8). Error bars indicate standard errors of the mean. **(C)** Grand-average peak latencies of DRN, N1-difference, and visual MMN in the three task-difficulty conditions (electrode: PO8). Error bars indicate standard errors of the mean with a jackknife method. Asterisks indicate a significant difference (*p* < 0.01).

that peaked at 193 ms (easy condition, PO8), 196 ms (moderate condition, PO8), and 197 ms (difficult condition, PO8) was observed. **Figure 4A** (right column) shows the grand-average deviant-minus-control difference waves in the three

task-difficulty conditions. A posterior negativity (visual MMN) that peaked at 186 ms (easy condition, PO8), 193 ms (moderate condition, PO8), and 225 ms (difficult condition, PO8) was observed; there was no clear sign of polarity-reversed N1.

#### **SCALP DISTRIBUTIONS OF N1-DIFFERENCE AND VISUAL MMN**

**Figure 4A** (middle and right columns) also shows topographical maps of N1-difference and visual MMN in the three task-difficulty conditions (within the 11-ms time-windows including ± 5 ms from the corresponding peak), respectively. The N1-difference had a scalp distribution that peaked at bilateral parieto-occipital electrodes (PO7 and PO8), while the visual MMN had a scalp distribution that peaked at a right parietooccipital electrode (PO8), regardless of the task-difficulty condition. Three-Way ANOVAs (2 Difference waves × 13 Electrodes × 3 Task difficulties) performed on the mean amplitudes of difference waves (within the 11-ms time-windows including ± 5 ms from the corresponding peak) revealed significant main effects of Difference wave [*F*(1, <sup>21</sup>) <sup>=</sup> <sup>12</sup>.96, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.01, partial <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.38] and Electrode [*F*(12, <sup>252</sup>) = 26.21, *p* < 0.001, ε = 0.19, partial <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.56], as well as a significant interaction of Difference wave <sup>×</sup> Electrode [*F*(12, <sup>252</sup>) <sup>=</sup> <sup>7</sup>.12, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>ε</sup> <sup>=</sup> <sup>0</sup>.25, partial <sup>η</sup><sup>2</sup> <sup>=</sup> 0.25]. Importantly, the significant interaction of Difference wave × Electrode was also present in the same Three-Way ANOVAs performed on the normalized mean amplitudes [*F*(12, <sup>252</sup>) = <sup>4</sup>.28, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.01, <sup>ε</sup> <sup>=</sup> <sup>0</sup>.26, partial <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.17]. *Post-hoc* comparisons revealed that the interaction mainly arose from the fact that the scalp distribution of N1-difference was bi-lateralized, while that of visual MMN was more right-lateralized.

#### **MEAN AMPLITUDES OF DRN, N1-DIFFERENCE, AND VISUAL MMN**

**Figure 4B** shows the grand-average mean amplitudes of DRN, N1-difference, and visual MMN in the three task-difficulty conditions (within the 11-ms time-windows including ± 5 ms from the corresponding peak at PO8 electrode). For the DRN, the mean amplitude was −3.04μV (*SE* = 0.34) in the easy condition, −3.14μV (0.37) in the moderate condition, and −2.62μV (0.23) in the difficult condition. One-tailed paired *t*-tests showed that DRN was significantly elicited in the easy [*t*(21) = −8.79, *p* < 0.001, *d* = 1.87], moderate [*t*(21) = −8.59, *p* < 0.001, *d* = 1.83], and difficult conditions [*t*(21) = −11.24, *p* < 0.001, *d* = 2.39]. However, a One-Way ANOVA (3 Task difficulties) revealed no significant effect (*F* = 1.0). For the N1-difference, the mean amplitude was −1.90μV (0.27) in the easy condition, −2.25μV (0.29) in the moderate condition, and −2.15μV (0.26) in the difficult condition. One-tailed paired *t*-tests showed that N1-difference was significantly elicited in the easy [*t*(21) = −6.92, *p* < 0.001, *d* = 1.48], moderate [*t*(21) = −7.80, *p* < 0.001, *d* = 1.66], and difficult conditions [*t*(21) = −8.26, *p* < 0.001, *d* = 1.76]. However, a One-Way ANOVA (3 Task difficulties) revealed no significant effect (*F* = 1.2). For the visual MMN, the mean amplitude was −1.17μV (0.33) in the easy condition, −0.89μV (0.34) in the moderate condition, and −0.93μV (0.31) in the difficult condition. One-tailed paired *t*-tests showed that visual MMN was significantly elicited in the easy [*t*(21) = −3.56, *p* < 0.01, *d* = 0.76], moderate [*t*(21) = −2.64, *p* < 0.05, *d* = 0.56], and difficult conditions [*t*(21) = −2.97, *p* < 0.01, *d* = 0.63]. However, a One-Way ANOVA (3 Task difficulties) revealed no significant effect (*F* = 1.9).

### **PEAK LATENCIES OF DRN, N1-DIFFERENCE, AND VISUAL MMN**

Peak latencies were calculated by a jackknife method (Miller et al., 1998; Ulrich and Miller, 2001; Kiesel et al., 2008). **Figure 4C** shows the grand-average peak latencies of DRN, N1-difference, and visual MMN in the three task-difficulty conditions (PO8 electrode). For the DRN, the peak latency was 188.7 ms (*SE* = 0.36) in the easy condition, 197.2 ms (0.25) in the moderate condition, and 203.5 ms (0.33) in the difficult condition. A One-Way ANOVA (3 Task difficulties) revealed no significant effect (*F*corrected = 1.3). For the N1-difference, the peak latency was 194.3 ms (0.36) in the easy condition, 197.6 ms (0.16) in the moderate condition, and 197.9 ms (0.10) in the difficult condition. A One-Way ANOVA (3 Task difficulties) revealed no significant effect (*F*corrected = 1.0). For the visual MMN, the peak latency was 185.9 ms (0.22) in the easy condition, 194.5 ms (0.81) in the moderate condition, and 226.2 ms (0.24) in the difficult condition. A One-Way ANOVA (3 Task difficulties) revealed a main effect of Task difficulty [*F*corrected(2, <sup>42</sup>) = 4.35, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.05, <sup>ε</sup> <sup>=</sup> <sup>0</sup>.64, partial <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.17]. *Post-hoc* comparisons showed that the peak latency of visual MMN was longer in the difficult condition than in the easy condition [*t*corrected(21) = 5.59, *p* < 0.01].

#### **MEAN AMPLITUDES OF VISUAL EVOKED POTENTIALS**

**Figure 5A** shows the grand-average ERPs and EOGs in response to standard (left column) and control stimuli (right column) in the three task-difficulty conditions. **Figure 5B** shows the results of Two-Way ANOVAs (2 Stimuli × 3 Task difficulties) that were performed on the mean amplitudes of ERPs elicited by standard and control stimuli (within each of 20 consecutive 10-ms time-windows from 100 to 300 ms). Reflecting the larger N1 in response to control stimuli compared to standard stimuli (see the control-minus-standard difference waves shown in the middle panel of **Figure 4A**), a significant main effect of Stimulus was revealed for the 14 consecutive 10-ms time-windows from 140 to 280 ms [*F*s(1, <sup>21</sup>) = 6.44–88.01, *p*s < 0.05–0.001, partial <sup>η</sup>2s <sup>=</sup> <sup>0</sup>.24–0.81]. With regard to the Task-difficulty factor, a significant main effect of Task difficulty was revealed for the 3 consecutive 10-ms time-windows from 120 to 150 ms (i.e., the latency range of P1) [*F*s(2, <sup>42</sup>) = 4.04–5.68, *p*s < 0.05–0.001, <sup>ε</sup><sup>s</sup> <sup>=</sup> <sup>0</sup>.93–0.96, partial <sup>η</sup>2s <sup>=</sup> <sup>0</sup>.16–0.21]. *Post-hoc* comparisons showed that P1 elicited by both standard and control stimuli was smaller in the difficult condition than in the easy condition (*p*s < 0.05). Further, a significant interaction of Stimulus × Task difficulty was revealed for the 2 consecutive 10-ms time-windows from 110 to 130 ms (i.e., the latency range of P1) [*F*s(2, <sup>42</sup>) = <sup>3</sup>.56–4.78, *<sup>p</sup>*<sup>s</sup> <sup>&</sup>lt; <sup>0</sup>.05, <sup>ε</sup><sup>s</sup> <sup>=</sup> <sup>0</sup>.89–0.92, partial <sup>η</sup>2s <sup>=</sup> <sup>0</sup>.15–0.19]. *Post-hoc* comparisons showed that P1 elicited by control stimuli was smaller in the difficult condition than in the easy condition (*p*s < 0.05), while P1 elicited by standard stimuli did not differ among the three task-difficulty conditions. Importantly, there was no significant main effect or interaction related to the Task-difficulty factor for the time-windows from 150 to 300 ms (i.e., the latency range of N1 and P2, where DRN, N1-difference, and visual MMN were emerged in the difference waves, see **Figure 4A**).

# **DISCUSSION**

Previous studies have shown that DRN (most likely, consisting of visual MMN and N1-difference) is insensitive to the manipulation of task difficulty (Heslenfeld, 2003; Pazo-Alvarez et al.,

2004), which supported the notion that visual MMN reflects attention-independent predictive processes. By taking into account the possible underestimation of the effect of task difficulty on visual MMN due to the superposition of N1-difference, we examined the effects of task difficulty on visual MMN, less contaminated by N1-difference, and investigated whether or not the predictive processes indexed by visual MMN are truly attention-independent.

#### **EFFECTS OF TASK DIFFICULTY ON VISUAL MMN**

Behavioral performance in the size-change detection task for the central fixation circle deteriorated (i.e., reaction times became slower and hit rates decreased) as the task difficulty increased, although no significant difference was observed between the easy and moderate conditions. These results confirm that the difficulty of the size-change detection task was successfully manipulated by varying the magnitude of the size-change (at least there was a difference between the difficult condition and the other two conditions).

In traditional deviant-minus-standard difference waves, a posterior negativity was observed at around 100–300 ms (DRN). The latency and scalp distribution are highly similar to those of DRN observed in previous studies (see e.g., Pazo-Alvarez et al., 2003; Czigler, 2007; Kimura, 2012). The DRN was then decomposed into N1-difference and visual MMN. In control-minus-standard difference waves, a posterior negativity at around 100–300 ms with no clear hemispheric dominance (N1-difference) was observed, while in deviant-minus-control difference waves, a posterior negativity at around 150–300 ms with clear right hemispheric dominance (visual MMN) was observed; there was no clear sign of polarity-reversed N1 in deviant-minus-control difference waves. The latency and scalp distribution of these two components are similar to those of N1-difference and visual MMN observed in recent studies, respectively (e.g., Kimura et al., 2009, 2010b). The significantly different scalp distributions of these two components are also consistent with the recent finding that N1-difference evaluated in control-minus-standard difference waves and visual MMN evaluated in deviant-minus-control difference waves are generated from distinct cortical areas (Kimura et al., 2010a). Further, the clear right hemispheric dominance observed for the latter posterior negativity is a characteristic of visual MMN (e.g., Kimura et al., 2009, 2010b, 2012). These observations suggest that DRN would be decomposed into N1-difference reflecting refractory effects and visual MMN reflecting prediction error effects.<sup>1</sup>

Neither the mean amplitudes nor the peak latencies of DRN were affected by the task difficulty, which is consistent with previous studies which showed that task difficulty does not affect DRN (Heslenfeld, 2003; Pazo-Alvarez et al., 2004). Task difficulty also did not affect the mean amplitudes or peak latencies of N1-difference. This result implies that task difficulty did not significantly influence the refractoriness state of afferent neurons that engage in N1. Unlike the findings regarding DRN and N1 difference, while task difficulty did not affect the mean amplitudes of visual MMN, it did affect the peak latencies of visual MMN: the peak latencies were significantly delayed in the difficult condition compared to the easy condition. This result implies that, while task difficulty did not significantly influence visual MMN elicitation itself, it strongly influenced the speed (or efficiency) of visual MMN elicitation.

The delay of peak latencies of visual MMN is not attributable to the modulation of visual evoked potentials elicited by control stimuli as a function of task difficulty. The amplitudes of ERPs elicited by control as well as standard stimuli in the latency range of P1 were slightly smaller in the difficult condition than in the easy condition: 110–150 ms for the control stimuli and 130–150 ms for the standard stimuli. This result is consistent with previous studies which demonstrated that the amplitude of P1 is a reliable index of spatial attention allocation (Hillyard et al., 1995; Mangun and Hillyard, 1995; Hillyard and Anllo-Vento, 1998) and the amplitude of P1 elicited by task-irrelevant peripheral stimuli is reduced as the task difficulty is increased from easy to difficult, via decreasing the amount of spatial attention allocated to the task-irrelevant peripheral stimuli (Handy and Mangun, 2000; Handy et al., 2001). Importantly, unlike the amplitudes of ERPs in the P1 latency range, those of ERPs in the subsequent N1 and P2 latency range were not affected by the manipulation of task difficulty for both control and standard stimuli: 150–300 ms, including the latency range of visual MMN as well as DRN and N1-difference in the difference waves. This result ensures that the delayed peak latency of the posterior negativity in the deviant-minus-control difference waves truly represents the modulation of visual MMN elicited by deviant stimuli.

The result that the peak latencies of visual MMN were delayed with an increase in the task difficulty is compatible with the expectation from the perceptual load theory (Lavie and Tsal, 1994; Lavie, 1995, 2005). This theory proposed that, as the perceptual load of task-relevant central information increases, a greater portion of the attention resources is needed for the perceptual processing of this information, and as a result, fewer residual attention resources are available to be involuntarily allocated for the perceptual processing of task-irrelevant peripheral information. The present effect of task difficulty on visual MMN can be interpreted as follows: as the difficulty of the size-change detection task increased from easy to difficult, a greater portion of the attention resources became necessary for detection of the size-change and fewer residual attention resources became involuntarily allocated to task-irrelevant surrounding bar stimuli, which caused the less rapid (less efficient) elicitation of visual MMN.

The present findings may be in line with a recent finding that visual MMN in response to task-irrelevant deviation is sensitive to the congruency between the feature dimension of task-irrelevant deviation and that of task-relevant target (Czigler and Sulykos, 2010). In that study, the authors presented taskirrelevant oddball sequences consisting of deviant and standard bar stimuli with either different colors or different orientations at the peripheral visual fields in separate blocks while the participant performed either a color- or an orientation-change detection task regarding a continuously presented shape at the central visual field in separate blocks. They found reduced amplitudes and delayed peak latencies for visual MMN in response

<sup>1</sup>One may argue that the posterior negativity extracted in the deviant-minuscontrol difference waves may still include N1-difference, since the smallest physical separation between control stimuli (ca. 16.4◦) was smaller than the physical separation between deviant and standard stimuli (ca. 32.7◦), and thus, the state of refractoriness for control stimuli might be higher than that for deviant stimuli. To rule out this possibility, we calculated ERPs elicited by control stimuli that were preceded by other control stimuli with at least ca. 32.7◦ physical separation. We found that, even with the newly-calculated control ERPs, the same statistically significant pattern of results regarding the visual MMN and N1-difference were obtained. This indicates that the present equiprobable protocol was sufficient for keeping the state of refractoriness for control stimuli equal to (or even lower than) that for deviant stimuli.

to deviant stimuli when the feature dimensions of deviation and target were congruent (e.g., color deviant stimuli under a color-change detection task) compared to when they were incongruent (e.g., color deviant stimuli under an orientation-change detection task). They interpreted the reduction of amplitudes and delay of peak latencies of visual MMN in terms of the competition for feature-specific attentional resources (Desimone and Duncan, 1995): when congruent, the processing of taskrelevant target and task-irrelevant deviation compete for featurespecific attentional resources, which leads to the suppression of visual MMN in response to the task-irrelevant deviation. More interestingly, although they did not consider the effects of task difficulty on visual MMN, as in the present study, their results showed delayed peak latencies of visual MMN for the difficult task (i.e., the color-change detection task) relative to the easy task (i.e., the orientation-change detection task) (however, their study evaluated visual MMN in deviantminus-standard difference waves, and thus it is possible that the delayed peak latency may represent the modulation of N1 difference). Although the experimental and analysis procedures differed between the present study and that reported by Czigler and Sulykos (2010), in a broad context, the findings in these studies consistently shed new light on the attention-sensitive nature of visual MMN (for another example, see Kimura et al., 2010d; but see also Winkler et al., 2005; Berti, 2011, for contrasting examples).

In summary, we found that visual MMN can be affected by the manipulation of task difficulty. This result suggests that visual MMN is not necessarily insensitive to the amount of attentional allocation. In contrast to the previous understanding, the present study supports the notion that visual MMN involves attentiondemanding predictive processes.

#### **THEORETICAL AND PRACTICAL IMPLICATIONS**

The present findings suggest that at least some portion of predictive processes underlying visual MMN elicitation is attentiondemanding. According to the predictive framework of visual MMN (Kimura et al., 2011; Kimura, 2012), the elicitation of visual MMN requires the contribution of multiple predictive processes: (1) the extraction of sequential rules embedded in the temporal structure of successive visual stimulation, (2) the establishment of a predictive model that encodes the extracted sequential rules, (3) the formation of a temporally-aligned prediction about forthcoming visual events on the basis of the predictive model, and (4) the comparison of the current and predicted visual events. Visual MMN is the output of these predictive processes: when incongruence has been detected via the comparison, visual MMN emerges. According to this framework, the delay of visual MMN elicitation observed in the present study implies that the comparison process required more time as the task difficulty increased from easy to difficult. At present, it is difficult to determine whether the delay represents the direct influence of attention on the comparison process (cf. Berti, 2011) or is the result of attentional influence on processes earlier than the comparison process (cf. Kimura et al., 2010b,c), providing no clue as to which part of the predictive process is attention-demanding. Determination of the attention-sensitivity of each process should be an important challenge in future visual MMN research, which could lead to the establishment of an integrative theory of sensory prediction and attention.

Research in this area should be important not only for theoretical development but also for practical progress. To date, visual MMN has been used in clinical studies as an effective tool for investigating preattentive visual processing, and has shed new light on its abnormality in the elderly (e.g., Tales et al., 2002) and several clinical populations (e.g., Tales and Butler, 2006; Tales et al., 2008; Urban et al., 2008; Chang et al., 2011; Qiu et al., 2011). With regard to such clinical applications, the present findings offer two implications. First, although visual MMN can be reliably regarded as a reflection of automatic visual processing (in that the elicitation of visual MMN does not require attention to be actively directed to visual stimulation), it can no longer be regarded as a reflection of preattentive visual processing (in that not all of the predictive processes that underlie visual MMN elicitation can be considered to be attention-independent). Second, possible attentional influences on visual MMN should always be taken into account: significant between-group differences in visual MMN may represent differences in automatic visual processing itself or may represent differences in attentional influences on automatic visual processing. A better understanding of the attention-sensitivity of the aforementioned predictive processes would be helpful for optimizing experimental design, so that such an ambiguous interpretation can be avoided.

Finally, the present findings suggest that visual MMN may be an effective tool in ergonomics (human factors) studies. In this research field, there has been a substantial interest in the utility of ERPs for the assessment of mental workload in the laboratory or real-world tasks (Donchin et al., 1986; Kramer and Weber, 2000). One of the major ERP procedures for assessing the mental workload is the so-called "probe" technique. In this procedure, while the participant performs a certain primary task, stimuli that are unrelated to the primary task (i.e., probe stimuli) are presented concurrently. To date, it has been suggested that P300, sensory evoked potentials, or other ERPs in response to probe stimuli can be used to assess the mental workload in the primary task (e.g., Kramer et al., 1983, 1995; Wickens et al., 1983; Ullsperger et al., 2001). Although the conditions for the application of visual probe stimuli would be fairly limited compared to those for the application of auditory or somatosensory probe stimuli, the utility of visual MMN in ergonomics applications may deserve more attention, given the unique (automatic but still attention-sensitive) nature of visual MMN.

#### **CONCLUSIONS**

The present study demonstrated that visual MMN can be affected by the manipulation of task difficulty, which suggests that visual MMN is sensitive to the amount of attentional allocation. In contrast to the previous understanding, the present finding supports the notion that visual MMN involves attention-demanding predictive processes.

#### **REFERENCES**


and visuocortical processing: event-related potentials reveal sensory-level selection. *Psychol. Sci.* 12, 213–218. doi: 10.1111/1467- 9280.00338


of the data of Kimura et al. (2009). *Neurosci. Lett.* 485, 198–203. doi: 10.1016/j.neulet. 2010.09.011


characterization of mismatch negativity to a visual stimulus. *Clin. Neurophysiol.* 116, 2392–2402. doi: 10.1016/j.clinph.2005.07.006


of sequential regularity violation. *Front. Hum. Neurosci.* 5:46. doi: 10.3389/fnhum.2011.00046


doi: 10.1097/00001756-200205240- 00014


a psychophysiological analysis of the reciprocity of informationprocessing resources. *Science* 221, 1080–1082. doi: 10.1126/science. 6879207


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 29 March 2013; accepted: 24 May 2013; published online: 12 June 2013.*

*Citation: Kimura M and Takeda Y (2013) Task difficulty affects the predictive process indexed by visual mismatch negativity. Front. Hum. Neurosci. 7:267. doi: 10.3389/fnhum.2013.00267*

*Copyright © 2013 Kimura and Takeda. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

# **HUMAN NEUROSCIENCE**

# The visual mismatch negativity elicited with visual speech stimuli

# *Benjamin T. Files 1, Edward T. Auer Jr. <sup>2</sup> and Lynne E. Bernstein1,2\**

*<sup>1</sup> Neuroscience Graduate Program, University of Southern California, Los Angeles, CA, USA*

*<sup>2</sup> Communication Neuroscience Laboratory, Department of Speech and Hearing Science, George Washington University, Washington, DC, USA*

#### *Edited by:*

*István Czigler, Hungarian Academy of Sciences, Hungary*

#### *Reviewed by:*

*Erich Schröger, University of Leipzig, Germany Paula P. Alvarez, University of Santiago de Compostela, Spain*

#### *\*Correspondence:*

*Lynne E. Bernstein, Communication Neuroscience Laboratory, Department of Speech and Hearing Science, George Washington University, 550 Rome Hall, Washington, DC 20052, USA e-mail: lbernste@gwu.edu*

The visual mismatch negativity (vMMN), deriving from the brain's response to stimulus deviance, is thought to be generated by the cortex that represents the stimulus. The vMMN response to visual speech stimuli was used in a study of the lateralization of visual speech processing. Previous research suggested that the right posterior temporal cortex has specialization for processing simple non-speech face gestures, and the left posterior temporal cortex has specialization for processing visual speech gestures. Here, visual speech consonant-vowel (CV) stimuli with controlled perceptual dissimilarities were presented in an electroencephalography (EEG) vMMN paradigm. The vMMNs were obtained using the comparison of event-related potentials (ERPs) for separate CVs in their roles as deviant vs. their roles as standard. Four separate vMMN contrasts were tested, two with the perceptually *far* deviants (i.e., "zha" or "fa") and two with the *near* deviants (i.e., "zha" or "ta"). Only *far* deviants evoked the vMMN response over the left posterior temporal cortex. All four deviants evoked vMMNs over the right posterior temporal cortex. The results are interpreted as evidence that the left posterior temporal cortex represents speech contrasts that are perceived as different consonants, and the right posterior temporal cortex represents face gestures that may not be perceived as different CVs.

**Keywords: speech perception, visual perception, lipreading, scalp electrophysiology, mismatch negativity (MMN), hemispheric laterazation for speech**

# **INTRODUCTION**

The visual mismatch negativity (vMMN) paradigm was used here to investigate visual speech processing. The MMN response was originally discovered and then extensively investigated with auditory stimuli (Näätänen et al., 1978, 2011). The classical auditory MMN is generated by the brain's automatic response to a change in repeated stimulation that exceeds a threshold corresponding approximately to the behavioral discrimination threshold. It is elicited by violations of regularities in a sequence of stimuli, whether the stimuli are attended or not, and the response typically peaks 100–200 ms after onset of the deviance (Näätänen et al., 1978, 2005, 2007). The violations that generate the auditory MMN can range from low-level stimulus deviations such as the duration of sound clicks (Ponton et al., 1997) to high-level deviations such as speech phoneme category (Dahaene-Lambertz, 1997). More recently, the vMMN was confirmed (Pazo-Alvarez et al., 2003; Czigler, 2007; Kimura et al., 2011; Winkler and Czigler, 2012). It too is elicited by a change in regularities in a sequence of stimuli, across different levels of representation, including deviations caused by spatiotemporal visual features (Pazo-Alvarez et al., 2004), conjunctions of visual features (Winkler et al., 2005), emotional faces (Li et al., 2012; Stefanics et al., 2012), and abstract visual stimulus properties such as bilateral symmetry (Kecskes-Kovacs et al., 2013) and sequential visual stimulus probability (Stefanics et al., 2011).

Speech can be perceived visually by lipreading, and visual speech perception is carried out automatically by hearing as well as by hearing-impaired individuals (Bernstein et al., 2000; Auer and Bernstein, 2007). Inasmuch as perceivers can visually recognize the phonemes (consonants and vowels) of speech through lipreading, the stimuli are expected to undergo hierarchical visual processing from simple features to complex representations along the visual pathway (Grill-Spector et al., 2001; Jiang et al., 2007b), just as are other visual objects, including faces (Grill-Spector et al., 2001), facial expression (Li et al., 2012; Stefanics et al., 2012), and non-speech face gestures (Puce et al., 1998, 2000, 2007; Bernstein et al., 2011). Crucially, because the vMMN deviation detection response is thought to be generated by the cortex that represents the standard and deviant stimuli (Winkler and Czigler, 2012), it should be possible to obtain the vMMN in response to deviations in visual speech stimuli. However, previous studies in which a speech vMMN was sought produced mixed success in obtaining a deviance response attributable to visual speech stimulus deviance detection (Colin et al., 2002, 2004; Saint-Amour et al., 2007; Ponton et al., 2009; Winkler and Czigler, 2012). A few studies have even sought an auditory MMN in response to visual speech stimuli (e.g., Sams et al., 1991; Möttönen et al., 2002).

The present study took into account how visual stimuli conveying speech information might be represented and mapped to higher levels of cortical processing, say for speech category perception or for other functions such as emotion, social, or gaze perception. That is, the study was specifically focused on the perception of the physical visual speech stimulus. The distinction between representations of the forms of exogenous stimuli vs. representation of linguistic categories is captured in linguistics by the terms *phonetic form* vs. *phonemic category*. Phonetic forms are the exogenous physical stimuli that convey the linguisticallyrelevant information used to perceive the speech category to which the stimulus belongs. Visual speech stimuli convey linguistic phonetic information primarily via the visible gestures of the lips, jaw, cheeks, and tongue, which support the system of phonological contrasts that underly speech phonemes (Yehia et al., 1998; Jiang et al., 2002; Bernstein, 2012). Phonemic categories are the consonant and vowel categories that a language uses to differentiate and represent words. If visual speech is processed similarly to auditory speech stimuli, functions related to higher-level language processing, such as categorization and semantic associations, are carried out beyond the level of exogenous stimulus form representations (Scott and Johnsrude, 2003; Hickok and Poeppel, 2007).

This study was concerned with the implications for cortical representation of visual speech stimuli in the case that speech perception is generally left-lateralized. There is evidence for formbased speech representations in high-level visual areas, and there is evidence that they are left-lateralized (Campbell et al., 2001; Bernstein et al., 2011; Campbell, 2011; Nath and Beauchamp, 2012). For example, Campbell et al. (1986) showed that a patient with right-hemisphere posterior cortical damage failed to recognize faces but had preserved speech lip-shape recognition, and that a patient with left-hemisphere posterior cortical damage failed to recognize speech lip-shapes but had preserved face recognition.

Recently, evidence for hemispheric lateralization was obtained in a study designed to investigate specifically the site/s of specialized visual speech processing. Bernstein et al. (2011), applied a functional magnetic resonance imaging (fMRI) block design while participants viewed video and point-light speech and nonspeech stimuli and tiled control stimuli. Participants were imaged during localizer scans for three regions of interest (ROIs), the fusiform face area (FFA) (Kanwisher et al., 1997), the lateral occipital complex (LOC) (Grill-Spector et al., 2001), and the human visual motion area V5/MT. These three areas were all under-activated by speech stimuli. Although both posterior temporal cortices responded to speech and non-speech stimuli, only in the left hemisphere was an area found with differential sensitivity to speech vs. non-speech face gestures. It was named the *temporal visual speech area* (TVSA) and was localized to the posterior superior temporal sulcus and adjacent posterior middle temporal gyrus (pSTS/pMTG), anterior to cortex that was activated by non-speech face movement in video and point-light stimuli. TVSA is similarly active across video and point-light stimuli. In contrast, right-hemisphere activity in the pSTS was not reliably different for speech vs. non-speech face gestures. Research aimed at non-speech face gesture processing has also produced evidence of right-hemisphere dominance for non-speech face gestures, with a focus in the pSTS (Puce et al., 2000, 2003).

The approach in the current study was based on predictions for how the representation of visual speech stimuli should differ for the right vs. left posterior temporal cortex under the hypothesis that the left cortex has tuning for speech, but the right cortex has tuning for non-speech face gestures. Specifically, lipreading relies on highly discriminable visual speech differences. Visual speech phonemes are not necessarily as distinctive as auditory speech phonemes. Visual speech consonants are known to vary in terms of how distinct they are from each other, because some of the distinctive speech features used by listeners (e.g., voicing, manner, nasality, place) to distinguish phonemes are not visible or are less visible to lipreaders (Auer and Bernstein, 1997; Bernstein, 2012). A left posterior temporal cortex area specialized for speech processing, part of an extensive speech processing pathway, is expected to be tuned to represent linguistically useful exogenous phonetic forms, that is, forms that can be mapped to higher-level linguistic categories, such as phonemes. However, when spoken syllables (e.g., "zha" and "ta") do not provide enough visual phonetic feature information, their representations are expected to generalize. That is, the indistinct stimuli activate overlapping neural populations. This is depicted in **Figure 1**, for which the visually *near* (perceptual categories are not distinct) syllables "ta" and "zha" are represented by almost completely overlapping ovals in the box labeled *left posterior temporal visual cortex*. The perceptually far stimulus "fa," a stimulus that shares few visible phonetic features with "zha," is depicted within its own non-overlapping oval in that box. Here, using the vMMN paradigm, a deviance response was predicted for the left hemisphere with the stimuli "zha" vs. "fa," representing a *far* contrast. But the *near* contrast "zha"-"ta," depicted in **Figure 1**, was not predicted to elicit the vMMN response by the left posterior temporal cortex for "zha" or for "ta" syllables.

In contrast, the right posterior temporal cortex, with its possible dominance for processing simple non-speech face motions such as eye open vs. closed, and simple lips open vs. closed (Puce et al., 2000, 2003), was predicted to generate a deviance response to both perceptually *near* and *far* speech stimulus contrasts. The depiction in **Figure 1** for the right posterior temporal cortex shows that the stimulus differences are represented there more faithfully (i.e., there are more neural units that are not in common). The right posterior temporal cortex is theoretically more concerned with perception of non-speech face gestures, for example, gestures related to visible emotion or affect: The representations may even be more analog in the sense that they are not used as input to a generative system that relies on combinations of representations (i.e., vowels and consonants) to produce a very large vocabulary of distinct words.

Even very simple low-level visual features or non-speech face or eye motion in the speech video clips can elicit the vMMN (Puce et al., 2000, 2003; Miki et al., 2004; Thompson et al., 2007). With natural speech production, phonetic forms vary from one production to the next. An additional contribution to variability is the virtually inevitable shifts in the talker's head position, eye gaze, eyebrows, etc., from video recording to recording. Subtle differences are not necessarily so obvious on a single viewing, but the vMMN paradigm involves multiple stimulus repetitions, which can render subtle differences highly salient.

The approach here was to use two recordings for each consonant and to manipulate the stimuli to minimize non-phonetic visual cues that might differentiate the stimuli. The study design took into account the likelihood that the deviance response to speech stimuli would be confounded with low-level stimulus

differences, if it involved a stimulus as standard (e.g., "zha") vs. a different stimulus as deviant (e.g., "fa"). Therefore, the vMMN was sought using the event-related potentials (ERPs) obtained with the same stimulus (e.g., "zha") in its two possible roles of standard and deviant. Stimulus discriminability was verified prior to ERP recording. During ERP recording, participants monitored for a rare target phoneme to engage their attention and hold it at the level of phoneme categorization, rather than at the level of stimulus discrimination.

temporal cortex when the syllables in the pair are perceptually similar

# **METHOD**

# **PARTICIPANTS**

Participants were screened for right-handedness (Oldfield, 1971), normal or corrected to normal vision (20/30 or better in both eyes using a traditional Snellen chart), normal hearing, American English as a first and native language, and no known neurological deficits. Lipreading was assessed with a screening test that has been used to test a very large sample of normal hearing individuals (Auer and Bernstein, 2007). The screening cutoff was 15% words correct in isolated sentences to assure that participants who entered the EEG experiment had some lipreading ability. Forty-nine individuals were screened (mean age = 23 years), and 24 (mean age = 24, range 21–31, 18 female, lipreading score *M* = 28.7% words correct) met the inclusion criteria for entering the EEG experiment. The EEG data from 11 participants (mean age = 23.2, range 19–31, 7 female, lipreading score *M* = 33.0) were used here: One participant was lost to contact, one ended the experiment early, two had unacceptably high initial impedance levels and were not recorded, and nine had high electrode impedances, excessive bridging between electrodes, or unacceptable noise levels. Informed consent was obtained from all participants. Participants were paid. The research was approved by the Institutional Review Boards at George Washington University and at the University of Southern California.

# **STIMULI**

# *Stimulus dissimilarity*

activate non-overlapping populations.

The stimuli for this study were selected to be of predicted perceptual and physical dissimilarities. Estimates of the dissimilarities and the video speech stimuli themselves were obtained from Jiang et al. (2007a), which gives a detailed description of the methods for predicting and testing dissimilarity. Based on the dissimilarity measures in Jiang et al. (2007a), the stimulus pair "zha"—"fa," with modeled dissimilarity of 4.04, was chosen to be perceptually *far*, and the stimulus pair "zha"—"ta," with modeled dissimilarity of 2.28 was chosen to be perceptually *near*. In a subsequent study, Files and Bernstein (submitted) tested whether the modeled dissimilarities among a relatively large selection of syllables correctly predicted stimulus discriminability, and they did.

# *Stimulus video*

Stimuli were recorded so that the talker's face filled the video screen, and lighting was from both sides and slightly below his face. A production quality camera (Sony DXC-D30 digital) and video recorder (Sony UVW 1800) were used simultaneously with an infrared motion capture system (Qualisys MCU120/240 Hz CCD Imager) for recording 3-dimensional (3D) motion of 20 retro-reflectors affixed to the talker's face. The 3D motion recording was used by Jiang et al. (2007a) in developing the dissimilarity estimates. There were two video recordings of each of the syllables, "zha," "ta," and "fa" that were used for eliciting the vMMNs. Two tokens of "ha," and of "va" were used as targets to control attention during the vMMN paradigm. All video was converted to grayscale.

In order to reduce differences in the durations of preparatory mouth motion across stimulus tokens and increase the rate of data collection, some video frames were removed from slow uninformative mouth opening gestures. But most of the duration differences were reduced by removing frames from the final mouth closure. No frames were removed between the sharp initiation of articulatory motion and the quasi-steady-state portion of the vowel.

During the EEG experiment, the video clips were displayed contiguously through time. To avoid responses due to minor variations in the position of the head from the end of one token to the beginning of the next, morphs of 267 ms were generated (Abrosoft's FantaMorph5) to create smooth transitions from one token to the next. The morphing period corresponded to the inter-stimulus-interval.

The first frame of each token was centered on the video monitor so that a motion-capture dot that was affixed at the center of the upper lip was at the same position for each stimulus. Also, stimuli were processed so that they would not be identifiable based solely on the talker's head movement. This was done by adding a small amount of smooth translational motion and rotation to each stimulus on a frame-by-frame basis. The average motion speed was 0.5 pixels per frame (0.87◦ of visual angle/s), with a maximum of 1.42 pixels per frame (2.5◦/s). Rotation varied between plus and minus 1.2◦ of tilt, with an average change of 0.055◦ of tilt per frame (3.28◦/s) and a maximum change of 0.15◦ of tilt per frame (9.4◦ of tilt/s). A stationary circular mask with radius 5.5◦ of visual angle and luminance equal to the background masked off the area around the face of the talker.

# *Stimulus alignment and deviation points*

The two tokens of each consonant (e.g., "zha") varied somewhat in their kinematics, so temporal alignments had to be defined prior to averaging the EEG data. We developed a method to align tokens of each syllable. Video clips were compared frame by frame separately for "zha," "fa," and "ta." In addition, mouth opening area was measured as the number of pixels encompassed within a manual tracing of the vermillion border in each frame of each stimulus. Visual speech stimulus information is widely distributed on the talking face (Jiang et al., 2007a), but mouth opening area is a gross measure of speech stimulus kinematics. **Figure 2** shows the mouth-opening area and video of the lips for the three different consonant-vowel (CV) stimuli and the two different tokens of each of them. The stimuli began with a closed neutral mouth and face, followed by the gesture into the consonant, followed by the gesture into the /a/ vowel ("ta," "fa," "zha"). Consonant identity information develops across time and continues to be present as the consonant transitions into the following vowel. The steep mouth opening gesture into the vowel partway through the stimulus was considered a possible landmark for temporal alignment, because it is a prominent landmark in the mouth area trace, but using this landmark in some cases brought the initial part of the consonant into gross misalignment. The frames comprising the initial gesture into the consonant were chosen to be the relevant landmark for alignment across tokens, because they are the earliest indication of the consonant identity (Jesse and Massaro, 2010).

The question was then, when did the image of one consonant (e.g., "fa") deviate from the image of the other (e.g., "zha"). The MMN is typically elicited by stimulus deviation, rather than stimulus onset (Leitman et al., 2009), and this deviation onset point is used to characterize the relative timing of the vMMN. Typically, ERPs to visual stimuli require steep visual energy change (Besle et al., 2004), but visual speech stimulus onset can be relatively slow-moving, depending on the speech phonetic features. Careful examination of the videos shows that differences in the tongue are

visible across the different consonants. The "zha" is articulated by holding the tongue in a quasi-steady-state somewhat flattened position in the mouth. This articulation is expected to take longer to register as a deviation, because of its subtle initial movement. The "ta" and "zha" stimuli vary primarily in terms of tongue position, which is visible but difficult to discern without attention to the tongue inside the mouth aperture. The deviation onset point here was defined as the first frame at which there was a visible difference across consonants. The 0-ms points in this report are set at the relevant deviation point and vMMN times are reported relative to the deviation onset.

#### **PROCEDURES**

#### *Discrimination pre-test*

To confirm the discriminability of the consonants comprising the critical contrasts in the EEG experiment, participants carried out a *same-different* perceptual discrimination task that used "zha"— "fa", and "zha"—"ta" *different* stimulus pairs. The two tokens of each syllable were combined in each of four possible ways and in both possible orders. *Same* pairs used different tokens of the same syllable, so that accurate discrimination required attention to consonant category. This resulted in six unique *same* pairs and 16 unique *different* pairs. To reduce the difference in number of *same* pairs vs. the number of *different* pairs, the *same* pairs were repeated, resulting in 12 *same* pairs and 16 *different* pairs per block, for a total of 28 pairs per block. During each trial, the inter-stimulus interval was filled by a morph transition from the end of the first token to the start of the second lasting 267 ms. Instructions emphasized that the tokens might differ in various ways, but that the task was to determine if the initial consonants were the same or different. Eleven blocks of pseudo-randomly ordered trials were presented. The first block was used for practice to ensure the participants' familiarity with the task, and it was not analyzed.

#### *vMMN procedure*

EEG recordings were obtained during an oddball paradigm in which standard, deviant, and target stimuli were presented. If one stimulus category is used as the standard and a different category stimulus is used as the deviant in deriving the vMMN, the vMMN also contains a response to the physical stimuli (Czigler et al., 2002). In order to compare ERPs to standards vs. deviants, holding the stimulus constant, each stimulus was tested in the roles of deviant and standard across different recording blocks (**Table 1**)<sup>1</sup> .

EEG recording comprised 40 stimulus blocks divided across four block types (**Table 1**). Each block type had one *standard* consonant (i.e., "zha," "fa," or "ta"), one *deviant* consonant (i.e., "zha," "fa," or "ta"), and one *target* consonant (i.e., "ha," or "va"). The "zha" served as *deviant* or *standard* with either "fa" or "ta." Thus, four vMMNs were sought: (1) "zha" in the context of "ta" (*near*); (2) "ta" in the context of "zha" (*near*); (3) "zha" in the



*Each block had a standard syllable, a deviant syllable and a target syllable. aDissimilarity measures the difference between the standard and the deviant syllable.*

context of "fa" (*far*); and (4) "fa" in the context of "zha" (*far*). Each vMMN was based on 10 stimulus blocks with the vMMN stimulus in either deviant or standard role. During each block, a *deviant* was always preceded by five to nine *standards*. At the beginning of a block, the *standard* was presented 9 times before the first *deviant*. The inter-stimulus-interval was measured as the duration of the morphs between the end of a stimulus and the beginning of the next, which was 267 ms.

To ensure that the visual stimuli were attended, participants were instructed to monitor the stimuli carefully for a *target* syllable. At the start of each block, the target syllable was identified by presenting it six times in succession. A *target* was always preceded by three to five *standards*. Participants were instructed to press a button upon detecting the target, which they were told would happen rarely. In each block, the *target* was presented four times, and the *deviant* was presented 20 times. In all, 85.4% of stimuli in a block were standards, 12.1% were deviants and 2.4% were targets. This corresponded to 200 *deviant* trials and ∼1400 *standard* trials per contrast per subject. The first *standard* trial following either a *deviant* trial or a *target* trial was discarded from analysis, because a standard following something other than a standard might generate a MMN (Sams et al., 1984; Nousak et al., 1996). This resulted in 1160 *standard* trials for computing the vMMN.

Participants were instructed to take self-paced breaks between blocks, and longer breaks were enforced every 10 blocks. Recording time was ∼4.5 h per participant. After EEG recording, electrode locations recorded were for each subject using a 3-dimensional digitizer (Polhemus, Colchester, Vermont).

#### **EEG RECORDING AND OFFLINE DATA PROCESSING**

EEG data were recorded using a 62-electrode cap that was configured with a modified 10–20 system for electrode placement. Two additional electrodes were affixed at mastoid locations, and bipolar EOG electrodes were affixed above and below the left eye and at the external canthi of the eyes to monitor eye movements. The EEG was amplified using a high input impedance amplifier (SynAmps 2, Neuroscan, NC). It was digitized at 1000 Hz with a 200 Hz low-pass filter. Electrode impedances were measured, and the inclusion criterion was 35 kOhm.

Offline, data were band-pass filtered from 0.5 to 50 Hz with a 12-dB/octave rolloff FIR zero phase-shift filter using EDIT 4.5 software (Neuroscan, NC). Eyeblink artifacts were removed using EDIT's blink noise reduction algorithm (Semlitsch et al., 1986). Data were epoched from 100 ms before video onset to 1000 ms after video onset. Epochs were baseline-corrected by subtracting

<sup>1</sup>This approach does not account for different refractoriness or adaptation due to different probabilities of stimulus presentation (Schroger and Wolff, 1996; Czigler et al., 2002; Kimura et al., 2009). However, an additional set of control recordings would have been needed to take this into account, and here the focus was not on isolating a unique MMN component. Also, the design of the experiment would have been excessively long (see General Discussion).

the average of the voltage measurements from −100 to +100 ms for each electrode and then average-referenced.

Artifact rejection and interpolation were performed using custom scripts calling functions in EEGLAB (Delorme and Makeig, 2004). Epochs in which no electrode voltage exceeded 50µV at any point in the epoch were included. For those epochs in which only one electrode exceeded the 50µV criterion, the data for that electrode were interpolated using spherical spline interpolation (Picton et al., 2000). This procedure resulted in inclusion of 91% of the EEG sweeps. To correct for variation in electrode placement between subjects, individual subject data were projected onto a group average set of electrode positions using spherical spline interpolation (Picton et al., 2000).

### **ANALYSES OF DISCRIMINATION DATA**

*Same-different* discrimination sensitivity was measured with *d* (Green and Swets, 1966). The hit rate was the proportion *different* responses to trials with different syllables. The false alarm rate was the proportion *different* responses for same pairs. If the rate was zero it was replaced with 1/(2*N*), and if it was one it was replaced by 1–1/(2*N*), where *N* is the number of trials (Macmillan and Creelman, 1991). Because this is a *same-different* design, *z*(*hit rate*)—*z*(*false alarm rate*) was multiplied by <sup>√</sup>**<sup>2</sup>** (Macmillan and Creelman, 1991).

Target detection during the EEG task was also evaluated using *d* , but the measure was *z*(*hit rate*)—*z*(*false alarm rate*). A response within 4 s of the target presentation was considered a *hit*, and a *false alarm* was any response outside this window. All non-target syllables were considered distracters for the purpose of calculating a false alarm rate. To assess differences in target detection across blocks, *d* was submitted to repeated-measures ANOVA.

# **ANALYSES OF EEG DATA**

# *Overview*

*A priori*, the main hypothesis was that visual speech stimuli are processed by the visual system to the level of representing the exogenous visual syllables. Previous research had suggested that there was specialization for visual speech stimuli by left posterior temporal cortex (Campbell et al., 2001; Bernstein et al., 2011; Campbell, 2011; Nath and Beauchamp, 2012). Previous research also suggested that there was specialization for non-speech face motion by right posterior temporal cortex (Puce et al., 1998, 2000, 2007; Bernstein et al., 2011). Therefore, the *a priori* anatomical regions of interest (ROI) were the bilateral posterior temporal cortices. However, rather than merely selecting electrodes of interest (EOI) over scalp locations approximately over those cortices and carrying out all analyses with those EOIs, a more conservative, step-by-step approach was taken, which allowed for the possibility that deviation detection was carried out elsewhere in cortex (e.g., Sams et al., 1991; Möttönen et al., 2002).

In order first to test for reliable stimulus deviation effects, independent of temporal window or spatial location, global field power (GFP; Lehmann and Skrandies, 1980; Skrandies, 1990) measures were compared statistically across standard vs. deviant for each of the four different vMMN contrasts. The GFP analyses show the presence and temporal interval of a deviation response anywhere over the scalp. The first 500 ms post-stimulus deviation was examined, because that interval was expected to encompass any possible vMMN.

Next, source analyses were carried out to probe whether there was evidence for stimulus processing by posterior temporal cortices, consistent with previous fMRI results on visual speech perception (Bernstein et al., 2011). Distributed dipole sources (Tadel et al., 2011) were computed for the responses to standard stimuli and for the vMMN waveforms. These were inspected and compared with the previous Bernstein-et-al. results and also with results from previous EEG studies that presented source analyses (Bernstein et al., 2008; Ponton et al., 2009). The inspection focused on the first 500 ms of the source models.

After examining the source models, EOIs were sought for statistical testing of vMMNs, taking into account the ERPs at individual electrode locations. For this level of analysis, an approach was needed to guard against double-dipping, that is, use of the same results to select and test data for hypothesis testing (Kriegeskorte et al., 2009). Because we did not have an independent localizer (i.e., an entirely different data set with which to select EOIs), as is recommended for fMRI experiments, we ran analyses on several different electrode clusters over posterior temporal cortices. Because all those results were highly similar, only one set of EOI analyses are presented here.

A coincident frontal positivity has also been reported for Fz and/or Cz in conjunction with evidence for a vMMN (Czigler et al., 2002, 2004). The statistical tests for the vMMN were carried out separately on ERPs from electrodes Fz and Cz to assess the presence of a frontal MMN. These tests also served as a check on the validity of the EOI selection. Fz and Cz electrodes are commonly used for testing the auditory MMN (Näätänen et al., 2007). If the same results were obtained on Fz and Cz as with the EOIs, the implication would be that EOI selection was biased toward our hypothesis that the posterior temporal cortices are responsible for visual speech form representations. The results for Fz and Cz were similar to each other but different from the EOI results, and only the Fz results are presented here. None of the Cz results were statistically reliable. ERPs evoked by target stimuli were not analyzed, because so few target stimuli were presented.

# *Global field power*

GFP (Lehmann and Skrandies, 1980; Skrandies, 1990) is the root mean squared average-referenced potential over all electrodes at a time sample. The GFP was calculated for each standard and deviant ERP per stimulus and per subject. The analysis window was 0–500 ms post stimulus deviation. Statistical analysis of group mean GFP differences between standard and deviant, within syllable, used randomization testing (Blair and Karniski, 1993; Nichols and Holmes, 2002; Edgington and Onghena, 2007) of the null hypothesis of no difference between the evoked response when the stimulus was a *standard* vs. the evoked response when the stimulus was a *deviant*. The level of re-sampling was the individual trial.

Surrogate mean GFP measures were generated for each subject by permuting the single-trial labels (i.e., *standard* or *deviant*) 1999 times and then computing mean GFP differences (*deviant* minus *standard*) for these permutation samples. These single-subject permutation mean GFP differences were averaged across subjects to obtain a permutation distribution of group mean GFP differences within the ERPs for a particular syllable. To avoid bias due to using a randomly generated subset of the full permutation distribution, the obtained group mean GFP difference was included in the permutation distribution, resulting in a total of 2000 entries in the permutation distribution. The *p*-value for a given time point was calculated as the proportion of surrogate group mean GFP difference values in the permutation distribution that were as or more extreme than the obtained group mean GFP difference, resulting in a two-tailed test.

To correct for multiple comparisons over time, a threshold length of consecutive *p*-values <0.05 was established (Blair and Karniski, 1993; Groppe et al., 2011). The threshold number of consecutive *p*-values was determined from the permutation distribution generated in the corresponding uncorrected test. For each entry in the permutation distribution, a surrogate *p*-value series was computed as though that entry were the actual data. Then, the largest number of consecutive *p*-values <0.05 in that surrogate *p*-value series was computed for each permutation entry. The threshold number of consecutive *p*-values was the 95th percentile of this null distribution of run lengths. This correction, which offers weak control over family-wise error rate and is appropriate when effects persist over many consecutive samples (Groppe et al., 2011), is similar to one used with parametric statistics (Guthrie and Buchwald, 1991) but requires no assumptions or knowledge about the autocorrelation structure of the underlying signal or noise.

#### *EEG distributed dipole source models*

EEG sources were modeled with distributed dipole source imaging using Brainstorm software (Tadel et al., 2011). In lieu of having individual anatomical MRI data for source space and forward modeling, the MNI/Colin 27 brain was used. A boundary element model (Gramfort et al., 2010) was fit to the anatomical model using a scalp model with 1082 vertices, a skull model with 642 vertices, and a brain model with 642 vertices. The cortical surface was used as the source space, and source orientations were constrained to be normal to the cortical surface. Cortical activity was estimated using depth-weighted minimum-norm estimation (wMNE; Baillet et al., 2001).

EEG source localization is generally less precise than some other neuroimaging techniques (Michel et al., 2004). Simulations comparing source localization techniques resulted in a mean localization error of 19.6 mm when using a generic brain model (Darvas et al., 2006), as was done here. Similar methods were used here, so the estimate of localization errors is ∼20 mm. Therefore, the source solutions found here serve as useful visualization tools and for EOI selection but are not intended for making conclusion related to precise anatomical localization.

#### *vMMN analyses*

The vMMN analyses used the same general approach as the approach to the GFP analyses rather than the more pervasive analysis of difference waveforms. To assess the reliability of the vMMNs for each stimulus, the average of the ERP for the EOIs for the token-as-standard was compared with the average of the ERPs for the token-as-deviant using a standard paired-samples permutation test (Edgington and Onghena, 2007) with the subject mean ERP as the unit of re-sampling. A threshold number of consecutive *p*-values <0.05 was established to correct for multiple comparisons using the same criterion (Blair and Karniski, 1993) as described above for the GFP analyses. The EOI cluster results that are presented are from the clusters left P5, P3, P1, PO7, PO5, and PO3, and right P2, P4, P6, PO4, PO6, and PO82 . We also carried out comparisons of the difference waveforms across *near* vs. *far* contrasts. These were a general check on the hypothesis that *far* contrasts were different from *near* contrasts.

In some cases in which a vMMN is observed, a coincident frontal positivity has also been reported for Fz and/or Cz (Czigler et al., 2002, 2004). The statistical tests for the vMMN were carried out separately on ERPs from electrodes Fz and Cz to assess the presence of a frontal MMN.

# **RESULTS**

### **BEHAVIORAL RESULTS**

The purpose of testing behavioral discrimination was to assure that the stimulus pair discriminability was predicted correctly. The 49 screened participants were tested, and the EEG data from 11 of them are reported here. Discrimination *d* scores were compared across groups (included vs. excluded participants) using analysis of variance (ANOVA) with the within-subjects factor of stimulus distance (*near* vs. *far*) and between-subjects factor of group (included vs. excluded). The groups were not reliably different, and group did not interact with stimulus distance.

*Far* pairs were discriminated better than *near* pairs, *F*(1, <sup>47</sup>) = 591.7, *p* < 0.001, mean difference in *d* = 3.13. Within the EEG group, mean *d* for the *far* stimulus pairs was reliably higher than for the *near* stimulus pairs, paired-*t*(10) = 12.25, *p* < 0.001, mean difference in *d* = 3.02. Mean *d* was reliably above chance for both *near*, *t*(10) = 8.09, *p* < 0.001, *M* = 1.40, and *far*, *t*(10) = 15.62, *p* < 0.001, *M* = 4.51, stimulus pairs.

Detection *d* of "ha" or "va" during EEG recording was high, group mean *d* = 4.83, range [3.83, 5.91]. The two targets were detected at similar levels, *paired*-*t*(10) = 0.23, *p* = 0.82. For neither target syllable was there any effect of which syllable was the standard in the EEG recording block.

#### *ERPs across vMMN stimulus pairs*

The ERP group mean data sets for the four stimulus pairs were inspected for data quality. **Figures S1–S2** show the montages for each of the vMMN data sets.

# *GFP results*

GFP measures were computed for each standard and deviant syllable. Holding syllable constant, the standard vs. deviant GFP was compared to determine whether and, if so, when a reliable effect of stimulus deviance was present in each of the four stimulus conditions (i.e., "zha" in the *near* context, "zha" in the *far* context,

<sup>2</sup>The alternate EOI clusters that were analyzed were: left (TP7 CP5 P7 P5), right (CP6 TP8 P6 P8); left (CP5 CP3 CP1 P7 P5 P3 P1 PO7 PO5 PO3 CB1) right (CP2 CP4 CP6 P2 P4 P6 P8 PO4 PO6 PO8 CB2); and left (TP7 CP5 CP3 CP1 P7 P5 P3 P1 PO7 PO5 PO3 CB1), right (CP2 CP4 CP6 TP8 P2 P4 P6 P8 PO4 PO6 PO8 CB2).

"fa" a *far* contrast, and "ta" a *near* contrast). All of the stimulus contrasts resulted in reliable effects. **Figure 3** summarizes the GFP results for each vMMN. The reliable GFP difference for "zha" in the *far* context was 200–500 ms post-deviation onset. For "zha" in the *near* context, there were two intervals of reliable difference, 268–329 and 338–500 ms post-deviation onset. The reliable difference for "fa" was 52–500 ms post-deviation onset. The reliable difference for "ta" was 452–500 ms post-deviation onset.

#### *Distributed dipole source models*

Dipole source models were computed using ERPs obtained with standard stimuli ("zha," "fa," and "ta") in order to visualize the spatiotemporal patterns of exogenously driven responses to the stimuli. **Figures 4**–**6** show the dipole source strength at 20-ms intervals starting from 90 ms after onset of visible motion until 670 ms for the group mean ERPs. The images are thresholded to only show dipole sources stronger than 20 pA·m. The figures show images starting at 90 ms post-stimulus onset, because no suprathreshold sources were obtained earlier. The images continue through 690 ms to indicate that posterior activity rises and falls within the interval, as would be expected in response to a temporally unfolding stimulus.

The right hemisphere overall appeared to have stronger and more sustained responses focused on posterior temporal cortex. Additionally, the right posterior temporal activation was more widespread but with a more inferior focus compared to that in left posterior temporal cortex. Variations in the anatomical locations of the foci of activity across **Figures 4**–**6** suggest that the possibility that activation sites varied as a function of syllable. But these cannot be interpreted with confidence given the relatively low level of spatial resolution of these distributed dipole source models.

The temporal differences across syllable are more interpretable. Variation across syllables is attributed to differences in stimulus kinematics. The "fa" standard (**Figure 4**) resulted in sustained right hemisphere posterior temporal activity from ∼120 to 490 ms relative to stimulus onset and sustained left hemisphere posterior temporal activity from ∼170 to 270 ms. The "zha" standard (**Figure 5**) resulted in sustained right hemisphere posterior temporal activity from ∼190 to 430 ms and sustained left hemisphere posterior temporal activity from ∼190 to 390 ms. The "ta" standard (**Figure 6**) resulted in sustained right hemisphere posterior temporal activity from ∼150 to 250 ms and sustained left hemisphere posterior temporal activity from ∼150 to 230 ms. The shorter period of sustained activity for "ta" vs. "fa" and "zha" can be explained by its shorter (fewer frames) initial articulatory gesture (**Figure 2**).

Some fronto-central and central activity emerged starting 220 to 280 ms post-stimulus onset, particularly with "zha" and "fa." No other prominent activations were obtained elsewhere during the initial periods of sustained posterior temporal activity.

Dipole source models were also computed on the vMMN difference waveforms (**Figures S3**–**S6**), resulting in lower signal strength in posterior temporal cortices in comparison with models based on the standard ERPs. The models support the presence of deviance responses in those cortical areas and higher right posterior activity for *far* contrasts than *near* contrasts. All of the difference waveform models demonstrate patterns of asymmetric frontal activity with greatest strength generally beyond 200 ms post-deviation that seems attributable to attention to the deviant.

**FIGURE 4 | Source images for "fa" standard.** Images show the depth-weighted minimum norm estimate of dipole source strength constrained to the surface of the cortex using a boundary element forward model and a generic anatomical model at 20-ms intervals starting from 90 ms after onset of visible motion for the group mean ERPs for syllable "fa" as *standard*. The time indicated by the cyan bar indicates the time at which "fa" visibly differs from "zha." Images are thresholded at 20 pA·m. Initial activity is in the occipital cortex. At 150 ms after syllable onset, the bilateral posterior temporal activity begins that lasts until 290 ms in the left hemisphere and until 490 ms in the right hemisphere. Activation in the right posterior temporal cortex is more widespread and inferior to that on the left. Fronto-central activity is visible from 250 to 510 ms post-stimulus onset.

#### *vMMN results*

ERPs of EOI clusters for each syllable contrast and hemisphere were submitted to analyses to determine the reliability of the deviance responses. Thus, there were four vMMN analyses per

**FIGURE 5 | Source images for "zha" standard.** Images show the depth-weighted minimum norm estimate of dipole source strength constrained to the surface of the cortex using a boundary element forward model and a generic anatomical model at 20-ms intervals starting from 90 ms after the onset of visible motion for the group mean ERPs for syllable "zha" as *standard*. The cyan bar indicates the time at which "zha" visibly differs from "fa," and the magenta bar indicates the time at which "zha" visibly differs from "ta." Images are thresholded at 20 pA·m. Initial activity is in the occipital cortex. At 190 ms after syllable onset, strong, widespread bilateral posterior temporal activity begins that lasts until 290 ms, with weaker activations recurring through 610 ms post-stimulus onset. Activation in the right posterior temporal cortex is more widespread and inferior to that on the left. Fronto-central activity is visible from 270 to 490 ms post-stimulus onset.

hemisphere. They were for "zha" in its *near* or *far* context, "fa" in the *far* context, and "ta" in the *near* context. Summaries of the results are given in **Table 2**. The duration (begin points to end points) of reliable deviance responses varied across syllables

**FIGURE 6 | Source images for "ta" standard.** Images show the depth-weighted minimum norm estimate of dipole source strength constrained to the surface of the cortex using a boundary element forward model and a generic anatomical model at 20-ms intervals starting from 90 ms after the onset of visible motion for the group mean ERPs for syllable "ta" as *standard*. The magenta bar indicates the time at which "ta" visibly differs from "zha." Images are thresholded at 20 pA·m. Initial activity is in occipital cortex. At 130 ms after syllable onset, bilateral posterior temporal activity begins that fades by 250 ms post-stimulus onset, but then recurs from 330 to 590 ms on the right and from 330 to 470 ms on the left. Fronto-central activity is visible from 270 to 470 ms post-stimulus onset.

(from 50 to 185 ms) and varied in mean voltage (from −0.35 to −0.85µV).

**Figure 7** shows the statistical results for the EOI cluster waveforms for each contrast and hemisphere. The theoretically predicted results were obtained. All of the right-hemisphere contrasts resulted in reliable deviance responses. They were "zha" in the *near* context from 239 to 288 ms post-deviation onset, "zha" in the *far* context from 324 to 500 ms post-deviation onset, "ta" from 449 to 500 ms post-deviation onset, and "fa" from 300 to 442 ms post-deviation onset. Only the *far* contrasts resulted in reliable left-hemisphere deviance responses. They were "zha" in the *far* context from 322 to 497 ms post-deviation onset and "fa" from 251 to 435 ms post-deviation onset.

# *Comparison of far vs. near vMMNs*

Difference waveforms were computed using the standard type of approach to the vMMN, that is, by subtracting the EOI cluster ERPs to standards from the response to deviants for each stimulus contrast and hemisphere on a per-subject basis. The magnitudes of the vMMN waveforms were then compared between *far* and *near* contrasts using the resampling method that was applied to the analyses of standards vs. deviants.

The "zha" *near* and *far* vMMN waveforms were found to be reliably different (**Figure 8**). On the left, the difference wave for "zha" in the *far* context was reliably larger (i.e., more negative) than for "zha" in the *near* context (320 to 443 ms post-deviation onset), not unexpectedly as the *near* context did not result in an observable vMMN. On the right, the difference wave was also reliably larger for "zha" in the *far* context (from 331 to 449 ms post-deviation onset), although both contexts were effective. The results were similar when the vMMN waveforms were compared between "fa" vs. "ta" (**Figure 9**). On the left, the difference wave for "fa" was reliably larger than for "ta" (309–386 ms post-deviation onset). On the right, the difference wave was also reliably larger for "fa" (from 327 to 420 ms post-deviation onset).

### *Fronto-central results*

ERPs were analyzed based on recordings from electrodes Fz and Cz, because these electrodes are typically used to obtain an auditory MMN (Kujala et al., 2007), but positivities on these electrodes have been reported for vMMNs (Czigler et al., 2002, 2004). Results with Fz (**Figure 9**) showed reliable effects for "ta," "fa," and "zha" *far*. None of the Cz results were reliable (**Figure 9**). Reliable differences with the deviant ERPS more positive were found on Fz for both of the *far* contrasts, from 282 to 442 ms postdeviation onset for "fa" and from 327 to 492 ms post-deviation onset for "zha" in the *far* context. These positive differences occur at similar times and with opposite polarity as the posterior temporal vMMNs. A reliable positivity was also obtained for "ta" from 151 to 218 ms post-deviation onset, but no reliable difference was obtained for "zha" in the *near* context.

# **GENERAL DISCUSSION**

This study investigated the brain's response to visual speech deviance, taking into account that (1) responses to stimulus deviants are considered to be generated by the cortex that represents the stimulus (Winkler and Czigler, 2012), and (2) that there is evidence that exogenous visual speech processing is lateralized to left posterior temporal cortex (Campbell, 1986; Campbell et al., 2001; Bernstein et al., 2011). Taken together these observations imply that the right and left posterior temporal cortices represent visual speech stimuli differently, and therefore that their responses to stimulus deviance should differ.

We hypothesized that the right posterior temporal cortex, for which there are indications of representing simple nonspeech face gestures (Puce et al., 2000, 2003), would generate


#### **Table 2 | Summary of reliable vMMNs.**

*All times are relative to deviance onset. LPT, left posterior temporal; RPT, right posterior temporal.*

*aThe p-value corresponds to the entire indicated time window and is corrected for multiple comparisons over time.*

*bThe mean is the group average deviant minus standard, averaged over the period from the begin to end points.*

the deviance response to both perceptually *near* and perceptually *far* speech stimulus changes (**Figure 1**). In contrast, the left hemisphere, for which there are indications of specialization (Campbell et al., 2001; Bernstein et al., 2011; Campbell, 2011; Nath and Beauchamp, 2012) for representing the exogenous stimulus forms of speech, would generate the deviance response only to perceptually *far* speech stimulus changes. That is, it would be tuned to stimulus differences that are readily perceived as different consonants (**Figure 1**).

Two vMMNs were sought for *far* stimulus deviations (one for "zha" and one for "fa"), and two vMMNs were sought for *near* stimulus deviations (one for "zha" and one for "ta"). The "zha" stimulus was used to obtain a perceptually *near* and a perceptually *far* contrast in order to hold consonant constant across perceptual distances. Reliable vMMN contrasts supported the predicted hemispheric effects. The left-hemisphere vMMNs were obtained only with the highly discriminable (*far*) stimuli, but the righthemisphere vMMNs were obtained with both the *near* and *far* stimulus contrasts. There were also reliable differences between vMMN difference waveforms as a function of perceptual distance, with larger vMMN difference waveforms associated with larger perceptual distances.

#### **EVIDENCE FOR THE vMMN DEVIANCE RESPONSE WITH SPEECH STIMULI**

Previous reports have been mixed concerning support for a posterior vMMN specific to visual speech form-based deviation (i.e., deviation based on the phonetic stimulus forms). An early study failed to observe any vMMN in a paradigm in which a single visual speech token was presented as a deviant and a single different speech token was presented as a standard (Colin et al., 2002). A more recent study (Saint-Amour et al., 2007) likewise failed to obtain a vMMN response.

In Colin et al. (2004), a posterior difference between the ERP evoked by a standard syllable and the ERP evoked by a deviant syllable was obtained on Oz (from 155 to 414 ms), but this difference was attributed to low-level (non-speech) stimulus differences and not to speech syllable differences, because the effect involved two different stimuli. A subsequent experiment controlling for stimulus difference found no vMMN for visual speech alone. For example, the original deviance detection could have arisen at a lower-level such as the temporal or spatial frequency differences between the stimuli, or it could have been the result of shifts in the talker's eye gaze across stimuli. A study by Möttönen et al. (2002) used magnetoencephalography (MEG) to record the deviance response with a single standard ("ipi") vs. a single deviant ("iti"). The mismatch response was at 245–410 ms on the left and 245–405 ms on the right. But again, these responses cannot be attributed exclusively to deviance. They could be attributable to consonant change.

Winkler et al. (2009) compared the ERPs to a "ka" stimulus in its roles as standard vs. deviant and reported a late occipital difference response, and possibly also an earlier negative difference peak at 260 ms on occipital electrodes that did not reach significance. In their study, the vMMN is not attributable to lower-level stimulus attributes that changed.

Ponton et al. (2009) used a similar approach in attempting to obtain vMMNs for "ga" and "ba." A reliable vMMN was obtained for "ba" only. The authors speculated that the structure of the "ga" stimulus might have impeded being able to obtain a reliable vMMN with it. The stimulus contained three early rapid discontinuities in the visible movement of the jaw, which might have each generated their own C1, P1, and N1 responses, resulting in the oscillatory appearance of the obtained vMMN difference waveforms. Using current density reconstruction modeling (Fuchs et al., 1999), the "ba" vMMN was reliably localized only to the right posterior superior temporal gyrus, peaking around 215 ms following stimulus onset. The present study suggests that the greater reliability for localizing the right posterior response could be due to generally more vigorous responding by that hemisphere.

As suggested in Ponton et al. (2009), whether a vMMN is obtained for speech stimuli could depend on stimulus kinematics. The current study took into account kinematics and the different deviation points across the different stimulus pairs. Inasmuch as the vMMN is expected to arise following deviation onset (Leitman et al., 2009), establishing the correct time point from which to measure the vMMN is critical. A method was devised here to establish the onset of stimulus deviation. The method was fairly gross, involving inspection of the video frames and measurement of the lip-opening area to align the stimuli within phoneme category and establish deviation across categories (**Figure 2**), but it resulted in good correspondence of the vMMNs latencies across stimuli and with previous positive reports (Ponton et al., 2009; Winkler et al., 2009).

The distributed dipole models of the standard stimuli here (**Figures 4**–**6**) suggest that the posterior temporal cortex responds to speech stimuli by 170–190 ms post-stimulus onset and continues to respond for ∼200 ms. This interval is commensurate with the reliable vMMNs here (**Table 2**), which were measured using the electrode locations approximately over the posterior temporal response foci in the distributed dipole models. The results here are considered strong evidence that there is a posterior visual speech deviance response that is sensitive to consonant dissimilarity, but that detailed attention to stimulus attributes may be needed on the part of researchers in order to obtain it reliably.

# **HEMISPHERIC ASYMMETRY OF VISUAL SPEECH STIMULUS PROCESSING**

Beyond demonstrating that visual speech deviance is responded to by high-level visual cortices, the current study focused on the hypothesis that the right and left posterior temporal cortices would demonstrate lateralized processing. The distributed dipole source models (**Figures 4**–**6**) show somewhat different areas of posterior temporal cortex to have been activated by each of the standard stimuli. In addition, during the first 400–500 ms poststimulus onset, the activation appears to be greater for the right hemisphere.

There are published results that support functional anatomical asymmetry for processing non-speech face stimuli. For example, the right pSTS has been shown to be critically involved in processing eye gaze stimuli (Ethofer et al., 2011). In an ERP study alternating mouth open and mouth closed stimuli, the most prominent effect was a posterior negative potential around 170 ms which appeared to be larger on the right but was not reliably so (Puce et al., 2003). The researchers point out that the low spatial resolution with ERPs precludes the possibility of attributing their obtained effects exclusively to pSTS, because close cortical areas such as the human motion processing area (V5/MT) could also contribute to activation that appears to be localized to pSTS. Thus, although there is evidence in their study and here of different functional specialization across hemispheres, the indeterminacies with EEG source modeling preclude strong statements about the specific neuroanatomical regions activated within the posterior temporal cortices. However, an fMRI study (Bernstein et al., 2011), in which localizers were used did show that V5/MT was under-activated by visual speech in contrast with non-speech stimuli.

The left posterior temporal EOI deviance responses here are consistent with the temporal visual speech area (TVSA) reported by Bernstein et al. (2011) and are generally consistent with observations in other neuroimaging studies of lipreading (Calvert and Campbell, 2003; Paulesu et al., 2003; Skipper et al., 2005; Capek et al., 2008). The TVSA appears to be in the pathway that is also attributed with multisensory speech integration (Calvert, 2001; Nath and Beauchamp, 2011). The current results are consistent with the suggestion (Bernstein et al., 2011) that visual speech stimuli are extensively processed by the visual system prior to being mapped to higher-level speech representations, including semantic representations, in more anterior temporal cortices (Scott and Johnsrude, 2003; Hickok and Poeppel, 2007).

The right- vs. left-hemisphere vMMN results could be viewed as paradoxical under the assumption that sensitivity to speech stimulus deviation is evidence for specialization for speech. That is, the four vMMNs on the right might seem to afford more

speech processing information than the two on the left. Here, the near deviant stimuli were discriminable as different patterns of speech gestures. But the obtained *d* discrimination measures that were ∼1.4 for *near* contrasts are commensurate with previous results that showed the stimuli are not reliably labeled as different speech phonemes (Jiang et al., 2007a). Stimulus categorization involves generalization across small and/or irrelevant stimulus variation (Goldstone, 1994; Jiang et al., 2007b). Neural representations are the recipients of convergent and divergent connections, such that different lower-level representations can map to the same higher-level representation, and similar lowerlevel representations can map to different higher-level representations (Ahissar et al., 2008). Small stimulus differences that do not signal different phonemes could be mapped to the same representations on the left but mapped to different representations on the right (**Figure 1**).

The vMMNs on the left are explicitly not attributed to phoneme category representations but to the representation of the exogenous stimulus forms that are mapped to category representations, an organizational arrangement that is observed for non-speech visual object processing (Grill-Spector et al., 2006; Jiang et al., 2007b). This type of organization is also thought to

be true for auditory speech processing, which is initiated at the cortical level with basic auditory features (e.g., frequencies, amplitudes) that are projected to exogenous phonetic stimulus forms, and then to higher-level phoneme, syllable, or lexical category representations (Binder et al., 2000; Scott et al., 2000; Eggermont, 2001; Scott, 2005; Hickok and Poeppel, 2007; Obleser and Eisner, 2009; May and Tiitinen, 2010; Näätänen et al., 2011).

to larger deviations only is expected for a lateralized language Thus, the sensitivity of the left posterior temporal cortex processing system that needs exogenous stimulus representations that can be reliably mapped to higher-level categories (Binder et al., 2000; Spitsyna et al., 2006). The deviation detection on the right could be more tightly integrated into a system responsive to social and affective signals (Puce et al., 2003), for which an inventory of categories such as phonemes that are combinatorically arranged is not required. For example, the right-hemisphere sensitivity to smaller stimulus deviations could be related to processing of emotion or visual attention stimuli (Puce et al., 1998, 2000, 2003; Wheaton et al., 2004; Thompson et al., 2007).

### **DISSIMILARITY**

Here, four vMMNs were sought in a design incorporating between- and within-consonant category stimuli and estimates of between-consonant category perceptual dissimilarity (Files and Bernstein, submitted; Jiang et al., 2007a). The perceptual dissimilarities were confirmed, and the vMMNs were consistent with the discrimination measures: Larger *d* was associated with larger vMMNs as predicted based on the expectation that the extent of neuronal representation overlap is related to the magnitude of the vMMN (Winkler and Czigler, 2012) (**Figure 1**). The direct comparison of the vMMN difference waves showed that, while holding stimulus constant (i.e., "zha"), the magnitude of the vMMN varied reliably with the context in which it was obtained. In the far ("fa") context, the vMMN was larger than in the near ("ta") context. To our knowledge, this is the first demonstration of predicted and reliable relative difference in the vMMN as a function of visual speech discriminability. This finding was also supported by the results for the other two stimuli, "ta" and "fa."

These results converge with previous results on the relationship between visual speech discrimination and the physical visual stimuli. Jiang et al. (2007a) showed that the perceptual dissimilarity space obtained through multidimensional scaling of visual speech phoneme identification can be accounted for in terms of a physical (i.e., 3D optical) perceptually (linearly) warped multidimensional speech stimulus space. Files and Bernstein (in submission) followed up on those results and showed that the same dissimilarity space successfully predicts perceptual discrimination of the consonants. That is, the modeled perceptual dissimilarities based on perceptually warped stimulus differences predicted discrimination results and the deviance responses here.

The controlled dissimilarity factor in the current experiment afforded a unique approach to investigation of hemispheric specialization for visual speech processing. An alternate approach would be to compare ERPs obtained with speech vs. non-speech face gestures, as has been done in an fMRI experiment (Bernstein et al., 2011). However, that particular approach could introduce uncontrolled factors such as different salience of speech vs. nonspeech stimuli. The current vMMN results also contribute a new insight about speech perception beyond that obtained within the Jiang et al. (2007a), and Files and Bernstein (in submission) perceptual studies. Specifically, the results here suggest that two types of representations can contribute to the perceptual discriminability of visual speech stimuli, speech consonant representations and face gesture representations.

# **MECHANISMS OF THE vMMN RESPONSE**

One of the goals of vMMN research, and MMN research more generally, has been to establish the mechanism/s that are responsible for the brain's response to stimulus deviance (Jaaskelainen et al., 2004; Näätänen et al., 2005, 2007; Kimura et al., 2009; May and Tiitinen, 2010). A main issue has been whether the cortical response to deviant stimuli is a so-called "higher-order memorybased process" or a neural adaptation effect (May and Tiitinen, 2010). The traditional paradigm for deriving the MMN (i.e., subtracting the ERP based on responses to standards from the ERP based on responses to deviants when deviant and standard are the same stimulus) was designed to show that the deviance response is a memory-based process. But the issue then arose whether the MMN is due entirely instead to refractoriness or adaptation of the same neuronal population activated by the same stimulus in its two different roles. The so-called "equiprobable paradigm" was designed to control for effects of refractoriness separate from deviance detection (Schroger and Wolff, 1996, 1997). The current study did not make use of the equiprobable paradigm, and we did not seek to address through our experimental design the question whether the deviance response is due to refractoriness/adaptation or a separate memory mechanism. We do think that our design rules out low-level stimulus effects and points to higher-level deviance detection responses at the level of speech processing.

The stimuli presented in the current vMMN experiment were not merely repetitions of the exact same stimulus. Deviants and standards were two different video tokens whose stimulus attributes differed (see **Figure 2**). These stimulus differences were such that it was necessary to devise a method to bring them into alignment with each other and to define deviations points, which were different depending on which vMMN was being analyzed. Furthermore, the stimuli were slightly jittered in position on the video monitor during presentation to defend additionally against low-level effects of stimulus repetition. Thus, the deviation detection at issue was relevant to consonant stimulus forms. We interpret the lateralization effects to be the result of the left hemisphere being more specialized for linguistically-relevant stimulus forms and the right hemisphere being more specialized for facial gestures that while not necessarily being discrete categories were nevertheless detected as different gestures (Puce et al., 1996). However, these results do not adjudicate between explanations that attempt to separate adaptation/refractoriness from an additional memory comparison process.

#### **vMMN TO ATTENDED STIMULI**

The auditory MMN is known to be obtained both with and without attention (Näätänen et al., 1978, 2005, 2007). Similarly, the vMMN can be elicited in the absence of attention (Winkler et al., 2005; Czigler, 2007; Stefanics et al., 2011, 2012). Here, participants were required to attend to the stimuli and carry out a phoneme-level target detection task. Visual attention can result in attention-related ERP components in a similar latency range as the vMMN. A negativity on posterior lateral electrodes is commonly observed and is referred to as the *posterior N2*, *N2c*, or *selection negativity* (SN) (Folstein and Petten, 2008). However, the current results are not likely attributable to the SN, as the magnitude of the vMMN increased with perceptual dissimilarity of the standard from the deviant, whereas the SN is expected to increase with perceptual similarity of the deviant to a taskrelevant target (Baas et al., 2002; Proverbio et al., 2009). Here, the *target* consonant was chosen to be equally dissimilar from both the *standard* and the *deviant* stimuli in a block, and this dissimilarity was similar across blocks. Therefore, differences in vMMN across syllables are unlikely attributable to the similarity of the *deviant* to the *target*: The task was constant in terms of the discriminability of the target, but the vMMNs varied in amplitude.

#### **NO AUDITORY MMN**

Results of this study do not support the hypothesis that visual speech deviations are exogenously processed by the auditory cortex (Sams et al., 1991; Möttönen et al., 2002). This possibility received attention previously in the literature (e.g., Calvert et al., 1997; Bernstein et al., 2002; Pekkola et al., 2005). Seen vocalizations can modulate the response of auditory cortex (Möttönen et al., 2002; Pekkola et al., 2006; Saint-Amour et al., 2007), but the dipole source models of ERPs obtained with standard stimuli (**Figures 4**–**6**) do not show sources that can be attributed to the region of the primary auditory cortex. Nonetheless, the Fz and Cz ERPs obtained with standards and deviants were compared in part because of the possibility that an MMN reminiscent of an auditory MMN (Näätänen et al., 2007) might be obtained. Instead, a reliable positivity was found for the two *far* syllable contrasts. The timing of this positivity was similar to that of the vMMN observed on posterior temporal electrodes but was opposite in polarity. Similar positivities have been reported for other vMMN experiments and could reflect inversion of the posterior vMMN or some related but distinct component (Czigler et al., 2002, 2004).

#### **SUMMARY AND CONCLUSIONS**

Previous reports on the vMMN with visual speech stimuli were mixed, with relatively little evidence obtained for a visual deviation detection response. Here, the details of the visual stimuli were carefully observed for their deviations points. The possibility was taken into account that across hemispheres the two posterior temporal cortices represent speech stimuli differently. The left posterior temporal cortex, hypothesized to represent visual speech forms as input to a left-lateralized language processing system, was predicted to be responsive to perceptually large deviations between consonants. The right hemisphere, hypothesized to be sensitive to face and eye movements, was predicted to detect both perceptually large and small deviations between consonants. The predictions were shown to be correct. The vMMNs that were obtained for the perceptually *far* deviants were reliable bilaterally over posterior temporal cortices, but the vMMNs for the perceptually *near* deviants were reliably observed only over the right posterior temporal cortex. The results support a left-lateralized visual speech processing system.

### **ACKNOWLEDGMENTS**

We thank our test subjects for their participation. We thank Silvio P. Eberhardt, Ph.D. for designing the hardware used in the experiment, developing the software for stimulus presentation, and his help preparing the stimuli. This research was supported by NIH/NIDCD DC008583. Benjamin T. Files was supported by NIH/NIDCD T32DC009975.

# **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/Human\_Neuroscience/ 10.3389/fnhum.2013.00371/abstract

**Figure S1 | (A)** ERP montage for "zha," in the *far* context. Group mean ERPs for "zha" as *standard* in blocks with "fa" as *deviant*, and "zha" as *deviant* in blocks with "fa" as *standard*. **(B)** ERP montage for "zha," in the *near* context. Group mean ERPs for "zha" as *standard* in blocks with "ta" as *deviant*, and "zha" as *deviant* in blocks with "ta" as *standard*. Each sub-axis shows the ERP on a different electrode, and the location of each axis maps to the location of that electrode on a head as seen from above, with the nose pointed up toward the top of the figure. The light green boxes show the electrodes of interest selected for subsequent vMMN analyses. Times shown are relative to deviation onset.

**Figure S2 | (A)** ERP montage for "fa," in the *far* context. Group mean ERPs for "fa" as *standard* in blocks with "zha" as *deviant* and "fa" as *deviant* in blocks with "zha" as a *standard*. **(B)** ERP montage for "ta," in the *near* context. Group mean ERPs for "ta" as *standard* in blocks with "zha" as *deviant* and "ta" as *deviant* in blocks with "zha" as *standard*. The light green boxes show the electrodes of interest selected for subsequent vMMN analyses. Times shown are relative to deviation onset.

**Figure S3 | Source images for "ta"** *near* **vMMN.** Images show the depth-weighted minimum norm estimate of dipole source strength constrained to the surface of the cortex using a boundary element forward model and a generic anatomical model at 20-ms intervals from 0 to 500 ms post-deviation onset. Images are thresholded at 20 pA·m. Foci of activity are scattered and transient, but focal activation occurs in fronto-central cortex throughout the time depicted, in right lateral occipital cortex from 0 to 40 ms, right posterior temporal cortex from 160 to 220 ms and 340 to 500 ms. Activation in the left hemisphere is scattered and transient throughout the time depicted.

**Figure S4 | Source images for "fa"** *far* **vMMN.** Images show the depth-weighted minimum norm estimate of dipole source strength constrained to the surface of the cortex using a boundary element forward model and a generic anatomical model at 20-ms intervals from 0 to 500 ms post-deviation onset. Images are thresholded at 20 pA·m. Strong focal activity occurs in right lateral occipital cortex starting at ∼260 ms, spreading into right posterior temporal cortex by 340 ms and expanding to include large swaths of posterior right cortex through the end of the temporal interval. In the left hemisphere, posterior temporal activity begins at ∼280 ms and continuing through 400 ms at which time a more inferior focus in posterior/middle temporal cortex emerges and continues through the end of the temporal interval. Left fronto-central activity begins at ∼300 ms and continues through to the end of the interval.

**Figure S5 | Source images for "zha"** *near* **vMMN.** Images show the depth-weighted minimum norm estimate of dipole source strength constrained to the surface of the cortex using a boundary element forward model and a generic anatomical model at 20-ms intervals from 0 to 500 ms post-deviation onset. Images are thresholded at 20 pA·m. Strong activity in right posterior temporal/lateral occipital cortex begins at ∼200 ms and proceeds through to 360 ms and then recurs

from 440 ms to the end of the temporal interval. In the left hemisphere, activity is scattered and transient, but there are hotspots of activity in inferior frontal cortex from 200 to 240ms, in fronto-central cortex from 320 to 420 ms and inferior posterior temporal cortex from 360 ms to the end of the interval depicted.

**Figure S6 | Source images for "zha"** *far* **vMMN.** Images show the depth-weighted minimum norm estimate of dipole source strength constrained to the surface of the cortex using a boundary element forward model and a generic anatomical model at 20-ms intervals from

# **REFERENCES**


(2011). Visual phonetic processing localized using speech and nonspeech face gestures in video and point-light displays. *Hum. Brain Mapp*. 32, 1660–1676. doi: 10.1002/hbm.21139


0 to 500 ms post-deviation onset. Images are thresholded at 20 pA·m. Focal activity in right posterior temporal cortex begins at 240 ms and continues through the end of the temporal interval, spreading to posterior inferior temporal and lateral occipital cortex at ∼340 ms. Right fronto-lateral activity begins at 260 ms and continues through the end of the interval. Left fronto-central activity begins at 220ms and continues through the end of the interval. Left posterior temporal activity occurs from 200 to 380 ms and in a slightly more inferior region from 460 ms to the end of the temporal interval.

recognition and lipreading. *A Neurological Dissociation. Brain* 109(Pt 3), 509–521. doi: 10.1093/brain/109.3.509


of categorical phoneme perception in adults. *Neuroreport* 8, 919–924. doi: 10.1097/00001756-199703030- 00021


EEG source imaging. *Clin. Neurophys.* 115, 2195–2222. doi: 10.1016/j.clinph.2004.06.001


related prefrontal negativity larger to irrelevant stimuli that are difficult to suppress. *Behav. Brain Funct.* 5, 25.


21, 434–441. doi: 10.1111/j.1469- 8986.1984.tb00223.x


*Neuroimage* 25, 76–89. doi:


and vMMN) linking predictive coding theories and perceptual object representations. *Int. J. Psychophysiol.* 83, 132–143. doi: 10.1016/j.ijpsycho.2011.10.001


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 25 April 2013; accepted: 26 June 2013; published online: 16 July 2013.*

*Citation: Files BT, Auer ET Jr and Bernstein LE (2013) The visual mismatch negativity elicited with visual speech stimuli. Front. Hum. Neurosci. 7:371. doi: 10.3389/fnhum.2013.00371 Copyright © 2013 Files, Auer and*

*Bernstein. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

# Altered visual information processing systems in bipolar disorder: evidence from visual MMN and P3

# *Toshihiko Maekawa1,2\*, Satomi Katsuki 1, Junji Kishimoto3, Toshiaki Onitsuka1, Katsuya Ogata2, Takao Yamasaki 2, Takefumi Ueno1, Shozo Tobimatsu2 and Shigenobu Kanba1*

*<sup>1</sup> Department of Neuropsychiatry, Faculty of Medical Sciences, Kyushu University, Fukuoka, Japan*

*<sup>2</sup> Departments of Clinical Neurophysiology, Faculty of Medical Sciences, Kyushu University, Fukuoka, Japan*

*<sup>3</sup> Center for Clinical and Translational Research, Kyushu University Hospital, Fukuoka, Japan*

#### *Edited by:*

*Gabor Stefanics, University of Zurich and ETH Zurich, Switzerland*

#### *Reviewed by:*

*Piia Astikainen, University of Jyväskylä, Finland Gabor Stefanics, University of Zurich and ETH Zurich, Switzerland*

#### *\*Correspondence:*

*Toshihiko Maekawa, Department of Clinical Neurophysiology, Faculty of Medical Sciences, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka 812-8582, Japan e-mail: t-mae@npsych.med. kyushu-u.ac.jp*

**Objective:** Mismatch negativity (MMN) and P3 are unique ERP components that provide objective indices of human cognitive functions such as short-term memory and prediction. Bipolar disorder (BD) is an endogenous psychiatric disorder characterized by extreme shifts in mood, energy, and ability to function socially. BD patients usually show cognitive dysfunction, and the goal of this study was to access their altered visual information processing via visual MMN (vMMN) and P3 using windmill pattern stimuli.

**Methods:** Twenty patients with BD and 20 healthy controls matched for age, gender, and handedness participated in this study. Subjects were seated in front of a monitor and listened to a story via earphones. Two types of windmill patterns (standard and deviant) and white circle (target) stimuli were randomly presented on the monitor. All stimuli were presented in random order at 200-ms durations with an 800-ms inter-stimulus interval. Stimuli were presented at 80% (standard), 10% (deviant), and 10% (target) probabilities. The participants were instructed to attend to the story and press a button as soon as possible when the target stimuli were presented. Event-related potentials (ERPs) were recorded throughout the experiment using 128-channel EEG equipment. vMMN was obtained by subtracting standard from deviant stimuli responses, and P3 was evoked from the target stimulus.

**Results:** Mean reaction times for target stimuli in the BD group were significantly higher than those in the control group. Additionally, mean vMMN-amplitudes and peak P3-amplitudes were significantly lower in the BD group than in controls.

**Conclusions:** Abnormal vMMN and P3 in patients indicate a deficit of visual information processing in BD, which is consistent with their increased reaction time to visual target stimuli.

**Significance:** Both bottom-up and top-down visual information processing are likely altered in BD.

**Keywords: bipolar disorder, bottom-up, top-down, visual mismatch negativity, visual information processing, windmill pattern, lithium**

# **INTRODUCTION**

Bipolar disorder (BD) is a chronic illness characterized by recurring mood episodes of depression, mania, or mixed states, which often lead to debilitating clinical and functional outcomes. Many patients (30–60%) experience occupational impairment and social dysfunction even during inter-episode euthymic states (Kam et al., 2011). Indeed, a meta-analysis has concluded that BD is characterized by significant deficits in a broad range of cognitive functions that also persist into euthymic phases, including verbal memory, sustained attention, aspects of executive functions, and emotional processing (Andersson et al., 2008). Moreover, BD has been reliably associated with enduring cognitive deficits and abnormal neurophysiological responses such as amplitude- and latency-modulated event-related potentials (ERPs) (Johannesen et al., 2012).

Several auditory ERP components have been found to be impaired in BD (Thaker, 2008), and after much subsequent investigation, are acknowledged as promising potential biomarkers. Mismatch negativity (MMN) is an important auditory ERP that reflects the detection of deviations from an auditory regularity, and is elicited even when attention is not directed to the stimuli. Therefore, MMN is considered to be an index of pre-attentive auditory information processing (Näätänen, 1990). While no abnormalities have been reported in the few studies that have investigated auditory MMN (aMMN) in BD patients (Catts et al., 1995; Umbricht et al., 2003; Salisbury et al., 2007; Hall et al., 2009), the most recent study (Domján et al., 2012) revealed a prolonged pitch-deviant aMMN latency in patients with BD. Although it is increasingly evident that some of these auditory deficits are common in BD, potential visual dysfunction has not yet been sufficiently clarified, in particular, with regard to the preattentive (automatic) information processing underlying visual MMN (vMMN). Because individuals with psychiatric disorders often show abnormalities in both auditory and visual information processing (Maekawa et al., 2012), we believe that like aMMN, vMMN is a promising potential biomarker for psychiatric disorders such as BD.

Regarding other ERPs, individuals with BD differ from healthy control subjects in ERP measures of auditory processing elicited by "oddball" discrimination tasks. In these tasks, participants must identify infrequent target tones presented within a series of frequent (standard) tones. Standard tones elicit the P1, N1, and P2 ERP components, whereas target tones additionally elicit the N2 and P3 ERPs. ERP studies of BD have mainly focused on P3, which is a positive-going wave that peaks approximately 300 ms after the presentation of a target tone and is believed to be an index of selective attention and general cognitive efficiency. The peak latency of this component is believed to reflect stimulus-evaluation speed independent of reaction time, whereas its amplitude may represent neural activity underlying attention and memory processes involved in updating stimulus representations (Polich, 2004). Several studies have reported P3 abnormalities in BD patients, the most consistent being increased P3 latency (Muir et al., 1991; Strik et al., 1998; Thaker, 2008; Hall et al., 2009). However, other studies have found no differences on this measure (Salisbury et al., 1998, 1999). Similarly, while several studies have found reduced P3 amplitudes peaks in BD patients (Muir et al., 1991; Salisbury et al., 1998, 1999; O'Donnell et al., 2004; Hall et al., 2009), others have found no amplitude differences (Souza et al., 1995; Strik et al., 1998). Clinically, compared with healthy controls, P3 amplitude was not reduced in a sample of patients suffering from first-episode affective psychosis (primarily BD) (Salisbury et al., 1998). However, reduced P3 amplitude has been reported in BD patients who were in remission for 6 months, suggesting that this measure indexes a relatively stable deficit that remains even after an extended euthymic period (Kaya et al., 2007).

Visual information processing occurs in several stages, with low-level processing occurring up through primary visual cortex (V1), and high-level processing occurring in up-steam visual association areas. It is well known that P1 (the first positive ERP peak after stimulus onset) reflects lower-level visual processing (for a review, Tobimatsu and Celesia, 2006). Studies have shown a reduced P1 in BD patients, suggesting that lower-level visual information processing may be abnormal in BD (Yeap et al., 2009). Alternatively, the reduced P1 may result from deficits in top-down selective attention. Selective attention is the process whereby a subset of input is selected preferentially for further processing, and has two major aspects: bottom-up and top-down. Bottom-up attention is automatically driven by stimuli properties, whereas top-down attention refers to a volitional focusing of attention on a location and/or an object based on current behavioral goals (Ciaramelli et al., 2008). These streams can operate in parallel but bottom-up attention occurs more quickly than top-down attention (e.g., Treisman et al., 1992). Two specific ERP components are candidates for attentional biomarkers, with visual mismatch negativity (vMMN) and visual P3 indicating bottom-up and top-down attention, respectively (Maekawa et al., 2005, 2009). Here, the paradigm settings for standard, deviant, and target stimuli allowed us to test visual information processing systematically, unlike most vMMN studies that have investigated only pre-attentive (automatic) visual information processing. The purpose of the present study was therefore to evaluate bottom-up and top-down visual information-processing systems in BD patients, and to test the relationships between clinical and demographic measurements and vMMN and visual P3 in BD patients.

# **METHODS**

#### **PARTICIPANTS**

Twenty patients with BD (10 females; mean age: 40.8 years; mean education: 14.6 years; time since diagnosis: 11.5 years) and 20 healthy non-medicated control participants (NC; 10 females; mean age: 41.5; mean education: 14.5 years) without a family history of mental illness were recruited. All participants were right handed, between 18 and 60 years of age and had completed grade-school-level education. Exclusion criteria for the participants included a history of a head injury that resulted in loss of consciousness, history of treatment with electroconvulsive therapy, or a history of substance abuse. For control participants, exclusion criteria included a history of substance abuse or a diagnosis of any current or past Axis I psychiatric illness. Groups did not differ significantly from each other in terms of gender, age, or education years. The patients were recruited from Kyushu University Hospital. This study was approved by Research Ethics Committee in Kyushu University Hospital and all participants gave written informed consent. Diagnosis of BD was made using a clinical interview. DSM-IV-TR (American Psychiatry Association, 2000) diagnoses of all patients were confirmed by two experienced psychiatrists. Participants were free of any diagnosed neurological disorders and had normal or corrected-to-normal vision.

The clinical state of patients at the time of testing was assessed using the Structured Interview Guide for Hamilton Depression Rating Scale (SIGH-D) (Williams, 1988) and the Young Mania Rating Scale (YMRS) (Young et al., 1978).

All patients were taking at least one psychotropic medication. To simplify medication status, we focused on mood-stabilizer dosages (lithium and valproic acid).

Participants' demography, clinical measurements, and medication information are summarized in **Table 1**.

### **VISUAL STIMULI AND PROCEDURES**

Visual stimuli, apparatus, procedures, and ERP-recording procedures were the same as in our previous studies of healthy adults (Maekawa et al., 2005, 2009) and autism spectrum disorder (Maekawa et al., 2011).

Circular black-white windmill patterns with 90% contrast were presented on a 20-inch CRT monitor and controlled using a ViSaGe graphics board (Cambridge Research Systems Ltd, Rochester, Kent, UK). The visual stimulus subtended 5.8◦ of visual angle in diameter at a viewing distance of 114 cm. Participants were seated comfortably in a semi-dark room. To divert attention away from the visually deviant stimuli as much as possible, participants were instructed to focus on a story delivered binaurally through earphones while fixing their gaze on the center of the monitor. Moreover, they were instructed to press a button with their right thumb as soon as they recognized a target stimulus on the monitor. Between trial blocks, they were asked to fill out a questionnaire regarding the context of the story that they had heard (story questionnaire).

Standard, deviant, and target stimuli were presented in a random order for 200 ms on the computer monitor (**Figure 1**). The inter-stimulus interval (ISI) was 800 ms. Stimulus probabilities were 80% (standard), 10% (deviant), and 10% (target).

ERP recordings were composed of two sessions. Standard and deviant stimuli (6-vane and 24-vane windmill patterns) were counterbalanced across sessions, while the target (non-patterned white circle) remained the same throughout the experiment. The target stimulus was the same in the both sessions. The total number of stimuli presented was 1800 (1440 standard, 180 deviant, and 180 target stimuli).

#### **Table 1 | Participants' demography, clinical measurements, and medication status information.**


*Values are expressed as mean (SD).*

*BD, bipolar disorder; NC, normal control; YMRS, young mania rating scale; SIGH-D, structured interview guide for Hamilton depression rating scale.*

### **ERP RECORDINGS**

EEG was recorded from 128 scalp sites referenced to Cz, using a high-density electroencephalography (EEG) system (Net Station 4.1 Software, Electrical Geodesics, Inc., USA). The impedances of all 128 electrodes were maintained below 50 k-. EEG was digitized at 500 Hz and filtered online using a 0.05–200 Hz band-pass filter and stored on a computer.

#### **DATA AND STATISTICAL ANALYSES**

To characterize each subject's degree of attention, the accuracy of answers to the story questionnaire was calculated. Questionnaires consisted of 40 questions, such as "what was the name of the hero?" or "How many persons participated in the operation?" Additionally, reaction time (RT) and accuracy for the target stimuli were also measured as indices of participants' task performance.

EEG data were filtered off-line with a bandpass of 0.05–30 Hz. Digital codes synchronized to the stimulus onset were also stored. At the end of the experiments, EEG epochs associated with each stimulus type were extracted from the continuous record. Epochs with amplitude values exceeding a threshold of ±70μV were discarded automatically. Artifact-free epochs were then segregated by stimulus code and averaged for each subject. The amplitudes of the ERPs were measured relative to a 100-ms pre-stimulus baseline. The grand average across all subjects in each stimulus condition was also computed. To compare our findings with those of previous studies (Maekawa et al., 2005, 2009, 2011), the average of the two electrodes on either side of the nose (electrodes 126 and 127) was adopted as the reference. Eye movements and blinks were measured from bipolar electrodes above and below the eyes (right, electrodes 14 and 126; left, electrodes 21 and 127). Mean trial numbers for standard, deviant, and target stimuli were 902.7 ± 254.9, 105.1 ± 29.6, and 114.5 ± 34.2 in the BD group and were 908.3 ± 263.4, 115.3 ± 35.3, and 131.8 ± 37.3 in the NC group. There were no significant differences in the number of trials for stimulus type or between subject groups.

Basic ERPs that indicate common neurophysiological information processing were assessed using the P1, N1, and P2 components at the Oz. After overviewing all ERP waveforms, P1, N1, and P2 components from each subject were clearly identifiable (see **Table 2**). The time windows for P1, N1, and P2 peak amplitudes

**FIGURE 1 | Three stimulus types used in the present study: six-vane circular black-white windmill pattern stimulus, 24-vane stimulus, and an un-patterned, white circle stimulus.** The two windmill pattern stimuli were

adopted as standard or deviant stimuli (counterbalanced across sessions) and the white circle was always used as the target stimulus. Probabilities of standard, deviant, and target stimuli were 8:1:1, respectively.

for standard and deviant stimuli were set at 90 ± 30 ms, 120 ± 40 ms, and 220 ± 50 ms after stimulus onset, respectively. These time windows were considered to include P1, N1, and P2 peaks in all participants (Luck, 2005). Time windows for P1, N1, and P2 peak amplitudes for the target stimuli were set to 80–140 ms, 140–200 ms, and 200–300 ms after stimulus onset, respectively (Luck, 2005). Basic ERP measurements (P1, N1, and P2 peak amplitudes/latencies) at Oz were subjected to a repeated measure analysis of variance (ANOVA) with stimulus type (standard, deviant) as the within-subject variables and participant group (BD, NC) as the between-subject variable. We recognize that measuring basic ERP time-window amplitudes (P1-N1-P2) might be more appropriate than measuring their peak amplitudes (Picton et al., 2000). However, because most previous ERP studies regarding BD have measured peak amplitudes, choosing the same measure makes comparisons between studies more meaningful. Note stimulus type did not include the target stimulus because its pattern was distinct from that of standard and deviant stimuli (i.e., a white circle *vs.* a windmill pattern).

In all the participants, the response to deviant stimuli at the Oz was more negative than that to standard stimuli during the 150–350 ms following stimulus onset (combined BD and NC groups; paired *t*-test: *t* = 5.186, *P* < 0.001). The time window was justified by visual inspection of grand averaged waveforms and difference waveforms from each participant (see **Figure 2**), consistent with our previous studies (Maekawa et al., 2005, 2009, 2011). **Figure 3** shows the selected electrodes of interest. Electrode numbers 65, 66, and 70 represented the left occipitotemporal region (blue circles), 62, 72, and 75 the mid-occipitoparietal region (red circles), and 83, 84, and 90 the right occipitotemporal region (green circles). Note that Oz corresponds to channel 75 in the EGI net, and this channel was included in the ROI that was later used for statistical analysis. An ANOVA was performed for vMMN mean amplitudes with electrode site (right occipitotemporal, mid-occipitoparietal, left occipitotemporal) being the within subject variable and participant group (BD, NC) the between subject variable. Bonferroni *post-hoc* analysis was performed when significant main effects or interactions were observed.

Attentive visual information processing was evaluated by the N2 and P3 components, which were evoked only in response to

that MMN1 distributes around Pz area, while the lower panel demonstrates

that MMN2 spread over the right occipitotemporal area.



*\*P* < *0.05.*

*Values are expressed as mean (SD). BD, bipolar disorder; NC, normal control.*

the target stimulus. N2 and P3 peak amplitudes/latencies and vMMN mean amplitude were subjected to a repeated measures ANOVA with electrode site (Fz, Cz, Pz, Oz) being the withinsubject variable and participant group (BD, NC) the betweensubject variable.

# **RESULTS**

Although behavioral task performance was successfully measured for all participants, data from the two participants in each group were excluded from the ERP analyses because of excessive artifacts in their ERP recordings. Following these exclusions, there were 18 participants in each group. There were no significant differences in sex ratio, age, or education years between the groups.

#### **BEHAVIORAL TASK PERFORMANCE DATA**

Questionnaire accuracy, response accuracy, and reaction time for target stimuli were evaluated and compared between BD and NC groups using a One-Way ANOVA. There was a marginally significant difference in mean accuracy rates for questions related to the story context [BD, 89.4%; NC, 96.7%; *F*(1, <sup>19</sup>) = 3.15, *P* = 0.084], which may indicate either a deficit in attention or short-term memory in BD. Regarding target-stimulus detection, accuracy did not differ between groups [BD, 84.3 ± 2.5%; NC, 89.8 ± 2.5%; *F*(1, <sup>19</sup>) = 2.52, *P* = 0.12]. However, compared with the NC group, BD patients showed significantly delayed RTs [BD, 467.4 ± 15.1 ms; NC, 402.4 ± 15.1 ms; *F*(1, <sup>19</sup>) = 9.19, *P* = 0.0044].

robust P3 were observed in each group's waveforms. While P3 latency was not significantly different between the two groups, P3 amplitudes were significantly smaller in the BD group (*P* < 0.05). The P3 amplitude gradient for the NC group is steeper than that in the BD group, which roughly corresponds to the statistical differences.

#### **BASIC ERPs (P1-N1-P2)**

Grand-averaged ERP waveforms in response to the standard and deviant stimuli are shown in **Figure 2**. Positive (P1)-negative (N1)-positive (P2) deflections were elicited by all three stimulus type and were maximal at Oz. N2-P3 complexes only appeared with the target stimulus, and were maximal at Pz (**Figure 4**). Mean latency and peak amplitudes for each common ERP component (P1, N1, or P2) are shown in **Table 2**.

# *P1*

A main effect of group was found for both amplitude and latency of the P1 component [amplitude: *F*(1, <sup>17</sup>) = 11.63, *P* = 0.002; latency: *F*(1, <sup>17</sup>) = 9.01, *P* = 0.005]<sup>1</sup> . P1 latency was significantly shorter and amplitude was significantly smaller in the BD group compared with the NC group. No main effect for stimulus or an interaction between stimulus and group were found.

#### *N1*

Although there were no main effects or interactions for N1 amplitude, a main effect of group was observed for N1 latency [*F*(1, <sup>17</sup>) = 12.08, *P* = 0.001], with latency in the BD group being significantly shorter than that in the NC group. No main effect for stimulus or an interaction between group and stimulus were found.

<sup>1</sup>Analysis of ERP component amplitudes was also carried out on mean amplitudes measured over time windows defined for peak search. The results of this analysis corroborated with that P1 amplitudes for both the standard and deviant stimuli in BD group were significantly smaller than those of NC [*F*(1, <sup>34</sup>) = 11.59, *P* = 0.002 and *F*(1, <sup>34</sup>) = 9.80, *P* = 0.004, respectively]. One-way ANOVA did not show any significant differences either for N1 or P2 mean amplitude between subject groups.

# *P2*

There were no main effects or interactions for either P2 amplitude or latency.

# **N2-P3 COMPLEX AND vMMNs**

#### *N2*

An ANOVA testing magnitude at electrode site (Fz, Cz, Pz, Oz) × participant group (BD, NC) showed a significant main effect of electrode [*F*(3, <sup>32</sup>) <sup>=</sup> <sup>17</sup>.59, *<sup>P</sup>* <sup>&</sup>lt; <sup>0</sup>.000, partial <sup>η</sup><sup>2</sup> <sup>=</sup> 0.62]. Bonferroni *post-hoc* comparisons showed that amplitudes at Cz and Pz were significantly larger (more negative) than at Fz (Cz: *P* < 0.001, Pz: *P* < 0.001). There were no significant main effects of participant group or any interactions.

Analysis of latency revealed main effects for both electrode [*F*(3, <sup>32</sup>) <sup>=</sup> <sup>10</sup>.36, *<sup>P</sup>* <sup>&</sup>lt; <sup>0</sup>.001, partial <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.49] and participant group [*F*(1, <sup>34</sup>) <sup>=</sup> <sup>18</sup>.54, *<sup>P</sup>* <sup>&</sup>lt; <sup>0</sup>.001, partial <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.35] were found. There were no significant interactions. *Post-hoc* analysis showed that N2 latencies at Pz and Oz were significantly shorter than those at Fz and Cz (Pz: *P* < 0.001 for Fz, *P* = 0.009 for Cz, Oz: *P* < 0.001 for Fz, *P* = 0.008 for Cz).

# *P3*

ANOVAs for both amplitude and latency revealed a significant main effect of participant group [amplitude: *F*(1, <sup>34</sup>) = 11.66, *P* = <sup>0</sup>.02, partial <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.26, latency: *<sup>F</sup>*(1, <sup>34</sup>) <sup>=</sup> <sup>4</sup>.44, *<sup>P</sup>* <sup>=</sup> <sup>0</sup>.042, partial <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.12]. There were no significant main effects of electrode site or any interactions (**Figure 5**).

# *vMMNs*

Difference waveforms were constructed by subtracting waveforms generated in response to standard stimuli from those to the

latency among the electrode sites. Error bar: standard error of mean,

deviants. Topographical distributions were inspected to verify that the vMMN occurred around the Oz electrode 150–350 ms after stimulus onset in all participants. Therefore, vMMN amplitude was calculated for each participant as the mean amplitude during that interval. The vMMN consisted of an early peak (MMN1) with a latency between 150 and 200 ms, located predominantly over the parietal area, and a late peak (MMN2) with a latency between 200 and 350 ms located over the temporal area (**Figure 2**).

While ANOVA revealed no significant main effects or interactions for MMN1 amplitude, significant main effects of both electrode site [*F*(2, <sup>33</sup>) <sup>=</sup> <sup>3</sup>.25, *<sup>P</sup>* <sup>=</sup> <sup>0</sup>.049, partial <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.17] and participant group [*F*(1, <sup>34</sup>) <sup>=</sup> <sup>42</sup>.01, *<sup>P</sup>* <sup>&</sup>lt; <sup>0</sup>.001, partial <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.55] were observed for MMN2. The interaction between electrode site and participant group was also significant [*F*(2, <sup>33</sup>) = 3.48, *<sup>P</sup>* <sup>=</sup> <sup>0</sup>.042, partial <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.18]. *Post-hoc* analysis with multiple comparisons showed that the response at the right occipitotemporal region was significantly larger than that at the midoccipitoparietal region (*P* = 0.043), but only in the NC group (BD: *P* = 1.000; NC: *P* < 0.001).

### **RELATIONSHIPS BETWEEN ERP COMPONENTS AND DEMOGRAPHIC AND CLINICAL MEASURES**

We conducted a multiple regression analysis to examine the relationship between ERP-component amplitudes (MMN2 and P3) and demographic and clinical variables among BD patients. Because mean amplitudes collected from some electrodes might reduce the statistical power, we adopted the amplitude of MMN2 at Oz and that of P3 at Pz for this analysis. The demographic and clinical variables tested with the model were age, sex, years of education, symptom score (YMRS and SIGH-D), mood-stabilizer dosage (lithium or valproic acid), and illness onset age. **Figure 6** shows scatterplots of ERP amplitudes and mood-stabilizer dosage. Among BD patients, Spearman's rank correlation analysis revealed that there was a significant relationship between lithium dosage and MMN2 mean amplitude (*R* = 0.48, *P* = 0.043). A significant effect of dosage on P3 amplitude was also observed (*R* = −0.42, *P* = 0.043). Because the number of patients taking valproic acid was small (i.e., five), valproic acid dosage was removed from the correlation analysis.

**FIGURE 6 | Spearman's rank correlation analysis for MMN2 (left) and P3 (right) as a function of lithium dosage within the BD patient group.** In the both figures, MMN2 and P3 amplitude are smaller with increasing lithium dosage.

∗*P* < 0.01; ∗∗*P* < 0.001.

P3 peak amplitude was also significantly correlated with age (*R* = −0.43, *P* = 0.036). Other demographic and clinical variables were not found to be related to MMN2 mean-amplitude or P3 peak-amplitude.

# **DISCUSSION**

The present study used ERP responses to a non-social visual stimulus to determine whether or not patients with BD show significant differences in visual information processing when compared with healthy individuals. The major differences between BD and control groups are summarized as follows. (1) BD patients performed marginally worse in the auditory context and had significantly slower reaction times. (2) The P1 response to standard and deviant stimuli in BD patients was significantly earlier and smaller than that in the NC group. (3) N1 response latency to standard and deviant stimuli in BD patients was significantly shorter than that in NC group. (4) The N2 latency to the target stimulus in BD patients was significantly delayed and The P3 component was smaller in BD subjects than that in the NC group. (5) MMN2 amplitude in the right occipitotemporal area in BD patients was significantly smaller than that in the NC group. (6) Both MMN2 and P3 amplitudes were significantly correlated with lithium dosage in BD patients. Thus, The ERP profiles for the two groups contained more differences than we expected. Possible explanations for such differences in ERP are discussed below.

#### **ALTERATION OF EARLY VISUAL PROCESSING IN BD**

We found that compared with controls, the early visual potential (P1) in BD patients was altered, with a significantly shorter latency and smaller amplitude. Most previous studies have not found any significant difference of P1 latency in BD. We presume that other components (such as C1) may overlap the P1 period in the present study. Because the signal to noise ratio can increase because of enormous P1-amplitude reduction, a hidden C1 can emerge that may be mistaken for P1. Even so, the P1-amplitude reduction seen here is consistent with reduction observed in ERP studies of endogenous neuropsychiatric disorders such as schizophrenia (Yeap et al., 2008). Moreover, the reduction seen here was very similar to deficits we reported in patients with schizophrenia using the identical paradigm (Maekawa et al., 2008). This suggests that visual sensory-processing deficits are common to both conditions. Strikingly, these findings are fairly consistent with results from another study in which reduced P1 amplitude to a geometric stimulus (isolated-check image) was demonstrated in BD patients (Yeap et al., 2009). Because the weight of evidence suggests that the P1 deficit is endophenotypic for schizophrenia (Hirano et al., 2010), it will be important for future investigations to establish whether this marker of visual dysfunction indexes shared genetic liability between schizophrenia and BD.

Contrary to our expectation, N1 latency was significantly faster in the BD group than in controls. Whereas ERP studies in BD patients often do not focus on the N1 component, abnormal N1 latency to auditory and/or visual stimuli in BD has been reported (Andersson et al., 2008; Fridberg et al., 2009; Lijffijt et al., 2009). Auditory and visual N1 may share different neurophysiological roles because they are generated from different structures in the brain (supratemporal and extrastriate cortices, respectively). Even so, it is well known that both are modulated by attention (for a review, see Näätänen, 1988). Because the N1 latency and amplitude depend on stimulus conditions (e.g., stimulus type, ISI, intensity, arousal, or attention), sometimes interpreting it in terms of a mechanism for illness is difficult (Rosburg et al., 2008).

#### **ABNORMAL ATTENTIVE PROCESSING IN BD**

The P3, including P3a and P3b, is the most-tested ERP component in patients with BD. Although most studies show abnormal P3 amplitude and/or latency in the grand averaged waveforms in BD patients, whether real statistical differences exist has been controversial (significant: Andersson et al., 2008; Schulze et al., 2008; Fridberg et al., 2009; Hall et al., 2009; Ryu et al., 2010; Jahshan et al., 2012; Johannesen et al., 2012, insignificant: Salisbury et al., 1998, 1999; Bestelmeyer, 2012; Domján et al., 2012). Generally, the group differences and effect size found in ERP measures are not as convincing as the neurophysiological differences, allowing no firm conclusions. However, we found a significantly delayed N2 latency and smaller P3 amplitude. The N2-P3 complex in response to target stimuli is usually called the N2b-complex, and underlies attentional processing for target detection (Näätänen, 1990). The peak-P3 latency is believed to show that stimulusevaluation speed is independent of reaction time, whereas its amplitude may represent neural activity underlying attention and memory processes involved in updating stimulus representation (Polich, 2004). Despite significantly delayed reaction time, prolonged N2 latency, and reduced P300 amplitude in our BD group, patients followed the contexts of the stories during the examination as well as the control group. Behavioral and neural results indicate that BD patients here likely had a deficit in attention that was obvious behaviorally and neurally, but not clinically.

# **ABNORMAL PRE-ATTENTIVE PROCESSING IN BD**

This is the first report regarding vMMN in patients with BD. Although the existence of a visual analogue of auditory MMN (aMMN) has long been debated, some studies (Pazo-Alvarez et al., 2003; Maekawa et al., 2005, 2009) have demonstrated genuine vMMN that meets the MMN criteria. vMMN is often described as a negativity measured at the occipital electrodes between 150 and 350 ms after the onset of an infrequent (deviant) visual stimulus inserted in a sequence of frequently presented (standard) visual stimuli (Pazo-Alvarez et al., 2003; Czigler, 2007). vMMN is assumed to have similar properties to aMMN, but in the visual modality. Moreover, it can be evoked preattentively, which reflects the memory representation of visualstimulation regularity (Czigler, 2007). Although a number of studies have found converging evidence for the existence of vMMN, there has been little vMMN research related to neuropsychiatric disorders (for a review, see Maekawa et al., 2012). Because there have been few BD reports regarding neurocognitive dysfunction in areas such as sustained attention, selective attention, or visual working memory (Balanzá-Martínez et al., 2008), we hypothesized that vMMN could be a sensitive biomarker for detecting deficits of pre-attentive (automatic) information processing in BD patients. As expected, results in this study demonstrated that vMMN was evoked in the parietooccipito-temporal area in both subject groups (see topographical maps in **Figure 2**). Moreover, vMMN comprised an early phase (100–150 ms, MMN1) and a later one (200–350 ms, MMN2), identical to our previous findings (Maekawa et al., 2005), and MMN2 in the BD group was significantly smaller than that in the NC group. Several studies have demonstrated the existence of two vMMN components (e.g., Astikainen and Hietanen, 2009; Kimura et al., 2009) and investigated the sources of vMMN for motion (Pazo-Alzarez et al., 2004; Yucel et al., 2007; Cléy et al., 2013), direction (Kimura et al., 2010), face (Kimura et al., 2012), color (Urakawa et al., 2010a,b; Müller et al., 2012), shape (Kecskés-Kovács et al., 2013), and handedness (Stefanics and Czigler, 2012). While neural activation in the occipital lobe was commonly observed in these studies, several activations in other regions were reported (for instance, posterior parietal cortex, anterior premotor cortex, orbitofrontal cortex, and temporal cortex). Two recent studies (Müller et al., 2012; Kecskés-Kovács et al., 2013) suggest that the earlier component is localized to retinotopically organized regions of the visual cortex and that the later one is generated from the middle occipital gyrus. The early phase has been characterized by deviance-related low-level activation (not allowing for memory issues) while the later one corresponds to the detection of changes based on memory comparisons. Therefore, our vMMN findings, especially MMN2, suggest that patients with BD have limited visual information processing that underlies deficits in pre-attentive memory-based detection of changes in the visual world.

Regarding vMMN laterality between BD and NC groups, MMN2 amplitude in the right occipitotemporal area in the BD group was smaller than that in the NC group. Two reports of vMMN in patients with major depressive disorder (MDD) have been published (Chang et al., 2010; Qiu et al., 2011). Chang et al. (2010) showed that expression-related vMMN at the P8 electrode was smaller in MDD patients compared with normal controls, suggesting a dysfunction in pre-attentive processing of emotional faces. Qiu et al. tested vMMN for duration-deviant stimuli in MDD and found that patients had dysfunctional visual-duration processing in the pre-attentive stage (2011). Although vMMN for short-duration deviants did not differ across groups, vMMN for long-duration deviants was significantly smaller in the right occipitotemporal area of MDD patients. Thus, vMMN is smaller in both MDD and BD patients compared with healthy controls. Although BD and MDD are considered to be different types of illnesses, finding that both conditions are associated with altered vMMN in the right occipitotemporal area implies a common abnormality in visual information.

To date, only two studies have examined the association between vMMN amplitude and behaviorally relevant factors. Stefanics and Czigler (2012) showed that vMMN amplitude to deviant right-hand stimuli correlated with behavioral preference to use the right hand. They concluded that continuously monitoring the identity of the left or right hand is a prerequisite for the ability to automatically transform observed actions into an observer's egocentric spatial reference frame. Gayle et al. (2012) demonstrated that vMMN in individuals with autism-spectrum personality traits were less sensitive to happy emotional expressions, and correlated well with Adult Autism Spectrum Quotient scores. They suggested that vMMN elicited by deviant emotional social expressions may be a useful indicator of affective reactivity and may thus be related to social competency in autism spectrum disorder. These reports indicate that the vMMN is not only an epiphenomenon but also a pre-attentive measurement relevant to behavior.

# **CHANGES IN BOTTOM-UP AND TOP-DOWN SENSORY PROCESSING IN BD**

From classical selective-attention capacity-model theory (Kahneman et al., 1992), attention resources in humans are finite and cognitive processing can work successfully only when they are shared correctly. Under a task condition that overloads attention processing, operating efficiency is apparently decreased. A well-known working memory model (Baddeley, 2001), developed from a dual storage model (Atkinson and Shiffrin, 1971), suggests that sensory information (stimulus) automatically enters into sensory registers and is kept as a sensory memory (∼500 ms). If selective attention is directed to the sensory memory, intentional processing can work. The sensory register consists of the phonological loop, visuospatial sketchpad, and central executive. The phonological loop and visuospatial sketchpad are controlled and integrated by the central executive system. Therefore, vMMN underlying sensory memory and/or a prediction system (Stefanics et al., 2012) could reflect bottom-up visual processing (Winkler and Czigler, 2012), while P3 underlying the central executive system could represent top-down visual processing (Saida et al., 2013). According to a more recent model (Friston, 2005), auditory MMN emerges when the incoming stimulus is incongruent with events that are predicted on the basis of learned statistical regularities of the stimulus properties. In line with this, vMMN emerges when a current visual event is incongruent with visual events that are predicted on the basis of extracted sequential rules (i.e., prediction error account of vMMN). Moreover, in this model, forward, backward, and lateral neural connections in the human brain underlie the vMMN. Predictive memory representations of environmental regularities are generated by interactions between multiple levels of a hierarchical system in the brain. Therefore, from these models, our results here can be interpreted as showing abnormal visual working memory systems in BD patients, including both bottom-up and top-down processing.

# **CORRELATION BETWEEN LITHIUM AND ERPS**

Correlation analysis revealed significant mutual relationships between age and P3 that were consistent with several previous studies in healthy subjects (Polich, 1991, 2007; Juckel et al., 2012), but are beyond the scope of the current report. Both MMN2 and P3 amplitudes were negatively correlated with lithium dosage (**Figure 6**) and to the best of our knowledge this is the first report of such correlation.

Several researchers have reported effects of lithium on neuropeptides, cognition, attention, and verbal memory (Bell et al., 2005; Senturk et al., 2007; Nishino et al., 2012; Chiu et al., 2013). For example, brain-derived neurotropic factor (BDNF), which is an important neurotrophin for learning and memory via neurogenesis (Nishino et al., 2012) is itself affected by lithium and the relationship may be explained by the glutamatergic system present in BD patients. In rats, phosphorylation of the NMDA receptor NR2B subunit at Tyr1472 is reduced, suggesting that lithium works to protect against glutamate excitotoxicity in cerebral neurons (Hashimoto et al., 2007). Thus, lithium can affect neurons through its interactions with BDNF, and increasing evidence establishes correlations between BDNF secretion and the glutamate system. In contrast, there is little evidence about its neurophysiological effect on vMMN and P3. One auditory MMN report (Jahshan et al., 2012) showed that there were no significant differences in auditory MMN or P3a between BD patients taking lithium and those that were not. Mood stabilizers such as lithium are given to BD patients to control their affective state, and the effects on attention and cognitive function is therefore very important for their quality of life. More meticulous investigation and prudent interpretations of this issue are therefore needed.

#### **METHODOLOGICAL CONSIDERATIONS**

Essentially, vision is always accompanied by attention, which is a basic difference from audition. Therefore, we took scrupulous care of controlling attention in the present experimental design. Participants listened to a story throughout the experiments, and the accuracy of their answers to a questionnaire related to the story was 89.4% and 96.7% in the BD and NC groups, respectively. In addition, accuracy for detecting the correct target was 84.3% and 89.8% in the BD and NC groups, respectively. The high behavioral accuracies assured us that the subjects divided their attention between the auditory task and

# **REFERENCES**


A., Sánchez-Moreno, J., Salazar-Faile, J., et al. (2008). Neurocognitive endophenotypes (Endophenocognitypes) from studies of relatives of bipolar disorder subjects: a systematic review. *Neurosci. Biobehav. Rev.* 32, 1426–1438. doi: 10.1016/j. neubiorev.2008.05.019


visual target identification. However, we could not really measure how much attention was diverted toward the deviant stimuli. Even so, the attention specific N2b-P3 complex was activated only by the target stimuli for each condition but not by the deviant stimuli. This result supports the idea that attention was shifted away from the deviant stimulus. Accordingly, we have already tested for attentional leak to the deviant stimulus and concluded our vMMN satisfies the definition of MMN (Maekawa et al., 2005). Moreover, there have been several studies that carefully controlled the direction of attention (e.g., Czigler et al., 2002, 2004; Heslenfeld, 2003; Kimura et al., 2009, 2010; Stefanics et al., 2012; Stefanics and Czigler, 2012) that reported vMMNs similar to the ones we present here. Therefore, even if attention leaked toward the deviant stimulus, we believe that it would not significantly alter our present results.

# **CONCLUSION**

Our study is the first to simultaneously investigate both bottomup and top-down visual information processing in patients with BD. vMMN exhibits properties of early automatic memorybased comparison processing, whereas P3 indexes higher-level, attention-dependent cognitive functions. The deficits in visual information processing that BD patients exhibit seem to be present from the very early stages all the way to higher-level cognitive functions.

# **ACKNOWLEDGMENTS**

This manuscript was supported in part by Grants-in-Aid for Scientific Research from the Ministries of Education, Culture, Sports, Science, and Technology (40448436), and of Health, Labour and Welfare (H23 Kokoro-ippan-002), Japan.


Relationships between auditory event-related potentials and mood state, medication, and comorbid psychiatric illness in patients with bipolar disorder. *Bipolar Disord.* 11, 857–866. doi: 10.1111/j.1399-5618.2009.00758.x


60, 2027–2034. doi: 10.1016/j. neuroimage.2012.02.019


A. (2004). Auditory event-related potential abnormalities in bipolar disorder and schizophrenia. *Int. J. Psychophysiol.* 53, 45–55.


(2007). Progressive and interrelated functional evidence of post-onset brain reduction in schizophrenia. *Arch. Gen. Psychiatry* 64, 521–529. doi: 10.1001/archpsyc.64.5.521


affective disorder. *Biol. Psychiatry* 37, 300–310. doi: 10.1016/0006- 3223(94)00131-L


T., et al. (2003). How specific are deficits in mismatch negativity generation to schizophrenia? *Biol. Psychiatry* 53, 1120–1131. doi: 10.1016/S0006-3223(02) 01642-6


*Eur. Arch. Psychiatry Clin. Neurosci.* 258, 305–316. doi: 10.1007/s00406- 008-0802-2


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 27 February 2013; accepted: 09 July 2013; published online: 26 July 2013. Citation: Maekawa T, Katsuki S, Kishimoto J, Onitsuka T, Ogata K, Yamasaki T, Ueno T, Tobimatsu S and Kanba S (2013) Altered visual information processing systems in bipolar disorder: evidence from visual MMN and P3. Front. Hum. Neurosci. 7:403. doi: 10.3389/fnhum.2013.00403*

*Copyright © 2013 Maekawa, Katsuki, Kishimoto, Onitsuka, Ogata, Yamasaki, Ueno, Tobimatsu and Kanba. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any thirdparty graphics etc.*

# Visual mismatch negativity in the dorsal stream is independent of concurrent visual task difficulty

#### *Jan Kremlácek ˇ <sup>1</sup> \*, Miroslav Kuba1, Zuzana Kubová1, Jana Langrová1, Jana Szanyi 1, František Vít <sup>1</sup> and Michal Bednáˇr <sup>2</sup>*

*<sup>1</sup> Department of Pathological Physiology, Faculty of Medicine, Charles University in Prague, Hradec Králové, Czech Republic*

*<sup>2</sup> Department of Rehabilitation, Faculty of Medicine, Charles University in Prague, Hradec Králové, Czech Republic*

#### *Edited by:*

*Gabor Stefanics, University of Zurich & ETH Zurich, Switzerland*

#### *Reviewed by:*

*Fruzsina Soltész, GlaxoSmithKline, UK Risto Näätänen, University of Helsinki, Finland*

#### *\*Correspondence:*

*Jan Kremlácek, Department of ˇ Pathological Physiology, Faculty of Medicine, Charles University in Prague, Simkova 870, 500 38 Hradec Králové, Czech Republic e-mail: jan.kremlacek@lfhk.cuni.cz*

The manipulation of attention can produce mismatch negativity-like components that are not necessarily connected to the unintentional sensory registration of the violation of probability-based regularity. For clinical purposes, attentional bias should be quantified because it can vary substantially among subjects and can decrease the specificity of the examination. This experiment targets the role of attention in the generation of visual mismatch negativity (vMMN). The visual regularity was generated by a sequence of two radial motions while subjects focused on visual tasks in the central part of the display. Attentional load was systematically varied and had three levels, no-load, easy, and difficult. Rare, deviant, and frequent standard motions were presented with a 10/60 ratio in oddball sequences. Data from 12 subjects was recorded from 64 channels and processed. vMMN was identified within the interval of 142–198 ms. The mean amplitude was evaluated during the aforementioned interval in the parietal and fronto-central regions. A general linear model for repeated measures was applied to the mean amplitude with a three-factor design and showed a significant difference [*F*(1, <sup>11</sup>) = 17.40, *p* = 0.002] between standard and deviant stimuli and between regions [*F*(1, <sup>11</sup>) = 8.40, *p* = 0.01]; however, no significant effect of the task [*F*(2, <sup>22</sup>) = 1.26, *p* = 0.30] was observed. The unintentional detection of irregularity during the processing of the visual motion was independent of the attentional load associated with handling the central visual task. The experiment did not demonstrate an effect of attentional load manipulation on mismatch negativity (MMN) induced by the motion-sequence, which supports the clinical utility of this examination. However, used stimulation paradigm should be further optimized to generate mismatch negativity that is stable enough to be usable not only for group comparisons but also for a single subject assessment.

**Keywords: visual mismatch negativity, visual motion, magnocellular pathway, dorsal stream, attention, irrelevant stimulus processing**

# **INTRODUCTION**

A specific component of the event-related potential (ERP), called Mismatch Negativity (MMN), denotes an electrophysiological correlate of the brain's detection of an unintentional disruption in the regularity of temporal events. The underlying mechanism is currently attributed to the conflict (error) between sensory input and a prediction and is involved in the processes of perceptual learning (Garrido et al., 2009). Originally, the MMN was described in the auditory modality (Naatanen et al., 1978) as a sensory intelligence within the primary sensory cortex that registers deviant events in a series of standard events (Naatanen et al., 2001). Recent studies on this topic identified an analogous response in the visual modality (vMMN) (Pazo-Alvarez et al., 2003).

Similar to the MMN in the auditory modality, utilizing the vMMN may represent a promising approach for the study of implicit perceptual learning in neuropsychiatric patients, as it is an inexpensive and non-invasive method. This method has previously generated positive results in patients with diseases such as Alzheimer disease (Tales and Butler, 2006; Tales et al., 2008), schizophrenia (Urban et al., 2008), depression (Chang et al., 2011), and autism (Cléry et al., 2013) or in abusers of methamphetamine (Hosak et al., 2008; Kremlacek et al., 2008).

Initially the MMN was recognized as a component independent of attention [in the auditory modality it can be elicited during coma or sleep—see (Näätänen et al., 2011)] and is different from the neuronal fatigue response [i.e., it can be elicited in response to an omitted stimulus (Czigler et al., 2006)]. Genuine MMN reflects a biologically important mechanism for the detection of irregularities in the environment (Czigler et al., 2007).

The MMN, as an electrophysiological marker of specific sensory discrimination, can be confounded by concurrent processes that mimic its appearance. One such process is the aforementioned neural fatigue response (refractoriness), during which a neural population of cells shows repetition-induced suppression of responses to standard stimuli, while another neural population of cells responds to different features of the deviant stimulus without suppression. Attention-related negative components can also confound processes (Czigler, 2007) that are connected to the MMN, as attention can change the ERP response in early visual processing without sensory discrimination (Luck et al., 2000)<sup>1</sup> . For this reason a vMMN review (Czigler, 2007) addressed the issue of attention and noted the necessity to control for this potentially confounding effect.

Because the measurement of the vMMN has to control for refractoriness and attention bias, the procedure is typically long and is paired with a demanding task; thus, its clinical utility is limited as the attentional resources of neuro-psychiatric patients are restricted.

Visual processing is initially anatomically separated into three pathways (parvo-, magno- and konio-cellular). It is generally accepted that the parvocellular (sustained) system conducts information about form and color to the ventral stream and that the second magnocellular (transient) system predominantly carries motion information to the dorsal stream (Ungerleider and Mishkin, 1982; Livingstone and Hubel, 1988). Although, in the later stages of processing, the separate inputs are heavily interconnected it is possible to some extent separately activate the dorsal stream by utilizing stimuli with a low spatial frequency, low contrast, and high temporal frequency (Kuba et al., 2007).

The transient/magnocellular system is considered to be faster than the parvocellular system and is engaged in exogenous attention processing (Steinman et al., 1997; Abrams and Christ, 2003; Laycock et al., 2008) [although not exclusively (Ries and Hopfinger, 2011)] and therefore might be more suitable for vMMN examination.

Because of selective deficits within the previously mentioned streams in some neuro-ophthalmic disorders, such as open angle glaucoma, multiple sclerosis, neuroborreliosis, amblyopia, among others (Kubova et al., 1996; Arakawa et al., 1999; Szanyi et al., 2012), the examination of the vMMN along the magnocellular pathway/dorsal stream might bring new information.

In our previous study, we used a paradigm for vMMN generation through the activation of the magnocellular pathway that met the requirements for refractoriness elimination (Kremlacek et al., 2006). For the experiment described in this study, we modified our previous design. We used radial motion (Kremlacek et al., 2004) for more effective standard/deviant peripheral activation and we applied an interleaved numeric task of different stimulus dimension for the control of attention. The interleaved design shortened the examination time and the use of numbers in the center of the visual field allowed for additional manipulations with attentional involvement.

The aim of this study was to evaluate the effect of task difficulty on an electrophysiological correlate of the violation of probability-based regularity, induced by the activation of magnocellular input via a motion sequence. We also sought to determine a sufficient level of task difficulty to allow for unbiased vMMN examination during clinical use.

# **METHODS**

# **SUBJECTS**

We examined a group of twelve healthy adult subjects (aged 21–61 years, 3 females) with no ophthalmologic or neurological abnormalities and with normal or corrected-to-normal visual acuity. Informed consent was obtained from each subject after they received an explanation of the test procedure. The study was approved by the Ethical Committee of the Faculty of Medicine in Hradec Kralove and experiments were conducted in accordance with the Declaration of Helsinki (World Medical Association, 2004).

# **STIMULI**

The stimulus consisted of a low contrast (10%) sinusoidal circular pattern outside of the central 10◦ of the visual field of 36 × 47◦. The spatial frequency of the pattern decreased toward the periphery, from 0.4 to 0.2 c/◦. The pattern changed every 200 ms in a sequence of expansion (100 ms) and contraction (100 ms) or in the opposite sequence (contraction followed by expansion), with a velocity from 12.5 to 25◦/s, to keep the temporal frequency of 5 Hz constant within the stimulus field.

In between the motion sequences, the pattern was stationary for 600 ms. During this stationary phase, the fixation point in the center of the stimulus field was changed to a randomly selected digit from 1 to 8 for 200 ms.

The vMMN was elicited by a change in the sequence of the expanding/contracting radial motions while the subject visually fixated on the central part of the display. The ratio between deviant and standard stimuli was 0.17. In half of the recorded blocks, the standard stimulus was an expanding/contracting motion and the deviant was a contracting/expanding motion. During the second half of the blocks, the stimuli were interchanged (see **Figure 1**).

To explore the relationship between the vMMN and the amount of attention allocated outside the standard/deviant stimuli, we used three tasks: a simple central fixation requiring no overt behavioral response and an oddball task of two difficulties. During the oddball task, subjects were instructed to press a handheld button as soon as the number 1 (easy task) or the numbers 1, 4, or 8 (difficult task) appeared. The target to non-target ratio was 0.30 for both the difficult and easy tasks. The number of target stimuli was the same in both oddball tasks and it was twice the number of deviant stimuli.

The entire session consisted of 7 blocks and each block included three tasks that were presented pseudo-randomly in three sub-blocks, each lasting one minute. Stimulus presentation in each block was terminated when 10 deviant and 20 target stimuli were delivered. The number of standard and non-target stimuli was different in each block but corresponded with the previously mentioned probabilities. Between sub-blocks there were 5 s breaks and between blocks there were 15 s breaks with short joke texts presented on the screen to keep the subjects alert.

<sup>1</sup>The role of attention in the generation of the MMN is complicated because the MMN was shown to depend on the manipulation of attention, mainly during the formation of the response to standard stimuli (e.g., building a memory trace) (Sussman et al., 2002). When subjects ignored a regular pattern of oddball design, the MMN was generated as the result of sensory discrimination; however, when they were instructed to pay attention to the pattern in the same oddball sequence, the MMN diminished (Sussman et al., 2002). Currently, it is accepted that perceptual learning, which is a necessary process in MMN generation, can be influenced by attention (Sussman, 2007); however, the process should be unintentional (Kimura, 2012).

The first block was used to familiarize the subjects with the tasks. The experiment timing and stimulus appearance are depicted in **Figure 1**. The stimuli were presented on a 21-inch computer monitor (Mitsubishi Diamond Pro 2070 SB, Japan). The monitor was driven using PsychToolbox (Brainard, 1997) at a 100 Hz. A mean screen luminance of 21 cd/m2 was used for all stimuli.

scheme **(B)** shows a temporal diagram of events occurring in the peripheral

#### **RECORDING**

vMMN acquisition was performed in a darkened, sound attenuated, electromagnetically shielded room, with a background luminance of 1 cd/m2. The subjects were seated and instructed to fixate on the center of the stimulus field.

Responses were recorded from 68 unipolar electrodes, including four EOG electrodes. The right earlobe (A2) served as a reference. The signal amplifier had a bandwidth of 0.3–100 Hz (Alien technik s.r.o., Czech Republic). The EEG was sampled at a rate of 1024 Hz and saved for off-line processing.

# **ANALYSIS**

The data were processed using EEGlab (Delorme et al., 2011) and custom routines in Matlab release 2013a (Mathworks, USA). The recorded EEG was digitally band pass filtered (0.5–30.0 Hz) and divided into epochs of −99 to 400 ms in duration with respect to the onset of a standard/deviant stimulus. The baseline was defined as the mean amplitude in the period from −99 to 0 ms (prestimulus part) for each epoch. Epochs with amplitudes outside the range of ±50µV were rejected (18% of all epochs). Channels with artifacts were removed and substituted by spatially interpolating the signal using EEGlab. Using this method, we interpolated one channel in 6 subjects, two channels in 3 subjects and three channels in one subject. To create session as short as possible, every second target was presented immediately after a deviant stimulus what systematically contaminated the responses to deviant stimuli and in lesser extend to the standard stimuli by the readiness potential (Bereitschafts Potential). The linear trend of in the epochs was removed to eliminate bias caused by the preparation (expectation) of responding to the oddball task. In each subject, we evaluated responses to the standard stimuli immediately preceding responses to the deviant stimuli (6 × 3 × 10 epochs). The responses to direct and "inverted" stimuli were pooled for the analysis.

The period containing a possible vMMN was identified as the local maxima of the global mean field power of the deviant standard ERPs aggregated across subjects, task and blocks. Statistical analysis was performed on the mean amplitudes from the selected periods in the fronto-central and parietal regions, which were selected according to the vMMN distribution (see **Figure 3**).

A general linear model for repeated measures was applied to the mean amplitude with a three-factor design: condition (standard and deviant), region (fronto-central and parietal), and task (fixation only, easy and difficult task). The results are reported as statistically significant if *p* < 0.05.

The correlation between age and visually evoked potentials (Kuba et al., 2012) suggests that age might be used as a covariate in our analysis. We examined the correlation between age and the vMMN, but there was no significant correlation; therefore, only within subject factors without age as a covariate were used in the general linear model.

# **RESULTS**

# **BEHAVIORAL ANALYSIS**

properties of the peripheral stimuli.

The reaction time for the easy task was 343 ± 46 ms, while for the difficult task subjects responded 392 ± 51 ms after the target number. The reaction times for the easy task were significantly shorter [paired *t*-test *t*(9) = 5.8, *p* < 0.001]. Due to response box error, three subjects were excluded from the reaction time analysis.

#### **ELECTROPHYSIOLOGICAL DATA**

Based on the global mean field power of aggregated vMMN, three intervals were visually identified: 142–198, 265–322, and 323– 400 ms (see **Figure 3**). The vMMN reached a maximum in two regions: the fronto-central (F1, F*Z*, F2, FC1, FC*Z*, FC2, C1, C*Z*, and C2) and the parietal regions (CP1, CP*Z*, CP2, P1, P*Z*, P2, PO1, PO*Z*, and PO2). The mean amplitude was evaluated in the aforementioned intervals and the regions of interest. The aggregated ERPs, together with the localization of electrodes, are depicted in **Figure 2**.

A general linear model for repeated measures was applied to the mean amplitudes with a three-factor design and showed a significant difference for only the first interval. The mean amplitudes are listed in **Table 1**. Statistical significance was reached for the factor of condition [*F*(1, <sup>11</sup>) = 17.40, *p* = 0.002] and for region [*F*(1, <sup>11</sup>) = 8.40, *p* = 0.014] but not for task [*F*(2, <sup>22</sup>) = 1.26, *p* = 0.30]. The analysis also indicated an interaction effect between task and amplitude in regions [*F*(2, <sup>22</sup>) = 4.16, *p* = 0.029], showing that the amplitudes in the fronto-central region decreased with the difficulty of the task, while they increased in the parietal area. This interaction did not occur with the standard/deviant condition; thus, it will not be further discussed. The other interactions did not reach statistical significance [condition × task *F*(2, <sup>22</sup>) = 0.66, *p* = 0.527; region × condition × task *F*(2, <sup>22</sup>) = 0.65, *p* = 0.534].

# **DISCUSSION**

Our experiments have shown that the vMMN, evoked by a sequence of motions in periphery of the visual field, was not modulated by the difficulty of tasks that subjects solved in the central part of the visual field. A previous study by

**Table 1 | The table shows the mean amplitudes and standard deviations in the selected interval of 142–198ms, from fronto-central and parietal derivations, for the standard and deviant conditions that were grouped together for the three different tasks.**


*The grand average ERPs, regions and the intervals of interest are depicted in Figure 2.*

**two regions.** A schematic layout of the recording electrodes with indication of the fronto-central (full black circles) and the parietal (full gray circles) regions of interest is in the left portion of the figure. The top

three rows display responses from the three tasks separately, and the fourth row shows all tasks together. The interval of interest, for which the mean amplitude was evaluated, is depicted as a gray rectangle along horizontal axis.

Pazo-Alvarez et al. (2004) used a similar design: a central task to control the attentional load and two moving gratings that appeared in the periphery and defined the standard/deviant condition by their direction of motion. They, in agreement with our results, did not find any effect of task difficulty on the generation of the vMMN. Our results are also similar to a study using a continuous performance task in the central part of the screen and standard/deviant stimuli presented as a grating in the periphery of the visual field (Heslenfeld, 2003). The authors did not report an effect of task difficulty on the vMMN found in the interval of 160–200 ms over the occipital, temporal or parietal areas.

However, our findings contradict several studies regarding the MMN in the auditory (for review see Sussman, 2007) and visual domains (Kimura et al., 2008; Czigler and Sulykos, 2010) where the attentional load or direction of attention modulated MMN. Such modulations are in agreement with the general effect of attention on the ERP (Luck et al., 2000). Some of these results do not directly contradict our results, such as the results for changes in the vMMN that were induced by the attention to a task, which were restricted to only interactions within the same stimulus dimension (i.e., the task was focused on color and the regularity was broken by a color change) (Czigler and Sulykos, 2010) while Heslenfeld's, Pazo-Alvarez's and our experiments violated regularity in different domain than tasks utilized. Another study (Kimura et al., 2008) presented deviant, standard and target stimuli in the same location, and therefore, overt attention was also orientated to the deviant stimulus. This limits direct comparisons with our results because, in our experiment, overt attention was located away from the standard/deviant stimuli.

There are also studies regarding brain metabolism with designs similar to ours. In an fMRI study, the perception of visual stimuli, such as optical flow, were modulated by the difficulty of an unrelated, spatially isolated task (Rees et al., 1997). Another similar study showed an effect of task difficulty on the perception of irrelevant color deviants (Yucel et al., 2007). These findings, unlike our findings and other electrophysiological studies (Heslenfeld, 2003; Pazo-Alvarez et al., 2004), may be attributed to using a different technique. ERP reflects transient, phase-locked events related to neural activity, whereas the blood oxygen leveldependent signal corresponds to sustained metabolic activity. It is possible to use an event-related fMRI design, but this approach cannot differentiate among processes occurring on a millisecond time scale. This discrepancy between electrophysiological and metabolic studies might be addressed in an experiment recording simultaneously EEG and fMRI.

Our results also contradict the "load theory" (Lavie et al., 2004), which states that the perception of a distractor depends on the task load and that the distractor is perceived when there are available attentional resources. Our results show that the distractors, for instance, standard and deviant stimuli, were processed by the sensory cortex, but there was no modulation of the response by task difficulty. One explanation might be that the tasks were so demanding that they exhausted all attentional resources. However, this seems unlikely because one of the tasks only required fixation on the center of the screen. Another possibility is that the tasks were insufficiently difficult, such that the attentional resources were altered so negligibly that the vMMN was not modulated. This is also unlikely because, in response to the deviant stimuli, there should be an attentional shift in the 200–300 ms interval (Heslenfeld et al., 1997) or at a later time point in a P3a component (Squires et al., 1975). We did not detect these components, and our results did not show an effect of task *per se*, nor its interaction with the condition factor (the standard/deviant stimuli).

Thus, we speculate that our experimental design presented so many transient changes (approximately 8/s—motion-onset, motion-reversal, motion-offset, pattern-on, and pattern-off, all happened within 600 ms; see **Figure 1**) that the standard/deviant difference was not salient enough to systematically capture subjects' attention despite the generation of the electrophysiological correlate in the vMMN. Some of the subjects were questioned after the experiment and they reported a lack of awareness of the peripheral regularity violation. Unfortunately, we do not have behavioral responses from all subjects; however, the data suggest that the attentional involvement in the peripheral stimuli was low.

The observation that the vMMN generated in our design did not change with task difficulty might be useful because it is desirable to dissociate the effect attentional bias from the genuine vMMN.

One of the goals of this study was to verify that the described protocol was suitable for a fast and reliable examination of the vMMN. In addition of the ability to elicit the vMMN, we found the following advantages of our design: (a) the sequence of motion in two directions avoided the possibility of refractoriness within the dorsal stream because the durations of the expanding and contracting motions within the single stimulus were equal; (b) the deviant stimuli did not elicit systematic changes or shifts in attention; (c) the responses to irrelevant stimuli were independent of central task difficulty; and (d) the radial motion avoids optokinetically induced eye movements.

However, this design has the following disadvantages: (a) we recorded small vMMN amplitudes, which makes the clinical use of this design difficult; and (b) the sequence had numerous target

# **REFERENCES**


EEGLAB, SIFT, NFT, BCILAB, and ERICA: new tools for advanced EEG processing. *Comput. Intell. Neurosci.* 2011, 1–12. doi: 10.1155/2011/130714


events that contaminated the responses to the irrelevant stimuli with slow readiness potentials, which subsequently had to be removed.

#### **ACKNOWLEDGMENTS**

This study was supported by Grant Agency of the Czech Republic 309/09/0869 and by the P37/07 (PRVOUK) program.

and Langrova, J. (2006). Visual mismatch negativity elicited by magnocellular system activation. *Vision Res.* 46, 485–490. doi: 10.1016/j.visres.2005.10.001


auditory processing in ageing and different clinical conditions. *Clin. Neurophysiol.* 123, 424–458. doi: 10.1016/j.clinph.2011.09.020


auditory organization. *Brain Res. Cogn. Brain Res.* 13, 393–405. doi: 10.1016/S0926-6410(01) 00131-8


Visual mismatch negativity highlights abnormal pre-attentive visual processing in mild cognitive impairment and Alzheimer's disease. *Neuropsychologia* 46, 1224–1232. doi: 10.1016/j. neuropsychologia.2007.11.017


*Research Involving Human Subjects.* Available online at: http://www. wma.net/en/30publications/10poli cies/b3/

Yucel, G., McCarthy, G., and Belger, A. (2007). fMRI reveals that involuntary visual deviance processing is resource limited. *Neuroimage* 34, 1245–1252. doi: 10.1016/j.neuroimage.2006. 08.050

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 29 April 2013; accepted: 11 July 2013; published online: 30 July 2013.*

*Citation: Kremláˇcek J, Kuba M, Kubová Z, Langrová J, Szanyi J, Vít F and Bednáˇr M (2013) Visual mismatch negativity in the dorsal stream is independent of concurrent visual task difficulty. Front. Hum. Neurosci. 7:411. doi: 10.3389/ fnhum.2013.00411*

*Copyright © 2013 Kremláˇcek, Kuba, Kubová, Langrová, Szanyi, Vít and Bednáˇr. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Oscillatory characteristics of the visual mismatch negativity: what evoked potentials aren't telling us

# *George Stothart \* and Nina Kazanina*

*School of Experimental Psychology, Faculty of Science, University of Bristol, Bristol, UK*

#### *Edited by:*

*Gabor Stefanics, University of Zurich & ETH Zurich, Switzerland*

#### *Reviewed by:*

*Lluís Fuentemilla, University of Barcelona, Spain Jordi Costa-Faidella, University of Barcelona, Spain*

*\*Correspondence: George Stothart, School of Experimental Psychology, Faculty of Science, University of Bristol, 12a Priory Road, Bristol BS8 1TU, UK*

*e-mail: george.stothart@bristol.ac.uk*

The visual mismatch negativity (vMMN) response is typically examined by subtracting the average response to a deviant stimulus from the response to the standard. This approach, however, can omit a critical element of the neural response, i.e., the non-phase-locked ("induced") oscillatory activity. Recent investigations of the oscillatory characteristics of the auditory mismatch negativity (aMMN) identified a crucial role for theta phase locking and power. Oscillatory characteristics of the vMMN from 39 healthy young adults were investigated in order to establish whether theta phase locking plays a similar role in the vMMN response. We explored changes in phase locking, overall post-stimulus spectral power as well as non-phase-locked spectral power compared to baseline (−300 to 0 ms). These were calculated in the frequency range of 4–50 Hz and analysed using a non-parametric cluster based analysis. vMMN was found intermittently in a broad time interval 133–584 ms post-stimulus and was associated with an early increase in theta phase locking (75–175 ms post-stimulus) that was not accompanied by an increase in theta power. Theta phase locking in the absence of an increase in theta power has been associated with the distribution and flow of information between spatially disparate neural locations. Additionally, in the 450–600 ms post-stimulus interval, deviant stimuli yielded a stronger decrease in non-phase-locked alpha power than standard stimuli, potentially reflecting a shift in attentional resources following the detection of change. The examination of oscillatory activity is crucial to the comprehensive analysis of a neural response to a stimulus, and when combined with evoked potentials (EPs) provide a more complete picture of neurocognitive processing.

**Keywords: mismatch negativity (MMN), visual attention, theta oscillations, alpha oscillations, phase locking, evoked potentials, induced oscillations**

# **INTRODUCTION**

Mismatch negativity (MMN) is an electrophysiological response that reflects the automatic detection of change in the sensory environment, and is elicited by violating an established regularity in a sequence of sensory stimuli. Such violations can take the form of simple physical changes in the stimulus properties, e.g., a change in pitch of an acoustic stimulus (Paavilainen et al., 1993), to abstract deviations in the relationships between stimuli, e.g., missing a step in a musical scale (Brattico et al., 2006), or a non-symmetrical stimulus in a sequence of symmetrical stimuli (Kecskés-Kovács et al., 2012). Since its first description (Näätänen et al., 1978; Näätänen and Michie, 1979) it has become an established tool in the investigation of sensory processing and attention, and a marker of cognitive decline across a variety of conditions (see Näätänen et al., 2011 for a review). After the initial focus on the auditory MMN (aMMN), there is now an established body of evidence for MMN in the visual modality, the vMMN (see Pazo-Alvarez et al., 2003; Kimura et al., 2011; Winkler and Czigler, 2012 for reviews).

The typical method for measuring the MMN response is to subtract the evoked potential (EP) response to the standard repeating stimulus from that of the deviant stimulus, i.e., the stimulus that violates the regularity established by the standard. The resulting difference wave reflects the neural processing difference between the standard and deviant stimuli. Statistical techniques vary but typically the aim is then to establish the duration and magnitude of any deviation in the difference wave from zero.

In addition to examining electrophysiological responses with amplitude as a function of time, the same responses can also be examined in the frequency domain, in order to examine the oscillatory characteristics of a response as a function of time. Since the first observation of event related oscillatory changes (Berger, 1929), oscillatory activity has been increasingly shown to play a key role in exploring sensory, cognitive and motor processes (see Ba¸sar et al., 2001; Ward, 2003; Buzsáki and Draguhn, 2004 for reviews). Typically oscillations are separated into the following bands for analysis, delta (0–4 Hz), theta (4–8 Hz), alpha (8–14 Hz), beta (14–30 Hz), and gamma (30 + Hz). In the visual modality, processes that are involved with or influenced by the vMMN response have been associated with different oscillatory processes. Distracter suppression and selective attention processes have been linked with alpha oscillation changes (Foxe and Snyder, 2011), object feature binding (Gray et al., 1989; Tallon-Baudry and Bertrand, 1999) and visual working memory (Tallon-Baudry et al., 1998; Rizzuto et al., 2003) have been associated with increases in gamma and theta oscillations. Mishra et al. (2012) identified concomitant increases in theta phase locking and power as oscillatory markers of visuo-spatial attention.

It should be noted that a stimulus can elicit both stimulus phase-locked (sometimes known as "evoked") and non-phaselocked (sometimes known as "induced") oscillatory changes, however it is only the phase-locked activity that will sum to form the characteristic peaks and troughs of a recognizable EP in the time domain. Non-phase-locked activity, because its phase does not align from trial-to-trial, will fail to sum to any meaningful activity in an averaged EP in the time domain. Non-phaselocked oscillatory activity has been suggested to play an important role in the synchronization and desynchronization of functional networks in the brain (Bastiaansen et al., 2012).

Recently the oscillatory characteristics of the aMMN have been examined using time-frequency analyses. Fuentemilla et al. (2008) demonstrated that the frontal and temporal sources of the aMMN were differentially modulated by stimulus phase-locked theta power increase and theta phase locking. The frontal source of the aMMN showed an increase in stimulus phase-locked theta power following deviant stimuli and an increase in phase locking. The temporal sources however showed an increase in theta phase locking in the absence of any increase in power. In a magnetoencephalographic (MEG) study of aMMN, Hsiao et al. (2009) demonstrated partially converging results with Fuentemilla et al., specifically an increase in theta phase locking and power in response to deviant stimuli at temporal sources, and an increase in theta phase locking at frontal sources only. Changes in power at the frontal sources were not reported so it is unclear if findings on the frontal sources matched those of Fuentemilla et al. Furthermore, Hsiao's study demonstrated increases in power and phase locking to deviant stimuli that were greater in the right hemisphere than the left. Ko et al. (2012) also demonstrated that aMMN was associated with increases in both theta power and phase locking, peaking at fronto-central electrodes and stronger on the right hemisphere. Bishop and Hardiman (2010) examined the aMMN response in single trials using principle components analyses and also found a significant increase in theta phase locking. Although there are differences in the results and in analyses techniques between studies (e.g., electrode sites chosen for analyses), a clear role for theta oscillatory activity in the aMMN response has emerged, with a trend for right hemispheric dominance.

The present study investigated the role of neural oscillations in the vMMN response. By examining visual evoked potentials (VEPs), stimulus phase-locked and non-phase-locked spectral power change, and inter trial phase locking (ITPL) across a range of frequencies (4–50 Hz), we were able to examine whether theta activity plays a similar role in the generation of the vMMN response as has been demonstrated in aMMN.

# **MATERIALS AND METHODS**

#### **PARTICIPANTS**

Thirty-nine healthy younger adults [aged 18–31, mean age 20.0 (±2.3), 13 males] gave consent to participate in the study. Participants were recruited from the University of Bristol student population and declared themselves to be in normal health. All had normal or corrected-to-normal vision and were right hand dominant [mean Edinburgh Handedness Inventory score 94.4 (±14.4)]. Seventeen of the younger adults were control participants in a previously reported study examining vMMN in healthy ageing (Stothart et al., 2013), however the study only examined vMMN in the classic time-amplitude domain and no time-frequency analyses have been previously reported. All appropriate approvals for our procedures were obtained from the Ethics Committee of the Faculty of Science at the University of Bristol. Participants provided written informed consent before participating and were free to withdraw at any time.

#### **STIMULI**

Stimuli were presented using Presentation software version 12.2 (Neurobehavioral Systems, Inc).

#### **PROCEDURE**

Using a paradigm previously developed by Tales et al. (1999), participants were instructed to fixate and attend exclusively to a small blue frame (1.3 × 1.3 cm) at the center of a monitor situated 0.5 m directly in front of them (**Figure 1A**). Periodically, the center of the blue frame turned red (the target stimulus) (**Figure 1B**) and the participant had to respond to it as quickly as possible by pressing a hand-held button. Participants were instructed to ignore any other stimuli that appeared on screen and focus solely on the target stimuli. The target presentation was a rare event for which subjects would have to maintain a sharp attentional focus, thereby reducing the likelihood of attending to the standards and deviants. A larger blue frame (10.5 × 10.5 cm) defined the area within which the standard and deviant stimuli were presented. The standards, single white bars (3.9 × 1.2 cm) were presented simultaneously above and below the central blue square (**Figure 1C**); deviants, double white bars equal to the standards in total area (3.9 × 0.6 cm ×2) and brightness, were presented in the same locations (**Figure 1D**). The symmetrical location of standards and deviants about the target area was intended to minimize any tendency for gaze fixation to be biased away from the central square. The target, standard and deviant stimuli were presented with a randomized interstimulus interval (ISI) of 612–642 ms for 200 ms. Furthermore, the targets and deviants were presented in a pseudo-random sequence among the standards with at least two standards preceding each deviant. The ratio of standards:deviants:targets was 16:1:1. Standards and deviants were not counterbalanced as it has been previously demonstrated using this paradigm that it is the rareness of the deviant rather than the subtle difference in

stimulus characteristics that elicits the vMMN response (Stagg et al., 2004). The stimuli were shown in one block lasting 11 min containing 640 standards, 40 deviants, and 40 targets.

#### **EEG RECORDING**

EEG signals were continuously recorded from 64 Ag/AgCl electrodes fitted on an elasticized cap in a standard electrode layout using a common FCz reference. Signals were sampled at a rate of 1000 Hz using a BrainAmp DC amplifier (Brain Products GmbH). Impedances were kept below 5 k and all signals were online low-pass filtered at 250 Hz during recording. Recordings were analysed offline using Brain Electrical Source Analysis software version 5.3 (BESA GmbH). Artifacts including blinks and eye movements were corrected using BESA automatic artifact correction (Berg and Scherg, 1994) and any remaining epochs containing artifact signals > ±100μV were rejected. The rejection rate never exceeded 10% of trials for each participant and condition. Data were re-referenced offline to a common average reference. Epochs from −300 to 600 ms were defined around stimulus onset and baseline corrected using the pre-stimulus interval (−300 to 0 ms).

#### **ERP ANALYSIS**

To confirm the presence of a vMMN the amplitudes of seven electrodes, O1,Oz,O2,PO9,PO10,PO7, and PO8, were averaged to form an occipital region of interest. Electrode selection was defined based on a recent study of vMMN using an identical paradigm (Stothart et al., 2013). Examination of grand average evoked responses and mean spectral power maps across the scalp confirmed that the overwhelming majority of neural activity was located in the occipital region, was highly consistent across the seven electrodes, and that the electrode selection was appropriate. The averaged response to the standard stimuli was subtracted from that to the deviant stimuli to create a difference waveform and a 40 Hz low-pass filter applied (only for the VEP analysis, not applied for frequency analysis). Sequential one sample *t*-tests were then applied to the difference waveforms for each group using the method outlined by Guthrie and Buchwald (1991). The consecutive time points necessary to indicate an epoch of significant difference between the standard and deviants responses were obtained from a simulation using an autocorrelation estimated from the data. Time intervals with values of *p* < 0.05 that lasted for the required duration, 15 consecutive time points (i.e., 15 ms), were accepted as significantly different epochs.

#### **TIME-FREQUENCY ANALYSES**

In order to further characterize the vMMN response, original epochs (i.e., not subjected to any offline filtering) from all 64 electrodes were transformed into the time-frequency domain using a complex demodulation approach, implemented by BESA Source Coherence module version 5.3 (Hoechstetter et al., 2004). Complex demodulation was applied using a sampling step of 2 Hz for frequencies between 4 and 50 Hz and a finite impulse response filter with a sampling step of 25 ms at latencies between −300 to 600 ms relative to the stimulus onset. Changes in spectral power were calculated relative to pre-stimulus baseline (−300 to 0 ms) in two different ways. First, demodulation of all activity (stimulus phase-locked and non-phase-locked) was performed by calculating spectral content of each epoch on a trial-by-trial basis and then averaging it ("overall spectral power"). In order to specifically assess non-stimulus phased locked activity, we first subtracted the participant's average response in the time-frequency domain from each individual trial, and then averaged the trials to create averages of the non-phase-locked spectral power only. ITPL values (i.e., the degree to which oscillatory phase is correlated from trial to trial, ranging from 0 to 1, with values approaching 1 indicating highly correlated phase values) were calculated for the overall spectral activity. For the analysis of phase locking values the number of standard trials was matched to the deviant trials by selecting the standard preceding the deviant for analysis. This equated signal to noise ratios between standard and deviant stimuli and removed the potential influence of the number of presentations on correlation values.

#### **STATISTICAL ANALYSIS**

All time frequency data were analysed using a non-parametric cluster based permutation approach. This approach, using Fieldtrip software (Oostenveld et al., 2011), and described by Maris and Oostenveld (2007), controls for multiple comparison testing when computing statistics across multiple frequency and time points. Firstly an independent samples *t*-test between the standard and deviant conditions was calculated for each sample point. Significant values (alpha < 0.01) were clustered based on their adjacency in time, space and frequency, and the *t*-values for all points in this cluster were summed. The critical *p*-value for each cluster was calculated using the Monte Carlo estimate. For each cluster this involved randomly dividing the data into two subsets and calculating a new summed *t*-value. This was repeated 10,000 times and the proportion of random partitions that resulted in a larger summed *t*-value than the one observed in the real data identified. If the summed *t*-value of the observed data cluster was higher than 95% of the random partitions (i.e., less than an alpha-level of 0.05, two-tailed), then the cluster was considered to represent a significant difference between the two groups. This technique allows for the evolution of spectral activity across time to be observed without the need for reductive averaging across arbitrary time windows, grouping of frequencies into bands or imposing spatial constraints on cluster size. It should be noted that the initial alpha value for cluster formation was lowered from alpha < 0.05 to alpha < 0.01 in order to reduce the likelihood of large clusters spanning the entire dataset, a potential problem in cluster based permutation testing highlighted recently by Mensen and Khatami (2013).

For the purposes of effective visualization time frequency data is presented for the averaged activity across the seven electrodes used in the ERP analyses (i.e., O1,Oz,O2,PO9,PO10,PO7, and PO8). Grand average and statistical plots based on the cluster based permutation analysis for all 64 electrodes are available in **Supplementary Figure A**.

#### **RESULTS**

#### **VEP**

A clear vMMN response was observed, see **Figures 2A,B**. Sequential *t*-tests corrected for multiple comparisons identified

three epochs (133–263 ms, 297–352 ms, 377–584 ms) in which deviant responses were significantly more negative than standards. Target stimuli elicited clear attentional components, i.e., P3b, that was not present in either the standard or deviant stimuli. The mean percentage of targets detected was 98.4% (±0.02) and the mean median reaction time was 390.8 ms (±48.4). There were no false alarm responses to any deviant stimuli for any participants.

# **CHANGES IN OVERALL (PHASE-LOCKED AND NON-PHASE-LOCKED) SPECTRAL POWER**

Standard and deviant stimuli elicited an increase in the overall spectral power, greatest at 6 Hz, between approximately 75 and 175 ms (see **Figures 3A,B**). The increase was greatest at right occipital and parietal channels and was absent from the analysis of non-phase-locked activity (see **Figure 4**). Cluster based permutation analysis demonstrated that it was not significantly different between standard and deviant conditions (demonstrated by the absence of significant differences in the 75–175 ms interval in **Figure 3C**), suggesting that it was a counterpart in the time-frequency domain of the P1 and N1 VEPs. A prominent reduction in overall power in a broad range of approximately 6–24 Hz was found at latencies 150–600 ms and was strongest at right occipital and parietal channels; the reduction was more pronounced for deviants than for standards. The same reduction was observed in non-phase-locked power analysis, suggesting it was non-phase locked in origin, and will be discussed below.

#### **CHANGES IN NON-PHASE-LOCKED SPECTRAL POWER**

A decrease in non-phase-locked spectral power, greatest at 14 Hz, was observed to both standard and deviant stimuli, see **Figures 4A,B**. The decrease, strongest at right occipital and parietal channels, began at 150 ms and lasted until approximately 525 ms in the standard condition and 600 ms in the deviant. (The exact timings and frequency ranges vary slightly from electrode to electrode.) Cluster based permutation testing demonstrated that the decrease in alpha spectral power was significantly stronger for deviant than for standard stimuli in the time interval between approximately 450 and 600 ms (Monte Carlo *p* = 0.0059), see **Figures 4C** and **2C**. It should be noted that when the number of standard trials was matched to the deviant trials the stronger decrease in alpha spectral power to deviant stimuli was maintained. As previously highlighted, this alpha power decrease was also present in the analysis of the overall spectral power (see **Figure 3**), however its maintenance in the present

**spectral power change (µV2) at the occipital region of interest compared to baseline (−300 to 0 ms) for (A) standard and (B) deviant stimuli.** Plot

Monte Carlo permutation correction for multiple comparisons. Non-significant differences are masked in white.

analysis demonstrates that it largely resulted from non-phase locked oscillatory change.

#### **INTER TRIAL PHASE LOCKING**

Standard and deviant stimuli showed increased ITPL, greatest at 6–8 Hz, between approximately 75 and 350 ms, see **Figures 5A,B**. The increase in ITPL was broadly distributed across the scalp, although still strongest at occipital and parietal electrode sites. Cluster based permutation testing demonstrated that deviant stimuli elicited a significantly greater increase in ITPL compared to standard stimuli between 75 and 225 ms (Monte Carlo *p* = 0.0004, **Figures 5C** and **2D**).

#### **FREQUENCY BAND ANALYSIS**

In order to enable a direct comparison between the current study and previous studies which performed time-frequency analyses on separate, conventionally-defined frequency bands, we also conducted our statistical analyses for the following *averaged* frequency bands: theta (4–8 Hz), alpha (8–14 Hz), beta (14–30 Hz), and gamma (30–50 Hz). As in the main analysis, cluster based permutation tests of spectral power change and ITPL were also calculated for the 0–600 ms interval compared to baseline (−300 to 0 ms), and only clusters that survived 10,000 Monte Carlo permutations are displayed. Observations from the main cluster analyses were replicated. Specifically, a decrease in overall and non-phase locked spectral power in the alpha band was larger to deviants than standards in the 400–600 ms time interval (Overall—Monte Carlo *p* = 0.003, Non-phase locked–Monte Carlo *p* = 0.004). A similar spatial distribution was observed to that in the main analyses, i.e., strongest at right occipital and parietal electrode sites. There were no other significant either overall or non-phase locked power changes in any other frequency band. ITPL was larger to deviant stimuli as compared to standard stimuli in 0–250 ms interval in the theta band only (Monte Carlo *p* = 0.001). A similar spatial distribution was observed to that in the main cluster analysis, i.e., broadly distributed across the scalp, although strongest at occipital and parietal electrode sites.

# **DISCUSSION**

Both standard and deviant stimuli elicited large and equivalent increases in overall (i.e., phase-locked and non-phase-locked) theta power between 75 and 175 ms. The timing of this spectral change in the theta range strongly suggests it was a spectral counterpart of the P1-N1 complex in the VEP. In the simplest case, if the P1-N1 peaks were simply two peaks in a continuous sinusoidal wave, the frequency of that wave would be in the theta/low alpha range, an explanation also proposed by Klimesch et al. (2004).

vMMN was found to be associated with an increase in ITPL in the theta range, peaking at 6–8 Hz, which coincided with the early vMMN epoch observed in the VEP grand average waveform (133–263 ms). Hence, theta phase locking appears to play a role in the generation of the vMMN response, as it does in the aMMN response. This was followed by a significant decrease in non-phase-locked spectral power in the high alpha range, peaking at 14 Hz, which coincided with the late vMMN epoch (377–584 ms). These alpha-range oscillatory changes were strongest at right hemisphere occipital and parietal electrode sites.

The increase in theta ITPL in response to deviant stimuli was not accompanied by a significant increase in theta power, pointing toward a striking similarity between vMMN and aMMN generation. Recall that Fuentemilla et al. (2008) study reported that temporal subcomponent of the aMMN (generated by sources of aMMN in the auditory cortex) was driven by theta phase realignment without concurrent spectral power modulation. Similarly, in our study the early phase of the vMMN was purely a product of theta phase realignment that was not accompanied by theta power increase. The theta realignment effect had a broad scalp distribution but was most pronounced in the occipital sites, consistent with the idea of primary vMMN generators in the visual cortex plus additional generators in the frontal cortex (see below). More generally, theta phase locking (with or without concurrent theta power increase) was found in all other previous studies of aMMN (see Introduction).

Cumulatively the vMMN and aMMN findings point toward an important role for phase locking in the underlying mechanisms of the auditory and visual MMN. Phase locking has been suggested to play a key role in the linking of spatially disparate areas together into transitory neural networks. For example, theta and gamma phase locking between medial temporal lobe and hippocampal structures has been associated with successful memory formation (Fell et al., 2001; Rizzuto et al., 2003). Visual working memory performance has also been associated with increased beta phase locking within separate areas of the extra-striate cortex (Tallon-Baudry et al., 2001) and increased theta phase coherence (i.e., increased phase locking between two sites rather than within one) between prefrontal and posterior areas (Sarnthein et al., 1998). The aMMN network is thought to be comprised of bilateral temporal sources located in the auditory cortex with an additional frontal source(s) (Giard et al., 1991; Deouell, 2007), with theta phase locking proposed as a possible mechanism for the functional connection of these areas (Fuentemilla et al., 2008). Significant theta phase locking was also observed in frontal electrode sites as well as occipital and parietal sites in the current study, a similar pattern to that observed in the aMMN studies. There have been no studies to date examining the possibility of a frontal source in the vMMN response, however the prefrontal cortex has been suggested to form a key part of the feedback loop in Kimura's predictive coding model of the vMMN response (Kimura, 2012). Given that vMMN can also result in attentional orientation similar to the aMMN, for which the frontal source is suggested to be primarily responsible, the investigation of the existence of a frontal vMMN source is an interesting avenue for future work.

The decrease in spectral power between 8 and 20 Hz (spanning two classic oscillatory frequency bands, alpha, 8–14 Hz and beta, 14–30 Hz) was observed in both standard and deviant responses from 200 ms post-stimulus. This decrease peaked at 12–14 Hz (often known as the "high alpha" range) and was significantly stronger in the deviants than standards in the interval between approximately 375–575 ms. In previous research, alpha oscillations have often been examined by averaging the spectral power in a narrow pre-selected band around 10 Hz, e.g., 8–12 Hz. This, however, does not take into account the variation in peak alpha power across individuals, something that has been shown to

vary considerably across individuals and age groups (Doppelmayr et al., 1998). The variation in individual peak alpha power can mean that upper alpha ranges can extend up to 15 Hz, i.e., into what is classically thought of as the beta range (Hanslmayr et al., 2005). In the main analysis of the current study we avoided the shortcomings resulting from "banding" a continuous frequency range and calculated spectral power using complex demodulation with a sampling step of 2 Hz across 4–50 Hz range. The spectral resolution of 2 Hz has a disadvantage of blurring some distinctions, e.g., 13 Hz oscillation likely to appear as both 12 and 14 Hz activity. Yet, it is our view that by not pre-selecting a narrow range around 10 Hz to explore alpha oscillations we have avoided the issue of fuzziness of the band boundaries and are subsequently able to obtain a more sensitive measure of event related spectral change.

Reductions in post-stimulus alpha power (or "alpha desynchronization") are considered to be reflective of increased activation within a cortical region, and, vice versa, increases in power reflective of decreased activation. For example, in a visual task alpha oscillations increase in sensorimotor regions and reduce in occipital regions, whereas during a motor task the opposite pattern occurs (Pfurtscheller, 1992). Previous EEG studies of visual processing found reduction of non-phase-locked alpha power with an occipital locus approximately 450–600 ms postvisual stimulus (e.g., Müller and Keil, 2002; Gazzaley et al., 2008). In particular, alpha power reduction reported by Gazzaley et al. (2008) closely matched both the timing (approximately 400–600 ms post-visual stimulus) and broad spectral profile (i.e., peaking within the alpha range but extending into the beta range) of that observed in the current study.

Furthermore, the reduction in alpha power, although found for both standards and deviants, was stronger with deviant stimuli, corroborating the link between alpha desynchronization and task-specific effects (Klimesch et al., 1997). Gazzaley et al. (2008) found that reduction in alpha power was stronger for task-relevant than task-irrelevant visual stimuli. Müller and Keil (2002) used a feature-based attention paradigm and found significantly stronger alpha desynchronization in upper alpha power for non-targets that contained the attended feature (e.g., green color) than those that did not. They suggested that the effect may have been caused by an initiation (although not execution) of a motor response for non-targets that contained the attended feature. Similarly, in our study deviants represent the rare stimuli which activate change detection mechanisms and thus can be regarded as more task-relevant than standards, i.e., deviants require more inhibition than standards. In particular, as the change-detection mechanism underlying MMN plays an important role in the orientation of attention to novel or unexpected events, it may be that the increased alpha desynchronization with deviants reflects in part the shifting of attentional resources following a detection of change, represented by the earlier increase in theta phase locking. Finally, it may also be worth noting that the alpha suppression has not been observed in the studies of aMMN, hinting at the possibility of different functional value of alpha oscillations in visual vs. auditory modality (cf. Hsiao et al., 2009 for increased alpha power post-auditory stimulus in an oddball paradigm). Further investigation of the oscillatory activity associated with the vMMN response may help test and develop these hypotheses.

In sum, the vMMN response has distinct oscillatory characteristics that are typically lost in the averaging process used to measure VEPs. Theta phase locking is associated with the early vMMN epoch, and may reflect the temporary functional connection of the cortical areas involved in the vMMN response. It also suggests a common oscillatory mechanism behind the aMMN and vMMN responses. The examination of oscillatory changes alongside grand average VEP waveforms provides a more complete picture of the event related neural changes in the vMMN response.

### **ACKNOWLEDGMENTS**

This work was supported by the Biotechnology and Biosciences Research Council, UK.

# **REFERENCES**


# **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/Human\_Neuroscience/10.3389/ fnhum.2013.00426/abstract

**Supplementary Figure A | Sixty-four channel grand average plots for all time-frequency analyses for (A) standard, (B) deviant and (C) significant difference** *t***-values (i.e., deviant minus standard,** *p <* **0.05) after Monte Carlo permutation correction for multiple comparisons.** Non-significant differences are masked in white. X and Y axes scales for all electrode plots are indicated by the blank plot at the bottom of the figure. It should be noted that statistical data is based on the cluster based permutation analyses across all channels, timepoints and frequencies, i.e., controlling for multiple comparisons across spatial/temporal/spectral dimensions. Electrodes are presented in the plots individually in order to show the presence and absence of effects at each electrode in a more detailed manner than using scalp maps.


(1997). Brain oscillations and human memory: EEG correlates in the upper alpha and theta band. *Neurosci. Lett.* 238, 9–12. doi: 10.1016/S0304-3940(97)00771-4


*Psychol. (Amst.)* 42, 313–329. doi: 10.1016/0001-6918(78)90006-9


doi: 10.1016/j.neurobiolaging.2012. 08.012


(MMN and vMMN) linking predictive coding theories and perceptual object representations. *Int. J. Psychophysiol*. 83, 132–143. doi: 10.1016/j.ijpsycho.2011.10.001

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 25 February 2013; accepted: 15 July 2013; published online: 01 August 2013.*

*Citation: Stothart G and Kazanina N (2013) Oscillatory characteristics of the visual mismatch negativity: what evoked potentials aren't telling us. Front. Hum. Neurosci. 7:426. doi: 10.3389/fnhum. 2013.00426*

*Copyright © 2013 Stothart and Kazanina. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Automatic processing of unattended lexical information in visual oddball presentation: neurophysiological evidence

# *Yury Shtyrov1,2,3\*, Galina Goryainova4, Sergei Tugin4,5, Alexey Ossadtchi 4,6 and Anna Shestakova4,7*

*<sup>1</sup> Center of Functionally Integrative Neuroscience, Institute for Clinical Medicine, Aarhus University, Aarhus, Denmark*


#### *Edited by:*

*Gabor Stefanics, University of Zurich and ETH Zurich, Switzerland*

#### *Reviewed by:*

*Andriy Myachykov, Northumbria University, UK Xiaodong Wang, Nanyang Technological University, Singapore*

#### *\*Correspondence:*

*Yury Shtyrov, Center of Functionally Integrative Neuroscience, Aarhus University Hospital, Building 10G, 5th floor, Nørrebrogade 44, Aarhus University, 8000 Aarhus, Denmark e-mail: yury.shtyrov@cfin.au.dk*

Previous electrophysiological studies of automatic language processing revealed early (100–200 ms) reflections of access to lexical characteristics of speech signal using the so-called mismatch negativity (MMN), a negative ERP deflection elicited by infrequent irregularities in unattended repetitive auditory stimulation. In those studies, lexical processing of spoken stimuli became manifest as an enhanced ERP in response to unattended real words, as opposed to phonologically matched but meaningless pseudoword stimuli. This lexical ERP enhancement was explained by automatic activation of word memory traces realized as distributed strongly intra-connected neuronal circuits, whose robustness guarantees memory trace activation even in the absence of attention on spoken input. Such an account would predict the automatic activation of these memory traces upon any presentation of linguistic information, irrespective of the presentation modality. As previous lexical MMN studies exclusively used auditory stimulation, we here adapted the lexical MMN paradigm to investigate early automatic lexical effects in the visual modality. In a visual oddball sequence, matched short word and pseudoword stimuli were presented tachistoscopically in perifoveal area outside the visual focus of attention, as the subjects' attention was concentrated on a concurrent non-linguistic visual dual task in the center of the screen. Using EEG, we found a visual analogue of the lexical ERP enhancement effect, with unattended written words producing larger brain response amplitudes than matched pseudowords, starting at ∼100 ms. Furthermore, we also found significant visual MMN, reported here for the first time for unattended perifoveal lexical stimuli. The data suggest early automatic lexical processing of visually presented language which commences rapidly and can take place outside the focus of attention.

**Keywords: brain, language, event-related potential (ERP), mismatch negativity (MMN, vMMN), lexical memory trace, visual word comprehension**

#### **INTRODUCTION**

In spite of years of productive research in psycho- and neuro-linguistics as well as psychophysiology and cognitive neuroscience, neurobiological mechanisms underlying the human language function remain poorly understood. Some of the questions still hotly debated in language sciences are the time course of linguistic processes in the brain and the degree of their dependence on attentional control. When exactly are word representations assessed by the brain? How automatic is this process and/or does it require our conscious control? While some scientists have traditionally argued for a lexico-semantic access at 350–400 ms (see e.g., Friederici, 2002; Hagoort, 2008), some more recent evidence is pointing toward a much earlier onset of these processes, at ∼50–200 ms (Pulvermüller et al., 2009; MacGregor et al., 2012). Similarly, whereas some accounts of linguistic processes imply attentional control over them, there are strong indications of a large degree of automaticity in e.g., lexico-semantic and syntactic

processes, at least at their earliest stages (for a review, see e.g., Shtyrov, 2010).

A substantial contribution to this debate came from a body of recent investigations using non-attend designs, where the subjects are not given a stimulus-related task and, furthermore, are distracted from auditory linguistic stimuli by an alternative primary task. This is done in order to ensure that no interference can come from attentional biases and stimulus-specific behavioral strategies <sup>1</sup> . A large number of these studies have used the so-called mismatch negativity (MMN) brain response, an early component of auditory event-related potentials (ERPs). MMN

*<sup>2</sup> Centre for Languages and Literature, Lund University, Lund, Sweden*

*<sup>3</sup> Medical Research Council, Cognition and Brain Sciences Unit, Cambridge, UK*

*<sup>4</sup> Department of Higher Nervous Activity and Psychophysiology, Saint Petersburg State University, Saint Petersburg, Russia*

<sup>1</sup>The distraction from spoken language stimuli is usually implemented by means of a primary visual task, such as watching a film or playing a computer game, although within-modality distraction to contralaterally presented nonspeech auditory stimuli has also been successfully used (Pulvermüller et al., 2008).

shows high sensitivity to unexpected changes in a monotonous stream of unattended sounds, reflected in electroencephalographic (EEG) recordings as an increased fronto-central negativity with temporo-frontal sources (Näätänen et al., 2007). When these sounds are meaningful speech elements, for example words or morphemes of a native language, they show a characteristic ERP amplitude increase over acoustically similar and psycholinguistically matched stimuli that do not form meaningful language units. Dubbed "lexical enhancement," this phenomenon, which most often occurs at about 100–200 ms, has been investigated in different experimental settings, languages and imaging modalities (EEG, MEG, fMRI; see e.g., Korpilahti et al., 2001; Shtyrov and Pulvermüller, 2002; Shtyrov et al., 2005, 2008). This wordspecific brain response shows sensitivity to a number of psycholinguistic word properties: its amplitude changes with word frequency (Alexandrov et al., 2011; Shtyrov et al., 2011), its surface topography and underlying cortical sources show specificity to word semantics (Shtyrov et al., 2004; Pulvermüller et al., 2005), its latency correlates with psycholinguistically determined word recognition times (Pulvermüller et al., 2006), etc. This has led to firm conclusions that lexical MMN response reflects activation of neural memory traces for stimulus words, which occurs rapidly after the information at the auditory input allows for word identification (Pulvermüller and Shtyrov, 2006). Importantly, this activation takes place when the subjects' attention is removed from the linguistic stimuli. Furthermore, modulation of attention levels (using task demands and experimental instructions) does not affect the strength of this early word-elicited response (Garagnani et al., 2009; Shtyrov et al., 2010). These latter findings imply that the early word-specific activation is largely automatic and does not strongly depend on the level of attentional control. This automaticity could be attributed to the robustness of distributed neuronal networks that act as neural word memory traces in the brain. Importantly, these findings of early automatic lexical activation could also be replicated outside the MMN oddball paradigm, in an ecologically more valid presentation of multiple unrepeated words and pseudowords, provided their acoustic and phonological features are tightly controlled (MacGregor et al., 2012). In sum, this body of evidence suggests that the brain may be capable of automatic lexical analysis of spoken language even in the absence of attention on the linguistic input.

Such an account would predict the automatic activation of these memory traces upon *any* presentation of linguistic information, irrespective of the modality in which it is presented. To date, however, linguistic experiments in the visual modality have not been able to explore this phenomenon, as they have usually presented stimuli in the focus of attention. In terms of the speed of lexico-semanitc activation, a number of visual studies provide a similar picture of rapid and early access to word information in the brain, as seen in visual ERPs at latencies between 100–200 ms (e.g., Ortigue et al., 2004; Hauk et al., 2006). Such studies, however, cannot easily address the question of automaticity of neural lexical access. Indeed, it is not easily possible to present *unattended* words visually: if the stimulus falls within the focus of the visual field, it enters the attended area, which is why visual research mostly deals with active processing of attended stimuli. One approach to study subconscious visual word processing is masked priming (e.g., Dehaene et al., 2001; Henson, 2003) where a "probe" word may be preceded by a "prime" stimulus, which is masked and presented so briefly that the subject is not able to consciously register it. Masked priming studies have indeed reported a number of effects produced by such "invisible" word stimuli, including evidence of lexico-semantic access to them (e.g., Brown and Hagoort, 1993; Kiefer, 2002), although at later latencies than in the auditory studies above. However, such experiments, on the one hand, *do* require vigilant attention to the linguistic input (and thus rather reduce awareness than remove attention). On the other hand, priming studies (masked priming included) more likely assess the interactions between the prime and the probe rather than the processing of the subliminal stimulus *per se*. A similar comment can be made with respect to the visual Stroop task which famously demonstrated behaviorally (e.g., Glaser and Glaser, 1989) the automaticity in access of individual words2 : whilst the experimental instruction *per se* does not explicitly encourage word processing, the stimulus words themselves in the Stroop task are nevertheless presented in the focus of attention. Thus, the automaticity of neural processing of *unattended* visual language remains obscure.

To complement the earlier auditory MMN studies and bridge the gap between them and the visual modality in linguistic processing, we set out to address the issue of early lexical automaticity in the visual domain. For this, it seems essential to remove the focus from the visual linguistic input (similar to the previous auditory research above) and to record activations caused by unattended stimuli *per se*. For maximum compatibility with the previous research, we decided to adapt the auditory lexical MMN paradigm to the visual modality. A visual analogue of the auditory MMN (vMMN) is known to occur for presentation of at least non-linguistic graphical stimuli (Czigler et al., 2006). This usually involves a primary task such as tracking geometrical shapes in the center of the visual field, while unattended stimuli (frequent "standards" and rare unexpected "deviants") are flashed on the periphery of the visual field in oddball sequences, similar to those used auditorily. vMMN can be elicited independently of attention (Berti, 2011) by deviance in color (Czigler et al., 2002), orientation (Astikainen and Hietanen, 2009; Kimura et al., 2010), movement (Pazo-Alvarez et al., 2003), spatial frequency (Heslenfeld, 2003), contrast (Stagg et al., 2004) and even in abstract sequential regularities (e.g., "if, then . . . " rules; Stefanics et al., 2011) in visual stimulation. Whilst having been linked to neural automatic visual change detection and short-term memory (Czigler and Pato, 2009), vMMN has remained virtually unexplored with respect to its sensitivity to long-term representations, such as word-specific lexical memory circuits.

<sup>2</sup>Stroop effect demonstrates automatic access to the meaning of visually presented word in an experimental task which does not encourage semantic processing or even reading as such. When the name of a color (e.g. "green" or "red") is printed in a color not denoted by the name (e.g., the word "red" printed in blue ink), color naming takes longer and is more prone to errors than in a non-conflict situation. This and an entire family of similar effects suggest that lexico-semantic information (including individual word semantics) is assessed automatically even though this is not required by the visual task, leading to mutual interference between the two accessed representations (Brown et al., 1995).

Motivations for applying MMN methodology to language lie, on the one hand, with the earliness and automaticity of this cognitive ERP (Shtyrov and Pulvermüller, 2007). These properties make it instrumental for uncovering the earliest attentionindependent neurophysiological indices of language processing, without any confounds associated with active tasks and attention variation (Pettigrew et al., 2004; Pulvermüller and Shtyrov, 2006; Näätänen et al., 2007). From the other, more methodological point of view, the use of a small set of well-controlled stimuli minimizes stimulus variance and associated brain response smearing, allowing for a finer degree of precision in locating and analysing any minute short-lived early activations (Shtyrov and Pulvermüller, 2007). Further, as the MMN is a difference response (obtained as a deviant-minus-standard ERP subtraction), this helps to rule out purely sensory confounds arising from divergence of physical stimulus features, by incorporating identical physical contrasts into different linguistic contexts. An advantage of the visual presentation, on the other hand, is its potential ability to overcome inherent problems of spoken stimulus presentation, such as variability in word length, in sound energy distribution across the waveform's duration, in word-specific recognition points etc. Unlike auditory stimuli that unfold over time, visual words are available in full instantly and can be presented for a strictly defined period of time, which can be fully matched across stimuli and conditions.

To test the presence of early automatic lexical effects in visual oddball presentation, we adapted the established lexical MMN approach to the visual modality. In line with non-linguistic visual MMN research (see e.g., Pazo-Alvarez et al., 2003), we engaged our experimental participants in a primary non-linguistic task continuously present in the center of the visual field. While the subjects were focused on this primary task, words and pseudowords matched for physical properties were briefly (100 ms) flashed just outside the fovea (2.5◦) in oddball sequences. All sequences had identical single-letter visual standard-deviant contrasts, while the exact lexical status of the standard and deviant stimuli (as either words or pseudowords) was systematically modulated. To control for purely sensory effects, further nonlinguistic control stimuli were used, and a low-level visual baseline condition was applied to parcel out the primary task contribution to visual responses. The subjects' neural responses to the stimulation were recorded using EEG. Based on the previous research, we expected to observe an early reflection of lexical differences, most likely as an increase in word-elicited activation relative to pseudoword ERPs. We also expected a visual MMN in the form of a difference between the deviant and standard brain responses.

# **METHODS**

#### **SUBJECTS**

Sixteen healthy right-handed (handedness assessed according to Oldfield, 1971) native Russian-speaking volunteers (6 males; age range 18–24, mean 21.2 y.o.) with normal vision and no record of neurological diseases were presented with visual stimuli in 6 experimental conditions. All subjects gave their written consent to take part in the study and were paid for their participation. The experiments were performed in accordance with the Declaration of Helsinki with approval of the University of St. Petersburg Ethics Committee.

# **STIMULI**

#### *Oddball stimuli*

As linguistic stimuli in the visual oddball presentation, we employed four sets of controlled monosyllabic three-letter words and pseudowords of the Russian language (**Table 1**). All stimuli were closely matched in their properties: (1) the two words in each standard-deviant pair shared the first two letters (always consonant-vowel), (2) the visual/orthographic contrasts between the standard and the deviant stimuli were identical in all conditions, and comprised a change between word-final consonants " " [k] <sup>3</sup> and " " [n], (3) the four sets differed only in the first letter ("M" [m], "T" [t], " " [f], " " [b], which was however the same letter within each set), (4) because of transparency in Russian orthography, the sets possessed equal phonetic similarity and identical phonetic contrasts in the auditory domain, which could be important to control in case of their covert articulation, even though it is unlikely to take place given the procedures employed (see below). All words were lexically unambiguous nouns common in Russian language and had similarly high lexical frequency of occurrence (range: 1.51–2.08 log instances per million; determined according to Sharoff, 2001), as did stimulus-initial and stimulus-final bigrams (2.27–3.18 and 3.04–3.12, respectively). Whilst matched visually and orthographically, the four sets systematically differed in the lexical status of the standard and deviant stimuli. All possible combinations were included: standard word vs. deviant word, standard pseudoword vs. deviant pseudoword, standard word vs. deviant pseudoword and standard pseudoword vs. deviant word (see **Table 1**).

To validate our choice of lexical stimuli and ensure that they were perceived as meaningful words vs. meaningless pseudowords by all experimental participants, we administered a behavioral rating questionnaire to all participants (after the EEG recording). This included answering questions on stimulus lexicality ("how confident are you that this is a real word in the Russian language") and frequency ("how often do you encounter this word or use it

3Original Cyrillic letters given, with Latin transcription approximating their pronunciation in square brackets.

**Table 1 | Visual word, pseudoword and non-word stimuli used in oddball sequences (Latinized transcription in square brackets, English translation in italics).**


*Note that the stimuli are very similar orthographically, while their lexical status is modulated systematically. Visual standard-deviant contrast ( / [k/n]) is identical across all conditions.*

yourself ") on a 7-point Likert scale. This rating study fully confirmed the intended strong word-pseudoword distinction (lexicality ratings: 6.9 words vs. 1.6 pseudowords [*F*(1, <sup>15</sup>) = 692, *p* < 0.0001]; frequency rating: 6.1 vs. 1.2 [*F*(1, <sup>15</sup>) = 252, *p* < 0.0001])<sup>4</sup> .

In addition to the 4 word and pseudoword conditions, a non-word stimulus set was included to control for lower-level sensory/sublexical factors. To match this set with the main 4 conditions, it employed the same visual contrast ( / ) incorporated with non-orthographic symbols of hashmark and ampersand, not typically used in Russian (see **Table 1**).

The textual stimuli were presented tachistoscopically for 100 ms, with stimulus onset asynchrony jittered between 800– 1000 ms (mean 900 ms), in black font-face (Arial 14 pt) on grey background (**Figure 1**). Two copies of each stimulus were simultaneously displayed at symmetric locations in the left and right hemifields at 2.5◦ angle from the center of the screen. Such a symmetric bilateral presentation was used in order to ensure that,

<sup>4</sup>This is particularly important as some of the pseudowords may have a niche meaning in highly specialized technical vocabularies with a restricted scope of use. As established by the behavioral rankings, the volunteers were not familiar with these stimuli, all of which were thus perceived as meaningless pseudowords.

**FIGURE 1 | An example of the visual stimulation employed and a schematic demonstration of the visual sequence.** The subjects' task was to focus on the center of the screen to detect combinations of two concentric circles, which were present continuously but changed colors pseudorandomly at every SOA refresh. At the same time, unattended orthographic stimuli were presented briefly (100 ms) at symmetrical locations on visual periphery (at 2.5◦ angle from the center to the left and to the right) in oddball sequences containing frequent standard and rare unexpected deviant stimuli (see also **Table 1**). In addition to the set of oddball blocks, a sensory visual baseline condition was included that only contained concentric circles but no orthogprahic stimuli on the flanks.

while the complete information is presented to both visual hemifields, the participant's gaze is not prompted to saccade from the central task to the orthographic stimuli (the risk of which could be higher with a single asymmetric presentation).

#### *Non-linguistic primary task stimuli*

As a primary task, which the participants were instructed to concentrate on, they were presented with 2 concentric circles of different colors (**Figure 1**): all possible combinations of red, green, blue and yellow were used. These combinations were displayed in the center of the screen and changed in synchrony with the orthographic stimuli that appeared on visual periphery. However, unlike the latter, these were kept on the screen for the entire duration of the SOA (to avoid strong visual onset and offset responses) such that the circles were seen as present continuously, with their colors changing.

# **PROCEDURE**

The subjects were instructed to fixate their gaze on the center of the screen where a fixation cross was displayed, and to focus on a dual visual task of detecting color circle combinations presented in the focus of their visual attention. This dual task required tracing the color of both the inner and the outer circles and reacting only to a particular combination of colors/locations (i.e., when the task was to detect "inner red, outer blue" target, responses to any other combination—including "inner blue, outer red"—were considered incorrect). Responses were given by pressing a button with the left index finger. In addition, the subjects were requested to count the number of target combinations and report them at the end of the block. Target combination probability was 15%. As the experiment consisted of six blocks, a different target combination was used in each block. The order of target color combinations was counterbalanced across subjects, and, within each block, stimulus sequences were randomized individually. A short training sequence, using similar (but not identical) stimuli was run in the beginning of each experiment.

While the subjects concentrated on this primary task, unattended orthographic stimuli were presented at the flanks. Each standard-deviant pair was presented in a separate block, where 600 frequent standard stimuli were pseudo-randomly interspersed with 100 deviant ones. There were at least two standard presentations between any two deviants. The subjects were not informed of the orthographic stimuli, and the task did not encourage attention on them. On the contrary, the very brief presentation of these stimuli (100 ms) that appeared perifoveally at the same time as the color combinations were changing in the focus of their attention ensured maximum distraction from the textual stimulation.

In addition to the four word/pseudoword sets and one nonword set, one further condition was included that contained only the primary visual detection task but no text stimuli. This was done in order to establish the baseline level of brain activation related purely to the colored geometric shapes, which could later be used to parcel out text-related brain responses from those related to the concurrent non-linguistic task.

### **EEG RECORDING AND PRE-PROCESSING**

During the visual presentation, the subjects' EEG was registered using a 32-channel EEG setup (Mitsar, St. Petersburg, Russia) and 10-mm gold-plated electrodes (Grass Products, Warwick RI, USA) placed on the scalp according to the 10–20% electrode configuration system, with linked mastoids as a reference electrode. To control for vertical and horizontal eye movements, electrooculogram (EOG) readings were taken via two electrodes placed below the left eye and lateral to its outer canthus. The sampling rate was 500 Hz. Electrode impedances were kept below 5 k-.

EEG data analysis was carried out offline using EMSE Suite (Source Signal, La Mesa CA, USA). Data were re-referenced to average reference, band-pass filtered (1–30 Hz) and bipolar electro-oculogram channels were reconstructed for vertical (VEOG) and horizontal (HEOG) eye movements from monopolar EOG recordings. Continuous data were then epoched into segments starting 100 ms before stimulus onset and ending 600 ms thereafter. The prestimulus interval of −100–0 ms was used as a baseline. Any epoch with signal variation exceeding 100µV was discarded, as were those that coincided with any target stimuli and the ones immediately following them, to minimize buttonpressrelated movement artifacts. The remaining artifact-free epochs were then averaged separately for each stimulus type (standard/deviant, word/pseudoword etc.). Finally, ERPs obtained for the control primary task-only block were subtracted from those obtained in the text stimulation blocks, in order to remove any contribution of attended geometric shapes into the responses, and concentrate on the effects of unattended orthographic stimuli *per se*.

# **EEG STATISTICAL ANALYSIS**

For an unbiased data-driven analysis, overall activation strength of the ERPs was first quantified as the global root mean square (RMS) of the ERP responses across all scalp electrodes. To this end, the grand average response was calculated across all word and pseudoword stimuli collapsed (standards and deviants included) for each electrode. Then, for each time point, the square root was calculated on the mean of squared amplitudes across all electrodes, producing a single global RMS response. Finally, the most prominent peaks in this global RMS were identified. These were found at ∼110 and 250 ms, which coincided with the wellknown ERP responses to visual/written stimuli: N1/P100 and N250 (Oken et al., 1987; Carreiras et al., 2009; Lee et al., 2012). Mean amplitudes across 20-ms time windows centered on these peaks were used for a more detailed further analysis. A smaller deflection was found at ∼350–400 ms corresponding to the established N350/N400 effects (Bentin et al., 1999; Lau et al., 2008); this period was therefore used as a 3rd time window in statistical assessment of the ERPs.

For statistical analysis, window-mean amplitudes extracted from each electrode in a 25-electrode array (organized in a 5 × 5 grid) covering most of the scalp were submitted to analyses of variance using factors Lexicality (words vs. pseudowords/nonwords), Stimulus Type (Standard—Deviant) and Topography (electrode location). For these statistics, data were taken from ERP responses prior to the RMS procedures, in order to allow assessment of possible polarity and topography differences.

# **RESULTS**

All stimulus conditions elicited pronounced ERP responses, with the most prominent peaks in the global response visible at ∼110, 250, and ∼375 ms (see **Figure 2**). The first peak exhibited posterior negativity combined with frontal positivity, whereas the reverse—posterior positivity with centro-frontal negativity—was seen for the second peak; the third deflection showed a posterior centro-parietal negativity typical for the N400 time range. Using these overall activity maxima to identify latencies of interest, we then compared window-mean ERP amplitudes at these main activation peaks between different stimuli. Statistical comparison between activation in response to meaningful words as opposed to matched meaningless pseudowords showed a main effect of Lexicality as early as in the first time window (centered at 110 ms), where words produced a significantly stronger response than pseudowords [*F*(1, <sup>15</sup>) = 5.76, *p* = 0.03; see **Figures 2**, **3**). This difference was visible as a more negative word deflection at posterior sites [*F*(1, <sup>15</sup>) = 5.04, *p* = 0.04], and a more positive one at fronto-central leads [*F*(1, <sup>15</sup>) = 5.05, *p* = 0.04]. A non-significant tendency for the same effect could also be observed in the second time window, and, finally, its fully significant rebound took place at the third peak [*F*(1, <sup>15</sup>) = 4.93, *p* = 0.04].

A similar difference was revealed by a comparison between words and non-linguistic control stimuli in the first peak [*F*(1, <sup>15</sup>) = 9.76, *p* = 0.01] and, although only marginally significant, in the last peak as well [*F*(1, <sup>15</sup>) = 3.71, *p* = 0.07]. Interestingly, although visual inspection suggested strong difference between non-word symbols and words also in the second interval (∼250 ms), this main effect was not significant when data from the entire electrode array were tested (*p* > 0.7). However, as ANOVA indicated a near-significant interaction between Lexicality and Topography for this contrast [*F*(4, <sup>16</sup>) = 2.60, *p* = 0.055], we followed it up with planned comparisons. These showed that the word-non-word difference in this interval was indeed significant but only at the electrodes to the left of the midline [*F*(1, <sup>15</sup>) = 4.16, *p* = 0.048] and not at any other sites, likely due to strong between-subject variability in this effect. Pseudowords, in turn, did not differ statistically from the nonlinguistic controls in either of the analyzed periods, although visual inspection did suggest a possible discrepancy in the two later intervals.

Direct comparison between standard and deviant stimuli revealed a main effect of Stimulus Type, that is, a significant MMN response, with a more negative deviant than standard response at posterior electrodes accompanied by an increased positivity frontally (**Figure 4**). This contrast was strongly significant in the 100–120 ms time window [*F*(1, <sup>15</sup>) = 7.37, *p* = 0.016] as well as in the 240–260 ms one [*F*(1, <sup>15</sup>) = 18.51, *p* = 0.001]. Although the latter difference, unlike that in the first peak, could be better described as a posterior *decrease* in positivity and anterior *decrease* in negativity for deviants (amounting to a total decrease in the global RMS curve as well), the net deviantstandard subtraction showed the same relative trend, and the difference topography was thus similar to that in the first peak. In the final time window, no significant mismatch response was found.

Interestingly, whereas we found clear main effects of Lexicality and Stimulus Type, no significant interactions between these factors arose in any of the analysis windows, and vMMN as such did not statistically differ between conditions. Finally, the subjects' performance on the primary behavioral task showed average 85% accuracy indicating good compliance with experimental instructions; mean reaction time was 753 ms.

stimulus-unspecific RMS, thus determining key intervals to be later

# **DISCUSSION**

We recorded ERPs elicited by unattended perifoveally presented meaningful words and orthographically and psycholinguistically matched meaningless pseudowords in a visual oddball sequence, while the subjects were distracted from these materials by a non-linguistic dual feature detection visual task presented in the focus of their attention in the center of the screen. We found (1) an effect of lexicality, i.e., differences in neural responses to words and pseudowords (as well as between words and non-word control stimuli), and (2) an evidence of differential processing of standard vs. deviant stimuli, i.e., the visual correlate of MMN for these lexical stimuli. These effects spanned in time from ∼100 to ∼400 ms, in line with the previous literature on neural word processing and lexical memory trace activation (Bentin et al., 1999; Martin-Loeches et al., 2005; Hauk et al., 2006; Lau et al., 2008; Carreiras et al., 2009; Pulvermüller et al., 2009). Below, we will discuss these findings in more detail.

# **LEXICALITY EFFECTS**

containing no orthographic oddball sequence.

The main effect of word-pseudoword difference became exhibited as an *increased word activation* that started very early (from ∼100 ms) and, with variable significance, was visible across the response epoch until ∼400 ms. As words and pseudowords were matched for orthographic and psycholinguistic features, it is unlikely that it was driven by low-level perceptual differences. Instead, we would like to suggest that this is the lexical familiarity *per se*, i.e., the presence of established memory representations for the meaningful word stimuli, that caused this difference. This is further supported by the remarkable similarity between the present effect and the so-called lexical enhancement in passive *auditory* ERPs. As reviewed in the Introduction, the lexical ERP enhancement has been explained by the activation of a word memory trace in the brain, as opposed to a purely sensory activity for meaningless pseudowords that do not possess memory representations in the brain and thus no corresponding memory trace activation is possible (Shtyrov et al., 2010; MacGregor et al., 2012). In the visual modality, lexical features have been known to affect responses already in 100–160 time range, although those results were obtained for attended and actively processed stimuli (e.g., Ortigue et al., 2004; Hauk et al., 2006), whereas the effect we report here takes place outside of the focus of attention. Previous studies using masked priming paradigm have also found lexico-semantic effects dependent on 'invisible' prime words (e.g., Dehaene et al., 2001; Naccache and Dehaene, 2001; Diaz and McCarthy, 2007), albeit their EEG correlates have largely been

located in a later time frame, predominantly in the 400 ms range (e.g., Brown and Hagoort, 1993; Kiefer, 2002). At this later time rage, the N400 response typically shows a reduction in amplitude for related prime-probe combinations. Here, we also report a later lexicality effect reaching into ∼400 ms time range (in addition to the early differences not typically reported in N400 literature). One important difference between these paradigms, however, is that in the masked priming designs the stimuli usually *are* attended in an active linguistic task (e.g., lexical decision), even though they may escape awareness through masking manipulation. Here, instead, the stimuli are outside the focus of attention while the subjects' task is strictly non-linguistic and does not encourage attentive linguistic processing in any way. Further, while the priming paradigm is typically aimed at revealing relationships between the prime and the probe stimuli, here we are addressing the processing of unattended stimulus *per se* and show that lexical familiarity strongly affects brain responses to such stimuli. Taken together, the current result appears to provide a strong evidence of automatic processing of unattended written language with lexical memory trace activation/access taking place even when this is irrelevant for task requirements and when attention is diverted away from written words. Automatic access to linguistic information in visual modality has been long suggested in behavioral psycholinguistic research (e.g., Glaser and Glaser, 1982; Brown et al., 1995; Naccache and Dehaene, 2001). Here, we show such access neurophysiologically and, furthermore,

demonstrate its rapid onset and dynamic timecourse in the brain's activity.

It has been argued that the bases for such automatic lexical activations are distributed neural circuits acting as longterm memory traces for words. Such memory circuits become formed through the process of associative learning in language acquisition, and thus possess strong internal connections that afford memory trace activation automatically, even in the absence of attention (Garagnani et al., 2009; Shtyrov, 2010; Shtyrov et al., 2010). Pseudowords/non-words, on the contrary, do not have such representations, leading to a smaller overall activity under non-attend presentation conditions. Automaticity and rapid speed of lexical activations are likely a consequence of high ecological value and social validity of linguistic communications, which are automatically processed by the brain for any potentially important messages. Previously established in the auditory modality, this automaticity is clearly shown here in the visual modality as well, suggesting similarity in neural word access irrespective of the exact presentation mode.

Although the overall surface topography of the brain responses found here is similar to that known from previous visual studies (e.g., Bentin et al., 1999; Hauk et al., 2006; Lau et al., 2008; Carreiras et al., 2009), exact brain loci of the found automatic lexical familiarity effect cannot be established given the low-resolution EEG method used. For this, future studies are necessary that may employ high-density EEG or/and MEG with neuroanatomically-based source analysis to reveal cortical origins of these lexicality effects. In previous auditory experiments using similar paradigms in fMRI and MEG, these were found in superior- and middle-temporal cortices as well as in inferiorfrontal cortex, predominantly in the left hemisphere (Shtyrov et al., 2005, 2008, 2011; Pulvermüller et al., 2006). Further areas, such as the inferior-temporally located visual word-form area as well as angular gyrus, are known to be involved in written word processing (Price, 2001); their involvement in unattended word processing also remains to be addressed in future research.

Interestingly, while the prominent word response around the typical P1/N1 range (∼100 ms) here takes the form of a posterior negativity accompanied by frontal positivity, the N170 deflection often found for orthographic materials (e.g., Maurer et al., 2008; Wang et al., 2013) is not obviously present here. There are a few possible explanations for this pattern of results. The most critical difference between this and the earlier visual orthographic studies is the mode of presentation. Rather than presenting the stimuli in the visual focus as it has been conventionally done in N170 studies, we showed them perifoveally where the Diaz-Araya and Provis, 1992). Further, the presentation was tachistoscopic, i.e., very brief, which may have also influenced the amplitude of common visual ERPs, including N170. This subtle presentation of the orthographic stimuli was also subject to interference from a massive non-linguistic central stimulus (**Figure 1**). Alternatively, such a subtle mode of presentation may have also led to a delay in the response peak—this could mean that the deflection at ∼250 ms may potentially at least in part be attributed to a weakened and delayed N170. To answer this question with any certainty, future studies will be necessary that will directly compare responses to lexical stimuli using different presentation modes. density of receptors on the retina is reduced (

# **vMMN TO ORTHOGRAPHIC STIMULI**

In line with previous research into visual MMN (see e.g., Czigler et al., 2002; Pazo-Alvarez et al., 2003; Stagg et al., 2004; Astikainen and Hietanen, 2009; Czigler and Pato, 2009; Kimura et al., 2010; Berti, 2011; Stefanics et al., 2011, 2012), we found that unattended presentation of standard and deviant stimuli in a visual oddball sequence does lead to a vMMN emergence. The current results showed the same relative polarity difference—more negative (or less positive at later times) posterior activity for the deviant than standard stimuli—as that seen with basic visual contrasts in previous vMMN research. The contrasts used in those earlier studies typically included color changes, movement direction, checkerboards and other simple visual objects. Similar to those preivous studies, vMMN seen here occurred early on and took place between 100 and 260 ms, although non-significant effects lasted for longer. The important new finding here is the vMMN elicitation by a subtle orthographic contrast, the change of a single letter in a tachistoscopically presented textual stimulus. This, to our knowledge, is the first demonstration of a vMMN effect for *unattended linguistic materials* suggesting that they are processed automatically early on even when presented outside the foveal attention spot. The only other linguistic vMMN study available to date is a very recent work by Wang et al. (2013), who have shown, using Chinese hieroglyphic characters, vMMN's sensitivity to phonological information. In that study, even though the subjects were not instructed to read the visually presented characters and were instead asked to detect their color, the vMMN was nevertheless strongly influenced by the phonological properties of the stimuli. The important difference between that work and our study is that Wang et al. deviated from the classic vMMN approach, by presenting the stimuli in the focus of visual attention and subjecting them to an explicit behavioral task. In our present work, we have followed more strictly the conventions for visual MMN research by locating the stimuli outside the visual focus of attention and ensuring that the subjects did not perform any stimulus-related activity at all, by distracting them with a spatially distinct primary task. Conceptually, while the current study is focused on automatic lexical effects, the Wang et al. paper deals with automatic extraction of phonological information. The two studies are therefore complementary in various aspects and, together, point toward early automaticity of different types of visual language processing.

While linguistic materials (including vowels, syllables, words and even phrases) have been known to elicit robust auditory MMNs (Pulvermüller and Shtyrov, 2006; Shtyrov and Pulvermüller, 2007), the same is shown here in visual modality, suggesting a certain similarity in linguistic MMN elicitation across modalities. There is, however, an important difference between the previous auditory results and the current visual findings. Auditory MMN research suggested a dominating role of the deviant stimulus's lexical status in eliciting memory trace activation, while reports of lexicality/familiarity effects for frequent standard stimuli have been less consistent (cf. Shtyrov and Pulvermüller, 2002; Jacobsen et al., 2004, 2005). Here, however, we observed no interaction at all between the factors of Lexicality (word vs. pseudoword) and Stimulus Type (standard vs. deviant), and vMMN as such did not statistically differ between conditions. This suggests that, on the one hand, lexical familiarity effects are elicited by standards and deviants alike, and, on the other hand, that vMMN is equally elicited by different stimuli regardless of their lexical familiarity. Given that previous auditory research is not entirely consistent and that the current study is the first foray into the lexical vMMN, it may be premature to discuss whether this difference is due to the modality of presentation, the rigorous within-modality distraction task or possibly some other factors. We would therefore prefer to refrain from addressing this question until further studies using different languages and experimental manipulations are carried out. Similarly, the cortical locus of the lexical vMMN in the brain can only be assessed in future high-density EEG/MEG and possibly fMRI research and cannot be resolved by this first study using a low-resolution EEG methodology.

Finally, application of the vMMN to neurolinguistic processes may open new avenues for this research. Unlike auditorily presented spoken words, visual text does not gradually unfold over time, which allows for stricter control over physical stimulus properties and thus opens a possibility to use a wider range of stimuli. It may also lead to application of linguistic MMN paradigms to situations in which auditory designs are not ideal, such as in noisy environments (e.g., inside an MR scanner) or with hearing-impaired participants, in order to ascertain the degree of automatic linguistic processing in various populations (Shtyrov et al., 2012).

# **CONCLUSIONS**

In a visual oddball sequence, matched short word and pseudoword stimuli were presented tachistoscopically in perifoveal area outside the visual focus of attention, as the subjects' attention was concentrated on a concurrent non-linguistic visual dual task in the center of the screen. Using EEG, we found:

• A visual analogue of the lexical ERP enhancement effect, with unattended written words producing larger brain response amplitudes than matched pseudowords as early as at 100–120 ms;

• A significant visual MMNs at 100-260 ms, here reported for the first time for unattended perifoveally presented lexical stimuli.

The data show a high degree of similarity with earlier auditory research into the neural time course of automatic language processing in the brain. This, in turn, suggests similar or even shared mechanisms of unattended language access in visual and auditory modalities. The current results indicate early and automatic lexical processing of visually presented language in the brain that commences rapidly and may take place outside the focus of visual attention, even under a strong distraction from linguistic input.

# **ACKNOWLEDGMENTS**

This research was supported by the UK Medical Research Council (MRC core project code MC-A060-5PQ90) and by the Russian Ministry for Education and Science (Targeted Federal Programme 'Scientific and scientific-pedagogical personnel of innovative Russia', contract 8488). We also wish to thank Alexander A. Alexandrov, Lucy J. MacGregor and Yury E. Shelepin for their input at different stages of this work, and two anonymous referees for their helpful comments and constructive critique on an earlier version of this paper.

# **REFERENCES**


21, 1395–1411. doi: 10.1037/0278- 7393.21.6.1395


network of brain regions. *J. Cogn. Neurosci*. 19, 1768–1775. doi: 10.1162/jocn.2007.19.11.1768


*Lond. B Biol. Sci.* 363, 1055–1069. doi: 10.1098/rstb.2007.2159


for an automatic spreading activation account of N400 priming effects. *Brain Res. Cogn. Brain Res*. 13, 27–39. doi: 10.1016/S0926- 6410(01)00085-4


Oken, B. S., Chiappa, K. H., and Gill, E. (1987). Normal temporal variability of the P100. *Electroencephalogr. Clin. Neurophysiol*. 68, 153–156. doi: 10.1016/0168-5597(87)90042-6

Oldfield, R. C. (1971). The assessment and analysis of handedness: the Edinburgh Inventory. *Neuropsychologia* 9, 97–113. doi: 10.1016/0028-3932(71)90067-4


Tracking speech comprehension in space and time. *Neuroimage* 31, 1297–1305. doi: 10.1016/j.neuroimage.2006.01.030


*Received: 07 May 2013; accepted: 14 July 2013; published online: 09 August 2013. Citation: Shtyrov Y, Goryainova G, Tugin S, Ossadtchi A and Shestakova A (2013) Automatic processing of unattended lexical information in visual oddball presentation: neurophysiological evidence. Front. Hum. Neurosci. 7:421. doi: 10.3389/fnhum.2013.00421*

*Copyright © 2013 Shtyrov, Goryainova, Tugin, Ossadtchi and Shestakova. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Unattended and attended visual change detection of motion as indexed by event-related potentials and its behavioral correlates

# *Nele Kuldkepp1,2\*, Kairi Kreegipuu1, Aire Raidvee1,2, Risto Näätänen1,3,4 and Jüri Allik1,5*

*<sup>1</sup> Institute of Psychology, University of Tartu, Tartu, Estonia*

*<sup>2</sup> Doctoral School of Behavioural, Social and Health Sciences, University of Tartu, Tartu, Estonia*

*<sup>3</sup> Center of Integrative Neuroscience, University of Aarhus, Aarhus, Denmark*

*<sup>4</sup> Cognitive Brain Research Unit, Institute of Behavioural Sciences, University of Helsinki, Helsinki, Finland*

*<sup>5</sup> Estonian Academy of Sciences, Tallinn, Estonia*

#### *Edited by:*

*Gabor Stefanics, University of Zurich and ETH Zurich, Switzerland*

#### *Reviewed by:*

*Erich Schröger, University of Leipzig, Germany Gábor Csukly, Semmelweis University, Hungary*

#### *\*Correspondence:*

*Nele Kuldkepp, Institute of Psychology, University of Tartu, Näituse 2, 50409 Tartu, Estonia e-mail: nele.kuldkepp@ut.ee*

Visual mismatch negativity (vMMN) is a negative-going component amongst cognitive event-related potentials. It reflects an automatic change-detection process that occurs when an infrequent stimulus is presented that is incongruent with the representation of a frequent (standard) event. In our research we use visual motion (more specifically motion direction changes) to study vMMN. Since movement in the visual field is quite irresistible to our brain, the question in hand is, if the detection of motion direction changes is dependent on attention directed to the stimulus. We present a new continuous whole-display stimulus configuration, where the attention capturing primary task of motion onset detection is in the central part of the visual display and visual oddball sequence on the background. The visual oddball paradigm consisted of 85% standard and 15% deviant events, motion direction change being the deviant. We show that even though the unattended visual oddball sequence does not affect the performance in the demanding behavioral primary task, the differences appearing in that sequence are noticed by our brain and reflected in two distinguishable vMMN components in occipital and parietal scalp locations. When attention is directed toward the visual oddball sequence, we only see different processing of standards and deviants in later time-windows and task-related activity in frontal scalp location. Our results are obtained under strict attention manipulation conditions.

**Keywords: visual mismatch negativity (vMMN), attention, oddball paradigm, motion detection, event-related potential (ERP)**

# **INTRODUCTION**

It is both necessary and possible for the human visual system to quickly and effectively detect sudden changes in the visual field even if those changes appear in the visual periphery or attention is not directed to them. This automatic change-detection mechanism has been shown to exist by means of a visual mismatch negativity (vMMN) component of the event-related potentials (ERPs). As its auditory counterpart (auditory MMN, Näätänen et al., 1978; for reviews see Näätänen and Winkler, 1999; Näätänen et al., 2007), vMMN component is elicited by infrequent visual stimuli (i.e., deviants) in the stream of frequent stimuli (i.e., standards) that obey some sequential regularity. It has a negative deflection and usually peaks around 150–400 ms after the onset of a visual stimulus. Researchers have argued that vMMN is elicited when an infrequent stimulus is incongruent with the sensory memory trace of a frequent stimulus (a memory-mismatch account) and that based on the regularities in the preceding stimulus sequence an incongruous prediction is made for the upcoming stimulus (a prediction-error account) (for reviews see Pazo-Alvarez et al., 2003; Czigler, 2007; Kimura et al., 2011; Kimura, 2012).

Proofs for the existence of vMMN remained elusive for some time and only relatively recently solid evidence started to accumulate that MMN exists not only in auditory but visual system as well. Up to now, vMMN has been obtained to differences in several visual features, such as stimulus color (Czigler et al., 2002, 2004; Clifford et al., 2010), location (Berti and Schröger, 2004, 2006), luminance (Stagg et al., 2004), orientation (Astikainen et al., 2004, 2008; Kimura et al., 2009 for left/right hands with different orientation see Stefanics and Czigler, 2012), spacial frequency (Kenemans et al., 2010; Sulykos and Czigler, 2011), duration of the visual stimulus (Qiu et al., 2011), motion direction changes (Lorenzo-López et al., 2004; Pazo-Alvarez et al., 2004a; Kremlácek et al., 2006; Amenedo et al., 2007 ˇ ), as well as more abstract sequential regularities (Stefanics et al., 2011; Kimura et al., 2012), object formation (Müller et al., 2010) or deformation (Besle et al., 2005) and stimuli carrying emotional content (Zhao and Li, 2006; Astikainen and Hietanen, 2009; Kimura et al., 2012; Stefanics et al., 2012). As Sulykos and Czigler (2011) have already pointed out, a vast majority of vMMN studies have concentrated on the automatic processing of features that are supposed to be processed by the parvocellular system. With this current study we investigate the change-detection processes in motion perception which is typically thought as a domain of the magnocellular system. Low-level motion perception is widely recognized as a vital function of the visual system and changes in speed and direction of motion are processed automatically without a necessary involvement of the focused attention (Cavanagh, 1992). Therefore, it could be a useful tool to investigate automatic change detection.

One of the main characteristics of the MMN component is its independence of attention: the magnitude of MMN can be approximately the same irrespective of the signal being attended or not (for auditory modality see Näätänen et al., 2007; for visual modality see Pazo-Alvarez et al., 2003; Kimura et al., 2009). Thus, when applying an experimental paradigm to elicit vMMN, the visual stimuli forming deviants and standards are usually task-irrelevant and there is a behavioral primary task that has to capture the subject's attention. To study automatic change detection in auditory modality, multimodal studies are often conducted, using a visual primary task [see Escera and Corral (2007) for some examples]. There have been studies investigating the intermodal effects of stimulation, showing that the amplitudes of ERPs are enhanced to stimuli in the attended modality (Alho et al., 1992; Wei et al., 2002). The stimulation and focused attention in one sensory modality has the capacity to affect perceptions in another modality (Besle et al., 2005; Bendixen et al., 2010; Salminen et al., 2013) and auditory and visual sensory memory are not completely differentiated from each other. Also, Czigler (2007) has pointed out that visual primary tasks guide attention more effectively than auditory, the latter becoming background stimuli too easily in case of continuous stimulation. So while for vMMN studies the primary task sometimes is a task in the auditory modality (e.g., listening to some story or radio play, or reacting to specific sounds: Astikainen et al., 2004; Maekawa et al., 2005, 2009; Fisher et al., 2010), a majority of studies have applied the vMMN paradigm and the primary task both in visual modality. One of the approaches is to use a sequence of stimuli where occasional stimuli function as targets and a behavioral task is related to them (e.g., subjects have to give a manual reaction whenever the targets appear in between the standard and deviant stimuli or when stimuli carrying standard or deviant properties also have target properties: Tales et al., 1999; Berti and Schröger, 2004, 2006; Kimura et al., 2009; Berti, 2011). A step forward is to have a stimulus sequence, where target stimuli are presented in the central part of the visual field and standards and deviants in the periphery (Lorenzo-López et al., 2004; Pazo-Alvarez et al., 2004a; Kremlácek et al., 2006 ˇ ). The question is whether there is no attention directed to the non-target stimuli in such sequential stimulus presentations where the stimuli are separated in time [that has also been critically raised by Czigler (2007)]. To take this issue under control, it is rather common to use a central primary task, while at the same time vMMN-eliciting stimulus sequences appear in adjacent locations or visual periphery (some examples of the different stimuli used: Müller et al., 2010; Qiu et al., 2011) and the time-course of stimulus presentation of the two areas is not connected. It has been found though, that vMMN amplitudes for stimuli presented in lower and upper visual hemifield differ (being higher in the lower visual hemifield) (Czigler et al., 2004; Amenedo et al., 2007; Sulykos and Czigler, 2011; Müller et al., 2012; for motion onset evoked potentials see Kremlácek ˇ et al., 2004). This discrepancy has not been shown for horizontal hemifield locations (Pazo-Alvarez et al., 2004b). The issue of stimulus location has been lately critically raised by Müller et al. (2012), who argue that the block-wise stimulus presentation in lower/upper hemifields does not rule out attention shifts to taskirrelevant stimuli. Derived from the studies indicating vMMN differences due to stimulus presentation location, we propose an experimental design that uses a central primary task and for standard and deviant stimulus presentation the whole peripheral visual field, which should eliminate the exogenous location effects.

The relative motion between an observer and the visual scene creates optic flow which is monitored with a purpose of guiding locomotion (Gibson, 1950). It is very likely that changes in the optic flow pattern are detected automatically at a relatively low level of processing and do not require focused attention for noticing them. The main goal of this study is to investigate the processing of changes in motion flow direction in conditions either requiring focused attention or not. It is predicted that unexpected changes in the flow pattern elicit a vMMN response which magnitude is nearly identical irrespective of attention paid to that change. The observer's task was to detect motion onset of a central area which was surrounded by a peripheral area filled with a horizontally moving pattern. The peripheral area was moving independently of the central one and an oddball paradigm was applied there to elicit vMMN. In an attention neutral task the observer was asked to execute a simple reaction as soon as the central target started to move. In an attention demanding task the observer was instructed to press one of two keys dependent of the relative motion direction between the central and peripheral moving patterns. Since one of the main properties of the MMN is attention-independence (Näätänen et al., 2007) it is expected that vMMN elicited by the peripheral flow pattern is independent of attention allocated to it.

# **MATERIALS AND METHODS**

#### **PARTICIPANTS**

Forty-nine volunteer observers (mean age 21.2 ± 2.3 years, 14 male) took part in the experiment. They all had normal or corrected-to-normal vision. The participants signed a written consent and the study was approved by the Research Ethics Committee of the University of Tartu [based on The Code of Ethics of the World Medical Association (Declaration of Helsinki)].

#### **APPARATUS AND STIMULI**

Stimulus presentation programs were created using Matlab (Math Works, Inc.). Stimuli were generated with Cambridge ViSaGe visual stimulus generator (Cambridge Research Systems Ltd., Rochester, UK) and presented on the monitor screen Mitsubishi Diamond Pro 2070SB 22 "(active display area 20," frame rate 140 Hz) which from the viewing distance of 90 cm subtended 27.6◦ in width and 20.5◦ in height. The display elements were target and background vertical sine gratings with following parameters: minimal and maximal luminance 0.13 and 128.2 cd/m2, respectively; spatial frequency 0.65 c/◦; Michelson contrast 99.8%. Around the central fixation point, a round area was separated by a 1.2◦ gap, forming a target area, which had a diameter of 8.26◦. The whole screen area outside the gap served as a background. (Stimulus configuration is schematically depicted in **Figure 1**). These specific stimulus parameters showed no background effect on the target motion detection in a previous behavioral study Kuldkepp et al. (2011). Based on that, we expect that when the subject is not paying attention to the background, we can study automatic processing of deviant stimuli there. The background was regularly horizontally moving (200 ms motion, 600 ms pause, velocity 1.6 ◦/s) and an oddball paradigm (85% standards, 15% deviants) was applied there with horizontal motion direction change as a deviant. In the pilot study for this experiment [unpublished data, result have been reported at 5th Conference on Mismatch Negativity (MMN) and its Clinical and Scientific Applications, 2009, in Budapest, Hungary], we found no exogenous effects of motion direction either on vMMN amplitude or latency and therefore, used rightward motion as a standard and leftward motion as a deviant. At the same time the target area was also horizontally moving: each motion trial had duration of 2225 ms (velocity 0.6◦/s, equal left-right probability), random inter-stimulus interval (ISI) was 500, 750, 1000, 1250, or 1500 ms.

# **PROCEDURE**

The subjects sat 90 cm from the monitor screen in a semidarkened electrically shielded room and were instructed to keep their eyes on the fixation point. In the "Ignore" condition the subjects had to pay attention only to the target area and to respond as quickly as possible to its motion onset by pressing a corresponding button on the response box (i.e., give a simple reaction). In the "Attend" condition, the instruction was to react to the motion onset of the target area, but depending on whether it is moving in the same or opposite direction with the background, press one of the two corresponding buttons on the response box (i.e., make a choice reaction). One experimental session lasted for about 13 min.

# **EEG RECORDING AND DATA ANALYSES**

Electroencephalography (EEG) was recorded with BioSemi Active Two system (BioSemi, Amsterdam, Netherlands) using 32 active electrodes (placement based on the international 10/20 system). Reference electrodes were placed on ear lobes. To monitor blinks and eye movements, vertical electrooculogram was recorded with electrodes below and above the right eye and horizontal electrooculogram with electrodes at the right and left outer canthi of the eyes. Online recording was done in DC mode with 1024 Hz sample rate and 0.16–100 Hz band-pass filter. Offline data analyses were done using Brain Vision Analyzer 1.05 (Brain Products GmbH, Munich, Germany). The signals were filtered from 1 to 30 Hz (24 dB/octave). Ocular correction was done using a built-in algorithm (Gratton et al., 1983). Artifact rejection was done with following criteria: maximal allowed voltage step 50 µV; maximal allowed absolute difference of two values in the segment 100µV; minimal and maximal allowed amplitudes −100 and 100µV; no more than 100 ms of consecutive low activity (0.5µV). Nine participants' data were excluded from the final analyses due to technical problems with EEG recording or excessive artifacts. As we were interested in the change detection process in two different attention conditions, EEG data for background events were used for the ERP analyses. We extracted epochs of 700 ms duration (including 100 ms pre-stimulus period) around background motion onset to calculate ERPs to standard and deviant events. Deviants that occurred right after another deviant were excluded from the analyses. As a result, the mean number of deviants per subject was 124. Also, only standards that were preceded by other standards (i.e., repetitive standards) were included (the first standard after a deviant event might be considered to be a deviant itself in an oddball paradigm, since the deviant also forms a trace to be compared with, but due to its rarity the trace is not reinforced; Näätänen and Winkler, 1999). The amount of deviants and standards to be compared in the individual recordings was equalized as much as possible by selecting random segments amongst standard events (the allowed difference criterion between the number of deviants and standards was four segments). For most of the recordings, the percentage of random segments was between 16 and 22. Since we did not allow bad intervals, there were also recordings where the random segments percentage was 24, 26, 32, and 58; for five recordings we had to allow bad intervals to get enough standards for comparison. As a result, the mean number of standard events included in the analyses was 124. The selected responses for deviant and standard events were averaged across each subject. In the resulting waveforms, mean amplitude values were calculated for each 25 ms latency window in the 100–400 ms post-stimulus time range for each subject. Difference waveforms (vMMN) were calculated for both recordings of each subject ("Ignore" and "Attend" condition) individually by subtracting the ERP waveform of a standard event from the ERP waveform of the deviant event. In the resulting vMMN waveforms, mean amplitude values were calculated on the same basis as described above. One-Way and repeated measures analyses of variance (ANOVA), paired *t*test for dependent samples and *t*-test for single sample was used for statistical analyses, the normality of residuals was tested for each comparison.

To check if there is no frontal vMMN [as shown for motion stimuli for example by Pazo-Alvarez et al. (2004a)], we pooled together electrodes (AF3, AF4, F3, F4, and Fz) from frontal area [there were no hemispheric differences: in the "Ignore" condition *F*(22, <sup>934</sup>) = 0.31, *p* = 0.99; in the "Attend" condition *F*(22, <sup>934</sup>) = 1.44, *p* = 0.09] and compared the mean amplitudes of standard and deviant waveforms in all latency windows for "Ignore" and "Attend" conditions. There were no significant differences except for in 3 latency windows in the "Attend" condition [*t*(39) = −2.07, *p* = 0.046 for 225–250 ms; *t*(39) = −2.31, *p* = 0.03 for 350–375 ms; *t*(39) = −3.46, *p* < 0.01 for 375–400 ms latency], the difference wave being positive (as seen in **Figure 2**) and probably reflecting attention-related P3 component.

To pool the single electrodes together based on their location, we first checked for hemispheric differences in mean vMMN amplitudes for all latency windows in parietal left vs. right regions and found none [in the "Ignore" condition *F*(22, <sup>934</sup>) = 0.56, *p* = 0.95, in the "Attend" condition *F*(22, <sup>934</sup>) = 1.3, *p* = 0.16], therefore, we pooled all the electrodes in parietal areas together. The electrodes from occipital area of interest were also pooled together. The following two areas were formed for further analyses: Occipital (comprised of O1, O2, and Oz electrodes) and Parietal (comprised of P3, P4, P7, P8, PO3, PO4, and Pz electrodes). Focus on the parietal and occipital scalp areas is supported by previous results (e.g., Pazo-Alvarez et al., 2004a) showing reliable vMMNs for moving stimuli at those locations.

#### **BEHAVIORAL DATA RECORDING AND ANALYSES**

For the purposes of within-subjects comparisons we excluded the same nine subjects' data from the analyses that were excluded from the final EEG analyses. The subjects' reactions (the button presses) were online-recorded in ms. For the "Attend" condition, the reactions were also classified to be either correct on incorrect (depending on whether the subject had estimated correctly if target and background area were moving in the same or opposite direction) in the offline analyses. Very fast (<100 ms) and slow (>1000 ms) reactions were excluded from the analyses. To

be sure the subjects were participating actively and directing or not directing their attention to the background (depending on the task in hand), we first calculated the hit rates based on target motion trials and subjects' answers. Since the question of interest is how the deviant motion in the background affects reactions to primary task, we included only those trials in the further analyses where both areas (target and background) had been moving together for at least 100 ms and excluded the ones where either or both of the areas were not moving. The differences between RTs were compared by one-way and factorial ANOVA; the normality of residuals was tested for each comparison.

# **RESULTS**

# **BEHAVIORAL DATA**

Subjects detected the motion onset of a central target (as indicated by button presses) during 79.6% of all trials in the "Ignore" condition and gave direction estimations on 70.4% of trials in the "Attend" condition. After including only the trials where target and background areas were both moving, mean reaction time (RT) for the "Ignore" condition was 265.2 (*SD* = 116.2) ms. RTs to target motion onset did not differ during standard and deviant background motion: *F*(1, <sup>1241</sup>) = 0.78, *p* = 0.38 for 266.5 (*SD* = 115.2) ms and 258.5 (*SD* = 121.5) ms, respectively. In the "Attend" condition, mean RT was 279.2 (*SD* = 131.9) ms, which differed from the mean RT in the "Ignore" condition [*F*(1, <sup>2928</sup>) = 8.92, *p* = 0.003]. This is expected since with the number of response alternatives RT increases (Teichner and Krebs, 1974). In the "Attend" condition, there was a significant difference between the RTs in correct vs. incorrect direction estimations [*F*(1, <sup>1683</sup>) = 5.54, *p* = 0.02]. Looking into it, we see this difference arises from the trials with deviant motion direction on the background. During standard stimuli, RTs for correct and incorrect answers did not differ: *F*(1, <sup>1441</sup>) = 0.46, *p* = 0.50, for 283.6 (*SD* = 136.6) ms and 277.7 (*SD* = 130) ms, respectively. During deviant stimuli, RTs were significantly shorter for incorrect direction estimations [*F*(1, <sup>242</sup>) = 5.04, *p* = 0.03), mean RT for the correct answers being 295.8 (*SD* = 139.3) ms and for the incorrect answers 255.3 (*SD* = 125) ms.

# **EEG DATA**

Deviant waveforms in Parietal and Occipital areas have a more negative placement compared to standard waveforms in both experimental conditions (**Figure 2**). Mean amplitudes of standard and deviant waveforms in both areas of interest were compared (repeated measures ANOVA, Benjamini-Hochberg correction applied). The results (**Tables 1**, **2**, **Figure 2**) show significant differences in early latency windows in both areas for only "Ignore" condition. The highest vMMN mean amplitude emerges in 125–150 ms latency range in Occipital area and in 150–175 ms time window in Parietal area. Significant vMMN amplitudes in later time windows are present in both areas in "Attend" condition starting from around 275 ms and in Occipital area in "Ignore" condition starting from 250 ms. Comparisons (repeated measures ANOVA, Benjamini-Hochberg correction) between "Ignore" and "Attend" conditions in both areas and all time windows separately did not show statistically significant differences, although in the 150–175 ms latency range it was close in both Occipital



*\*Marked probabilities are significant after Benjamini-Hochberg correction allowing for 5% false positives.*

[*F*(1,39) = 3.2, *p* = 0.09] and Parietal [*F*(1, <sup>39</sup>) = 3.03, *p* = 0.09] areas. Analogous tendency was seen in 300–325 latency range in Parietal area [*F*(1, <sup>39</sup>) = 3.04, *p* = 0.09].

# **DISCUSSION**

It is common to stress that our very survival depends critically on being able to perceive the movement of significant objects (e.g., falling tree, running predator etc.) that are approaching us or have otherwise been set in motion by an action or some force. Considering the importance of motion perception, it is not surprising that the visual system is particularly sensitive to it (Palmer, 1999) by developing specialized neurological mechanisms tuned to the fast detection of motion (e.g., Newsome and Paré, 1988). Neurons selective to motion direction that are found in higher levels (layer 4B) of the magnocellular pathway are known for their fast temporal resolution (Livingstone and Hubel, 1988). Also, there is evidence of a pre-attentive, automatic change detection mechanism sensitive to motion direction in the human visual system (e.g., Pazo-Alvarez et al., 2004a). Given that, it is not surprising that there was a stronger deflection in response to an unexpected direction of motion (relative to the regularly directed motion) in unattended than attended situation, the main difference being the emergence of an early vMMN component in the "Ignore" condition that was missing in the "Attend" condition. It is important to note that the difference in standard and deviant stimuli was defined by the direction of motion, not by any other physical attribute of the stimuli. What is surprising is that although deviant and standard stimuli are both quickly detected by our brain, the difference between them is, for some reason, quickly (i.e., during the first couple of hundred ms) processed only during the "Ignore" condition. This is unexpected in the light of previous research (Wei et al., 2002) showing two vMMN components in the attended and an earlier negativity only in unattended condition (but see also Maekawa et al., 2005, who report 2 vMMN components emerging in unattended conditions, although they did not have an attended condition to compare with). It is also well known from studies in auditory modality that MMN should be similarly elicited when subjects direct their attention away or toward the standard and deviant stimuli (for an overview, see Näätänen et al., 2007). Our puzzling result may be caused by an unknown artifact which origin is difficult to trace. However, it is also possible that the results reflect a principal difference between auditory and visual processing. Compared to auditory MMN it took approximately two decades to establish the mere existence of vMMN and one of the probable reasons is a difference between auditory and visual attention. The fact that an


**Table 2 | Mean amplitudes of standard, deviant and difference (vMMN) waveforms and repeated measures ANOVA results showing the comparison of standard and deviant mean amplitude for each latency window and condition in Parietal area for 40 subjects.**

*\*Marked probabilities are significant after Benjamini-Hochberg correction allowing for 5% false positives.*

early vMMN is not seen in "Attend" condition might reflect the executive attention process in visual modality. Schröger (1997) has suggested that attention affects the encoding of the available sensory information, so it seems possible that when the features of standard and deviant stimuli (i.e., motion direction) are actively processed for conducting a difficult primary task (as was the case in our experiment), the visual top-down attention might suppress the automatic change-detection mechanism responsible for the emergence of vMMN (although there are opposite results, e.g., Kimura et al., 2010, showing vMMN only under attention).

It has been argued (see Czigler et al., 2002; Kimura et al., 2009; Kimura, 2012), that the difference between standard and deviant events near the latency range associated with N1 or the early detection could be mainly due to stimulus-specific refractoriness and not reflect a "genuine" mismatch between stimuli. In other words, because of the different probability of standards and deviants (in our study 85 and 15%, respectively) the level of habituation for afferent neuronal populations responding to differential features of either stimulus (horizontal motion direction in our study) is different and early ERP amplitudes related to deviant stimuli could be larger than for standard stimuli. We can easily eliminate the refractoriness-hypothesis, because exactly the same stimulus configuration and probabilities of stimulus types are used in both attention conditions and there is no significant difference in early processing of standards vs. deviants in the "Attend" condition. Also, Kimura (2012) has suggested that for separating N1 ERP component from the "genuine" vMMN the latter has to be outside the range of a usual N1 peak. The early posterior negativity visible in vMMN waveform in the "Ignore" condition of the current study has the highest mean amplitude between 150–200 ms in Parietal and 125–175 ms in Occipital locations. For motion onset of complex stimulus displays the N1 peak has been found below 150 ms (Kremlácek et al., 2004 ˇ ) and Kremlácek et al. (2006) ˇ report an even larger negative component around 110 ms in a vMMN-eliciting paradigm that is probably N1 (they see differences between standard and deviant stimuli that are interpreted as vMMN starting from 145 ms). Based on these findings we can assume that the early significant difference between standard and deviant responses in the "Ignore" condition (as shown in **Figure 2** and **Tables 1**, **2**) is in concordance with the features of vMMN.

In addition, we see a second negative-going difference between standard and deviant events starting from around 250 and 275 ms in both posterior areas in both conditions (although it did not yield statistical significance in Parietal area in "Ignore" condition). This difference waveform has two amplitude peaks in the "Ignore" condition, first one in the N2 time range that has been reported by some researchers to be a "genuine" vMMN (e.g., Czigler et al., 2006; Kimura et al., 2009). In the "Attend" condition, we see a more continuous negative waveform, which would suggest the difference in the N2 time range as well as already in the P3 time range (visible in the deviant and standard waveforms), the latter reflecting task-related activity (Näätänen and Winkler, 1999). We see again that the component associated with automatic deviance detection (here in the N2 latency range) is better separated from latter activity in the "Ignore" condition, which is in concordance with the notion of an attenuated MMN response under focused attention (Näätänen et al., 2007).

When looking at the behavioral results, we see that in the "Ignore" condition there is no difference between participants' reaction times during standard or deviant background motion. We have shown this independence of background motion to target motion onset for the same stimulus configuration in our previous paper Kuldkepp et al. (2011). Interestingly, although the effect of background motion is not visible in behavioral responses, it is evident in the ERP results, meaning that events that do not manifest themselves in our behavior can nevertheless, be noticed and registered by our brain. Hence, we have shown that the discrimination of changes in the unattended visual field is possible for visual complex stimuli.

In the "Attend" condition, we see a somewhat surprising result, namely that in case of incorrect direction estimations RTs are significantly shorter if there is a deviant event on the background. The result that a deviant event facilitates incorrect answers (i.e., subjects make more mistakes) has been shown before (Escera and Corral, 2007). But the result of shorter RTs contradicts many of the previous findings showing prolonged behavioral responses in case of task-irrelevant deviant or novel events (for visual modality see for example Czigler and Sulykos, 2010; for auditory-visual cross-modal paradigm Bendixen et al., 2010; for an overview Escera and Corral, 2007). On the other hand, there are studies showing facilitation effects on performance in case of novel or deviant events on some occasions, for example when the rare events carry ecological importance or some informational content [see Wetzel et al. (2012) and SanMiguel et al. (2010) for auditory-visual paradigms]. One explanation to such results is the enhancement of arousal by stimuli that are motivationally significant, which in turn improves performance or readiness to respond. This notion is also supported by Wetzel et al. (2012) who report the facilitation effect to be larger for (ecologically more significant) novel stimuli than artificial deviants. Chen et al. (2010) have argued that novel or deviant events might draw more attention than frequent standard events, which results in subjects being more confident about their decision and answering more quickly. This explanation is plausible with the decreased RTs, because these results are obtained in the "Attend" condition. The facilitation effect seen in our results can be partly explained by both the arousal component and the attention component of the orienting response. It still remains unclear why the deviant event facilitates only incorrect and not correct answers. For example we can exclude the notion of motion direction being a motivationally significant stimulus (as suggested by studies showing cultural preferences of direction, see for example Spalek and

Hammad, 2005) and affecting the performance, because there were no exogenous effects of motion direction (as stated in the Materials and Methods section). The result that deviant events facilitate incorrect direction estimations, needs to be therefore, further explored, because we restricted the analyses of behavioral data to only those trials where there was motion occurring in both central and background area of the display and the number of trials was quite low (although the normality of residuals was controlled).

One might ask if we are sure we have manipulated with subjects' attention effectively enough. We have four arguments to support the positive answer to that question. First, the stimulus configuration was chosen based on previous behavioral results of background and target interaction (Kuldkepp et al., 2011). More specifically, we determined the configuration of central and background visual field partition, where the background motion did not affect the detection of motion onset in the central area. We consider these behavioral results to be a solid ground for designing an experiment with a primary motion detection task in the center to investigate vMMN (elicited by background motion) under ignore conditions. Our current results support this approach since there is a clear difference between "Ignore" and "Attend" conditions for vMMN in early latency windows that is not due to state of refractoriness as explained before. Second, we see a positive amplitude peak in the P3 latency range in Frontal scalp area only in the "Attend" condition, which reflects attentionspecific task activity (see Pazo-Alvarez et al., 2003, for an overview of N2b-P3a complex findings in the vMMN research). Third, when we look at the number of target trials and the number of subjects' manual responses, we see a high percentage of answered events in both conditions, which suggests that the subjects were actively participating in the task given to them. For example in the "Attend" condition the task was to estimate if the target and background areas are moving in the same or opposite direction, but due to different time intervals there could have been a situation when the background was stationary during target motion onset. Taking this under consideration the 70.4% answer rate is very high for such a difficult task. Fourth, we see that the mean RT in the "Ignore" condition is in an expected range for a motion onset detection task. For the same stimulus size and velocity the mean RT was 277.9 (*SD* = 74.9) ms in our previous study Kuldkepp et al. (2011). This confirms that the subjects were in fact actively participating in detecting any motion onset and responding to it as quickly as possible.

In the line of research of visual motion perception and psychophysics it is rather common to use experimental paradigms which incorporate the whole visual display area (e.g., Raidvee et al., 2011; Hanada, 2012; for visual evoked potentials see Kremlácek et al., 2004 ˇ ). Surprisingly, stimulus configurations extending the entire display are not often reported in vMMN research (except for a stimulus configuration used in several studies by Kremlácek and colleagues, see ˇ Kremlácek et al., 2006; Hosák ˇ et al., 2008; Urban et al., 2008), although it would be a reasonable way of eliminating the stimulus location effects caused by discrete stimulus presentations. Importantly, this is the first time to show vMMN to motion direction changes with a display where the sequence of target events is separate from the sequence of standard and deviant events, the latter being continuous. We have therefore, solved two problems that existed in previous vMMN studies using moving stimuli and have been critically raised by Czigler (2007) and Kimura (2012). First, the problem of target events appearing in the same time-sequence with standard and deviant events (e.g., Kremlácek et al., 2006 ˇ ), and secondly, the problem of standard and deviant displays being non-continuous [e.g., separated by a blank screen like in Lorenzo-López et al. (2004)].

It has been argued (for an overview, see Kimura, 2012) that in an oddball type of MMN paradigm the more prominent processing of a deviant event could be due to its rareness. New vMMN paradigms with equiprobable stimulus presentation have been shown to be effective for controlling the state of refractoriness (see for example Czigler et al., 2006 and Kimura et al., 2009). Derived from that, future directions with continuous whole-display stimulus configurations should include more equal stimulus proportions. In the line of motion detection research this would also mean including different motion directions instead of only horizontal motion and instead of sine-wave gratings probably a random-dot display [where the orientation of elements in the stimulus display would not play a role, see for example Raidvee et al. (2011)].

### **REFERENCES**


a new crossmodal paradigm. *Neuropsychologia* 48, 2130–2139. doi: 10.1016/j.neuropsychologia. 2010.04.004


In conclusion, we have proposed a stimulus configuration for studying change-detection processes in a typical optic flow pattern and for manipulating with subjects' attention. We obtained two deviant-related negativities that we consider to be vMMN responses in parietal and occipital scalp locations. The first negativity has its peak around 150 ms and is evident only in the "Ignore" condition, and the second emerges in latency windows starting from 225 ms and is more evidently separated from the P3 difference again in the "Ignore" condition in occipital location. We also see that even if the deviant and standard stimulus events do not affect the behavior (as is the case in the "Ignore" condition), our brain is able to process those events automatically.

# **ACKNOWLEDGMENTS**

This research was supported by the Estonian Science Foundation (grant #8332), the Estonian Ministry of Education and Research (Institutional Research Grant IUT02-13 and SF0180029s08) and Primus grant (#3-8.2/60) from the European Social Fund to Anu Realo. The authors thank two reviewers and Dr. Piia Astikainen for helpful comments, as well as Hels Hinrikson for language corrections, Kertu Saar for help with the figures and Tiit Mogom for technical help.


attention. *J. Psychophysiol.* 21, 251–264. doi: 10.1027/0269-8803. 21.34.251


*Neurophysiol.* 116, 2392–2402. doi: 10.1016/j.clinph.2005.07.006


mismatch-related brain activity. *Brain Res.* 1398, 64–71. doi: 10.1016/j.brainres.2011.05.009


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 April 2013; accepted: 29 July 2013; published online: 14 August 2013. Citation: Kuldkepp N, Kreegipuu K, Raidvee A, Näätänen R and Allik J (2013) Unattended and attended visual change detection of motion as indexed by event-related potentials and its behavioral correlates. Front. Hum. Neurosci. 7:476. doi: 10.3389/fnhum.2013.00476 Copyright © 2013 Kuldkepp, Kreegipuu, Raidvee, Näätänen and Allik. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Is it a face of a woman or a man? Visual mismatch negativity is sensitive to gender category

# *Krisztina Kecskés-Kovács 1,2\*, István Sulykos 1,3 and István Czigler 1,3*

*<sup>1</sup> Experimental Psychology, Research Centre for Natural Sciences, Institute of Cognitive Neuroscience and Psychology, Hungarian Academy of Sciences, Budapest, Hungary*

*<sup>2</sup> Department of Experimental Psychology, Institute of Psychology, University of Debrecen, Debrecen, Hungary*

*<sup>3</sup> Department of Cognitive psychology, Institute of Psychology, Eötvös Loránd University, Budapest, Hungary*

#### *Edited by:*

*Piia Astikainen, University of Jyväskylä, Finland*

#### *Reviewed by:*

*George Stothart, University of Bristol, UK Kairi Kreegipuu, University of Tartu, Estonia*

#### *\*Correspondence:*

*Krisztina Kecskés-Kovács, Research Centre for Natural Sciences, Institute of Cognitive Neuroscience and Psychology, Hungarian Academy of Sciences, Pusztaszeri str. 59-67, 1394 Budapest, PO Box 398, Hungary e-mail: kecskes.kovacs. krisztina@ttk.mta.hu*

The present study investigated whether gender information for human faces was represented by the predictive mechanism indexed by the visual mismatch negativity (vMMN) event-related brain potential (ERP). While participants performed a continuous size-change-detection task, random sequences of cropped faces were presented in the background, in an oddball setting: either various female faces were presented infrequently among various male faces, or vice versa. In Experiment 1 the inter-stimulus-interval (ISI) was 400 ms, while in Experiment 2 the ISI was 2250 ms. The ISI difference had only a small effect on the P1 component, however the subsequent negativity (N1/N170) was larger and more widely distributed at longer ISI, showing different aspects of stimulus processing. As deviant-*minus*-standard ERP difference, a parieto-occipital negativity (vMMN) emerged in the 200–500 ms latency range (∼350 ms peak latency in both experiments). We argue that regularity of gender on the photographs is automatically registered, and the violation of the gender category is reflected by the vMMN. In conclusion the results can be interpreted as evidence for the automatic activity of a predictive brain mechanism, in case of an ecologically valid category.

**Keywords: event-related potential (ERP), gender, perceptual categorization, automatic change detection, facial processing, visual mismatch negativity (vMMN), passive oddball paradigm**

### **INTRODUCTION**

In social interactions of everyday life face recognition is a fundamental function. The human perceptual system can identify categorical attributes of faces, e.g., female, male, happy, fearful, unfamiliar, familiar, etc. Research on face perception concentrated on active, attended paradigms, while contrarily face perception is associated with automatic processes. In our study we were interested in the automaticity of category-formation, more specifically discrimination of female and male faces. On this end we examined this issue using event-related potentials (ERPs), particularly the visual mismatch negativity (vMMN) component. This component is sensitive to registration of environmental regularities and environmental changes even if the visual stimuli are not connected to the attended events, i.e., vMMN is an index of mismatch between the representation of the regularities and the deviant event, without the involvement of attentional processing. VMMN is considered as an error signal to the discrepancy between the expected and actual stimulation (Kimura, 2011; Winkler and Czigler, 2012). VMMN is usually investigated using passive oddball paradigms. In such paradigms, a frequently presented type of stimuli (standard) acquires the representation of regularity, and another, infrequently presented type of stimuli (deviant) violates this regularity. The difference between the ERPs to the deviant and standard is the vMMN component. VMMN is elicited by events physically different from the regular members of stimulus sequences (e.g., color, Czigler et al., 2002; orientation, Astikainen et al., 2008; spatial frequencies, Heslenfeld, 2003; movement, Pazo-Alvarez et al., 2004). Results of a number of studies provide evidence that the sensitivity of vMMN is not restricted to the detection of infrequent changes of elementary features; vMMN is elicited by deviant sequential relationships (Kimura et al., 2011), and the conjunction of visual deviant features (Winkler et al., 2005). Furthermore, it has been shown that the system underlying vMMN is sensitive to perceptual categorization in the color domain (Athanasopoulos et al., 2010; Clifford et al., 2010; Mo et al., 2011), and in Gestalt organization, like vertical symmetry (Kecskés-Kovács et al., 2013), and laterality of human hands as a category (Stefanics and Czigler, 2012).

Before we introduce the main question and procedure of the current study, it is worth to mention some of potentially relevant characteristics of face recognition. The influential functional model of facial processing, developed by Bruce and Young (1986) suggested several face processing units. Among these units the structural encoding module is especially prominent in the context of the present study. This module configures the representation and description of the faces. A well-investigated ERP correlate of the structural encoder is a negative ERP component with 170 ms post-stimulus latency (N170) that reflects the neural mechanisms of face detection (for a review see Bentin et al., 1996).

Concerning the categorical aspects of facial processing, in vMMN studies so far only emotional expressions were

**Abbreviations:** ANOVA, analysis of variance; ERP, event-related potential; vMMN, visual mismatch negativity; ISI, inter -stimulus -interval.

investigated (Zhao and Li, 2006; Astikainen and Hietanen, 2009; Stefanics et al., 2012). Zhao and Li (2006), in a modified crossmodal delayed response paradigm where participants performed an acoustic tone discrimination task and ignored the face stimuli (neutral standards vs. happy or sad deviants), obtained vMMN to emotional deviants in an earlier (110–120 ms) and a later (∼300 ms) latency range. In this study the negative difference was termed as expression mismatch negativity (EMMN). However, in this experiment the different facial expressions were produced by a single actor (the various emotions were produced by only one person). Therefore, it is possible that the ERP difference was due to low-level feature changes. Similarly, in another study (Susac et al., 2004) the facial emotions were expressed by a single actor. The possibility of low-level visual effects was eliminated by Astikainen and Hietanen (2009). In their study emotions were presented by different actors. In this study the passive oddball paradigm and the task-related events were in the auditory modality. The ERPs to the deviant face stimuli (happy vs. fearful) were more negative in an earlier (140–160 ms) and in a later (280–320 ms) latency range. Astikainen et al. regarded the second vMMN as the relevant index of emotional change detection. The authors suggested that in an earlier latency range (140–160 ms) deviant-related negativity was a consequence of the change of the face-related N170 ERP component.

In a passive oddball paradigm Stefanics et al. (2012) introduced a visual detection task at the center of the visual field with a fixation cross. The vMMN-related face stimuli were presented parafoveally. They observed deviant-related negativities in two latency ranges (150–220 and 250–360 ms). Furthermore, they found different hemispheric lateralization for the positive and negative automatic emotional processes (fearful-right, happy-left hemisphere).

In the majority of studies face identity varied within sequences, consequently at the level of elementary visual features different cell populations were stimulated. Therefore, stimulus variability decreased the possibility of stimulus-specific refractoriness of exogenous activity (May and Tiitinen, 2010). However, the similar latency (Astikainen and Hietanen, 2009) and the decreased effect in an equal probability control (Li et al., 2012) indicates, refractoriness of higher order processing structures, specific to face processing may be involved in the early sub-component of emotion-related vMMN<sup>1</sup> . However, the later negative effect (in the 250–360 ms range) seems to be a valid genuine vMMN. On the basis of the results of vMMN studies with facial stimuli we expected a similar latency range for the gender-related deviant*minus*-standard difference potentials. This vMMN component would reflect the higher level of the automatic change detection that is related to facial gender categories.

Gender-related categorical perception was investigated less frequently than the emotional categories. In their behavioral study, Campanella et al. (2001) examined the processes of gender perception in a delayed matching task with morphed unfamiliar face pairs. They found a morphing main effect (the participants identified gender easily if the distance of morph was large between pictures). Additionally, and more interestingly, it was easier to discriminate between-gender pairs than within-gender pairs, even if the morphing differences were identical.

In a second relevant study Mouchetant-Rostaing et al. (2000) recorded three types of gender-processing. In the first condition all faces were identical gender (female vs. male—preventing gender discrimination). In the second condition, both types of gender were presented, but gender itself was irrelevant for the participant's task. The third condition was an explicit gender discrimination task. The main finding was that gender processes are different from the structural encoding of faces (N170). Gender categorization effect was observed (in the second and third conditions) in the later epoch range (200–250 ms) which might reflect more general gender categorization processes. As results of ERP data suggests, it seems that representation and encoding of gender information on the face is automatic.

In the present study, we investigated whether gender category was capable of eliciting vMMN, when male faces as deviants were presented in a sequence of female faces and *vice versa*.

# **EXPERIMENT 1 MATERIALS AND METHODS** *Participants*

Participants were 14 healthy adults [6 women; mean age = 21.16 years, standard deviation (*SD*) = 1.52 years]. They had normal or corrected-to-normal vision. Written informed consent was obtained from every participant before the experimental procedure. The study was conducted in accordance with the Declaration of Helsinki, and accepted by the United Committee of Ethics of the Psychology Institutes in Hungary.

#### *Stimuli*

The stimuli were 80 cropped faces with neutral expression, 40 from each gender taken from public internet databases (from internet database: www.findaface.ch, we attempted to avoid pictures with emotional experiences, i.e., the photographs were "college yearbook" types). Using Photoshop (CS4) software, black and white pictures were created with a specific cropping (the size of the cropping mask was 1024 × 1024 pixels i.e., 12.9◦)<sup>2</sup> .

<sup>1</sup>The term of N1/N170 is an indication that visual stimuli in general elicit a posterior negative exogenous component. However, this negativity is usually larger when the facial stimuli were presented. Therefore, the negative peak can be considered as an aggregate of the N1 component and a putative face-related activity. The equal probability control refers to a method that was developed by Schröger and Wolff (1996). Two aspects are prominent in the control design: 1, the control and deviant stimuli are presented with same features and probabilities; 2, the control sequence is without a sequential regularity rule (Schröger and Wolff, 1996; Jacobsen and Schröger, 2001).

<sup>2</sup>In a control experiment we assessed the gender-related discriminability of the stimuli. It was an active two-stimulus oddball paradigm (active attending) with the infrequent stimuli as target. The sequences were similar to the vMMN sequences, the participants (*n* = 14) were instructed to identify the rare gender category and to respond with button press. We measured RT response and accuracy. Before each block (the block order was randomized) the participants were informed about the current target category (female or male). According to the results, the hit rate was over 80% (mean hit rate of targets = 98.21%, *SD* = 3.61%). Median reaction time was 504.16 ms (*SD* = 65.52 ms). There was no difference between male and female target stimuli. Therefore, we concluded that the two faces were valid and discriminable members of the two categories. However, it must be noted that the experiment design used

Stimulus duration was 300 ms and the inter-stimulus-interval (ISI; i.e., non-stimulated interval) was 400 ms (see **Figure 1**).

The background was gray (36.67 cd/m²). The mean luminance of female faces was 54.31 cd/m² (*SE* = 2.0 cd/m²). Male faces were presented with 47.78 cd/m² mean luminance value (*SE* = 2.0 cd/m²). Stimuli appeared on a 17" monitor (Samsung SyncMaster 740B, 60-Hz refresh rate) from a 1.2 m viewing distance in a dimly lit and soundproof room.

The probability of frequent stimuli (standard) was 0.8% and the probability of infrequent stimuli (deviant) was 0.2%. We applied two conditions (female deviant and male deviant). In one of the conditions female faces were the frequent (standard) and male faces were the infrequent (deviant) stimuli. In the other condition these probabilities were reversed. There were 600 stimuli (480 standards and 120 deviants) in a condition. The order of presentation of conditions was counterbalanced across participants. The number of consecutive standards was changed in pseudo random order from two to nine. The successive stimuli were never physically identical.

# *Task*

Participants performed a simple reaction time (RT) task. The center of the screen was the task-field, which included a gray circle (0.81◦ with 36.67 cd/m²). The target was a dark cross (0.45 cd/m²), continuously presented at the center of the circle. The participants were instructed to detect the change of dark cross (the cross changed random between 5 and 15 s). The cross comprised of a shorter (0.37◦) and a longer line (0.75◦), and responses were required for each reversal of the size of the lines. Central fixation was required, and participants were asked to respond as quickly and as correctly as possible. The participants responded to the changes by pressing a button.

did not make it possible to examine prototype effect (good and worse members of the categories). Therefore, it is conceivable that if we had used only prototype faces, the vMMN effect would have been more prominent. In a subsequent study by using morphing methods (e.g., 50% female and 50% male or 20% male and 80% female etc.) we could examine the sensitivity of vMMN to within category or between category effects. Finally, to ensure the neutral ecological validity of our stimuli, we ran a control behavioral experiment. It was a three-alternative forced-choice task. The participants (*n* = 14) were instructed to judge emotional expressions of faces on the basis of the following categories: negative facial expression (1st value), neutral facial expression (2nd value) or positive facial expression (3rd value). According to the results, the mean score of male faces was 2.04 (*SD* = 0.53) and the average score of female faces was 2.26 (*SD* = 0.50). We consider the emotional expression of faces were counterbalanced between two face categories i.e., we measured no emotional-related vMMN. Therefore, the difference between the ERPs to the deviant and standard would be a valid gender effect.

#### *EEG measuring*

EEG was recorded (DC-30 Hz, sampling rate 500 Hz; Synamps2 amplifier, NeuroScan recording system) with Ag/AgCl electrodes placed at 61 locations according to the extended 10–20 system using an elastic electrode cap (EasyCap). The right mastoid was used as reference, off-line re-referenced to average activity. Horizontal EOG was recorded with a bipolar configuration between electrodes positioned lateral to the outer canthi of the two eyes. Vertical eye movements were monitored with a bipolar montage between electrodes placed above and below the right eye. The EEG signal was band pass filtered offline, with cutoff frequencies of 0.1 and 30 Hz (24 dB slope). Epochs of 800 ms duration (including a 100 ms pre-stimulus interval) were extracted for each event and averaged separately for standard and deviant stimuli (from the two conditions female and male deviants). The mean voltage during the 100 ms pre-stimulus interval was used as the baseline for amplitude measurements, and epochs with an amplitude change exceeding ±70μV on any channel were rejected from further analysis.

Only responses from the third to ninth standard after a deviant were included in the standard-related ERPs. To identify changerelated activities, ERPs from standard stimuli were subtracted from ERPs from deviant stimuli of the respective condition.

#### *Analyses and comparisons*

As the results of the majority of vMMN studies suggest, we expected the emergence of deviant-*minus*-standard difference over the posterior electrode locations. However, to reinforce this expectation, we defined a channel matrix on the basis of results of point-by-point *t*-tests (criterion: minimum 10 consecutive significant data points, i.e., 20 ms; see e.g., Guthrie and Buchwald, 1991) applied on the whole scalp location. The largest significant difference (deviant-*minus*-standard) appeared on the matrix of ten electrodes (P7, PO3, POz, PO4, P8, PO7, O1, Oz, O2, and PO8). This matrix consisted of two rows (factor of anteriority: anterior and posterior) and five columns (factor of laterality: left, left-middle, middle, right-middle, and right).

On the basis of previous face-related vMMN (e.g., Stefanics et al., 2012; face stimuli elicited vMMN-related negativity in 150– 360 ms latency range) difference potentials as vMMN were identified from grand-average waveforms in the 202–498 ms range. In vMMN-related analyses of variance (ANOVAs) the mean amplitude values of this epoch were used.

Three-Way ANOVAs were introduced with factors of *Stimulus Type* (standard and deviant), *Anteriority* (anterior and posterior), and *Laterality* (left, left-middle, middle, right-middle, and right). Amplitude and peak latency values of the P1 and N1/N170 components were analyzed in similar ANOVAs. However, faces elicit a more negative response at lateral occipital electrode locations, especially over the right hemisphere (especially PO8 electrode and PO7, P7, P8; see Bentin et al., 1996, for a review).

When appropriate, Greenhouse-Geisser correction of the degrees of freedom was applied and the ε values are reported in the results. Significant effect's sizes were represented by the partial eta-squared. Furthermore, significant interactions were further specified by Tukey HSD *post-hoc* tests. Surface distributions were compared under method of the vector-scaled amplitude values (McCarthy and Wood, 1985). Additionally, we calculated the mean amplitude value of two exogenous components (P1 and N1/N170) in a ±20 ms epoch around the peak amplitude value of the group average. Moreover, rare deviant responses included both types (female and male) of visual events violating sequential regularities.

# **RESULTS**

#### *Behavioral results*

The participants performed the primary task with hit rates over 80% (mean hit rate = 95.30%, *SD* = 5.18%). The median RT was 485.5 ms (*SD* = 125.00 ms). There was no difference in performance between the conditions.

#### *Event-related potentials*

**Figure 2** shows the ERPs to deviant and standard stimuli, and the deviant-*minus*-standard difference potentials. As **Figure 2** shows, stimuli elicited a large positive component within the 126–166 ms range (P1) with amplitude maximum at the PO8 channel location (146 ms). We obtained no P1 amplitude and latency difference for frequent and infrequent stimuli. The P1 was followed by a small negative component in the 180–220 ms latency range (N1/N170), and a long-lasting positivity in the 202–498 ms range. **Figure 3** (upper panel) shows the topographic maps of exogenous components to standard stimuli and the surface distribution of the deviant-*minus*-standard difference potentials in the 202–498 ms range. Furthermore, **Table 1** shows the peak amplitude values of the P1 and N1/N170 components and the largest negative values of the difference potentials.

As **Figure 2** shows in the 202–498 ms latency range the ERP to deviants was more negative than the ERP to standards.

On the basis of *t*-tests we obtained significant deviantstandard difference within the 160–498 ms latency range. Due to the similarity of the earlier part of this range to the latency range of a negative epoch of the ERPs (N1/N170) and the dissimilarity of the later deviant-related negativity to the long-lasting ERP positivity, we conducted separate ANOVAs for earlier and later effects.

As for the P1 component, amplitudes (mean of the 126–166 ms range) were compared to deviant and standard stimuli. We obtained no significant difference. However, in the N1/N170 range (180–220 ms) the negativity was larger to deviant stimuli than standards. In the Three-Way ANOVA *Stimulus Type* and *Laterality* main effects were significant [*F*(1, <sup>13</sup>) = 9.75, *p* < 0.01, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.42 and *<sup>F</sup>*(4, <sup>52</sup>) <sup>=</sup> <sup>14</sup>.84, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.53, <sup>ε</sup> <sup>=</sup> <sup>0</sup>.48, respectively]. These significant effects indicate larger negative responses to deviant compared to standard gender face stimuli, and both type of stimuli elicited larger response at the lateral electrode locations than in the midline (PO7, P7 and PO8, P8). Finally, *Anteriority X Laterality* interaction [*F*(4, <sup>52</sup>) = 14.03, *p* < <sup>0</sup>.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.51, <sup>ε</sup> <sup>=</sup> <sup>0</sup>.48] was due to the larger negativity over the posterior locations in the extreme lateral locations. The latencies of P1 to standard and deviant stimuli were almost identical within the electrode matrix.

It is possible that, instead of the emergence of early memory mismatch effect (vMMN; Zhao and Li, 2006; Stefanics et al., 2012), the deviant-related negativity effect is an amplitude modulation of the N1/N170 component. For that reason we

**components (P1, N1/N170, Long-lasting positivity) and the deviant-minus-standard difference potential in the 202–498ms range.** Color represents the amplitude values.

compared the surface distribution of the N1/N170 to the standard stimuli and the distribution of early difference potential (deviant*minus*-standard difference). On this end vector-scaled amplitude values (McCarthy and Wood, 1985) were used in an ANOVA with factors of *Component* (standard and difference potential), *Anteriority* and *Laterality*. In this analysis there were neither significant main effects of component [*F*(1, <sup>13</sup>) = 0.22, *p* = 0.64, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.01] nor interactions. Accordingly, we obtained no evidence of genuine mismatch activity in the earlier latency range, i.e., the early difference is an increased amplitude value of the N1/N170 component.

The deviant-related activity (vMMN) was analyzed in the 202–498 ms latency range. All main effects were significant, *Stimulus Type* main effect [*F*(1, <sup>13</sup>) <sup>=</sup> <sup>18</sup>.83, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.59]; *Anteriority* main effect [*F*(1, <sup>13</sup>) <sup>=</sup> <sup>14</sup>.86, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.01, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.53]; and *Laterality* main effect [*F*(4, <sup>52</sup>) = 43.84, *p* < 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.50, <sup>ε</sup> <sup>=</sup> <sup>0</sup>.59, respectively]. We consider the significant difference between deviant and standard as vMMN component. Furthermore, the interaction of *Anteriority* and *Laterality* was also significant [*F*(4, <sup>52</sup>) <sup>=</sup> <sup>11</sup>.09, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.46, <sup>ε</sup> <sup>=</sup> 0.73], showing that ERPs were larger over the posterior, extreme right locations.


**Table 1 | Mean amplitude values (µV) and mean latency values (ms) of the event-related potentials to standard and deviant face stimuli (Standard error of the mean in parenthesis).**

#### **DISCUSSION**

Experiment 1 demonstrated that the face stimuli elicited two ERP components in the earlier latency range (up to 220 ms). The large P1 component was insensitive to the probability of genders. Faces elicited an N1/N170 component, however the negativity had small amplitude. More importantly, we recorded more negative responses to rare (deviant) stimuli than to standard stimuli although this difference (deviant-*minus*standard) can be an amplitude modulation of the N1/N170 component.

The ISI of Experiment 1 was shorter (400 ms) than the ISI of the typical studies of reported fairly large N170. The short ISI might contribute to the attenuated exogenous activity (e.g., Czigler, 1979; Liu et al., 2010). On the one hand, considering the N170 component as an index of the structural encoding of faces (c.f. Bentin et al., 1996), it is possible that the repeated presentation of the structural features characteristic to a gender (male or female) may saturate the processes underlying this component, therefore, as mentioned above, the early deviant-*minus*standard difference effect was a manifestation of the refractoriness of the face-specific activity. On the other hand, in this experiment genders were effectively discriminated, and considering that structural encoding is a necessary stage of such discrimination, (N170 component is an index of structural encoding processes), it seems that the amplitude of the N170 component is unrelated to successful encoding. As an alternative, there is no close connection between the processes underlying the N170 component and the encoding processes necessary for gender discrimination<sup>3</sup> .

The main finding of this experiment is the long-lasting deviant-related posterior negativity to facial stimuli of the infrequent gender. Emergence of this deviant-related negativity in the later latency range cannot be explained as a refractoriness effect, because in this range exogenous activity was mainly positive. Refractoriness of positive ERP components to standard, and the lack of refractoriness to deviant stimuli should result in positive, instead of negative difference potential. Therefore, the posterior negativity of the 202–498 ms range is considered as a vMMN, elicited by the ecologically significant change of gender category. The findings of Experiment 1 provided further evidence that the violation of the rule: "members of a particular category are presented sequentially," automatically registered in the perceptual system.

In summary, in the present Experiment, due to the short ISI the face-related exogenous activity was unexpectedly small. In Experiment 2 we introduced longer ISI. Besides the possibility of an enlarged N1/N170 component, we investigated the ISI-effect on the vMNN component. Because this component is considered to depend on the short-term registration of sequential rules (Astikainen et al., 2008), we expect an enlarged N1/N170 and the reduction of vMMN amplitude in Experiment 2.

# **EXPERIMENT 2 MATERIALS AND METHODS**

#### *Participants*

Participants were 12 healthy adults (3 women; mean age = 21.50 years, *SD* = 1.78 years). They had normal or corrected-to-normal vision. Written informed consent was obtained from every participant before the experimental procedure. The study was conducted according to the Declaration of Helsinki, and accepted by the United Committee of Ethics of the Psychology Institutes in Hungary.

#### *Stimuli, procedure, EEG measuring, and data processing*

Experiment 2 was almost identical to Experiment 1. We manipulated only the ISI that was between 2000 and 2500 ms (mean ISI: 2250 ms; we calculated with a pseudorandom value drawn from the standard uniform distribution on the interval). The other stimulus parameters, the participant's task, EEG recording (except the online filter: DC-100 Hz) and data processing were the same as Experiment 1.

#### **RESULTS**

#### *Behavioral results*

All participants performed the primary task with hit rates over 80% [mean hit rate = 97.84% (*SD* = 5.56%)]. The median RT was 463.38 ms (*SD* = 123.16 ms). There was no difference in performance between the conditions.

#### *Event-related potential data*

**Figure 4** shows the ERPs to deviant and to standard stimuli, and the deviant-*minus*-standard difference potential. As the figure shows, in this experiment the P1 component was followed by a large N1/N170 component. The other aspects of the ERPs were similar to the ERPs in Experiment 1, since deviant stimuli elicited a long-lasting negativity within the 202–498 ms latency

<sup>3</sup>Dering et al. (2011) claim that the N170 component is sensitive to cropped face stimuli, whereas the face processes are related to an earlier positive component, to the P1.

range. **Figure 5** shows the surface distribution of the exogenous components and the difference potential (in construction identical to that of **Figure 3**), and **Table 2** shows the amplitude values of the exogenous components and the difference potential.

The P1 had a wide posterior distribution, while the N1/N170 component emerged over the bilateral posterior locations. The difference potential had a restricted distribution over the posterior locations.

We applied the same statistical analyses as in Experiment.1. The deviant-*minus*-standard difference potential was statistically significant within the 120–480 ms latency range.

The P1 component had maximum on the PO8 channel location (126 ms). We obtained no significant *Stimulus Type* effect on this component. However, *Anteriority* [*F*(1, <sup>11</sup>) = 12.12, *p* < 0.01, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.52] and *Laterality* [*F*(4, <sup>44</sup>) <sup>=</sup> <sup>5</sup>.92, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.35, ε = 0.40] main effects were significant. According to the Tukey HSD tests, the P1 component was larger at the posterior row and the P1 amplitude was larger at the midline locations (*p* < 0.01 in all cases). As for latency values, in a similar ANOVA no significant effect appeared.

Unlike in Experiment 1, we obtained no significant N1/N170 amplitude difference between the ERPs to deviants and standards. In the Three-Way ANOVA the bilateral maxima of this component is reflected by the significant *Laterality* main effect [*F*(4, <sup>44</sup>) <sup>=</sup> <sup>22</sup>.30, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.67, <sup>ε</sup> <sup>=</sup> <sup>0</sup>.50]. In addition, *Anteriority* main effect was also significant: [*F*(1, <sup>11</sup>) = 9.96, *p* < <sup>0</sup>.01, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.47]. According to the Tukey HSD test, the N1/N170

component had larger negative values at the bilateral posterior locations (at the P7, PO7 and P8, PO8 channels *P* < 0.01 in all cases). *Anteriority* × *Laterality* interaction was also significant [*F*(4, <sup>44</sup>) <sup>=</sup> <sup>12</sup>.45, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.53, <sup>ε</sup> <sup>=</sup> <sup>0</sup>.50]; this effect was due to the more negative values (for both deviants and standards)


**Table 2 | Mean amplitude values (µV) and mean latency values (ms) of the event-related potentials to standard and deviant face stimuli (Standard error of the mean in parenthesis).**

at the posterior and lateral locations. In ANOVAs on the N1/N170 latency values there were neither significant main effects nor interactions.

In the 202–498 ms latency range the Three-Way ANOVA resulted in significant main effects of *Stimulus Type* [*F*(1, <sup>11</sup>) = <sup>7</sup>.40, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.01, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.40] and *Laterality* [*F*(4, <sup>44</sup>) <sup>=</sup> <sup>17</sup>.37, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.61, <sup>ε</sup> <sup>=</sup> <sup>0</sup>.57]. *Stimulus Type* main effect indicated enhanced negativity to the changes of gender category (vMMN), even if the ISI increased to 2250 ms. Besides, the *Laterality* main effect showed that the vMMN maxima was located at the PO7 and PO8 channel locations. Finally, significant *Stimulus Type* <sup>×</sup> *Anteriority* interaction [*F*(1, <sup>11</sup>) <sup>=</sup> <sup>7</sup>.00, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.05, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.38] reflected larger vMMN at the posterior locations. Latency values of the late long-lasting components to standards and deviants were not different.

#### **DISCUSSION**

The vMMN effect of Experiment 2 replicated the results of Experiment 1, even if the ISI was longer. The amplitude of the exogenous activity (N1/N170) increased at longer ISI. As a plausible explanation, at longer ISI the refractory effect on this component dissipated, and the lack of deviant-related N1/N170 difference was due to the saturation of the amplitude, even in the case of standard stimuli.

### **COMPARISON OF THE FIRST AND SECOND EXPERIMENT'S RESULTS**

**Figure 6** compares the ERPs and difference potentials of Experiment 1 (short ISI) and Experiment 2 (long ISI) at the PO8 channel location. The figure illustrates three obvious differences: the latency of P1 and N1/N170 components were longer in Experiment 1, and the N1/N170 amplitude was more negative in Experiment 2. As a less evident difference, the P1 amplitude was larger in Experiment 1.

The P1 amplitude difference was investigated in an ANOVA with factors of *Experiment* (short ISI in Experiment 1 vs. long ISI in Experiment 2; between group factors) and *Laterality* (PO7 and PO8 channels). *Experiment* main effect was significant [*F*(1, <sup>24</sup>) = <sup>5</sup>.16, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.05, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.17], showing that the amplitude values of the P1 component was really larger in Experiment 1.

As for the latency difference, in a similar ANOVA the main effect of *Experiment* was significant [*F*(1, <sup>24</sup>) = 39.56, *p* < 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.62], indicating that the latency was shorter in Experiment 2. Furthermore, for the latency values the significant *Experiment* <sup>×</sup> *Laterality* interaction [*F*(1, <sup>24</sup>) <sup>=</sup> <sup>4</sup>.64, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.05, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.16]

**FIGURE 6 | Comparison of the ERPs and difference potentials of Experiment 1 and Experiment 2.** The epoch range and mean amplitude values are presented (the shaded area marks the latency range of negative differences). We found amplitude and latency differences between the exogenous components.

shows that in Experiment 2 the P1 latency was longer over the right (PO8) side.

The N1/N170 amplitudes and latencies were analyzed in similar ANOVAs. The amplitude values of the N1/N170 component were different between the two experiments [*F*(1, <sup>24</sup>) = 8.61, *p* < <sup>0</sup>.01, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.26]. Furthermore, the difference was larger over the right side, as indicated by the significant *Experiment* × *Laterality* interaction [*F*(1, <sup>24</sup>) <sup>=</sup> <sup>10</sup>.85, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.005, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.31]. Latency values of the N1/N170 component were also significantly different in the two experiments [*F*(1, <sup>24</sup>) = 10.74, *p* < <sup>0</sup>.01, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.30], showing a shorter latency in case of longer ISI (Experiment 2).

Finally, we compared the vMMN scalps distributions (202– 498 ms) in the two experiments at a 2 × 5 electrode matrix (P7, PO3, POz, PO4, P8, PO7, O1, Oz, O2, PO8). In this ANOVA we used vector-scaled amplitude values (McCarthy and Wood, 1985). In ANOVA the main effect of *Experiment* and the interactions (*Experiment* × *Anteriority; Experiment* × *Laterality; Experiment* × *Anteriority* × *Laterality*) were not significant i.e., we obtained no significant difference between the vMMN distributions.

# **GENERAL DISCUSSION**

Concerning the exogenous components, the P1 component had larger amplitude in Experiment 1 than in Experiment 2, (8.06 vs. 6.04μV), therefore there was no ISI (400 vs. longer 2250 ms) and so there was no refractory effect on P1. In general, the processes of refractoriness attributed to the decreased responsiveness of neurons for the "fast" repeated input (see Näätänen and Picton, 1987, for a review). Therefore, in the case of the longer interval between the successive stimuli, larger exogenous components should have occurred. However, the characteristic of the exogenous P1 component (i.e., larger amplitude and latency) did not follow the prediction based on the refractory theory.

On the contrary, at longer ISI (2250 ms) the human face stimuli elicited an enlarged N1/N170 component (≤5.26μV at PO8). The posterior bilateral distribution of the negativity corresponded to the findings reporting the face-related N170 component (e.g., Bentin et al., 1996). In case of short ISI (Experiment 1) we obtained a small deviant-related difference on N1/N170-effect of infrequent deviant gender stimuli were embedded in a sequence of frequent patterns. However, this difference disappeared at longer ISI (Experimental 2). Therefore, the small early amplitude difference, as a N1/N170 modulation in Experiment 1 has to be treated carefully. Nevertheless, category specific refractoriness is a possible explanation, although the long ISI of the present study is beyond the interval which is sensitive to refractoriness effect (Coch et al., 2005).

In both experiments facial stimuli belonging to the infrequent gender category of a sequence elicited vMMN. The results of the present study are in line with the behavioral results of Campanella et al. (2002) showing the sensibility of the perceptual system to gender as category. VMMN in this study emerged as a long lasting ERP component with the onset of ∼200 ms post stimulus, and terminated at ∼500 ms. The onset time corresponds to the results of other vMMNs studies, where another facial category, emotional expression established the sequential regularity (Zhao and Li, 2006; Astikainen and Hietanen, 2009; Li et al., 2012; Stefanics et al., 2012). The results of Susac et al. (2004) also supported the possibility of late vMMN to facial category change without an amplitude difference in an earlier (N1/N170) latency range.

In comparison to the emotion-related vMMN, in the present study the duration of the vMMN component was unusually long. As a possible explanation for this long-lasting negativity, gender category processes might be tested on many levels and/or in some circles of re-entrant mechanisms. In other word, we suggest that the automatic identification of the gender difference is a fairly complex process,—hence the vMMN activity was extended to 200–500 ms post stimulus interval especially in case of cropped faces (i.e., without the ears and hair).

As the results of Experiment 2 show, vMMN emerged even if the ISI was longer than ∼2000 ms. This finding provided ample evidence that the representation of this facial category survives several seconds. Contrary to the absence of ISI effect on vMMN, P1 and N1/N170 components were sensitive to the ISI, but these relationships were complex. As a function of ISI the latency of these components decreased in both cases. However, P1 amplitude decreased and N1/N170 amplitude increased at longer ISI. The ISI effect on the N1/N170 amplitude is attributed to the refractoriness at short ISI, but at this stage we have no explanation for the other ISI-related differences. As for the N1/N170 component, Mouchetant-Rostaing et al. (2000) demonstrated that the N170 component is insensitive to the processing of genders, and supporting this finding, the present results show genderrelated facial processing even in case of compromised N1/N170. However, the contribution of the processes underlying the P1 component in facial processing is a viable possibility (Dering et al., 2011).

At a more general level, the present results provide converging evidence about the automatic development of category-related information, and the automatic detection of events different from the predicted category. Athanasopoulos et al. (2010) obtained larger vMMN in Greek participants for two variants of blue than in British participants. In the Greek language the two variants have different labels, whereas in English only one. Clifford et al. (2010) obtained larger vMMN to between-category colors than to within-category ones, even if the distances in the color space were equal. Finally, Mo et al. (2011) obtained larger within category vMMN if deviants were presented to the right side (i.e., left hemisphere processing). No category-specific difference appeared to left half-field stimulation. VMMN appeared to be sensitive to symmetry as perceptual category (Kecskés-Kovács et al., 2013), hand laterality (Stefanics and Czigler, 2012), and vMMN emerged to deviant facial emotions (Stefanics and Czigler, 2012). The question to be answered in relation to such vMMN results is whether these effects are based on the activation of a common set of physical features, or the stimuli activate the category code, and this code has a top-down effect on stimulus processing. In comparison to other categories, the specificity of the color domain is that the physical stimuli are continuous (visible spectrum) and the categories are products of the perceptual system. Not surprisingly, this characteristic of the color system provided a methodological possibility for investigating language-related effects of vMMN in the studies by Clifford et al. (2010) and Mo et al. (2011); the within and between category stimuli were in equal distance within the color space, therefore such results are difficult to explain without the activation of a category code. Hand laterality is a markedly different category type, it is inherently dichotomous, the distinctive features are relatively simple, and it seems to be impossible to produce continuity between the left and right hand. Gender as perceptual characteristic of facial stimuli seems to be an "immediate" case, the category (female and male) is obviously dichotomous, but on the basis of present results it is challenging to decide whether vMMN was the result of the emergence of the category representation or as an effect of a set of different physical stimulus features. In this study a large set of photographs with different individual features and structural characteristics were presented, therefore the latter possibility is less probable. However, using morphing methods, it is possible to create immediate stimuli. In further studies it would be possible to investigate the sensitivity of vMMN to within category and between category photographs, using similar distance in a morph scale.

In sum, the vMMN components were elicited in both first and second experiments. Deviants elicited a posterior

# **REFERENCES**


facial expression in visual oddball task: an ERP study. *Biol. Psychol.* 59, 171–186. doi: 10.1016/S0301-0511(02) 00005-4


negativity within a comparable latency range in both experiments. The processes of automatic change detection of gender face category are unattached to ISI manipulation, especially as the distributions of vMMNs were similarly enhanced negativities.

In conclusion, in both experiments we found robust vMMN effects, showing that vMMN is sensitive to perceptual categorization processes. Accordingly, emergence of a vMMN response to deviant gender of human faces demonstrates that posterior visual areas automatically registered the unattended gender information, and detected regularities of gender facial stimuli.

# **ACKNOWLEDGMENTS**

Supported by the Hungarian Research Found (OTKA 104462). We thank Krisztián Samu, PhD (Budapest University of Technology and Economics- MOMEI) for technical support. We thank Orsolya Kolozsvári for the helpful comments.

MA: Kluver Academic Press), 41–59.


*Psychophysiology* 47, 66–122. doi: 10.1111/j.1469-8986.2009.00856.x


16, 265–268. doi: 10.1023/B:BRAT. 0000032863.3990/cb


expressions under non-attentional condition. *Neurosci. Lett.* 410, 126–131. doi: 10.1016/j.neulet. 2006.09.081

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 04 April 2013; accepted: 16 August 2013; published online: 04 September 2013.*

*Citation: Kecskés-Kovács K, Sulykos I and Czigler I (2013) Is it a face of a woman or a man? Visual mismatch* *negativity is sensitive to gender category. Front. Hum. Neurosci. 7:532. doi: 10.3389/fnhum.2013.00532*

*This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2013 Kecskés-Kovács, Sulykos and Czigler. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Event-related potentials to unattended changes in facial expressions: detection of regularity violations or encoding of emotions?

#### *Piia Astikainen1 \*, Fengyu Cong2, Tapani Ristaniemi <sup>2</sup> and Jari K. Hietanen3*

*<sup>1</sup> Department of Psychology, University of Jyväskylä, Jyväskylä, Finland*

*<sup>2</sup> Department of Mathematical Information Technology, University of Jyväskylä, Jyväskylä, Finland*

*<sup>3</sup> Human Information Processing Laboratory, School of Social Sciences and Humanities, University of Tampere, Tampere, Finland*

#### *Edited by:*

*Gabor Stefanics, University of Zurich and Swiss Federal Institute of Technology Zurich, Switzerland*

#### *Reviewed by:*

*Guillaume A. Rousselet, University of Glasgow, UK Andrea Tales, University of Swansea, UK*

#### *\*Correspondence:*

*Piia Astikainen, Department of Psychology, University of Jyväskylä, Ylistönmäentie 33, building Y33, PO Box 35, 40014 Jyväskylä, Finland e-mail: piia.astikainen@jyu.fi*

Visual mismatch negativity (vMMN), a component in event-related potentials (ERPs), can be elicited when rarely presented "deviant" facial expressions violate regularity formed by repeated "standard" faces. vMMN is observed as differential ERPs elicited between the deviant and standard faces. It is not clear, however, whether differential ERPs to rare emotional faces interspersed with repeated neutral ones reflect true vMMN (i.e., detection of regularity violation) or merely encoding of the emotional content in the faces. Furthermore, a face-sensitive N170 response, which reflects structural encoding of facial features, can be modulated by emotional expressions. Owing to its similar latency and scalp topography with vMMN, these two components are difficult to separate. We recorded ERPs to neutral, fearful, and happy faces in two different stimulus presentation conditions in adult humans. For the oddball condition group, frequently presented neutral expressions (*p* = 0.8) were rarely replaced by happy or fearful expressions (*p* = 0.1), whereas for the equiprobable condition group, fearful, happy, and neutral expressions were presented with equal probability (*p* = 0.33). Independent component analysis (ICA) revealed two prominent components in both stimulus conditions in the relevant latency range and scalp location. A component peaking at 130 ms post stimulus showed a difference in scalp topography between the oddball (bilateral) and the equiprobable (right-dominant) conditions. The other component, peaking at 170 ms post stimulus, showed no difference between the conditions. The bilateral component at the 130-ms latency in the oddball condition conforms to vMMN. Moreover, it was distinct from N170 which was modulated by the emotional expression only. The present results suggest that future studies on vMMN to facial expressions should take into account possible confounding effects caused by the differential processing of the emotional expressions as such.

**Keywords: equiprobable condition, facial expressions, independent component analysis, oddball condition, visual mismatch negativity**

# **INTRODUCTION**

Other people's facial expressions convey socially important information about other individuals' emotions and social intentions (Keltner et al., 2003). Therefore, it is not surprising that facial expressions are, among other biologically and socially significant information, processed automatically and rapidly in the brain (e.g., Adolphs, 2002; Palermo and Rhodes, 2007).

Because of its good time resolution, measurement of eventrelated potentials (ERPs) has been widely used in studies investigating the early stages of facial information processing. An ERP component called visual mismatch negativity (vMMN; visual counterpart of mismatch negativity, defined originally in the auditory modality, Näätänen et al., 1978; for a review see Näätänen et al., 2010) is a feasible method to study automatic encoding of several types of visual stimuli including faces. vMMN is elicited to rare stimuli ("deviant") interspersed with repeated ("standard") stimuli and observed as a differential ERP response between these two. vMMN can be observed in conditions where the participants are instructed to ignore the visual stimuli eliciting the vMMN and attend to other visual stimuli (e.g., Stefanics et al., 2012) or auditory stimuli (e.g., Astikainen and Hietanen, 2009). In addition to changes in low-level visual features, such as orientation of a bar (e.g., Astikainen et al., 2008) or color (e.g., Czigler et al., 2002), it has also been associated with changes in complex visual features, including human hands (Stefanics and Czigler, 2012) and facial expressions (Susac et al., 2004; Zhao and Li, 2006; Astikainen and Hietanen, 2009; Chang et al., 2010; Kimura et al., 2011; Li et al., 2012; Stefanics et al., 2012).

vMMN is considered to reflect a process of detecting a mismatch between the representation of the repeated standard stimulus in transient memory and the current sensory input (Czigler et al., 2002; Astikainen et al., 2008; Kimura et al., 2009) similarly to auditory MMN (for the trace-mismatch explanation of MMN, see Näätänen, 1990). The standard stimuli can also be physically variant, but if they form sequential regularity, deviant stimuli violating this regularity elicit vMMN (Astikainen and Hietanen, 2009; Kimura et al., 2010, 2011; Stefanics et al., 2011, 2012; for a review see Kimura, 2012). For example, serially presented pictures of faces can be of different identities, but a vMMN is elicited if, say, rare fearful faces are interspersed among emotionally neutral faces, suggesting that a representation of a "neutral face" can be abstracted among several low-level features (Astikainen and Hietanen, 2009). Along the same lines, vMMN elicitation has recently been linked to the predictive coding theories (Friston, 2005), which postulate a predictive error between the neural model based on the representations of visual objects in memory and the actual perceptual input (Winkler and Czigler, 2012).

vMMN to facial expressions has been reported at different latency ranges, starting from 70 up to 360 ms post-stimulus (Susac et al., 2004; Zhao and Li, 2006; Astikainen and Hietanen, 2009; Chang et al., 2010; Kimura et al., 2011; Li et al., 2012; Stefanics et al., 2012), and sometimes multiple responses with different latencies have been reported (Astikainen and Hietanen, 2009; Chang et al., 2010; Li et al., 2012; Stefanics et al., 2012). Nevertheless, a consistent finding has been a vMMN observed around the same latency (∼130–200 ms after stimulus onset) and in the same scalp location (parieto-occipital region) with the well-known face-sensitive N170 response (Zhao and Li, 2006; Astikainen and Hietanen, 2009; Chang et al., 2010; Stefanics et al., 2012). The N170 was originally associated with the structural encoding of faces (Bentin et al., 1996), but several studies have shown its sensitivity to emotional expressions (Batty and Taylor, 2003; Eger et al., 2003; Caharel et al., 2005; Williams et al., 2006; Blau et al., 2007; Leppänen et al., 2007; Schyns et al., 2007; Japee et al., 2009; Vlamings et al., 2009; Wronka and Walentowska, 2011, for the studies showing no emotional modulation of N170, see Eimer and Holmes, 2002; Eimer et al., 2003; Holmes et al., 2003, 2005; Ashley et al., 2004; Santesso et al., 2008). Because the N170 can be modulated by emotional expressions, and because its latency and scalp topography can resemble the vMMN to facial expressions, differentiating these two components is difficult. This is especially true in vMMN studies in which emotional faces have been used as deviant faces among neutral standard faces (Zhao and Li, 2006; Astikainen and Hietanen, 2009; Chang et al., 2010; Li et al., 2012) since the differential response could result from enhanced N170 responses to emotional vs. neutral faces.

Moreover, assuming that vMMN to emotional facial expressions could be separated from N170, there might be an additional confounding factor to consider. Namely, it is unclear whether such a differential response (seemingly similar to vMMN) reflects a true mismatch response that is, a response indicating regularity violation. The other possibility is that the differential response reflects, solely or in part, varying levels of sensitivity to different facial emotions. A few recent studies have elucidated this issue. In the study by Stefanics et al. (2012), regularity violations involved rare changes in emotional expressions (infrequent fearful face among happy faces and vice versa) of constantly changing facial identities. A rarely presented facial expression elicited differential ERPs relative to the same emotion when it was presented as a frequent one (i.e., happy standard vs. happy deviant face, fearful standard vs. fearful deviant face) at 70–120 ms latency for the fearful faces, and at 170–360 ms latency covering N170 and P2 components for both the fearful and happy faces. In the study by Kimura et al. (2011), an immediate repetition of an emotional expression was presented as a deviant stimulus violating the pattern of constantly changing (fearful and happy) emotions while the participants were attending to faces wearing eyeglasses. This stimulus condition elicited responses associated to the regularity violation at relatively long latencies: ∼280 ms after the onset of the fearful faces and 350 ms after the onset of the happy faces. In both of these studies, the experimental paradigms allowed the analysis of vMMN by comparing the ERPs elicited by two identical pictures (e.g., fearful faces presented as a standard and as a deviant stimulus). Since differential ERPs, i.e., vMMN, to these physically identical pictures were found, confounding by emotional processing as such can be ruled out.

The existing findings of vMMN as an index of regularity violation in facial expression processing are, however, only from experiments which applied happy and fearful faces in the stimulus series, i.e., all the expression used in the experiments were emotional expressions (Kimura et al., 2011; Stefanics et al., 2012). vMMN as an index of regularity violation and its possible confounding by emotional expression encoding remains an open question in the case of expressive vs. neutral faces. In our previous study (Astikainen and Hietanen, 2009), neutral standard faces of constantly changing identities were presented, and the regularity violations were rarely presented fearful or happy faces. A differential response was elicited by the rare expressions at 150–180 ms latency, but this study left open the question of functional independency between N170 and vMMN as well as the question of the emotional confounding of the vMMN response. The same holds true for a study in which regularity of "neutral expression" was violated by happy and sad faces while changing "identities" (i.e., low-level visual features) of schematic faces were used (Chang et al., 2010).

In order to reveal the process underlying the differential responses to deviant emotional vs. neutral standard faces (regularity violation vs. encoding of emotional expression) and its independency from (emotion-modulated) N170, we recorded ERPs in two different conditions presented to two groups of adult humans. For one group, happy, fearful, and neutral faces were presented in random order and with an equal probability (*p* = 0.33 for each; equiprobable condition). For the other group, fearful and happy faces were rarely (*p* = 0.1 for both) and pseudo-randomly (at least two neutral faces were presented between the emotional faces) interspersed with the neutral ones (oddball condition). In both conditions, the facial identities changed from trial to trial requiring abstraction of the regularity in the facial expressions from among several low-level features. Independent component analysis (ICA) was applied to the data. ICA functions as a spatial filter for the ERP data, and the peak amplitudes of the components projected to the sensor space can be further used in statistical analysis.

We expected to find two separate components in the oddball condition: emotion modulated N170 and vMMN. In the equiprobable condition, we expected to observe only emotion-modulated N170. It was also possible that no differential response would be found in the equiprobable condition if N170 was not sensitive to the emotional facial expressions in the present stimulus presentation condition.

### **METHODS**

#### **PARTICIPANTS**

Twenty native Finnish-speaking volunteers participated in the study. For half of the participants, the stimuli were presented in the oddball condition whereas the other half viewed the stimuli in the equiprobable condition (see below). Both groups comprised two male and eight female participants. In the "oddball" group, the age range was 19–35 years and mean age 23.9 years (median 23.5). In the "equiprobable" group, the age range was 20–42 years with a mean age of 24.6 years (median 21.5). All the participants were right-handed and had self-reported normal hearing and vision (corrected with eyeglasses if necessary), and no diagnosed neurological or psychiatric disorders. A written informed consent was obtained from the participants before their participation. The experiment was undertaken in accordance with the Declaration of Helsinki. The ethical committee of the University of Jyväskylä approved the research protocol.

### **PROCEDURE**

During the experiment, the participants sat in a comfortable chair in a dimly-lit room. The participants were instructed to attend to a recording of a radio play. The play was presented via a loudspeaker placed at about 50 cm above the participant's head where the volume of the recording equaled that of a normal speaking voice. Visual stimuli were presented on a computer screen (Eizo Flexscan, 17 inch CRT display) approximately one meter away from the participant. The participants were monitored during the recordings via a video camera positioned on top of the screen. The participants were asked to fix their gaze at a cross in the middle of a screen. The participants were informed that the visual stimuli would be faces and they were instructed to concentrate on the radio play and pay no attention to the faces.

#### **STIMULI**

The visual stimuli were pictures of faces of four different models (male actors PE and JJ, female actors MF and NR) from Pictures of Facial Affect (Ekman and Friesen, 1976). Pictures of a neutral, fearful and happy expression from each model were used. The stimulus presentation was controlled with E-Prime software (Psychology Software Tools, Inc., Sharpsburg, MD, USA).

The pictures of faces, occupying an area of 4 × 5◦ of visual angle were presented at fixation for 200 ms. The stimulus-onset asynchrony (SOA) was 700 ms. The faces were presented in two different conditions (a between-subjects variable). In a modified oddball condition, two different deviant stimulus types, fearful faces and happy faces, were infrequently interspersed between frequently presented neutral standard faces. Standards and deviants were presented pseudo-randomly with the restriction that no less than two standards would occur between consecutive deviants. Among the 1600 stimuli were 160 happy faces (*p* = 0.1) and 160 fearful faces (*p* = 0.1). In the equiprobable condition, the stimulus presentation was otherwise the same, except that all the expressions were presented pseudo-randomly (there were no immediate repetitions of stimuli from the same emotion category), and with equal probability (*p* = 0.33). The number of stimuli in each emotion category was the same as the number of deviants in the oddball condition that is, 160 stimuli in each emotion category were presented. In both stimulus presentation conditions the facial identity in the pictures changed from trial to trial.

#### **ELECTROENCEPHALOGRAPHY RECORDING**

Electroencephalogram was recorded with Brain Vision Recorder software (Brain Products GmbH, Munich, Germany) at Fz, F3, F4, Cz, C3, C4, Pz, P3, P4, P7, P8, Oz, O1, and O2 according to the international 10–20 system. An average reference was applied. Eye movements and blinks were measured from bipolar electrodes, one placed above the left eye and the other lateral to the right orbit. The signals from the electrodes were amplified, sampled at a rate of 1000 Hz, digitally band pass filtered from 0.05 to 100 Hz, and stored on a computer disk.

#### **DATA ANALYSIS**

The signals from the electrodes were filtered (Butterworth zero phase filter: 0.1–30 Hz, 24 dB/octave roll-off) and 700 ms stimulus-locked segments were extracted (from −100 to +600 ms). Segments with signal amplitudes beyond the range between −100 and 100μV in any recording channel, including the EOG channel, were omitted from further analysis. The segments were corrected by their baseline values (mean amplitude during the 100-ms pre-stimulus period). In the equiprobable condition, all the responses left after the artifact rejection were averaged for each participant. On average, 136 trials for the fearful (min = 107, max = 155, median = 141), 135 trials for the happy (min = 96, max = 154, median = 142), and 135 trials for the neutral expression (min = 105, max = 155, median = 139) were available. In the oddball condition, only responses to standards immediately preceding the deviants were averaged. This procedure allows the same number of segments, and thus a similar signal-to-noise ratio, for both standards and deviants. On average, the number of analyzed trials for the fearful and happy deviants and the neutral standards immediately preceding them was 138 (fear trials: min = 85, max = 160, median = 149; happy trials: min = 78, max = 158, median = 153). **Figure 1** depicts the ERPs to the happy, fearful, and neutral faces in the oddball condition and **Figure 2** to those in the equiprobable condition.

Next, differential ERPs (expressive minus neutral face responses) were calculated separately for the fearful and happy faces. By this way, the brain activities common to the emotional and neutral faces were removed. The differential ERPs were processed by an approach including wavelet filter and ICA. The approach and the benefits of it has been thoroughly described by Cong et al. (2011a,b, 2012). In this approach, ICA is applied to the averaged ERPs (see also Makeig et al., 1997; Vigario and Oja, 2008; Kalyakin et al., 2008, 2009; Cong et al., 2011a,b). This is different from the commonly used application of ICA on the concatenated single-trial EEG data (for a N170 study, see e.g., Desjardins and Segalowitz, 2013). Briefly, the method was as follows. Wavelet filter was performed on the difference wave.

**FIGURE 1 | Grand-averaged ERPs in the oddball condition.** Raw ERPs to happy and fearful deviant faces and to the neutral standard faces immediately preceding them. Stimulus onset at time 0.

Ten levels were set to decompose the signal through the reversal biorthogonal wavelet with the order of 6.8, and coefficients at levels 5–8 were chosen for the reconstruction. The wavelet filter was selected so that it could be assumed to remove sensor noise and frequencies irrelevant for the studied components (Cong et al., 2011b, 2012). This has been found to be advantageous in the

**FIGURE 3 | Grand-averaged differential ERPs (emotional minus neutral face).** Raw ERPs.

following ICA decomposition (Cong et al., 2011a,b). **Figure 3** shows the wavelet-filtered differential responses in both stimulus presentation conditions.

The filtered and averaged differential ERPs (responses to fearful and happy faces minus responses to neutral standard faces) were next decomposed by ICASSO software for ICA (Himberg et al., 2004). The unmixing matrix was randomly initialized 100 times, and FastICA with the hyperbolic tangent function (Hyvärinen, 1999) was run 100 times for each setting to extract 14 components each time. The 1400 components obtained in the 100 runs were then clustered into 14 groups using agglomerative hierarchical clustering with the average-linkage criterion. Finally, the centroid of each cluster was sought and was regarded as one component by ICASSO (Himberg et al., 2004). The stability of the ICA decomposition was satisfactory: the mean of the index of quality (Iq) of the 560 ICA components (2 by 2 by 10 by 14) was 0.90 (standard deviation = 0.10, min = 0.45, max = 0.99, and median = 0.94). This index is for the interpretation of the stability of decomposition for each ICA component. If the Iq is close to "1," multiple runs of ICA decomposition give similar results, which means that ICA composition is stable. Otherwise, if the Iq is close to "0," multiple runs of ICA decomposition give very different results.

After the estimation of the independent components by ICASSO, the desired components were chosen based on the peak latency between 100 and 200 ms and subsequent evaluation of the component's scalp topography (posterior negativity) when the component was projected back to the electrodes. This projection was to correct the inherent polarity and variance indeterminacy of ICA (Makeig et al., 1999). It also allows performing conventional statistical analysis on peak amplitudes to reveal the experimental effects. Two components with peak latencies of ∼130 and 170 ms after stimulus onset were found for each participant in both conditions.

The peak amplitudes of the electrode-field projection of ICA components were submitted to repeated measures multivariate analysis of variance (MANOVA) with within-subjects factors of Expression (fearful vs. happy) and Electrode (Fz, F3, F4, Pz, P7, P8, Oz), and with a between-subjects factor Condition (oddball condition vs. equiprobable condition). Channel selection was based on visual inspection of the grand averaged scalp topography maps and previous findings for N170 (e.g., Blau et al., 2007) and vMMN to facial emotions (e.g., Astikainen and Hietanen, 2009; Stefanics et al., 2012). *P*-values smaller than 0.05 were considered as significant. *T*-tests were two-tailed, and their test-values are reported whenever the *p*-value is smaller than 0.055. Partial eta squared (η<sup>2</sup> p) presents effect size estimates for MANOVA.

# **RESULTS**

**Figures 1**, **2** show the raw grand-averaged ERPs for the oddball and equiprobable groups. In **Figure 3**, the grand-averaged differential responses (emotional minus neutral) for both conditions are presented. In these figures, differential responses to both emotional expressions and in both stimulus presentation conditions can be observed in the lateral parietal and occipital electrodes in the latency range under inspection (100–200 ms post stimulus).

The ICA decomposition showed two separate components at the relevant latency range for both the oddball and equiprobable condition. The earlier, henceforth 130-ms component, had a mean latency of 134 ms, and the later, henceforth 170-ms component, had a mean latency of 165 ms in the posterior electrode sites. **Figures 4**, **5** illustrate the back-projected components in individual participants as waveforms at electrodes P7 and P8.

#### **130-MS COMPONENT**

**Figure 6** shows the scalp potential maps for the 130-ms component back-projected to the electrodes in the oddball and in the equiprobable conditions.

A MANOVA revealed a significant main effect of Electrode, *<sup>F</sup>*(6, <sup>13</sup>) <sup>=</sup> <sup>29</sup>.4, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.0001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.931, reflecting the positive polarity of the differential response in the frontal electrode sites and negative polarity in the posterior electrodes. Importantly, an Electrode × Condition interaction was found, *F*(6, <sup>13</sup>) = 5.00, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.007, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.698, indicating a non-homogeneous scalp distribution in ERP amplitudes between the two conditions. The other main effects or other interaction effects were nonsignificant [Expression: *<sup>F</sup>*(1, <sup>18</sup>) <sup>=</sup> <sup>0</sup>.45, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.834, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.003; Expression <sup>×</sup> Condition: *<sup>F</sup>*(1, <sup>18</sup>) <sup>=</sup> <sup>0</sup>.39, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.846, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.002; Electrode <sup>×</sup> Expression: *<sup>F</sup>*(6, <sup>13</sup>) <sup>=</sup> <sup>0</sup>.76, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.617, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.259; Electrode × Expression × Condition: *F*(6, <sup>13</sup>) = 1.67, *p* = 0.206, η2 *<sup>p</sup>* = 0.435].

Because Expression showed no effect, subsequent *t*-tests with the mean amplitude values averaged over responses to fearful and happy faces were applied separately to data measured from each electrode in order to compare the responses between the stimulus presentation conditions. The stimulus presentation condition had a significant effect on differential responses (emotional minus neutral) at P7, *t*(18) = 3.38, *p* = 0.011 (mean difference 0.31μV, 95% confidence interval 0.12–0.51μV) and at Pz electrodes, *t*(18) = 2.13, *p* = 0.047 (mean difference 0.21μV, 95% confidence interval 0.003–0.42μV). There was also a marginally significant effect at Oz electrode, *t*(18) = 2.07, *p* = 0.053 (mean difference 0.27μV, 95% confidence interval 0.04–0.55μV). For

all these electrodes (P7, Pz, Oz), the difference-wave amplitudes were larger (more negative) in the oddball condition than in the equiprobable condition (**Figure 7**). For the other electrodes, no significant differences between the conditions were observed. Amplitude values differed from zero at all electrode sites and in both stimulus presentation conditions. **Table 1** shows the *t*values, *p*-values, mean differences, and 95% confidence intervals for the differential response amplitude values at each electrode tested against zero (one sample *t*-test).

condition and equiprobable condition groups.

Since visual inspection of the scalp topographies suggested that the lateral parietal activity was right dominant in the equiprobable condition, while no such lateralization existed in the oddball condition, Electrode × Condition effect was further studied. Pair-wise comparisons were applied for the amplitude values recorded at P7 and P8 separately for the conditions. The statistics conformed to visual observation showing that, in the equiprobable condition, amplitude values were larger in the right parietal electrode site than the left (i.e., P8 vs. P7), *t*(9) = 3.19,

*p* = 0.022 (Bonferroni corrected, mean difference 0.62μV, 95% confidence interval 0.18–1.06μV). In the oddball condition, no such difference was found, *t*(9) = 0.48, *p* = 0.641 (mean difference 0.05μV). **Figure 8A** depicts lateralization index for both conditions (oddball and equiprobable). Comparison of the lateralization indexes in the oddball and equiprobable conditions indicated a significant difference between the conditions, *t*(18) = 2.36, *p* = 0.030, mean difference 0.40μV, 95% confidence interval 0.04–0.75μV).

#### **170-MS COMPONENT**

**Figure 9** shows the scalp potential maps for the back-projected 170-ms component. A MANOVA indicated a main effect for electrode, *<sup>F</sup>*(6, <sup>13</sup>) <sup>=</sup> <sup>14</sup>.11, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.0001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.867. No other main effects or any of the interaction effects were significant (Expression: *<sup>F</sup>*(1, <sup>18</sup>) <sup>=</sup> <sup>0</sup>.35, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.563, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.019; Electrode <sup>×</sup> Condition: *<sup>F</sup>*(6, <sup>13</sup>) <sup>=</sup> <sup>0</sup>.68, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.669, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.239;

**back-projected to the electrodes.** Map for the equiprobable condition group on left and map for the oddball condition group on right.

**FIGURE 7 | Mean amplitude values (µV), confidence intervals, and scatterplots of the individual participants' values for each electrode in the equiprobable condition (EQ) and oddball condition (OB) for the 130-ms component (differential response; emotional minus neutral face).** An asterisk (∗) indicates a significant difference (*p* < 0.05) between conditions at P7 and Pz electrodes.

Expression <sup>×</sup> Condition: *<sup>F</sup>*(1,18) <sup>=</sup> <sup>0</sup>.66, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.800, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.004; Electrode <sup>×</sup> Expression: *<sup>F</sup>*(6, <sup>13</sup>) <sup>=</sup> <sup>0</sup>.56, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.752, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.207; Electrode × Expression × Condition: *F*(6, <sup>13</sup>) = 0.34, *p* = 0.901, η2 *<sup>p</sup>* = 0.137). The effect for electrode resulted from the amplitudes being of positive polarity in the anterior electrodes and of negative polarity in the posterior electrodes (**Figures 9**, **10**). Amplitude values averaged for the fearful and happy expressions differed from zero at all the electrode sites and in both stimulus presentation conditions. **Table 2** shows the *t*-values, *p*-values,


*One sample t-tests (tested against 0) for the emotional minus neutral differential responses. OB, oddball condition; EQ, equiprobable condition.*

mean differences, and 95% confidence intervals for the differential response amplitude values at each electrode tested against zero (one sample *t*-test).

**Figure 8B** shows lateralization index for both conditions (oddball and equiprobable). No statistically significant difference was found in the lateralization indexes between the conditions, *t*(18) = 1.06, *p* = 0.304.

# **DISCUSSION**

We presented two groups of adults with a series of pictures of faces: for one group the faces were presented in an oddball condition, for the other group the faces were presented in an equiprobable condition. Facial identities changed on a trial-bytrial basis. For the oddball condition group, most of the faces expressed neutral emotion, with rare happy and fearful faces

group on left and map for the oddball condition group on right.

randomly violating this regularity. For the equiprobable condition group, neutral, happy, and fearful faces were presented with equal probability and formed no regularity in the stimulus series.

Differential responses to emotional expressions (fearful– neutral and happy–neutral) were calculated, wavelet filtering was applied to the averaged data in order to increase the signal-tonoise ratio and the independent components were extracted by the open-source ICA software, ICASSO (Himberg et al., 2004). We found two separate components for the emotional faces in

#### **Table 2 | 170-ms component.**


*One sample t-tests (tested against 0) for the emotional minus neutral differential responses. OB, oddball condition, EQ, equiprobable condition.*

both the oddball and equiprobable conditions: one at the latency of ∼130 ms and the other at the latency of ∼170 ms after stimulus onset.

The 170-ms component conforms to the face-sensitive N170 response. The scalp topography of the component extracted from the differential response included a lateral occipito-parietal negativity and frontal positivity, and was thus similar to the topography previously reported for N170 itself (e.g., Bentin et al., 1996; Ashley et al., 2004; Williams et al., 2006; Blau et al., 2007). Also Blau et al. (2007) have found that subtracting responses to taskirrelevant fearful faces from those to neutral faces provided a topography that was highly similar to the topography of N170. The frontal positivity was most likely a so called vertex positive potential (VPP) known to be elicited by the same brain generators as N170 (Joyce and Rossion, 2005). Also the present data analysis based on ICA method supports this view.

The 170-ms component did not, as expected, differ between the stimulus presentation conditions, but differentiated between the emotional and neutral faces. Emotional modulation of N170 has also been reported in several other studies (Batty and Taylor, 2003; Eger et al., 2003; Caharel et al., 2005; Williams et al., 2006; Blau et al., 2007; Leppänen et al., 2007; Schyns et al., 2007; Japee et al., 2009; Vlamings et al., 2009; Wronka and Walentowska, 2011). Since there are also studies in which no N170 modulation for emotional expressions has been found (Eimer and Holmes, 2002; Eimer et al., 2003; Holmes et al., 2003, 2005; Ashley et al., 2004; Susac et al., 2004; Santesso et al., 2008), future studies should explore the factors influencing this modulation. These could be related to several methodological choices, for example, to the behavioral tasks the participants are asked to perform during stimulus presentation. Recently, the location of the reference electrode has also been suggested to have an effect on the N170 modulation by emotional expression (Rellecke et al., 2013).

The finding of a 130-ms component in both conditions was unexpected. It was observed as enhanced parieto-occipital negativity and frontal positivity to the emotional faces in comparison to neutral faces in both conditions. Importantly, however, differences in topography between the conditions were observed: the topography was bilateral over the lateral parietal sites in the oddball condition while it was more right-dominant in the equiprobable condition. The bilateral posterior topography of the component conforms to vMMN to facial expressions (Astikainen and Hietanen, 2009; Stefanics et al., 2012). Also, the observed frontal positivity in the present data replicates our previous findings of the vMMN topography to emotional faces (Astikainen and Hietanen, 2009). The current data suggest that this differential response is not due to detection of the regularity violation, but more generally related to emotional processing since it was elicited also in the equiprobable condition. Indeed, in some previous studies investigating ERPs to facial expressions, but not applying the oddball condition, a frontal positivity to emotional expressions relative to neutral ones at a latency corresponding to that observed in the present study has been found (Eimer and Holmes, 2002; Kiss and Eimer, 2008). The latency of the 130 ms component is in line with the earliest differential responses to changes in facial expressions (Susac et al., 2004, 2010; Astikainen and Hietanen, 2009; Chang et al., 2010; Stefanics et al., 2012). The fact that elicitation of the 130-ms component was also observed in the equiprobable condition suggests that it reflects both the detection of the regularity violations and encoding of the emotional information in the faces. This finding calls for appropriate control conditions, such as an equiprobable condition, in future studies of vMMN to facial expressions.

In accordance with our previous finding of vMMN in a similar experimental paradigm as the oddball condition applied here (Astikainen and Hietanen, 2009), the 130-ms component showed no difference between fearful and happy expressions in either condition (oddball or equiprobable). Also in a study in which schematic faces were applied, sad and happy faces as rare changes among neutral standard faces elicited equally large amplitudes (Chang et al., 2010). On the other hand, a so called "negativity bias" in response latencies has been reported in two previous vMMN studies using facial expressions (Kimura et al., 2011; Stefanics et al., 2012). In these studies, fearful faces as deviants elicited differential responses in clearly earlier latency ranges than happy faces; for example, the differential response found in the 70–120-ms latency range to fearful faces was absent for happy faces (Stefanics et al., 2012). In the present study, the fearful and happy deviants elicited the same components (130− and 170-ms component) and they were also similar in their latencies. The lack of a negative bias in our study is not, however, in conflict with the previous findings by Kimura et al. (2011) and Stefanics et al. (2012), hence we only analyzed the components in the relevant latency range for N170 (100–200 ms post stimulus). Inspection of the components extracted by ICA for the entire post stimulus time period might have revealed emotion-specific components in the earlier and later latency ranges as well.

There are some limitations in the present study. First, the emotional expressions were not presented with the same probability in the oddball and equiprobable condition (*p* = 0.1 and *p* = 0.33, respectively). It is thus possible that it was not solely the differences in cognitive expectation (present only in the oddball condition in which the high probability of the neutral standard faces formed it), but also the differences in the probability of the emotional expressions as such that could have induced the between-condition effects. In the future, one should investigate whether the response amplitude of vMMN or N170 to emotional faces is influenced by the presentation probability within the stimulus sequence. Second, the study was conducted with a limited sample size. Future studies should aim to replicate the findings with a larger number of participants. Third, the current study is based on EEG data recorded with a montage of 14 electrodes. More sensors could have allowed for example estimation of the locations of sources (for a magnetoencephalography study of facial processing see, Smith et al., 2009). Finally, the present study does not reveal to which specific diagnostic features in faces the found components are responses to (see e.g., Schyns et al., 2009). However, in the present study, several different facial identities in the pictures were applied and there were no immediate repetitions in them. Our results might thus reflect abstraction of emotion-related features among several changing low-level features.

# **REFERENCES**


In sum, we found two separate components in the 100–200-ms latency range for changes in emotional expressions. The component peaking at ∼170 ms post stimulus showed no difference between the stimulus presentation conditions and it was identified as the face-sensitive N170 response. A component peaking at 130 ms post stimulus was different in its scalp topography in the oddball and the equiprobable conditions, i.e., when the presented face violates the regularity formed by the standard faces in comparison to the condition in which no regularity is present. Future studies of vMMN to facial expressions should apply relevant control conditions to avoid the confounding effect of the encoding of emotional expressions as such.

# **ACKNOWLEDGMENTS**

The authors thank Mr. Petri Kinnunen for his help in constructing the stimulus conditions and MSc Joona Muotka for statistical advices. The study was supported by the Academy of Finland (project no. 140126).

#### **SUPPLEMENTARY MATERIALS**

A data sample and Matlab scripts for running the data processing including wavelet filtering and ICA can be found online at: http://www.frontiersin.org/Human\_Neuroscience/ 10.3389/fnhum.2013.00557/abstract


of rapid brain responses to six basic emotions. *Cogn. Affect. Behav. Neurosci.* 3, 97–110. doi: 10.3758/ CABN.3.2.97


318–326. doi: 10.1111/j.1469-8986. 2007.00634.x


M., Ratner, K. G., Roesch, E. B., et al. (2008). Electrophysiological correlates of spatial orienting towards angry faces: a source localization study. *Neuropsychologia* 46, 1338–1348. doi: 10.1016/j. neuropsychologia.2007.12.013


fearful facial expressions primarily mediated by coarse low spatial frequency information. *J. Vis.* 9, 1–13. doi: 10.1167/9.5.12


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 April 2013; accepted: 22 August 2013; published online: 11 September 2013.*

*Citation: Astikainen P, Cong F, Ristaniemi T and Hietanen JK (2013) Event-related potentials to unattended changes in facial expressions: detection of regularity violations or encoding of emotions? Front. Hum. Neurosci. 7:557. doi: 10.3389/fnhum.2013.00557*

*This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2013 Astikainen, Cong, Ristaniemi and Hietanen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Oblique effect in visual mismatch negativity

# *Endre Takács 1,2\*, István Sulykos 1,2, István Czigler 1,2, Irén Barkaszi 1,3 and László Balázs <sup>1</sup>*

*<sup>1</sup> Research Centre for Natural Sciences, Institute of Cognitive Neuroscience and Psychology, Hungarian Academy of Sciences, Budapest, Hungary*

*<sup>2</sup> Faculty of Education and Psychology, Eötvös Loránd University, Budapest, Hungary*

*<sup>3</sup> Department of Cognitive Psychology, Institute of Psychology, Eötvös Loránd University, Budapest, Hungary*

#### *Edited by:*

*Piia Astikainen, University of Jyväskylä, Finland*

#### *Reviewed by:*

*Helen Clery, Institut National de la Santé et de la Recherche Médicale, France Dagmar Müller, University of Leipzig, Germany*

#### *\*Correspondence:*

*Endre Takács, Research Centre for Natural Sciences, Institute of Cognitive Neuroscience and Psychology, Hungarian Academy of Sciences, Pusztaszeri Str., 59–67, 1394 Budapest, P.O. Box 398, Hungary e-mail: takacs.endre@ttk.mta.hu*

We investigated whether visual orientation anisotropies (known as oblique effect) exist in non-attended visual changes using event-related potentials (ERP). We recorded visual mismatch negativity (vMMN) which signals violation of sequential regularities. In the visual periphery unattended, task-irrelevant Gábor patches were displayed in an oddball sequence while subjects performed a tracking task in the central field. A moderate change (50◦) in the orientation of stimuli revealed no consistent change-related components. However, we found orientation-related differences around 170 ms in occipito-temporal areas in the amplitude of the ERPs evoked by standard stimuli. In a supplementary experiment we determined the amount of orientation difference that is needed for change detection in an active, attended paradigm. Results exhibited the classical oblique effect; subjects detected 10◦ deviations from cardinal directions, while threshold from oblique directions was 17◦. These results provide evidence that perception of change could be accomplished at significantly smaller thresholds, than what elicits vMMN. In Experiment 2 we increased the orientation change to 90◦. Deviant-minus-standard difference was negative in occipito-parietal areas, between 120 and 200 ms after stimulus onset. VMMNs to changes from cardinal angles were larger and more sustained than vMMNs evoked by changes from oblique angles. Changes from cardinal orientations represent a more detectable signal for the automatic change detection system than changes from oblique angles, thus increased vMMN to these "larger" deviances might be considered a variant of the magnitude of deviance effect rarely observed in vMMN studies.

**Keywords: visual mismatch negativity (vMMN), event-related potential (ERP), unconscious processing, attention, oblique effect, oddball paradigm**

# **INTRODUCTION**

Oblique effect, a well-known phenomenon in visual orientation research, denotes that the nervous system is more sensitive to stimuli of cardinal (vertical and horizontal) than oblique orientations. Various experimental methods demonstrate this anisotropy, e.g., contrast sensitivity for gratings (Campbell et al., 1966; Caelli et al., 1983), visual acuity (Berkley et al., 1975), vernier acuity (Corwin et al., 1977), setting stimuli parallel (Andrews, 1967) and reproduction of stimulus orientation (Gentaz et al., 2001).

The oblique effect most likely originates from the visual cortex (Li et al., 2003, but see Vidyasagar and Urbas, 1982). In a wide range of mammal species more cells respond preferably to cardinal than to oblique stimuli in the visual cortex (Mansfield, 1974; Levitt et al., 1994; Coppola et al., 1998; Li et al., 2003; Xu et al., 2006). The fact that oblique effect emerges if light is projected straight to retina indicates that not the optics of the eyeball or pupil is responsible for the effect (Campbell et al., 1966; Mitchell et al., 1967).

In humans, larger fMRI response was registered to cardinal than to oblique stimuli in V1 (Furmanski and Engel, 2000). Using event-related potentials (ERP) unequal responses have been obtained to cardinal and oblique orientations in steady state potentials (Maffei and Campbell, 1970; May et al., 1979; Skrandies, 1984; Moskowitz and Sokol, 1985); transient ERPs (Yoshida et al., 1975; Arakawa et al., 2000; Proverbio et al., 2002), and MEG (Koelewijn et al., 2011).

Orientation anisotropies were also demonstrated in visual search. In these experiments an oblique stimulus pops out more easily among vertical stimuli, than a vertical stimulus among oblique stimuli (Treisman and Gormican, 1988; Cavanagh et al., 1990). According to the interpretation by Treisman and Gormican, 1988, the visual system treats vertical lines as default, primary value, while oblique lines carry an additional feature (vertical plus a deviancy from vertical). These features are perceived preattentively, without the need of individual examination of every element in the display. On the contrary, the lack of features could only be detected with serial inspection of every stimuli, so increasing the number of distractor elements monotonically increases the reaction time. These results imply that there are essential differences between oblique and vertical orientations. It is important to note that the direction of the asymmetry switches if an aperture is placed over the display having the same orientation as the oblique stimuli, i.e., in this case the vertical stimulus pops-out. However, installing a rounded aperture which is neutral in orientation, oblique stimulus pops-out again, demonstrating that the basis of the phenomenon is the oblique effect, but environmental clues have also important roles. Others have pointed out the influence of vestibular and somatosensory input (Marendaz, 1998; Lipshits and McIntyre, 1999).

In the majority of papers dealing with the oblique effect, stimuli were in the focus of attention, however, the visual search anisotropy indicates that the oblique effect may also be present at the pre-attentive levels. Investigation of the automatic visual change-detection may also underpin that oblique effect is a fundamental phenomenon in visual perception.

Automatic, unconscious deviance-detection is indicated by the auditory (MMN, for review see Näätänen and Winkler, 1999; Näätänen et al., 2007) and visual mismatch negativities (vMMN, for review see Pazo-Alvarez et al., 2003; Czigler, 2007; Kimura et al., 2011). VMMN is usually investigated in the passive oddball paradigm, where standard stimuli are infrequently replaced by deviant stimuli. VMMN might be recorded in various experimental conditions. In one subset of experiments, vMMN-related stimuli are presented in the unattended, task-irrelevant part of the visual field, while subjects are engaged in a task presented in the center of the visual field (e.g., Tales et al., 1999; Czigler et al., 2002). In other type of experiments a single object is presented and certain features, like the shape of a line segment's end, are used for the task while some other features, like the orientation of the line, are used for vMMN elicitation (e.g., Kimura et al., 2010a). VMMN also emerges in conditions when subjects perform a primary auditory task concurrently with unattended visual stimuli (e.g., Astikainen et al., 2004). In most cases vMMN is a negative component within the 120–400 ms latency range over posterior areas, identified in the deviant minus standard difference wave of ERPs. Auditory and visual MMN is considered to emerge whenever the regularity of the incoming discrete elements is automatically registered, and as a result of comparison processes the violation of the regularity by a new event is detected (Winkler and Czigler, 2012). Upon detecting such mismatch, MMN or vMMN emerges reflecting a prediction error.

At least one portion of both the auditory and visual MMN originates from sensory areas of the brain. Studies aimed to localize vMMN (Yucel et al., 2007; Kimura et al., 2010b; Urakawa et al., 2010; Müller et al., 2012) are in agreement that it has generators in the visual cortex. Deviant-related negative components on occasion found in anterior electrode sites (Czigler et al., 2004; Clery et al., 2012), frontal sources have been demonstrated as well with fMRI (Clery et al., 2013; Cléry et al., 2013) and LORETA (Kimura et al., 2010b). VMMN could be elicited with simple visual deviances, such as motion direction (Pazo-Alvarez et al., 2004), orientation, spatial frequency (Heslenfeld, 2003), color (Czigler et al., 2002), or shape (Maekawa et al., 2005). Studies utilizing orientation change are relatively numerous (Astikainen et al., 2004, 2008; Czigler and Pató, 2009; Flynn et al., 2009; Kimura et al., 2009, 2010a; Czigler and Sulykos, 2010; Sulykos and Czigler, 2011; Sulykos et al., 2013).

In this study we set out to investigate the possibility of orientation anisotropies in vMMN. In a series of experiments we examined whether the system underlying vMMN was more sensitive to orientation deviations from cardinal than from oblique angles. In the first experiment we used a modest change of orientation (50◦). While subjects performed a visuomotor tracking task in the center of the visual field, Gábor patches with various orientations were presented in the periphery in an oddball sequence. Infrequent changes in orientation occurred in oblique vs. cardinal as well as in oblique vs. oblique relation. Our main hypothesis was that visual deviance detection is easier if change occurs compared to cardinal than compared to oblique angles, and this will manifest itself in increased vMMN to such changes. This would be in concordance with the findings and theory of Treisman and Gormican (1988). We also expected reduced vMMN to changes from oblique to oblique orientations compared to the other two relations involving cardinal stimuli, as it is suggested by the oblique literature. We also investigated if the oblique effect is manifested in the exogenous ERP components.

We considered the tracking task to be especially appropriate, because this task guarantees continuous and constant attentional demand, while the vMMN-related stimuli are presented as separate, individual objects in a separate part of the visual field. Taking into account the frame effects reported in visual search studies (Treisman and Gormican, 1988), sources of visual orientation were eliminated from the experimental environment by placing a black circular aperture over the computer screen, and by providing no background light in the room.

Following the first electrophysiological experiment we conducted a psychophysical measurement in order to assess the threshold for orientation change detection in an active paradigm with stimuli similar to those used in the passive oddball experiment (i.e., Gábor patches). In addition, the psychophysical measurement allowed us to assess an observation reported earlier, that in contrast to auditory modality where MMN is thought to be elicited by any discriminable difference (Sams et al., 1985; Näätänen et al., 2007), in the visual modality significantly larger differences are necessary to evoke vMMN. For example, in a study by Czigler et al. (2002) pink-black grating changing to red-black grating elicited no vMMN, although in an active paradigm it is easy to detect such color change.

#### **EXPERIMENT 1**

#### **METHODS**

#### *Participants*

Seventeen healthy students volunteered in this experiment (12 females, mean age: 22.5 years, age range: 18–32 years) either for modest financial compensation or for course credit. All subjects had normal or corrected-to-normal vision and have given written informed consent after the nature of the experiment had been explained to them. The experiment was approved by the Joint Ethical Committee of the Hungarian Psychology Institutes.

#### *Stimuli and procedure*

*Task-irrelevant stimuli.* Task-irrelevant stimuli were Gábor patches (circular grayscale images of Gaussian-windowed sinusoidal gratings; Gaussian standard deviation: 0.17; phase: 45◦; trim-value: 0.25, spatial frequency: 3) in two concentric circles (see **Figure 1**). A circular aperture (radius: 6.2◦) was placed over the monitor in order to remove all external orientation clues. The first circle from the center of the screen had 12 patches (diameter: 1.6◦). The second, outer circle consisted of 16 patches (diameter: 1.9◦). Measured from the center of the screen to the center

of the patches, the distance was 3.4 and 5.2◦. Care was taken to avoid that the inner and outer circle's patches create radiant lines that could ground for orientation. The background was gray (3.1 cd/m2). Stimulus display time was 100 ms, inter-stimulus time was 450 ± 50 ms random jitter to avoid evoking steady state potentials. ERPs were recorded to these task-irrelevant stimuli.

*Task-relevant stimuli.* Subjects performed a tracking task in the center in a circular task field (1.3◦). They were asked to keep an ever-moving dot inside a small circle by tracking down its moves using a trackball (Kensington, Orbit optical trackball). When the dot was inside the circle, the circle was blue (0.9 cd/m2), but in case of getting out, the circle switched to red (6.6 cd/m2).

Subjects were seated in a reclining chair in a sound-attenuated room, 1.2 m from an 17 LCD monitor (refresh rate: 60 Hz). No background light was provided in the room.

Task-irrelevant Gábor patches were placed in a pseudorandom oddball sequence, where standards had 83.1% probability. Deviant stimuli were preceded by 3–7 standard stimuli. In one block there were 374 standard and 76 deviant stimuli. Every block was presented twice. As **Table 1** illustrates, eight types of standard-deviant pairs were tested: 0 vs. 50◦ (left from horizontal) (and vice versa), 22.5 vs. 72.5◦ (and vice versa), 90 vs. 140◦ (and vice versa) and 112.5 vs. 162.5◦ (and vice versa). In total 20 blocks were presented, each were approximately 4 min duration. The order of blocks was counterbalanced across participants.

*ERP recording.* EEG was recorded with NeuroScan system, DC-100 Hz, sampling rate 500 Hz, with Ag/AgCl electrodes in an elastic electrode cap (EasyCap) on 61 channels from standard locations of extended 10–20 system. Ground electrode was attached to lower forehead, reference electrode was placed on the nose-tip. Reference was offline recalculated to channel average. Horizontal and vertical EOG was recorded with a bipolar montage below and lateral to the eyes. EEG was filtered offline using a bandpass filter of 0.1 and 30 Hz (24 dB/octave slope).



*Line segments solely illustrate the orientation of Gábor patches. The standarddeviant pairs highlighted by the same color were used to calculate difference waves.*

EEG and EOG activities were averaged for epochs beginning 100 ms before and extending until 400 ms after stimulus onset. The mean voltage of the first 100 ms served as baseline interval. Epochs containing amplitude changes exceeding 50μV at any channel were rejected from analysis. Standards preceded by at least three other standards were averaged. ERPs recorded in "standards only" sequences were all averaged, regardless their positions. After artifact rejection for deviants in average 126.5 epochs (*SD* = 20.0; range: 64–148), for standards 226.7 epochs (*SD* = 36.9; range: 112–267) and for the "standards only" 153.6 epochs (*SD* = 26.8; range: 46–178) were included in the mean for one subject.

*Analysis.* To analyze change-related activities, we calculated difference waves by subtracting ERPs elicited by the very same stimulus as a deviant and a standard (Kujala et al., 2007). **Table 1** depicts how the difference waves were calculated. Pairs of standard and deviant stimuli are highlighted in different colors which were used to calculate the difference waves that formed the basis of further analyses. For instance, difference wave for horizontal stimuli (0◦) was calculated by subtracting Oddball1 sequence standard ERPs from Oddball2 sequence deviant ERPs. Though, these ERPs were recorded in separate blocks, this way physically same stimuli were subtracted from each other, the only difference between them was their roles of being a standard or a deviant.

In two conditions stimuli changed from oblique to cardinal <sup>1</sup> (0 and 90◦), in other two conditions from cardinal to oblique (50 and 140◦), and in four conditions from oblique to oblique (22.5, 72.5, 112.5, and 162.5◦). Additional four orientations (45, 67.5, 135, and 167.5◦<sup>2</sup> ) were presented in separate blocks without deviants ("standards only" conditions).

Negative going difference-waves were considered to be valid vMMN responses if point-by-point *t*-test (see, e.g., Guthrie and Buchwald, 1991) were significant at 0.05 level at least at two adjacent parieto-occipital channels in five consecutive time points (10 ms) between 100 and 250 ms after stimulus onset.

For orientation-related amplitude differences we compared the mean amplitude of standard stimuli in 40 ms wide time windows centered around the latency of N1b subcomponent on six parieto-occipital channels (PO7, POz, PO8, and O1, Oz, O2), where this component was most evident by visual inspection. For linear regression models we report *R*<sup>2</sup> coefficient of determination, *F-* and *p*-values.

Tracking task performance was assessed by calculating tracking efficiency, the percent of time when the dot was located inside the circle. Repeated measures ANOVA were performed to compare tracking efficiency in conditions where stimuli changed from cardinal to oblique (50, 140◦), from oblique to cardinal (0, 90◦) and from oblique to oblique orientations (22.5, 72.5, 112.5, and 162.5◦).

2Accidentally we recorded ERPs to 167.5◦ instead of 157.5◦, which would have been midmost between 135 and 180◦.

Greenhouse-Geisser correction was applied when appropriate. Significant interactions were further specified by Tukey HSD *posthoc* test. Partial eta squared (η2) presents effect size estimates.

#### **RESULTS**

#### *Behavioral results*

Repeated measures ANOVA on tracking efficiency with factor conditions revealed significant effects, *F*(2, <sup>34</sup>) = 3.4, *p* < 0.05, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.17. Tracking efficiency was 81.4% (*SE* <sup>=</sup> <sup>1</sup>.73%) in blocks where stimuli changed from oblique to cardinal, 80.6% (*SE* = 1.80%) in blocks where stimuli changed from cardinal to oblique, and 81.8% (*SE* = 1.69%) in blocks where stimuli changed from oblique to oblique orientations. *Post-hoc* comparison showed that the latter two conditions differed significantly from each other.

#### *ERP results*

The response to standard and deviant stimuli displays a positivitynegativity-positivity sequence on Oz channel (see **Figure 3**) that could be identified as P1-N1a-P2 response. These components peak at 94, 112, and 240 ms, respectively. Between N1a and P2 components at lateral, occipito-temporal channels another negative deflection could be observed with a latency of 170 ms (N1b).

*Visual mismatch negativity.* Difference waves for eight stimulus orientations (0, 50, 22.5, 72.5, 90, 140, 112.5, 162.5◦) were calculated. Point-by-point *t*-tests revealed only four conditions out of eight, where vMMN emerged. **Figure 2** displays grandaverage waveforms and topographic voltage maps for vMMN in these conditions. As **Table 2** shows, in all four conditions there was an early time interval (latencies between 120 and 140 ms) for vMMN. In three conditions, vMMN appeared also in a later time interval, with peak latency falling between 198 and 230 ms.

*Exogenous differences.* We compared ERPs evoked by standard stimuli in the twelve available orientations: 0, 22.5, 45, 50, 67.5,

<sup>1</sup>Although for the 0◦ vMMN the standards come from a block where stimuli changed from 0 to 50◦ (i.e., from cardinal to oblique), according to the widely accepted view in change detection studies, it is the deviancy that determines the condition, which in this case is the rare appearance of the 0◦ deviants in successions of 50◦ standards (i.e., a change from oblique to cardinal). This reasoning applies to every condition (50◦ condition, 90◦ condition, etc.)

#### **Table 2 | VMMN in Experiment 1.**


*Channels that exhibited vMMN. Latency and peak amplitude was measured on grand-average waveforms.*

72.5, 90, 112.5, 135, 140, 162.5, and 167.5◦. To simplify the illustration of the orientation effect, on **Figure 3** there are only three orientations, a cardinal (0◦) and two oblique angles (22.5, 45◦).

As **Figure 3** illustrates, around 170 ms (in the time range of the N1b sub-component at occipital channels) there are orientation-related amplitude differences. Although responses were positive in voltage in most cases, N1b subcomponent is a negativity shaped by the adjacent dominant P2 wave. **Figure 4** shows mean amplitudes averaged across subjects at PO8 channel, where N1b component reached its maximum. Amplitudes were highly dependent on the orientation of stimuli.

In order to build a linear regression model, we defined a new variable, deviancy from cardinal orientation, which equals to the difference between the given orientation and closest cardinal orientation (e.g., 72.5◦ has a 17.5◦ deviancy from cardinal orientation, because the closest cardinal orientation is 90◦). A simple linear regression analysis was conducted at five posterior leads (PO7, O1, Oz, O2, PO8), where we predicted mean amplitudes with the independent variable of deviancy from cardinal orientations. **Table 3** displays the results of the regression analyses. The high coefficient of determination (*R*2) indicates that the orientation of Gábor patches is a good predictor of the N1b amplitude.

#### **DISCUSSION FOR EXPERIMENT 1**

To conclude, in Experiment 1 vMMNs were evoked sporadically, only in four conditions out of eight. We observed orientation-related amplitude differences in the latency range of occipito-temporal, lateral N1b component, around 170 ms.

Sulykos and Czigler (2011) presented similar Gábor patches in their experiment. Orientation related vMMN was elicited with 130 and 132 ms peak latency at lower and upper visual field stimulation, respectively. The differences found in the earlier time interval (120–140 ms) in the present study correspond to these latency ranges. However, in the present study we obtained vMMN only in half of the conditions, and there was no oblique-related order in the emergence of deviant-related negativity. Since our hypothesis

**FIGURE 3 | Exogenous differences at occipital leads in Experiment 1 for standard stimuli.** Significant differences were found in the time range of N1b subcomponent (150–190 ms), which are indicated by the rectangular boxes. P1, N1a, N1b, and P2 components are marked where they are most evident. For the sake of visibility, we display just three angles.

was based on finding valid vMMN responses in all conditions or at least in those involving cardinal stimuli, conclusions pertaining to the existence of oblique effect on vMMN could not be made based on the data of the present experiment.


**Table 3 | Statistics for N1b exogenous differences in Experiment 1.**

*Linear regression analysis.*

Contrary to vMMN, amplitude changes of an exogenous component, the N1b suggest the visual system was able to precisely map the orientation of Gábor patches and ERP methods were suitable for detecting these responses. In the light of these results, the lack of reliable vMMN is even more surprising. It is clear that the processing of orientation did not raise difficulties for the visual system, even if stimuli were in the visual periphery and out of the focus of attention.

The small, but significant difference in tracking efficiency between two conditions (changes from cardinal to oblique vs. changes from oblique to oblique) was an unexpected finding. Czigler and Sulykos (2010) demonstrated subtle interactions between the task-relevant and irrelevant stimuli in a similar experimental setup.

In a supplementary experiment we tried to determine the amount of orientation difference that is needed for change detection in an active, attended paradigm. Subjects were required to detect orientation change of Gábor patches while they were reading aloud numbers in the center appearing simultaneously with the patches. Short (100 ms) and simultaneous display of numbers and Gábor patches prevented subjects from using eye movements to fixate on Gábor patches. In this way subjects detected orientation change through peripheral vision, like in Experiment 1. The tracking task used in Experiment 1 would not have provided the required control over subject's eye movements.

Our goal was to reproduce the classical oblique effect with this type of stimulus array, that is Gábor patches with moderately high spatial frequencies in concentric circles. In addition, we could assess the former observation (Czigler et al., 2002) that vMMN could be registered with significantly larger deviances than what could be detected in an active paradigm.

# **PSYCHOPHYSICAL MEASUREMENTS**

#### **METHODS**

#### *Participants*

Eighteen subjects were recruited in this experiment. Five subjects were excluded due to the high number of false alarms that is more than three false alarms in any of the four blocks. An additional subject was excluded due to very low performance in one block. The final cohort therefore consisted of 12 volunteers (7 females, mean age: 21.6 years, age range: 18–30 years). This sample was partly overlapping with the sample of Experiment 1, eight subjects participated in both experiments.

#### *Stimuli and procedure*

In this experiment the central and peripheral visual field were both task-relevant. The peripheral stimuli were identical to Experiment 1 stimuli, i.e., Gábor patches in two concentric circles were presented. In the center of the display random numbers from 1 to 9 (color: magenta, 7.8 cd/m2, size: 0.5◦) were presented. The background was gray (3.1 cd/m2). The stimulus duration was 100 ms, inter-stimulus interval was 1500 ms.

The peripheral and central stimuli always appeared simultaneously. The task was to read aloud the numbers while detecting the change in the orientation of the Gábor patches. Participants were instructed to press a button with their dominant hand upon detecting any change in the background. Subjects have been video-monitored in real time by a research assistant to make sure they kept reading aloud the numbers.

Gábor patches were arranged in an oddball sequence. Standards were 0◦ (horizontal), 22.5, 90 and 112.5◦. Standard probability was 77.8%. At least 2, at most 5 standards followed a deviant stimulus. Deviant stimuli differed in orientation from standards. Amount of this difference was changing throughout the experiment depending on the subject's response, but deviants were most of the time <sup>3</sup> oblique orientations. One up, one down staircase sequence was introduced. Until first reversal, step-size was 10◦, and then step-size was reduced to 1◦. Initial difference was set to 20◦ counterclockwise from the standard stimuli. Threshold was calculated as the mean of the last six reversals out of total 11 reversals. Subjects were tested in four blocks for each standard orientation.

The order of blocks was counterbalanced across participants. Prior to this experiment, participants performed Experiment 1, then they had a short break while the EEG-cap was removed and they washed and dried their hair.

# **RESULTS**

**Figure 5** displays mean thresholds. On the mean threshold data of the four conditions we performed a repeated measures ANOVA with factors cardinality (cardinal: 0, 90◦ vs. oblique 22.5, 112.5◦). A cardinality main effect emerged, *F*(1, <sup>11</sup>) = 10.6, *p* < 0.01, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.49, reflecting that thresholds were lower when standards were cardinal orientations (threshold: 10.08◦; *SE* = 1.99) compared when standards were obliquely oriented (threshold: 16.57◦; *SE* = 2.38).

#### **DISCUSSION FOR PSYCHOPHYSICAL MEASUREMENTS**

Thresholds for detecting orientation deviants were 10 and 16.5◦ in this experiment. Thresholds were significantly lower when standards were cardinal stimuli (compared when standards were oblique orientations), exhibiting the classical oblique effect. This finding is also in line with the results and theory of Treisman and Gormican (1988).

These thresholds, 10 and 16.5◦ are appreciably smaller than the 50◦ orientation change that in fact did not elicit reliable vMMN in Experiment 1. These results provide evidence that perception of change in the visual periphery could be accomplished at significantly smaller thresholds, than what elicits vMMN.

<sup>3</sup>When e.g., standards were 22.5◦ it was possible that deviants had been 90◦ for one or more presentations, but as average thresholds (10◦ and 16.5◦) indicates deviants were fluctuating around 10, 39, 100, and 129◦ for the 0, 22.5, 90, and 112.5◦ standard orientations, respectively. Each of these deviants represents oblique orientations.

One major difference should be noted between this experiment and the vMMN experiment. Although, in this task central vision was occupied with detecting random numbers, subjects still attended consciously to the Gábor patches. The design of our vMMN experiment is intended to prevent subjects from conscious attention towards stimuli used to elicit ERPs. So these two experiments are really different in a major feature (attentive vs. non-attentive processing), that could account for the markedly different results. However, it is possible that the lack of vMMN is attributable to low signal-to-noise ratio that results from presenting too small orientation change (50◦) for reliable vMMN emergence. To test this possibility, and as an attempt to record reliable vMMN, in the next experiment we increased the orientation change to 90◦.

#### **EXPERIMENT 2**

#### **METHODS**

#### *Participants*

Nineteen subjects (11 females, mean age: 21.4 years, age range 19–25 years) participated in this experiment. None of them took part in the previous experiments.

#### *Stimuli and procedure*

Task-irrelevant stimuli were similar to stimuli in Experiment 1, with the following exceptions. First, Gábor patches were displayed in three circles <sup>4</sup> , the center of the patches in the first circles were 1.9◦ from the center of display and Gábor-patches were 1.3◦ in size. The first circle consisted of eight patches. The other two circles and the task-relevant stimuli were identical to Experiment 1.

As shown in **Table 4**, there were four stimulus conditions: 0 vs. 90◦ (vice versa) and 45 vs. 135◦ (vice versa). As every block was


*Line segments solely illustrate the orientation of Gábor patches. The standarddeviant pairs highlighted by the same color were used to calculate difference waves.*

repeated twice, there were eight blocks presented altogether, and all of them were intended to measure vMMN.

*Analysis.* For each subject an average of 134.6 epochs (*SD* = 8.9, range: 99–147) was included in the mean response to deviants, and 249.6 epochs (*SD* = 16.9, range: 173–272) in the standard response. Analysis was identical with Experiment 1, with the following exceptions. For every stimuli condition we determined the latency of the vMMN response, based on the grand average difference waveforms. The latency was measured at the channel where difference wave reached its maxima between 100 and 250 ms. Mean amplitudes of the deviant-minusstandard difference wave were measured around this latency in 60 ms wide windows, in the same time interval for every subjects.

For statistical analyses of vMMN a 2 × 3 grid of parietal and occipital channels were used (PO3, POz, PO4; PO7, Oz, PO8). Repeated measures ANOVA was applied on the mean amplitude values of the difference wave including factors cardinality (cardinal: 0 and 90◦; oblique: 45 and 135◦), anteriority (anterior: PO3, POz, PO4; posterior: PO7, Oz, PO8) and laterality (left: PO3, PO7; midline: POz, Oz; right: PO4, PO8).

Orientation-related amplitude differences were analyzed in two time-intervals, between 100 and 140 ms for N1a component and between 150 and 190 ms for N1b component.

#### **RESULTS**

#### *Behavioral results*

Repeated measures ANOVA on tracking efficiency with factor cardinality revealed no significant effects. Tracking efficiency was 78.9% (*SE* = 1.56%) in cardinal blocks and 79.6% (*SE* = 1.59%) in oblique blocks.

#### *ERP results*

As **Figure 6A** shows, similar waveforms were obtained for standard and deviant stimuli as before. On Oz channel P1-N1a-P2 sequence was elicited, with similar latencies (94, 116, and 240 ms) as in the previous experiment. The occipito-temporal N1b component with 180 ms peak latency was more pronounced in this experiment.

<sup>4</sup>Our rationale for this change was to improve signal-to-noise ratio and achieve larger ERP responses. In Experiment1 we had considerably smaller evoked potentials, e.g., on **Figure 3**, at Oz channel ERPs are around 5.5 μV in amplitude, than in Experiment2, where the same ERPs were around 9μV in amplitude (**Figure 6**).

*Visual mismatch negativity.* In the four conditions (0, 45, 90, and 135◦) deviant-minus-standard difference waves were calculated (see **Figure 6A**). Visual inspection and point-by-point *t*-tests revealed that vMMN responses were present in every condition between 100 and 200 ms, with maxima between 134 to 162 ms (see **Table 5**). On anterior channels positive components were present with similar latencies as the posterior vMMNs. Around 270 and 340 ms the difference waves were positive with a parieto-occipital maximum scalp-distribution.

In a repeated measures ANOVA on mean vMMN amplitudes with factors cardinality, anteriority and laterality a main effect of cardinality *<sup>F</sup>*(1, <sup>18</sup>) <sup>=</sup> <sup>5</sup>.3, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.05, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.23 was obtained, revealing more negative amplitudes in response to cardinal stimuli.

**FIGURE 6 | (A)** Event-related activity and deviant-minus-standard difference potentials at the location having largest amplitude of the difference potentials in Experiment 2. Intervals marked in gray signaled significant deviant-minus-standard differences by point-by-point *t*-tests. **(B)** Topographic voltage maps of the deviant-minus-standard difference potentials in the time-window used for statistical analysis.

**Table 5 | Mean peak amplitudes and mean latencies used for statistical analyses.**


For investigating peak latency differences repeated measures ANOVA was performed with the same factors as above. A cardinality main effect emerged, *<sup>F</sup>*(1, <sup>18</sup>) <sup>=</sup> <sup>55</sup>.0, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.00001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.75, which was due to faster latencies in response to oblique angles. We also found an anteriority main effect, *F*(1, <sup>18</sup>) = 17.2, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.49, reflecting faster latencies at anterior row of channels (142 vs. 148 ms).

The question arises whether latency differences reflect earlier timing of vMMN to oblique conditions. Since both waveforms and topographical voltage maps exhibited close concordance in cardinal (0 and 90◦) and oblique (45 and 135◦) stimuli conditions, we collapsed these responses, **Figure 7** shows these records.

As a descriptive analysis of onset and offset times **Table 6** displays the first time points where point-by-point *t*-tests were significant in the time intervals of cardinal and oblique vMMN responses. Differences between these conditions are notable in offset times only, which suggest that latency differences between oblique and cardinal conditions does not imply earlier timing for oblique vMMNs.

*Exogenous differences.* **Figure 8** depicts visual evoked potentials to four standards (0, 45, 90, and 135◦). Orientation-related amplitude differences were evident already in the time interval of the N1a component around 120 ms post-stimulus. Response amplitudes to vertical (90◦) and marginally to horizontal (0◦) appeared to be less negative than to oblique orientations.

Repeated measures ANOVA conducted with factors stimulus (0, 45, 90, and 135◦) and channels (PO7, O1, Oz, O2, PO8) revealed that differences are present between orientations. Stimulus main effect was significant, *F*(3, <sup>54</sup>) = 11.61, ε = 0.89, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.0001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.39. *Post-hoc* tests inform that it was due to significant differences between amplitudes to 90◦ and to every other orientation. In addition, a stimulus × channel interaction was found, *<sup>F</sup>*(12, <sup>216</sup>) <sup>=</sup> <sup>2</sup>.64, <sup>ε</sup> <sup>=</sup> <sup>0</sup>.32, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.05, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.13. According to *post-hoc* comparisons, responses to 90◦ differed from responses to 0◦ at O1 and Oz, from responses to 45◦ at every channel, and from responses to 135◦ at O1, Oz, O2, PO8 channels. Responses to horizontal (0◦) differed from oblique orientations only at Oz (0 vs. 45◦) and at PO8 (0 vs. 135◦). So we

**evoked by cardinal and oblique stimuli.**


**Table 6 | VMMN onset and offset times (in ms) at parieto-occipital channels based on grand-average waveforms.**

*Last row displays difference between these measures (cardinal minus oblique).*

can conclude that around 120 ms N1a amplitudes to vertical orientation were less negative and horizontal orientation exhibited almost negligible difference.

Differences around 170 ms, in the time-interval of the N1b component were much clearer between orientations. We conducted a repeated measures ANOVA with factors stimulus (0, 45, 90, and 135◦) and channels (PO7, O1, Oz, O2, PO8). A stimulus main effect emerged, *F*(3, <sup>54</sup>) = 8.96, ε = 0.89, *p* < 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.33, *post-hoc* tests revealed significant differences between every cardinal-oblique pairs, and no difference between cardinalcardinal and oblique-oblique pairs. We obtained a stimulus × channel interaction, *<sup>F</sup>*(12, <sup>216</sup>) <sup>=</sup> <sup>4</sup>.36, <sup>ε</sup> <sup>=</sup> <sup>0</sup>.36, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.01, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.20. *Post-hoc* comparisons between cardinal and oblique orientations showed that differences were significant mainly at the three occipital channels (O1, Oz, O2), with the exception of 0 vs. 135◦ and 90 vs. 135◦ contrasts at O2 channel, which were not significant. At parieto-occipital channels (PO7 and PO8) the differences did not reach significance, with the exception of 0◦ vs. 45◦ contrast. Summing up, orientation-related differences in Experiment 2 were significant only in the occipital area, but eventually we could replicate the findings of the previous experiment about the orientation-related N1b differences.

#### **DISCUSSION FOR EXPERIMENT 2**

Results in this session reflect that reliable orientation vMMN with Gábor patches could be obtained with the largest possible (90◦) orientation change. Oblique effect was found; vMMN to cardinal angles exhibited larger amplitudes and had 20 ms longer peak latencies. The vMMN to cardinal orientations had similar onset times than oblique vMMN, but its latency was prolonged due to larger amplitude and later offset. Orientation-related amplitude differences were present already around 120 ms, but the oblique effect could be observed only around 170 ms, in the time-interval of N1b component.

#### **GENERAL DISCUSSION**

The main question of our study was whether an important feature of the visual system, the increased sensitivity for cardinal (horizontal and vertical) orientations (so-called oblique effect) influences pre-attentive processing reflected by the vMMN.

In our first experiment 50◦ deviancy did not elicit reliable vMMN. Nonetheless the largest possible orientation deviancy, 90◦ did elicit vMMN. The deviant-minus-standard difference wave was maximal over occipital areas between 120 and 200 ms, its peak falling between 134 and 162 ms. Stimuli changing from cardinal to cardinal orientations evoked longer and larger responses exhibiting a variant of the oblique effect.

Other studies investigating orientation vMMN obtained reliable vMMN in response to smaller deviances than we did. Czigler and Sulykos (2010) observed vMMN to bar stimuli changing from oblique to oblique orientations using 30 and 60◦ deviances. Astikainen et al. (2008) was able to register vMMN to 36◦ orientation changes for stimuli changing from oblique to oblique orientations. In the interference condition of the experiment of Sulykos et al. (2013) 30◦ deviances evoked vMMN. They were using similar Gábor stimuli as we did, but only in the lower visual field. Kimura et al. (2009, 2010a) presented 36◦ deviances and they also obtained vMMN. However, there is one important issue that we should consider. In our experiment every source of external orientation clues was removed. It was achieved by using a circular aperture over the screen, by providing no background light and by presenting stimuli in upper and lower visual field as well. In this way only orientation clues from the vestibular and somatosensory system remained available for the subjects. The studies mentioned above did not control this aspect, so it is possible that e.g., the outline of the computer screen facilitated the evaluation of orientation and the operation of automatic deviance detection.

It is a key question how we could interpret that we obtained increased response to cardinal changes. Although it was not directly assessed in lot of vMMN studies, presumably it is tenable assumption that the stronger the rule, the larger is the response to its violation, simply because change approximates the threshold of the vMMN system in more experimental trials. Representation of cardinal stimuli is more potent in the visual system, so their presence and deviations from them are more easily detected. While we were able to register electrophysiological responses reflecting fine differentiation of orientation between 150 and 190 ms over the occipital areas, and we could assume that the brain precisely mapped the orientation of Gábor patches, this occurred later than the vMMN, which appeared around 120 ms after stimulus onset. Still this argument does not account for why the response was also more sustained, and not just a simple amplitude differences was observed.

The oblique effect found in vMMN might correspond to the magnitude of deviance effect first observed in auditory processing (Näätänen et al., 1982; Sams et al., 1985, but see Horváth et al., 2008). In the case of the auditory MMN, larger deviancy between standard and deviant stimulus results in a MMN response with larger amplitudes and shorter latencies (Kujala et al., 2007). The existence of this phenomenon in visual domain is uncertain. Czigler and Sulykos (2010) obtained similar vMMNs to 30 and 60◦ deviancy with stimuli changing from oblique to oblique orientations. Maekawa et al. (2005) used windmill patterns, and according to their results, vMMN (or as they label it, "deviantrelated negativity," DRN) did not show increase of amplitude with increasing magnitude of deviance, only the latency decreased of the second negativity between 200 and 300 ms with maxima over temporal areas.

In our study, stimuli changing from cardinal to cardinal could be regarded as a stronger stimulus, and the perceived difference between them larger than the difference between oblique orientations, even though differences were the exact same in degrees. We interpret the sustained response to the more salient cardinal changes, as an indication of the visual system submitting more computational resources to changes that could be of larger importance.

Recently Cléry et al. (2013) found another version of magnitude of deviance effect using fMRI and a passive oddball paradigm. In their experiment the shape of the circular stimuli changed dynamically, for standard stimuli it stretched out horizontally into an ellipse, for deviant stimuli it stretched out vertically. The novel stimuli changed gradually to an irregular shape. The differences between responses elicited by deviant and novel stimuli were apparent in the visual cortex (BA 18 and 19) and in the medial frontal cortex (BA 8). In the anterior cingular cortex only novel stimuli evoked significant activity compared to baseline. Despite the fact that fMRI and ERP results are sometimes difficult to compare due to their widely different spatial and temporal resolution, in this case some parallels could be drawn. According to the authors extrastriatal differences might signal the activation of the visual areas that are responsible for the vMMN generation (or for other higher sensory processes), while differential fMRI response in the anterior cingular cortex might show the contribution of the areas responsible for the generation of the P3a component that is usually elicited by novel, non-target stimuli (Courchesne et al., 1975). However, because in this experiment the type of deviancy between the standard and deviant stimuli (vertical or horizontal stretching) was not the same as the deviancy between the deviant and novel stimuli (vertical stretching or changing to an irregular shape), it is difficult to compare their results with ours. Still, it seems that generators of the posterior part of vMMN are able to give not only all-or-nothing responses to visual deviancies.

We also found exogenous, orientation-related differences around 150–190 ms in the amplitude of the ERPs evoked by the standard stimuli. It is an important question why we observed these differences in a latency range which is quite late in time for visual orientation processing. The area V1 contains cells selective for orientation, and Gábor patches stimulate these as well. Visual processing in the striatal area (V1) is signaled by the C1 visual evoked potential, 50–90 ms after stimulus onset (Clark et al., 1995). Surprisingly, not too many studies reported (e.g., Song et al., 2010) orientation-related differences in this component. Unfortunately we were not able to examine this component due to simultaneous stimulation of upper and lower visual fields.

The first signs of orientation-related processing emerged between 100 and 140 ms in Experiment 2, where vertical (90◦) stimuli elicited less negative N1a component than the other stimuli. Horizontal orientations evoked slightly different response than oblique stimuli. Arakawa et al. (2000)found oblique effect in the P100 component; at low spatial frequencies ERPs to cardinal orientations exhibited longer latencies than those to oblique stimuli, while at high spatial frequencies the relationship was reversed. Proverbio et al. (2002) reported orientation-related differences in P1 and P3 components, vertical elicited larger amplitudes than oblique stimuli (they did not look at horizontal stimuli). A study conducted by Yoshida et al. (1975) found differences in N1-P2 peak-to-peak amplitude, cardinal stimuli evoked larger responses than oblique stimuli. They obtained waveforms similar to ours using circular black and white gratings as stimuli, a P1-N1-P2 sequence was elicited with peak latencies of 110–120, 180–190, and 270–280 ms, respectively. Since they used only one active electrode (Oz), it is unclear whether N1 in their study had similar scalp topography as ours.

Our knowledge about the N1b wave is rather limited. This occipito-temporal component usually peaks approximately around 170–180 ms. In the experiment of Clark and Hillyard (1996) it was maximal at 180 ms, it was elicited by nontarget circular black and white checkerboards on contralateral sides. In our experiment this wave displayed bilateral distribution due to bilateral stimulus presentation. In the Clark and Hillyard (1996) study target stimuli evoked larger N1b responses, but the latency and scalp topography remained unaffected. The authors localized this component to the ventral-lateral visual cortex. This extrastriatal area is engaged in object identification and belongs to ventral visual pathway (Ungerleider and Haxby, 1994). This raises the possibility that in our experiments the visual system treated Gábor patches as objects and reprocessed the orientation of these objects during N1b. Others using everyday objects pointed out that processing of the orientation of objects could be tied to the dorsal occipito-parietal system (Valyear et al., 2006), so it is unclear what are the brain sources of the N1b component that we obtained in the present experiment. Examining the role of attention, Hopf et al. (2002) showed that a negativity with 165 ms peak amplitude is increased if subjects perform a discrimination task compared to simple detection. In our study Gábor patches were unattended, so presumably only detection took place, and the N1b modulation was clearly a result of difference in the physical characteristics of the stimuli.

To sum up the visual evoked potentials to standard stimuli provided evidence that orientation-related processing could be tracked until 190 ms after stimulus onset. Although vertical stimuli elicited different N1a in an earlier time interval (around 120 ms), N1b was the one that precisely mapped the orientation of stimuli. While theoretical assumptions suggest that Gábor patches are primarily processed in V1, it is possible that extrastriatal areas play a role in it as well—our findings corroborate this notion. The vMMN emerged earlier (onset time ∼120 ms) than N1b (around 170 ms) in both cardinal and oblique stimuli conditions. It is possible that the precise orientation of the stimuli was achieved only after the process marked by N1b, and the visual deviance detection was not able to utilize this input. This could account for the widely different thresholds of visual deviance detection in the passive (90◦) and active paradigm (10–17◦).

The other feasible explanation is that the orientation of stimuli is determined in earlier levels of visual processing, possibly in V1, and the N1b component only signals the reprocessing of the stimulus as an object. In this case we can conclude that vMMN did not emerge to some of the differences that the visual system can detect, but only for considerably larger differences that exceed its own threshold.

Other studies provided further evidence that the sensitivity of active visual deviance detection is independent of the vMMN. In the experiment of Czigler et al. (2007) vMMN could be registered if the SOA between the stimulus and the backward mask was at least 40 ms. However if the stimulus—mask SOA was increased up to 174 ms, the magnitude of the vMMN remained the same. In attended conditions participants responded to deviants with a Go-NoGo response. In this case performance increased monotonically up to the longest stimulus-mask SOA (174 ms). Lyyra et al. (2012) combined change blindness paradigm and vMMN. Change blindness labels the phenomenon that human subjects are usually slow or unable to detect sudden, but minor changes in successive pictures of complex, natural scenes. The authors presented such pictures in oddball sequences while subjects tried to detect the change. They looked into the ERPs until the point when the detection of change happened. The authors hypothesized that vMMN will emerge even before the behavioral detection. Successful behavioral change detection occurred in the absence of vMMN using 500 ms inter-stimulus interval (ISI), probably because during this interval sensory memory crucial for vMMN elicitation decayed. With a shorter, 100 ms ISI, behavioral change detection was unchanged, but this time vMMN also emerged in posterior areas. What pertains to our question is that vMMN was not a necessary prerequisite of explicit change detection in their study. Our results also suggest similar dissociations; the processes responsible for the discrimination performance in the active paradigm are not the same as those generating vMMN.

While in the auditory modality, the threshold for MMN more or less corresponds to the behavioral threshold in an active, attended paradigm; in the visual domain it seems not to be the case. Alho et al. (1992) presented rectangular black and white gratings in their experiment, the deviant stimuli differed from standard in height. Only the larger of the two deviances elicited posterior negativity. In the study of Czigler et al. (2002) coloredblack gratings were presented, and the results were similar: only larger deviancy evoked vMMN. In summary, this phenomenon was demonstrated with three different types of visual deviancy shape, color, and orientation.

It is of particular interest what the functional significance of this dissimilar sensitivity is. The auditory MMN could serve as a basis of subsequent orienting response, and vMMN might have a similar role (see Czigler et al., 2006). It would not be functional if every discriminable change in a sequence elicited an orienting reaction, because it would lead to unnecessary distraction from the primary task. In addition, since humans gather information mainly from vision, the processing of stimuli in the focus of attention is substantial, and stimulation in the background is secondary. Auditory perception operates often outside the focus of attention, so automatic, unconscious perceptual processes might have a more central role than in vision.

#### **ACKNOWLEDGMENTS**

We thank Zsuzsanna D'Albini for data collection. This study was supported by the National Research Fund of Hungary (OTKA 104462).

#### **REFERENCES**


of stimulus orientation on spatial frequency function of the visual evoked potential. *Exp. Brain Res.* 131, 121–125. doi: 10.1007/s002219900274


*Neurosci. Lett.* 368, 231–234. doi: 10.1016/j.neulet.2004.07.025


retinotopic and topographic analyses. *Hum. Brain Mapp.* 2, 170–187. doi: 10.1002/hbm.460020306


*Neuroreport* 22, 669–673. doi: 10.1097/WNR.0b013e32834973ba


study. *Brain Res. Cogn. Brain Res.* 13, 139–151. doi: 10.1016/S0926- 6410(01)00103-3


modality. *J. Psychophysiol.* 27, 1–6. doi: 10.1027/0269-8803/a000085


*Neuroimage* 34, 1245–1252. doi: 10.1016/j.neuroimage.2006.08.050

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 April 2013; accepted: 02 September 2013; published online: 23 September 2013.*

*Citation: Takács E, Sulykos I, Czigler I, Barkaszi I and Balázs L (2013) Oblique effect in visual mismatch negativity. Front. Hum. Neurosci. 7:591. doi: 10.3389/fnhum.2013.00591*

*This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2013 Takács, Sulykos, Czigler, Barkaszi and Balázs. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# *Kairi Kreegipuu1\*, Nele Kuldkepp1,2 , Oliver Sibolt1, Mai Toom1, Jüri Allik1,3 and Risto Näätänen1,4,5*

*<sup>1</sup> Department of Experimental Psychology, Institute of Psychology, University of Tartu, Tartu, Estonia*

*<sup>3</sup> Estonian Academy of Sciences, Tallinn, Estonia*

*<sup>4</sup> Center of Integrative Neuroscience, University of Aarhus, Aarhus, Denmark*

*<sup>5</sup> Cognitive Brain Research Unit, Institute of Behavioural Sciences, University of Helsinki, Helsinki, Finland*

#### *Edited by:*

*Piia Astikainen, University of Jyväskylä, Finland*

#### *Reviewed by:*

*István Czigler, Hungarian Academy of Sciences, Hungary Lun Zhao, Jinan Military General Hospital, China*

#### *\*Correspondence:*

*Kairi Kreegipuu, Department of Experimental Psychology, Institute of Psychology, University of Tartu, Näituse 2, Tartu 50409, Estonia e-mail: kairi.kreegipuu@ut.ee*

Our brain is able to automatically detect changes in sensory stimulation, including in vision. A large variety of changes of features in stimulation elicit a deviance-reflecting event-related potential (ERP) component known as the mismatch negativity (MMN). The present study has three main goals: (1) to register vMMN using a rapidly presented stream of schematic faces (neutral, happy, and angry; adapted from Öhman et al., 2001); (2) to compare elicited vMMNs to angry and happy schematic faces in two different paradigms, in a traditional oddball design with frequent standard and rare target and deviant stimuli (12.5% each) and in an version of an optimal multi-feature paradigm with several deviant stimuli (altogether 37.5%) in the stimulus block; (3) to compare vMMNs to subjective ratings of valence, arousal and attention capture for happy and angry schematic faces, i.e., to estimate the effect of affective value of stimuli on their automatic detection. Eleven observers (19– 32 years, six women) took part in both experiments, an oddball and optimum paradigm. Stimuli were rapidly presented schematic faces and an object with face-features that served as the target stimulus to be detected by a button-press. Results show that a vMMN-type response at posterior sites was equally elicited in both experiments. Post-experimental reports confirmed that the angry face attracted more automatic attention than the happy face but the difference did not emerge directly at the ERP level. Thus, when interested in studying change detection in facial expressions we encourage the use of the optimum (multi-feature) design in order to save time and other experimental resources.

**Keywords: visual mismatch negativity, optimal design, oddball design, angry schematic face, happy schematic face**

#### **INTRODUCTION**

We are built to perform sparingly. For example, we do not expend perceptual resources at a stable (i.e., highly predictable) level of stimulation. The situation is different with changes in stimulation. The change could be a possible signal of an error, challenge, danger or just a need to react, which triggers a specific neuronal response in the brain, a mismatch negativity (MMN; Näätänen et al., 1978; Näätänen and Michie, 1979). The MMN is a change detection component of the event-related potentials (ERPs) curve that is obtained when the averaged ERP for the frequent standard stimulus is subtracted from that for the rare deviant stimulus. Since its discovery in an auditory modality in 1978, the MMN has been reported to reflect any discriminable changes (see Näätänen et al., 2007 for a recent review). Further support for the view of the MMN as a general reflection of deviance detection in the brain comes from studies of different modalities – vision (e.g., Czigler et al., 2002; Berti and Schröger, 2004; Lorenzo-López et al., 2004; Kremláˇcek et al., 2006; Astikainen and Hietanen, 2009; Stefanics et al., 2012, etc., see a review in Czigler, 2007), touch (e.g., Kekoni et al., 1997; Shinozaki et al., 1998; Astikainen et al., 2001) and olfaction (e.g., Krauel et al., 1999). Thus, the MMN can be viewed as the most general cortical indicator of an unfulfilled prediction.

Establishing the MMN in vision (i.e., vMMN) still took some time and effort and the main reason is obviously the different relation between vision and attention. One of the necessary properties of the MMN is its independence of attention (Näätänen, 1992; Maekawa et al., 2005) – it is even better observed in a passive (ignore) condition than in an attended condition. It is, of course, much harder to achieve an attention-free testing situation in vision than in hearing due to eyeblinks and directedness of the sight. Maekawa et al. (2005) stated that for vision, it is only possible to have participants engaged with a task when one keeps the stimuli under investigation absolutely irrelevant. No more rigorous guidelines have been given. Even perfect performance in the auditory overt task does not guarantee that there are not enough attention resources for visual stimuli. This doubt has also been recently expressed by Stefanics et al. (2012). Researchers have solved the problem of attention control by using different practices. Some have used a story listening with a later check about its contents (e.g., Zhao and Li, 2006; Astikainen and Hietanen, 2009; Maekawa et al., 2009; Li et al., 2012); others have introduced a target detection visual task unrelated to vMNN stimuli (counting targets in Chang et al., 2010; button presses as a response to the target: Tales et al., 1999, 2008; Tales and Butler, 2006). Sometimes the target is presented in the center

"fnhum-07-00714" — 2013/10/25 — 11:28 — page 1 — #1

*<sup>2</sup> Doctoral School of Behavioural, Social and Health Sciences, Tartu, Estonia*

of the visual field, while standards and deviants appear more peripherally (e.g., in Kremláˇcek et al., 2006; Stefanics et al., 2012; Kuldkepp et al., 2013).

Due to the fact that the (v)MMN has clinical value (e.g., Tanaka et al., 2001; Horimoto et al., 2002; Tales et al., 2002, 2008; Lorenzo-López et al., 2004; Tales and Butler, 2006; Hosák et al., 2008; Urban et al., 2008; Kenemans et al., 2010; all recently reviewed together with MMN studies by Näätänen et al., 2011, 2012), it certainly calls for rigorous and standardized measurement procedures. Furthermore, a systematic look at this clinical work also reveals the same important discrepancy (i.e., difficulties in controlling one's attention) between auditory and visual MMNs. For example, there are reports on two generators of the MMN in the auditory modality – one being a more perceptual feature-related, supratemporal generator and the other a cognitively higher, attention-switching frontal generator in nature (Giard et al., 1990). In vision, the vMMN has mainly been discovered in parieto-occipital and occipito-temporal and only seldom in frontal sites (Wei et al., 2002; Astikainen et al., 2008; Hosák et al., 2008). However, recently Kimura et al. (2010, 2011) proposed the existence of two overlapping vMMNs: a more posterior sensory vMMN reflecting refractoriness and N1 and a more fronto-central cognitive or memory-dependent vMMN. The original oddball design for measuring the MMN with 10–20% of deviant signals has great value as a clean experimental procedure but, at the same time, it is very time-consuming. This is the main reason why Näätänen et al. (2004) developed a new paradigm ("Optimum 1"). Optimum 1 is a multi-feature paradigm that allows recording of multiple MMNs in a session with four to five deviants, 10–12.5% of each. The deviants differ from the standard in one feature, and can be presented alternately with the standard stimulus. Recently, Fisher et al. (2011) modified this paradigm by making it shorter and showing that three deviants (frequency, intensity, and duration of the sound) also elicited attenuated MMNs that were still of a reasonable size. As far as we know, the optimal multi-feature paradigm has not yet been applied in the context of visual automatic change detection (vMMN). Still, some researchers (Zhao and Li, 2006; Astikainen and Hietanen, 2009) have successfully presented two deviants (fearful or sad and happy faces) equiprobably in the same session (5 or 10%, for these studies, respectively). This is a very close approximation to the simple form of the optimum design. However, these authors did not compare their results to oddball data that, we believe, is worth of doing.

Obviously, we cannot easily stop participants from blinking but we can help their brain by presenting stimuli that are difficult to ignore or that are even considered to be processed automatically, like movements (Gibson, 1950) or faces (Palermo and Rhodes, 2007). In this study, we deal with schematic emotional faces for at least three reasons – *automaticity*, *simplicity* and *relevance*. (1) By automaticity, we mean that faces have often been reported as having been processed with high priority and without conscious effort and attention (Palermo and Rhodes, 2007). Contrary to many other objects, there is even a face-specific ERP component, N170, indicating fast detection of faces (Bentin et al., 1996). Studies also show that a basic categorization between a face and a non-face takes place in an even earlier time range (at about 100 ms, Pegna et al., 2004). Furthermore, probably due to the evolutionary processes, it is the expressional value of a face that most likely gets preferentially processed (Palermo and Rhodes, 2007). There have been several demonstrations that this automatic emotion processing of a face is asymmetric, favoring an angry or threatening face over a neutral or a happy one (Hansen and Hansen, 1988; Öhman et al., 2001; Weymar et al., 2011 with pop-out displays; Schupp et al., 2004; Stefanics et al., 2012) and involving the right hemisphere more than the left hemisphere (Palermo and Rhodes, 2007). Some studies with positive (happy) and negative (angry, sad, or fearful) faces as stimuli have reported longer latencies (Astikainen and Hietanen, 2009) but larger amplitudes for the negative face difference wave (Zhao and Li, 2006; but see also Xu et al., 2013 who found remarkable gender differences in the brain responses to schematic faces). Any emotion can be characterized by its valence and intensity (i.e., arousal value; Russell, 1980; Posner et al., 2005). These two categories tend to be temporally separated in their effects on ERPs (Olofsson et al., 2008). Olofsson et al. (2008) conclude in their review that the valence of stimuli is related to short latency (100–200 ms) ERPs, and arousal to longer-latency ERPs (200–300 ms). Thus, our brain differentiates between good and bad very quickly. However, sometimes happy face advantage is reported (e.g., Juth et al., 2005; Becker et al., 2011) attributing the effect to the communicative importance (Becker et al., 2011) or relative ease of perceptual processing of the happy face (Juth et al., 2005).

(2) The second reason for using schematic emotional faces as stimuli is their simplicity and controllability. It has been proven that simple schematic faces differing in only a few features (i.e., direction of the eyebrows and the mouth) but having no identity (i.e., does not likely resemble any real person) are rated as relatively natural and signaling different emotions (Horstmann, 2009). Horstmann's article shows that, with respect to the ability to signal threat and being natural, the most optimal set of schematic faces is the one that Öhman et al. (2001) established. In a recent paper, Becker et al. (2011) also point out that a stimulus-set may often contain confounds (like white teeth in a happy face) explaining effects that have mistakenly been attributed to an emotion that can certainly be avoided with good schematic face-stimuli.

(3) Relevance refers to the fact that schematic faces have already been used as stimuli in vMMN research (Chang et al., 2010; Li et al., 2012). These studies, together with the ones with photographic faces (e.g., Astikainen and Hietanen, 2009; Stefanics et al., 2012) and those involving face-processing ERPs (i.e., the N170), provide the temporal frame of reference used later in the present study. Typical latencies for the vMMN are 200–320 ms (Li et al., 2012), 100–350 ms (Chang et al., 2010), 150–180 ms, and 280– 320 ms (Astikainen and Hietanen, 2009), 170–360 ms (Stefanics et al., 2012), 110–360 ms for happy and 120–430 ms for sad faces (Zhao and Li, 2006). Another aspect of relevancy comes from a recent study where the preferential processing of an angry face was present with both, schematic and photographic stimuli (Lipp et al., 2009).

In the current study, we measured vMMN for schematic faces (happy, angry and neutral) in two different designs: the traditional oddball and a variant of the optimal (or "optimum" - these two terms will be used alternately here) multi-feature paradigm. We expect to (1) measure vMMN for schematic faces in posterior

"fnhum-07-00714" — 2013/10/25 — 11:28 — page 2 — #2

and probably also frontocentral sites; (2) find angry-face superiority (i.e., earlier or stronger response) in eliciting vMMN and (3) demonstrate that the multi-feature optimum paradigm can also successfully replace the traditional oddball paradigm in the visual domain. We also look at relations between subjective ratings in stimuli and vMMN, but it remains rather descriptive.

# **MATERIALS AND METHODS**

#### **PARTICIPANTS**

Participants were eleven volunteering students (mean age 23.1 years, SD = 3.7 years, six women). They all had normal or corrected-to-normal vision and were right-handed. The study was approved by the Research Ethics Committee of the University of Tartu and the participants signed a written consent.

### **STIMULI AND PROCEDURE**

Recording took place in an electrically shielded semi-darkened chamber. The presentation screen (*Mitsubishi Diamond Pro 2070SB*, 22"; 60 Hz) was looked through a window at a distance of 114 cm. Stimuli were presented under the control of a Matlab program (MathWorks, Inc., Natick, MA, USA) for 249 ms with a 448 ms offset-to-onset interval (i.e., ISI = 448 ms) in the center of the computer screen (see **Figure 1** upper panel). The relatively long presentation time was chosen according to previous

experiment only upward stimuli were used. In the optimum, the stimuli were also rotated by 180◦. Stimuli are labeled from left to right: angry, happy, neutral, and target.

literature with comparable intervals (Astikainen and Hietanen, 2009; Stefanics et al., 2012; Kimura and Takeda, 2013) and pretesting where shorter on-time rates tended to distress participants. During the ISI, the screen remained white (same as the stimulus background, 112 cd/m2). Stimuli were black-line schematic faces (674 × 789 pixels, i.e., 10.5◦ × 13.5◦) on a white background (luminance 112 cd/m2) – neutral, angry, and happy plus a non-face object with scrambled face-like elements (adapted from Öhman et al., 2001 by Kukk, 2010; see **Figure 1** lower panel). The remarkable size of stimuli warranted that when observers looked at the center of the screen (as were the given instructions), important parts (i.e., mouth and eyebrows) of the to-be-ignored schematic faces appeared outside the fully attended foveal area. At the same time, a foveal part of the target stimulus (T) was optimal for its detection and no extra eye movements or search was needed.

All participants took part in two experiments – one with an oddball design and the other with a variant of optimum design. The sequence of experiments was pseudo-random and there was typically about 1.5 years between the measurements.

# *ODDBALL DESIGN*

There were four different conditions in the experiment with an oddball design. For calculating the vMMN, the conditions consisted of following standard (S) and deviant (D) combinations: (1) angry D – neutral S; (2) happy D – neutral S; (3) neutral D – angry S; and (4) neutral D – happy S. In addition, we presented a non-face object as an attention-capturing target stimulus for each condition. All stimuli were presented in an upright position (as illustrated in **Figure 1** lower panel). In all four conditions, stimuli were arranged into 30 blocks, each consisting of 37 stimulus presentations. The first five presentations were always standard stimuli and thereafter standard, deviant, and target stimuli appeared pseudo-randomly. The pseudo-random sequence in the oddball experiment followed some simple rules: the overall proportion of both, deviants, and targets was 12.5% each and the minimum number of consecutive standards was two. This resulted in 120 deviant stimulus presentations per condition.

# *OPTIMUM DESIGN*

There were three different conditions in the experiment. In all of them, stimuli were arranged into 40 blocks that consisted of 37 stimulus presentations (as in an oddball experiment). The first five presentations were always standard stimuli and thereafter standards appeared alternately to deviant or target stimuli (meaning that every second stimulus was a standard). In our variant of the optimum design, one of the schematic faces (either neutral, angry, or happy, see **Figure 1** lower panel) was the standard stimulus in each of the three conditions. Standard stimuli were always presented in an upright position. The deviant stimuli were, depending on the condition, two remaining schematic faces and their inverted versions (180◦ rotated, not illustrated in **Figure 1**, and not analyzed in the current article), inverted version of the standard stimulus and standard stimulus presented is a position of a deviant (both not analyzed here). Altogether, the proportion of six different deviants (including standard presented as a deviant) was 37.5%, which resulted in 80 presentations per one deviant per condition.

"fnhum-07-00714" — 2013/10/25 — 11:28 — page 3 — #3

As in an oddball experiment, the scrambled non-face object was always presented as a target (here again, either upward or inverted) with the proportion of 12.5%. Participants were, again, instructed to ignore all the other stimuli and press the mouse key with their right hand as quickly as possible whenever the target appeared on the screen. As already told, targets were easily detectable and there was no obvious reason to attend to standard and deviant stimuli. The onset of the blocks consisting of 37 presentations was selfinitiated by the participant in both experiments. The idea behind the block-wise setup was to help participants follow the instruction to avoid blinking and body movements during the recording and to compensate for the effort during these self-terminated breaks.

### **EEG MEASUREMENT AND DATA ANALYSES**

The electroencephalogram (EEG) was recorded using a system of 32 active electrodes (Active Two, BioSemi, Amsterdam, Netherlands). In accordance with the 10/20 system, the recording sites were FP1, AF3, F7, F3, FC1, FC5, T7, C3, CP1, CP5, P7, P3, Pz, PO3, O1, Oz, O2, PO4, P4, P8, CP6, CP2, C4, T8, FC6, FC2, F4, F8, AF4, FP2, FZ, and Cz. Two additional electrodes were placed behind the earlobes of the participant and their signal was offline used as a reference. Additional electrodes were placed below and under the left eye (to record vertical eye movements and blinks) and to the outer canthi of the eyes to monitor horizontal eyemovements. The EEG was online registered with a sampling rate of 1024 Hz and a band pass filter of 0.16–100 Hz.

EEG data were offline analyzed with Brain Vision Analyzer 1.05 (Brain Products GmbH, Munich, Germany). A moderate 1–30 Hz filter (24 oct/dB) was applied to the data. Data were segmented into 950 ms pieces around the stimulus presentation (from −150 ms pre-stimulus to 800 ms post-stimulus) and a 100 ms pre-stimulus period was used as a baseline correction interval. A built-in ocular correction was also applied (Gratton et al., 1983). Artifact-rejection criteria for any segment was applied as follows: (1) maximal amplitude difference between consecutive data points over 50 μV; (2) maximal allowed amplitude difference of 100 μV; (3) amplitudes over 100 μV and below −100 μV; and (4) no more than 100 ms of low activity (0.5 μV). For the subsequent data analysis, electrodes were pooled together: O1, O2, and Oz for the occipital area (O), P4, P8, PO4, and Pz for the right parietal (RP), P3, P7, PO3, and Pz for the left parietal (LP); P3, P4, P7, P8, PO3, PO4, and Pz for the parietal activity (P), Cz, FC1, and FC2 for the midfrontal (MF), AF3, F3, and Fz for the left frontal (LF), AF4, F4, and Fz for the right frontal (RF) and AF3, AF4, F3, F4, and Fz for the frontal (F) activity.

To equalize the number of trials under comparison between standard and deviant stimuli, the respective proportion of trials was randomly selected out of each set of standards. It was visually checked that the selection did not influence the general shape of the averaged standard stimulus waveform. To analyze the individual data, standard, deviant and difference (vMMN) waveforms for each observer (*N* = 11), design (oddball and optimum) and condition (four different conditions in an oddball and three in an optimum experiment, see Stimuli and Procedure) were exported in ASCII format. Statistical comparisons were conducted in Statistica 8.0 (StatSoft Inc., Tulsa, OK, USA). Repeated measures analysis of variance (ANOVA) with a Greenhouse-Geisser correction of degrees of freedom (applied when needed for electrophysiological comparisons) and Tukey HSD or Bonferroni test for post-hoc comparisons was conducted.

We characterized MMN by a maximal negative peak (determined as the mean value of three points: the peak and its neighbors) and a mean amplitude within five predefined intervals (based on the literature and visual inspection of results).These intervals (100–140, 140–180, 180–260, 260–340, and 340–500 ms) are rough estimations of the latency of the vMMN. However, the mean linear product-moment correlation between the highest negative peak and the average activity for the angry or the happy face difference waves (over all observers, conditions, pooled electrode sites and five intervals) was 0.961 (ranging from 0.943 to 0.971). Thus, these two estimates of the vMMN are very similar to each other. Derived from that, we used only the average amplitude within the interval in further analyses.

# **BEHAVIORAL RECORDINGS AND DATA ANALYSES**

The participants' manual reactions (as indicated by mouse key presses) were online recorded in milliseconds. Data were offline analyzed in Statistica 8.0 (StatSoft Inc., Tulsa, OK, USA) for calculating target detection probabilities and mean reaction times (RTs) for detections, as well as conducting comparisons between optimum and oddball designs using paired t-test for dependent samples.

# **POST-EXPERIMENTAL QUESTIONNAIRE AND DATA ANALYSES**

After the experiment, a short inventory was conducted asking participants to rate on a nine-point scale how (1) positive (1) or negative (9); (2) calming (1) or arousing (9); and (3) attentionattracting (1– ignored or unnoticed, 9 – irresistible) each of the stimuli had subjectively been felt. Participants were also asked to label stimuli and to describe their strategy (i.e., verbal or figurative, not analyzed here) used in the experiment. Subjects' ratings to each question were coded into values from one to nine and mean values of each category (valence, arousal and attention-attracting) were calculated. Repeated measures analysis of variance (ANOVA) was used to calculate mean differences in participants subjective answers to each stimuli and category. Also, answers after participating in an experiment with optimum design were compared to answers after participating in an experiment with oddball design (this was done using ten participants' data, because one of the participants conducted both experiments on the same day and filled the questionnaire once). During the experimental session labeling of stimuli was consciously avoided, by showing drawings of stimuli and calling them "it" or "this" when instructions were given.

# **RESULTS**

# **BEHAVIORAL AND POST-EXPERIMENTAL QUESTIONNAIRE'S RESULTS**

Altogether, the detection of targets was performed very well. The detection probability was remarkable being 96.5% for the oddball and 92.3% for the optimum sessions. Also RTs for detections – 395.0 ms (SD = 45.59 ms) and 405.0 ms (SD = 41.7 ms) for the oddball and the optimum experiment, respectively – were similar in both experiments (*t*(10) = 1.41, *p* = 0.188).

"fnhum-07-00714" — 2013/10/25 — 11:28 — page 4 — #4

One may ask whether such simple stimuli carry any emotionrelated meaning at all. We first tested if there were differences in rating the stimuli depending on whether it was done after an oddball or an optimum experiment. A 2 (condition: oddball and optimum) × 4 (stimulus: angry, happy, neutral, and face-like object, i.e., target) repeated measures ANOVA showed that the subjective ratings of valence, arousal and attention did not differ from each other when compared between the two conditions [*F*(1,9) = 4.494, *p* = 0.063 for valence, *F*(1,9) = 0.429, *p* = 0.529 for arousal and *F*(1,9) = 0.243, *p* = 0.634 for attention]. This allows us to use all questionnaire results together in further analyses. **Table 1** shows the mean ratings of the stimuli according to two intrinsic dimensions of emotions (valence and arousal) as well as how much the stimuli had caught the attention of participants.

As can be seen, target and neutral faces were indeed perceived as neutral, whereas angry and happy faces were perceived to be negative and positive, respectively. Arousing value of the stimuli did not differ significantly from each other. With respect to attention, targets were more attention capturing than the happy and the neutral stimuli but the angry stimulus was perceived to be equally attention catching as compared to the target. This speaks for the subjective superiority of the angry (as a social threat-carrying) stimulus as a perceptual object. It is of interest that the automatic attention allocated to the target tended to relate negatively to the attention paid to the angry (correlation being −0.618, *p* < 0.01).

#### **TOPOGRAPHY OF THE ERP DIFFERENCE (VMMN)**

To see whether there are lateralisation effects, we first compared responses in frontal, parietal and occipital left and right locations (LF, RF, LP, RP, O1, O2). This was done by ANOVA (mean amplitude of the difference waves in five intervals as repeated variable) and with lateralisation (left or right), position (frontal, parietal, and occipital), condition (oddball, optimum) and stimulus (angry, happy) as factors. We found no lateralisation main effect [*F*(1, 240) = 1.185, *p* = 0.2774] nor any interaction with these factors indicating basic symmetry in the brain responses. Mean amplitude of the differential response did differ between 100 and 500 ms as there was a main effect of interval [*F*(2.42, 581.63) = 39.111, *<sup>p</sup>* <sup>&</sup>lt; 0.00001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.140,<sup>ε</sup> <sup>=</sup> 0.606]. Instead of negativity, the latest interval (340–500 ms) showed pervasive positivity (0.229 μV) and the activity differed from the mean activity of all the other intervals (being, starting from the earliest −0.765, −1.058, −1.013 and −0.781 μV, respectively; all the differences confirmed by the Tukey post-hoc test). Similar analysis did not indicate any difference between frontal and midfrontal pooled sites. Thus, in further analyses we compare only central pooled positions (F, P, and O) to each other (if not specified differently).

First, the responses to deviant stimulus, either angry or happy one, were compared to the responses to the same stimulus presented as a standard in another session (i.e., Angry-Deviant-minus-Angry-Standard and Happy-Deviantminus-Happy-Standard, see also Stefanics et al., 2012 for comparison of physically identical stimuli). This was done for the oddball and optimal experiments. The resulting difference waveforms as well as standard and deviant waveforms are presented in **Figure 2**.

Processing of the same stimulus as deviant and standard was subjected to an unpaired point-by-point *t*-test (Vision Analyzer 1.05, Brain Vision) with a rather conservative criterion, *t* < −5 or *t* > 5. Significant *t*-values are marked on the waveforms in **Figure 2** with two colors indicating how much time within the interval the processing of these two stimuli differed from each other. It is seen that there are reliable differences between deviant and standard waveforms in 140–340 ms posteriorly, most likely representing the vMMN response. This is another reason, together with ANOVA results reported above, to refine most of our analyses to the midlatency intervals (140–180, 180–260, and 260–340 ms) where the vMMN is most likely elicited.

Although the pattern is not fully clear, is can be seen that the negativity presumably representing vMMN is the most consistently present at occipital sites. This processing negativity is pretty extended in time possibly comprising also attention-related processing. It can be seen that processing of happy stimulus shows a remarkable similarity between experiments. Another common feature of the processing is the widespread emerging positivity after 340 ms. Activity in this late time interval may refer to attended or conscious processing of stimuli. At the same time, there is also a frontal vMMN as frontal negative deflection is shown in the 260–340 ms time-interval in the oddball paradigm for both stimuli, and also for the angry in the optimum paradigm. This explains why we included frontal pooled site into further analyses.

Topographical illustrations of visual MMNs are plotted in **Figure 3** for three time-intervals where more prominent significant differences between deviant and standard processing were shown (**Figure 2**).


*Notes. N* <sup>=</sup> *11. SD is in parentheses.* A,H,N *and* <sup>T</sup> *refer to differences (Bonferroni post hoc test, p* <sup>&</sup>lt; *0.05) in the angry, happy, neutral, and target stimulus, respectively. The scales for valence were valence 1 – positive and 9 – negative, for arousal 1 – calming and 9 – exciting, and for attention 1 – unattended, ignored and 9 – fully attended, irresistible.*

"fnhum-07-00714" — 2013/10/25 — 11:28 — page 5 — #5

5 – 340–500 ms.

# **ODDBALL VS. OPTIMUM**

One of the central aims of the study was to compare vMMNs to the schematic faces elicited in the two different MMN-paradigms,

for the deviant and the solid line is the difference wave (vMMN, deviant – standard). Graphs represent results from three pooled sites (frontal,

**representing "Angry-Deviant-minus-Angry-Standard" and "Happy-Deviant-minus-Happy-Standard" activity for the oddball and optimum paradigms.**

the classical oddball and a variant of the optimum, with each other. Next, differential processing of the deviant stimulus, either angry or happy one (i.e., Angry-Deviant-minus-Angry-Standard and Happy-Deviant-minus-Happy-Standard) was inspected more closely. The occipital, parietal and frontal sites were selected for the graphical presentation because they showed, also at the individual level, the most prominent negative amplitudes (see **Figure 2**). Such a selection was supported statistically, too.

100–140 ms, 2 – 140–180 ms,3–180–260 ms, 4 – 260–340 ms, and

In a repeated measures 3 (electrode sites: F, P, and O) × 3 (temporal intervals: 140–180, 180–260, and 260–340 ms ) × 2 (angry vs. happy stimuli) × 2 (condition: optimal and oddball) ANOVA ) it was revealed that pooled electrode site had an main effect on results [*F*(1.15, 11.53) <sup>=</sup> 15.17, *<sup>p</sup>* <sup>&</sup>lt; 0.01, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.603, <sup>ε</sup> <sup>=</sup> 0.577 ]. At the same time, neither experimental paradigms (oddball vs. optimal), stimuli (angry vs. happy schematic face) or time differed from each other: [*F*(1, 10) <sup>=</sup> 0.147, *<sup>p</sup>* <sup>=</sup> .710, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.015; *<sup>F</sup>*(1, 10) <sup>=</sup> 3.023, *<sup>p</sup>* <sup>=</sup> .113, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.232 and *<sup>F</sup>*(2, 20) <sup>=</sup> 0.492, *<sup>p</sup>* <sup>=</sup> 0.618, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.047, respectively]. Tukey post-hoc test confirmed that posterior sites (P, O) had more negative amplitudes than the frontal one (F). No interaction between these factors was significant. vMMNs obtained at the occipital, parietal and frontal sites are plotted in **Figure 4**. Although visually somewhat different, mean amplitudes for these four vMMN curves do not differ from each other.

It again confirms that at mid-latency the oddball and the variant of optimal paradigm give relatively similar estimations of the MMN.

#### **CONTROL FOR FEATURE DIFFERENCES AND REFRACTORY EFFECTS**

In addition to the between-series difference waveforms analyzed so far (Angry-Deviant-minus-Angry-Standard and Happy-Deviantminus-Happy-Standard) we also found classical, within-series difference waves (Angry-Deviant-minus-Neutral-Standard and

"fnhum-07-00714" — 2013/10/25 — 11:28 — page 6 — #6

Happy-Deviant-minus-Neutral-Standard). Again, a 2 (type of standard: type of deviant) × 3 (recording site: F, P, O,) × 2 (stimulus: angry or happy face) × 3 (temporal intervals) × 2 (oddball and optimum) ANOVA indicated that type of standard (physically the same or different from the deviant) did not matter for the generation of difference waves [*F*(1, 10) = 2.419, *<sup>p</sup>* <sup>=</sup> 0.1509, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.195]. Thus, it did not make any difference whether deviants were compared to physically identical or different standards. As the vMMN is also thought to represent activity from fresh units encoding new input (i.e., the refractoriness issue; see Kimura, 2012), this result refers to the fact that the contamination of the vMMN with the refractory reactions is non-fatal. However, an emerging interaction between the stimuli (angry or happy) and the comparison (either with the same or different standard) [*F*(1, 10) <sup>=</sup> 6.4371, *<sup>p</sup>* <sup>=</sup> 0.0295 <sup>η</sup><sup>2</sup> <sup>=</sup> 0.392] indicated that stimuli may differ in this respect. We found that processing of happy stimulus was vulnerable to the standard stimulus similarity showing only about half of the amplitude in the same standard condition as compared to the neutral standard condition (−0.76 vs. −1.35 μV).

In MMN research, there is always the question of, what is behind the difference waveforms. The very first candidate is a physical difference between standard and deviant stimuli that would result in a larger amplitude of the N1 component and an earlier MMN (see Kimura, 2012). Next, we analyzed data from two inverse conditions, i.e., Angry-Deviant-minus-Neutral-Standard will be compared to Neutral-Deviant-minus-Angry-Standard and Happy-Deviant-minus-Neutral-Standard will be compared to Neutral-Deviant-minus-Happy-Standard. This is typically – in the case of equal sized MMNs – considered to help control for exogenous effects in the MMN. For this we conducted a 2 (type of standard: emotional or neutral schematic face) × 3 (localization site: F, P, O) × 2 (stimulus: angry or happy face) × 3 (temporal intervals) repeated measures ANOVA (with experimental design as the grouping factor). The mere direction of comparison did not have a significant effect on average activity in the intervals [*F*(1, 10) <sup>=</sup> 3.898, *<sup>p</sup>* <sup>=</sup> 0.077, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.281]. However, the results show an interaction between pooled electrodes and comparison direction (i.e., whether emotional stimulus is compared to the neutral or vice versa) [*F*(1.54, 15.44) = 13.165, *p* = 0.001 <sup>η</sup><sup>2</sup> <sup>=</sup> 0.568, <sup>ε</sup> <sup>=</sup> 0.772]. A Tukey post-hoc test confirmed that emotional deviants had more negativity at occipital and parietal recording sites compared to the respective neutral deviants. Thus, it appears that conducting a standard-deviant inverse procedure has the built-in risk that such comparison does not work, and even does not have to work. We believe that our current case belongs to the latter category – emotional deviant stimulus just gets an extra processing because of its evolutionary significance. Similar pattern for an angry face was found with a search task using direct and averted gaze direction: a face with direct gaze, indicating more threat, was more efficiently found among angry faces with averted gaze than vice versa (von Grünau and Anston, 1995).

The rare emotional stimulus (either angry or happy deviant) among neutral standards was obviously more salient and attracted more automatic processing resources than the neutral deviant among emotional standards. Actually, this was what we implicitly expected when choosing emotional to-be-ignored stimuli! Thus, we failed to test the feature equality between stimuli but, at the same time, found some proof that emotional deviants attract automatic attention. In the following analyses we abandon the reversed neutral deviant conditions.

#### **IS THERE ANY ANGRY ADVANTAGE?**

According to our analyses, the quick answer to this question appears to be "no". Still, **Figures 2, 3**, and 4 describe at least some differences between the two deviant emotional stimuli with the opposite valences. Also, previous analyses indicate some advantages in processing of the angry stimulus as compared to the happy one. For example, the findings showing that (1) the angry stimulus got subjectively more attention than other non-targets (**Table 1**); (2) in case of the angry stimulus processing negativity started earlier than for the happy deviant (**Figure 2**); or (3) processing of the angry face did not depend on the standard stimulus, all indicate some superiority of the angry face for perception.

Furthermore, it may be logical to ask whether the theoretically plausible superiority of the angry face that seems to be present in the **Figure 4** survives statistical testing. Main effects ANOVA (2 stimuli × 2 designs × 3 electrodes) for the mean activity in the interval of the most prominent vMMN (140–180 ms) shows that there is again the already reported main effect of electrode [*F*(2, 127) <sup>=</sup> 23.11, *<sup>p</sup>* <sup>&</sup>lt; 0.00001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.267] and also a main effect of stimulus [*F*(1, 127) <sup>=</sup> 4.494, *<sup>p</sup>* <sup>=</sup> 0.028, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.038]. Generally, within this interval (and also in the next interval), angry stimulus produces vMMNs with higher average amplitude but this does not depend on the experimental design.

Altogether, although we were unable to discover a broad and striking angry superiority effect at the level of deviance detection in the brain there are some allusions to it.

"fnhum-07-00714" — 2013/10/25 — 11:28 — page 7 — #7

#### **VMMN'S RELATION TO SUBJECTIVE RATINGS OF STIMULI**

Finally we examined whether individual ratings of each stimulus valence, arousal and power to attract one's attention were related to the mean amplitude of the difference wave (vMMN) within the five intervals (100–140, 140–180, 180–260, 260–340, and 340–500 ms) and extended list of pooled electrode positions (F, LF, RF, MF, P, LP, RP, O). We decided to use wider range of positions and temporal intervals here because it may be meaningful for emotional attention issues. Instead of single correlations we ran multiple regression analyses (forward stepwise method) to predict subjective ratings from mean amplitudes of the difference waves (vMMNs) in all five intervals. For valence and arousal ratings and the attention that was subjectively allocated to the happy stimulus or to the target the models either did not converge or reduce the set of predictors effectively enough. For the angry stimulus there appeared to be a set of independent predictors accounting for 71% of attention subjectively paid to it. A significant model was achieved [*F*(11, 9) = 5.46, *p* < 0.00836] with an adjusted *<sup>R</sup>*<sup>2</sup> <sup>=</sup> 0.710 with following significant predictors and standardized regression coefficients in parentheses: for 100–140 ms at LF (0.463), for 140–180 ms at RP (−0.670), MF (1.061), RF (6.930), F (−6.344), for 260–340 ms at LP (0.793), RP (−0.200), MF (0.981), RF (−3.340), F (2.152) and for 340–500 ms at MF(0.336). A closer look at all these 11 predictors reveals some patterns: (1) most of them are located frontally (MF, F, RF); (2) there are only two typical predictor intervals: 140–180 ms and 260–340 ms; (3) two out of three posterior predictors (LP, RP) are in 260–340 ms, and (4) more predictors lie in the right than in the left.

# **DISCUSSION**

Our results show that (1) in the occipital and parietal area, the oddball and the optimum designs elicit vMMN equally in automatic deviance detection; (2) emotional faces are more efficient in eliciting vMMN in the brain than the neutral schematic face; (3) automatic visual change detection is the most powerful during 140–260 ms after stimulus onset and at the posterior (P, O) sites; (4) although participants were asked to ignore it, the angry stimulus catches as much subjective attention as the target (**Table 1**, **Figure 2**); and (5) despite the differences in subjective ratings of valence and attention-catching, the angry and the happy deviant stimuli do not differ much from each other, but both differ from the neutral stimulus in processing at the brain level; (6) allocation of attention to the angry stimulus was hard to avoid.

#### **DID WE REGISTER THE VMMN?**

At first we should make clear whether we dealt with the vMMN at all. The general shape of the vMMN tends to vary a bit along with stimulus and experiment type. Our stimuli – the sequence of schematic or realistic faces – resembled the ones used in several previous studies (Zhao and Li, 2006; Astikainen and Hietanen, 2009; Chang et al., 2010; Stefanics et al., 2012).The shapes of the deviant-minus-standard difference waves obtained, and their prominently posterior location were comparable as well. Our relatively early posterior vMMN also includes at least some N1 and refractory activity (e.g., Kimura et al., 2010, 2011; Kimura, 2012; Kimura and Takeda, 2013). Inspired by the adaptation vs. memory trace debate on the MMN (e.g., Näätänen et al., 2005), we argue that the adaptation-part (i.e., difference in N1) is not the most decisive nor the only factor here because: (a) the observed negative differential posterior activity lasts for about 200 ms (140–340 ms) that is too long and too late for the pure early sensory activity (indicated by P1 and N170, see **Figure 2**); (b) the afferent activity should depend on the physical difference between deviant and standard (that could have been either neutral stimulus of the same session or angry/happy face of another session in our study) but the posterior vMMNs we found with these two types of standards, differed only for the happy not for the angry deviant. This discrepancy may be related to the amount of automatic attention the angry stimulus inevitably catches.

Due to the nature of stimuli it was very difficult to avoid automatic attention allocated to them. Our stimuli were presented in the center of the visual field for relatively long time (250 ms) to allow attentional processes to operate. At the same time, our stimuli extended over relatively wide area (13.5◦ × 10.5◦), so that informative parts of them (mouth, eyebrows) were certainly not presented foveally but rather processed automatically. Attention was allocated more to the target and angry face than to other stimuli (**Table 1**). At the same time, extra involuntary attention to the angry stimulus did not yield any considerable differences in ERPs at the vMMN interval up to 260 ms. Of course, **Figure 2** shows that angry face tends to elicit difference in processing deviant and standard also around the N1 range (100–140 ms) that refers to the role of attention in detecting them. After posterior vMMN there was some frontal vMMN in processing of the angry face (see **Figures 2** and **3**, 260–340 ms) probably reflecting the automatic attention switch (Giard et al., 1990). A probable attention reorienting was supported by the multiple regression analysis showing that the vMMN in the EPN range (Schupp et al., 2006; Olofsson et al., 2008) was related to the amount of attention allocated to the target. However, only a few positions (LP and RP, both 260– 340 ms) were actually posterior. At the same time, this activity was lateralised being more in the right than in the left hemisphere (also observed by Stefanics et al., 2012). At the same time, these earlier posterior and later frontal difference curves did not differ between conditions and stimuli, probably due to a modest sample size. According to Kimura et al. (2010) and Kimura and Takeda (2013) the later and more anterior vMMN is even a more genuine marker of the automatic difference processing than the vMMN recorded from the more sensory areas.

#### **ODDBALL VS. OPTIMUM PARADIGM**

The most important practical result of the study is the essential equivalence of the oddball and the optimal multi-feature design in eliciting the posterior vMMN. Experimental design did not have any main effect on any comparisons we performed. But let us take an intuitive look at **Figure 4**, representing the four vMMNs in the occipital, parietal and frontal area. Intuition tells us that in the optimum design, the processing of the angry deviant would elicit a higher amplitude vMMN than the happy deviant. This is close to the truth as in the restricted intervals the angry stimulus was processed with higher mean activity but this was a rather pervasive tendency at posterior sites, not specific to the oddball or optimum paradigm. It may be asked whether the ability not to discriminate

"fnhum-07-00714" — 2013/10/25 — 11:28 — page 8 — #8

between deviant stimuli is an advantage or a disadvantage of the oddball and optimum design. Probably the stimuli we used were not strong enough to produce such differences. It really deserves further investigation, but the encouraging fact is that the designs were rather equal and can be used intermittently, depending on specific needs.

However, it should be mentioned that the presentation probability for a single deviant in the oddball experiment was about twice as high as in the optimum experiment. Thus, some amount of the vMMN in the optimum paradigm could have originated from its lesser refractory state (see Kimura and Takeda, 2013). In the future research the refractoriness in the optimum paradigm should be systematically studied.

Further studies should contrast these two designs with equally salient, more neutral stimuli enabling to also test endogeneity. A good candidate for such a feature is visual motion (see Kuldkepp et al., 2013) differing in direction, velocity and duration, for example. We consider the future development of the visual optimal paradigm for the vMMN measurement truly promising as this would considerably facilitate its clinical implementation. Clinically applicable and standardized multi-feature vMMN experiments would be very welcome for diagnostic and treatmentmonitoring purposes, for example in the case of Alzheimer's disease and mild cognitive impairment (e.g., Tanaka et al., 2001; Tales et al., 2002, 2008; Tales and Butler, 2006), schizophrenia (Urban et al., 2008) and alcohol intoxication (Kenemans et al., 2010).

#### **ANGRY VS. HAPPY FACES**

The specific nature of the deviants – either angry or happy – did not explain the obtained vMMN waveforms. The most meaningful result was the general higher mean activity for the angry than the happy deviant but this was only seldom statistically significant. As other vMMNs for these two stimuli did not differ significantly, the registration of the vMMN was, indeed, relatively attention-free. Generally, the subjective state of the participant was expected to relate to vMMN amplitude (evidence reviewed in Näätänen et al., 2011). The fact that the valence of stimuli did not relate to the

#### **REFERENCES**


Becker, D. V., Anderson, U. S., Mortensen, C. R., Neufeld, S. L., and Neel, R. (2011). The face in the crowd effect unconfounded: happy faces, not angry faces, are more efficiently detected in single-and multiple-target visual search tasks. *J. Exp. Psychol. Gen.* 140, 637–659. doi: 10.1037/a0024060


vMMN may be connected to the relatively late temporal window under close investigation. For the face, positive–negative categorization may take place even earlier than 100 ms (Palermo and Rhodes, 2007). The averaged vMMNs (**Figure 2** left most panel) are not too encouraging in this respect. The face-specific component N170 was found at around 160 ms (see **Figure 2**) and it did not differ between angry and happy stimuli. Neither did arousal (indicated by the subjective ratings). Our expected angry stimulus advantage (Öhman et al., 2001) or negativity bias (Stefanics et al., 2012) has been shown to have considerable gender differences (Bradley and Lang, 2007) that should be taken into account in future research (Xu et al., 2013).

# **LIMITATIONS AND STRENGTHS**

Our study is quite exploratory but we have raised several important issues that need a different study with finer spatial and temporal resolution and probably also with a larger sample, to be addressed. A larger sample would enable us to have a closer look on gender differences in lateralization that have been recently reported (Xu et al., 2013).

On the other side, our study has the strength of using a withinsubjects design giving us the certainty that differences between conditions and stimuli are not produced by different groups. Another aspect is the use of a repeated measures design with at least satisfactory quality of each individual data set. To conclude, we have taken an important and rather successful step toward the establishing of the optimum multi-feature registration procedure of the vMMN.

# **ACKNOWLEDGMENTS**

We thank Andres Kreegipuu for technical help, Delaney Skerrett for language advise, Piia Astikainen for great editorial work and Kertu Saar for help with the figures. This research was supported by the Estonian Science Foundation (grant #8332), the Estonian Ministry of Education and Research (SF0180029s08 and IUT02- 13) and Primus grant (#3-8.2/60) from the European Social Fund to Anu Realo.


"fnhum-07-00714" — 2013/10/25 — 11:28 — page 9 — #9


detection as reflected by the mismatch negativity (MMN) – evidence for a memory-based process. *Neurosci. Res.* 65, 107–112. doi: 10.1016/j.neures.2009.06.005


"fnhum-07-00714" — 2013/10/25 — 11:28 — page 10 — #10

disease and ageing. *Neuroreport* 13, 2557–2560. doi: 10.1097/00001756- 200212200-00035


A stare-in-the-crowd effect. *Perception* 24, 1297–1313. doi: 10.1068/ p241297


of facial expressions: an ERP Study. *Brain Topogr.* 26, 488–500. doi: 10.1007/s10548-013-0275-0

Zhao, L., and Li, J. (2006). Visual mismatch negativity elicited by facial expressions under non-attentional condition. *Neurosci. Lett.* 410, 126–131. doi: 10.1016/j.neulet.2006.09.081

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 Apr 2013; accepted: 08 Oct 2013; published online: 28 October 2013. Citation: Kreegipuu K, Kuldkepp N, Sibolt O, Toom M,Allik J and Näätänen R* *(2013) vMMN for schematic faces: automatic detection of change in emotional expression. Front. Hum. Neurosci. 7:714. doi: 10.3389/fnhum.2013.00714 This article was submitted to the journal*

*Frontiers in Human Neuroscience.*

*Copyright © 2013 Kreegipuu, Kuldkepp, Sibolt, Toom, Allik and Näätänen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

"fnhum-07-00714" — 2013/10/25 — 11:28 — page 11 — #11

# Investigating developmental changes in sensory processing: visual mismatch response in healthy children

#### *Katherine M. Cleary1 \*, Franc C. L. Donkers 1,2, Anna M. Evans <sup>1</sup> and Aysenil Belger <sup>1</sup>*

*<sup>1</sup> Department of Psychiatry, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA <sup>2</sup> Department of Psychology, Tilburg University, Tilburg, Netherlands*

#### *Edited by:*

*Gabor Stefanics, University of Zurich and ETH Zurich, Switzerland*

#### *Reviewed by:*

*Paavo H. T. Leppänen, University of Jyväskylä, Finland Nicole Wetzel, University of Leipzig, Germany*

#### *\*Correspondence:*

*Katherine M. Cleary, Department of Psychiatry, School of Medicine, University of North Carolina at Chapel Hill, 101 Manning Dr., CB #7160, Chapel Hill, NC 27599, USA e-mail: kmcleary@ad.unc.edu*

The ability to detect small changes in one's visual environment is important for effective adaptation to and interaction with a wide variety of external stimuli. Much research has studied the auditory mismatch negativity (MMN), or the brain's automatic response to rare changes in a series of repetitive auditory stimuli. But recent studies indicate that a visual homolog to this component of the event-related potential (ERP) can also be measured. While most visual mismatch response (vMMR) studies have focused on adult populations, few studies have investigated this response in healthy children, and little is known about the developmental nature of this phenomenon. We recorded EEG data in 22 healthy children (ages 8–12) and 20 healthy adults (ages 18–42). Participants were presented with two types of task irrelevant background images of black and gray gratings while performing a visual target detection task. Spatial frequency of the background gratings was varied with 85% of the gratings being of high spatial frequency (HSF; i.e., standard background stimulus) and 15% of the images being of low spatial frequency (LSF; i.e., deviant background stimulus). Results in the adult group showed a robust mismatch response to deviant (non-target) background stimuli at around 150 ms post-stimulus at occipital electrode locations. In the children, two negativities around 150 and 230 ms post-stimulus at occipital electrode locations and a positivity around 250 ms post-stimulus at fronto-central electrode locations were observed. In addition, larger amplitudes of P1 and longer latencies of P1 and N1 to deviant background stimuli were observed in children vs. adults. These results suggest that processing of deviant stimuli presented outside the focus of attention in 8–12-year-old children differs from those in adults, and are in agreement with previous research. They also suggest that the vMMR may change across the lifespan in accordance with other components of the visual ERP.

**Keywords: mismatch negativity, visual mismatch response, vMMN, children, developmental psychology, spatial frequency processing, ERP, EEG**

# **INTRODUCTION**

The human brain is constantly responding to changes in sensory stimuli, even if these changes do not pass into conscious awareness. Mismatch negativity (MMN), or the brain's response to infrequent changes in a series of repetitive stimuli (Näätänen and Escera, 2000), is an element of the Event-Related Potential (ERP) that allows for the investigation of the neural correlates of (automatic) change detection in the environment. MMN is typically measured when the subject's attention is directed away from the stimulus, and manifests as a difference wave computed by subtracting the ERP to a frequently-occurring standard stimulus from the ERP to a rarely-occurring deviant stimulus. The MMN can be measured relatively early in development and is generally viewed as the outcome of a mechanism that compares the current sensory input to memory traces formed by previous repetitive inputs, and signals a mismatch between them (e.g., Naätänen et al., 2005, 2007).

MMN has mainly been investigated in the auditory modality, but recent studies have characterized this difference wave in the visual modality as well (see Pazo-Alvarez et al., 2003; Czigler, 2007, for reviews). Recent research (reviewed by Pazo-Alvarez et al., 2003 and Czigler, 2007) has provided convincing evidence that the brain can unconsciously detect small changes in visual environment. Visual MMN (vMMN) is an occipital-parietal negativity computed by subtracting the ERP to a frequentlyoccurring standard stimulus from the ERP to a rarely-occurring deviant stimulus in the visual modality. vMMN usually occurs around 100–250 ms post-stimulus presentation and to date has been primarily studied in typically-developing adults. Visual MNN has been observed in response to unattended changes in color (Czigler et al., 2002, 2006; Berti, 2009), line orientation (Kimura et al., 2009; Czigler and Sulykos, 2010), stimulus position in the visual field (Berti, 2009; Muller et al., 2012), emotional faces (Chang et al., 2010; Gayle et al., 2012; Stefanics et al., 2012), and spatial frequency (Fu et al., 2003; Heslenfeld, 2003).

Like the more frequently studied auditory MMN, differences in the specific paradigms employed and, in some cases, differences in populations studied, may yield different patterns of vMMN. In early vMMN studies, there has been some debate as to whether this negativity represents the refractory effect of the visual stimulus itself or a true detection of change based on building up of a memory trace for the repeated stimulus and a "comparison" of the deviant stimulus features against this trace. Kimura et al. (2009) addressed this question by presenting healthy subjects with two paradigms, the equiprobable (all types of stimuli presented at equal frequencies) and the oddball (standard stimuli 80% of presentations and deviant stimuli 20%). In the equiprobable paradigm, bar stimuli in five different types of orientations were presented; a control bar stimulus was presented twenty percent of the time, equally as likely to be viewed as any of the other four orientations. In the equiprobable paradigm change-specific neuronal populations should not be activated. In the oddball sequence, two bar stimuli with the two closest line orientations were presented: the deviant stimulus twenty percent of the time, and the standard stimulus eighty percent of the time. The authors compared deviant/standard, deviant/control, and control/standard pairings, and found two negativities when comparing deviant stimuli to standard stimuli; one at 100–150 ms and another at 200–250 ms. However, when they compared deviant stimuli to control stimuli, only the later negativity was elicited. The authors concluded that the early negativity is related to the refractory effect while the later one is related to the memory component of stimulus change detection. Similarly, Czigler et al. (2006) found two occipital/centroparietal negativities in healthy adults viewing a set order of color grids that was periodically displaced. One negativity occurred at 100–140 ms post-stimulus and another at 210–280 ms poststimulus. The purpose of the set pattern of alternating colors was to determine if the vMMN was related to change in stimuli themselves or a detection of deviance from a pre-established pattern of change in stimuli. Only the second later negativity at 210–280 ms was elicited when the pattern of color grids was violated, indicating that this later waveform reflects comparison to an established memory trace for the stimulus pattern and not stimulus change *per se*. These findings indicate that, in the visual modality, change detection may involve a 2-step process: a first "sensory" change detection, occurring earlier, and possibly processed at a more "local" level in primary sensory cortices; and a second, occurring slightly later, and possibly depending upon the contrasting of the current stimulus with an established "contextual memory trace" through interactions between visual sensory and higher order associated cortical regions.

Despite a growing number of visual mismatch response (vMMR) studies in adults, there is comparatively little research on the vMMR in children. A recent study by Clery et al. (2012) used dynamic deformations in a circle slowly becoming an ellipse to examine vMMR in healthy adults, as well as in healthy children ages 8–14. While in adults the vMMR was observed as an occipital-parietal negativity occurring around 210 ms poststimulus, in children, three successive negativities originating over fronto-central electrode positions were observed between 150 and 330 ms. In addition, a larger late mismatch *positive* response was observed in children around 450 ms post-stimulus. The authors conclude that not only is the vMMR immature in children up to 14 years of age, but the successive negative potentials may reflect a sequential visual processing of deviancy that is not present in the mature brain. Processing of visual deviancy during development may require several distinct steps that are not necessary for adults, and may be related to immature selective attention processes or underdeveloped connectivity across cortical regions. Scalp topography maps suggested equal temporal recruitment of the dorsal and ventral pathways in adults, but the involvement of right parietal areas in the late positive potential observed in children may indicate that the dorsal pathway is engaged later in stimulus change detection processing in children. It is worth noting, however that the stimuli used in the Clery et al. study featured changes in both form and motion, and the authors hypothesize that these two stimulus properties may be processed separately in children, with maturation of the visual system leading to better integration of multiple stimulus properties. Currently no studies have investigated the vMMR in children treating changes in stimulus form and motion as separate deviant events. Studies using static stimuli that probe changes in physical form or dynamic stimuli with constant physical properties would help confirm this theory. Also worth noting is that the age range investigated in the Clery et al. study comprised a good portion of late childhood and adolescence. Since many important neurophysiological changes occur during adolescence, vMMR may be different in the younger portion of their sample compared to the older portion of their participants. The authors also note that developmental changes in vMMR appear more drastic than those in the auditory modality. Other studies have also reported latency decreases in vMMR with age up to approximately age 16 (Tomio et al., 2012). This latency difference may indicate improved cognitive processing until the late teenage years, possibly associated with improved connectivity resulting from brain maturation. In particular, Tomio et al. conclude that increasing age affords increasing ability to discriminate stimulus properties pre-attentively, and hypothesize that difficulty of stimulus property discrimination may affect latency differences. These differences are seen in other studies that have investigated vMMR across development using different stimuli, such as color differences (Horimoto et al., 2002), which appear to be developmentally mature at 7–13 years of age and can even be observed in mentally retarded (MR) children. Therefore, color modality may be easier to discriminate than the black and white stimulus pattern used by Tomio et al., and may require less advanced stimulus discrimination ability.

While a small number of recent studies, described above, have investigated vMMR in children, specific differences in the mismatch response at various stages in development and across different paradigms are still unclear. In addition, understanding of the neurobiology of developmental differences in vMMR is still in its infancy. In the current study, we aim to further characterize the vMMR in a sample of 8–12-year-old children. This age range is comparable to the age range used in the Clery et al. (2012)study but we chose to limit the upper age range to twelve in order to examine a slightly narrower defined age group. We compared the vMMR to deviant task-irrelevant background stimuli in children to the vMMR of adults while both groups were occupied performing a simple target detection task. We hypothesized that a vMMR would be observed to changes in background stimuli in both groups. Because our stimuli deviated only in form, rather than in form and motion as in a previous study with children in this age range (Clery et al., 2012), we hypothesized the appearance of one negative occipital deflection in the difference wave for both groups.

# **MATERIALS AND METHODS**

#### **PARTICIPANTS**

We collected EEG data from 20 healthy adults between the ages of 18 and 42 [mean age = 26.6, (*SD* = 5.65); 10 females; 78% righthanded] and 22 healthy children between the ages of 8 and 12 [mean age = 10.4, (*SD* = 1.43); 13 females; 85% right-handed]. All participants reported no current, past, or family history of substance abuse, no neurological/neuropsychiatric disorders, no seizure disorder with evidence of seizure activity within the past 12 months, no significant physical impairments or limitations, no history of head trauma or loss of consciousness, and were not currently taking any antipsychotic medications. Participants reported normal or corrected-to-normal vision. One child was excluded from further analysis due to excessive sleepiness during recording, resulting in noisy data.

Participants were recruited from multiple venues, including a university-based mass email system and local community and parent groups. Participants received \$30 for taking part in the study and a certificate with a graphical image of their brain waves to take home. Adult participants gave informed consent, and minor participants provided written assent while their parents provided parental permission as approved by the University of North Carolina Institutional Review Board.

#### **EXPERIMENTAL PROCEDURE**

Visual MMN Paradigm: Continuous EEG data was recorded while participants were presented with target (15% probability) and non-target images (85% probability) displayed at fixation in front of two types of task irrelevant background images of gratings. The target (2 × 2 cm) was a blue star presented in the center of a black and gray grating, while the non-target was a blue crosshair (2 × 2 cm) in the same location (visual angle of star and crosshair < 2 × 2◦). Four different stimulus conditions were created (see **Figure 1**): high spatial frequency (HSF) background with target image placed in center (12.5%), low spatial frequency (LSF) background with target image placed in center (2.5%), HSF background with non-target image placed in the center (72.5%) and LSF background with non-target image place in the center (12.5%). All images were 960 × 720 pixels and consisted of gray and black bars in a repeating pattern. LSF images consisted of four cycles of gray and black bars while HSF images consisted of 10 cycles. LSF images (15% probability) served as deviant background stimuli while HSF images (85% probability) served as standard background stimuli. Our primary events of interest were the standard non-target HSF images with a blue crosshair in the center (HFNT), and the deviant non-target LSF images with a blue crosshair in the center (LFNT). Participants were told that they would view a series of pictures and that their task was to ignore the background gratings and press a button each time an image of a star appeared at the center of the screen. Target events were omitted from analysis and were only included in the experiment in order to make sure that participant paid attention to the screen. No training blocks were provided. Stimuli were presented in a pseudorandom order (i.e., no deviant non-target stimulus was followed by another deviant non-target stimulus). Five runs of 5 min each were presented, with 160 images per run and 800 images total. The total session (including electrode preparation, breaks, and cleanup) lasted no more than 90 min. Images were presented for 750 ms duration, with an interstimulus interval of 1000 ms (offset to onset).

#### **ELECTROPHYSIOLOGICAL RECORDING**

Participants were seated comfortably in a sound-attenuated, dimly lit booth and were instructed to avoid excessive movement, tension of facial muscles, horizontal eye movements, or speaking. Images were displayed on a 19-inch Dell flat panel monitor with a 60 Hz refresh rate. Participants were seated 100 cm away from the stimulus monitor adjusted to be at eye level. Stimulus presentation was controlled by CIGAL software, version 17.2 (Voyvodic, 1999). Continuous EEG data were collected using an elastic cap containing 18 electrodes, with only 13 electrodes used to collect data: at frontal (F3, Fz, F4), central (T7, C3, Cz, C4, T8), parietal (P3, Pz, P4), and occipital (O1, O2) scalp locations. The right mastoid served as the reference electrode and AFz as the

ground. Bipolar recordings of the vertical and horizontal electrooculogram (EOG) were obtained by electrodes placed above and below the right eye and on the outer canthus of each eye, respectively. EEG and EOG data were sampled at a rate of 500 Hz and bandpass filtered online between 0.05 and 100 Hz, with a narrow 60 Hz notch filter used to reduce main power frequency interference. Continuous data were analyzed off-line using NeuroScan 4.4 software (Neurosoft, Inc., Sterling, VA, USA).

#### **DATA PROCESSING**

Response latencies and percentage of correct responses to target stimuli were calculated for each subject. All incorrect trials or trials containing responses less than 200 ms and greater than 1000 ms from onset of the target were excluded from further analyses. Continuous EEG data was filtered offline with a 30 Hz (24 dB/octave) zero phase shift Butterworth low-pass filter and visually inspected for movement artifacts. EEG data sets from each participant were corrected for eye-movements using regression analysis as implemented in Neuroscan Edit 4.4 (Semlitsch et al., 1986). Continuous EEG data from all channels were epoched using a 200 ms pre-stimulus baseline period and a 1000 ms post-stimulus period. Individual epochs were passed through an automatic artifact detection algorithm to remove epochs with EEG activity in excess of –100µV or +100µV. After pre-processing, the number of remaining trials for the main stimulus conditions of interest were as follows, standard non-target: 534.05 (range 439–565) for adults vs. 427.48 (range 209–558) for children [*F*(1, <sup>39</sup>) = 19.58, *p* < 0.001]; deviant non-target: 93.65 (range 83–100) for adults vs. 74.81 (range 34–97) for children [*F*(1, <sup>39</sup>) = 21.42, *p* < 0.001]. ERPs were obtained by averaging the baseline corrected EEG epochs for each stimulus category and for each participant.

The P1 and N1 were identified by an automatic peak detection procedure, defined as the most positive and negative peak (as appropriate) within a specified window after stimulus onset. For P1 and N1, peak windows were determined based on the relevant peak in a visual inspection of grand averages at electrode positions O1, O2, F3, Fz, F4, C3, Cz, and C4. For the occipital channels, P1 peak detection windows for children and adults were defined as 80–150 ms for standard non-target stimuli and deviant non-target stimuli. N1 windows for children and adults were defined as 150–230 ms for standard non-target and deviant non-target stimuli. Since in both groups a negativity at around 150 ms and a positivity at around 230 ms was also clearly visible at frontal electrode positions these peaks were also assessed. For both the child and adult group, the peak detection window for the first positivity was defined as 120–200 ms for standard and deviant non-target stimuli. The first negativity was defined as 200–260 ms for standard and deviant non-target stimuli.

MMN was computed by subtracting the ERP to the standard non-target stimulus (HFNT) from the ERP to the deviant nontarget stimulus (LFNT). Visual inspection of electrode positions O1 and O2 for both group-averaged difference waves and individual subject data indicated that adult subjects displayed a single negative peak around 150 ms post-stimulus, whereas children displayed two negative peaks. In children, the first negativity occurred at around 150 ms and the second one at around 230 ms. Therefore, in the adult group we detected the vMMM as the most negative peak within a 130–200 ms post-stimulus window, while in the child group we detected the first peak as the most negative peak within 130–200 ms post-stimulus, and the second peak as the most negative peak within 200–275 ms post-stimulus. Since a clear positive peak at around 250 ms was also visible in the children's difference wave at frontal (and central) electrode positions, we also assessed this positive peak within 200–275 ms post-stimulus.

#### **STATISTICAL ANALYSES**

All statistical analyses were performed using SPSS 19 (IBM Corp., Armonk, NY, USA). For behavioral analyses, independent samples *T*-tests were performed. For between-groups comparisons of ERP peaks repeated measures mixed model ANOVAs were used, with between subject factor Group (child vs. adult) and within-subject factors Stimulus (standard vs. deviant nontarget), and Electrode position (O1 vs. O2; or F3 vs. F4). If Stimulus effects or interactions with Group were significant, follow-up repeated measures ANOVAs were fit for each group separately.

# **RESULTS**

#### **BEHAVIORAL DATA**

Response accuracy (percentage of correct responses) and reaction times for target conditions are indicated in **Table 1**. There was a significant difference in the response accuracy between the children and the adults for the deviant background target condition (*t* = 2.38, *p* = 0.022), whereas response accuracy for the standard background target condition did not differ (*p* > 0.08). All individuals across both groups performed the task with at least 95% accuracy. There were no significant differences in mean reaction time between the children and the adults (*p* > 0.1 for both conditions). These results show that both children and adults performed the task with high accuracy and were focusing their attention onto the center of the monitor.

**Table 1 | Behavioral data for target stimuli in adult (***N* **= 20) and child (***N* **= 21) groups.**


*Percentage of correct responses (accuracy) and reaction times in ms [as well as standard deviations (SD)] are indicated for both standard background (Std) and deviant background (Dev) target stimulus conditions.*

#### **ELECTROPHYSIOLOGICAL DATA**

ERPs to standard and deviant non-target background stimuli as well as the difference wave (deviant-standard) for adults are shown in **Figure 2** (for occipital electrode positions) and **Figure 3** (for frontal, central, and parietal electrode positions). For children this is shown in **Figures 4**, **5**. ERPs to standard and deviant non-target background stimuli overlaid for both adults and children are shown in **Figure 6** (for occipital electrode positions) and **Figure 7** (for frontal, central, and parietal electrode positions). Mean amplitudes and standard deviations are listed in the Appendix (**Table A1**).

#### *P1: amplitude*

The repeated measures mixed ANOVA for P1 amplitude at the occipital electrode positions demonstrated a main effect of Group [*F*(1, <sup>39</sup>) = 65.49, *p* < 0.001] in the absence of an interaction effect of Stimulus × Group [*F*(1, <sup>39</sup>) = 0.37, *p* = 0.684] or an effect of Stimulus [*F*(1, <sup>39</sup>) = 0.267, *p* = 0.608], indicating

that the P1 amplitude to both standard and deviant non-target stimuli was larger in children than in adults. Furthermore, a significant Electrode position × Group interaction [*F*(1, <sup>39</sup>) = 6.34, *p* = 0.016] was observed, indicating that for the children only, the P1 amplitude to standard and deviant non-target stimuli was larger at electrode position O2 than at electrode position O1. No other effects for P1 amplitude were observed.

# *P1: latency*

The repeated measures mixed ANOVA for P1 peak latency at occipital electrode positions demonstrated a significant effect of Group [*F*(1, <sup>39</sup>) = 16.04, *p* < 0.001] and a significant main effect of Stimulus [*F*(1, <sup>39</sup>) = 29.88, *p* < 0.001] in the absence of a Stimulus × Group interaction [*F*(1, <sup>39</sup>) = 0.21, *p* = 0.648]. No other effects for P1 peak latency were observed. Since a significant main effect for Stimulus was observed we performed follow-up

**FIGURE 6 | ERPs for both deviant non-target and standard non-target stimulus conditions in children (***N* **= 21) and adults (***N* **= 20) at electrode positions O1 and O2.**

**positions F3, Fz, F4, C3, Cz, C4, P3, Pz, and P4.**

# exploratory within group analyses to test whether the P1 peak latency effect of Stimulus held up for both groups separately.

# *Frontal negativity: amplitude*

A significant effect for Stimulus, in the absence of any other effect, was observed for both the adult [*F*(1, <sup>19</sup>) = 10.24, *p* = 0.005] and child [*F*(1, <sup>20</sup>) = 35.81, *p* < 0.001] group, indicating that for both groups the P1 to deviant stimuli peaked earlier than the P1 to the standard stimuli.

Statistical tests for the negativity occurring at around 150 ms (and for the positivity occurring at around 230 ms) at the frontocentral electrode positions were performed taking only frontal electrode positions into account, since responses were generally largest at those electrode positions. Furthermore, to limit the number of tests and to make comparison to the occipital electrode tests easier to interpret, only electrode positions F3 and F4 were included into the factor "Electrode position."

The repeated measures mixed ANOVA for the first negativity at around 150 ms at the frontal electrode positions demonstrated a main effect of Group [*F*(1, <sup>39</sup>) = 61.19, *p* < 0.001] in the absence of an interaction effect of Stimulus × Group [*F*(1, <sup>39</sup>) = 0.006, *p* = 0.937] or an effect of Stimulus [*F*(1, <sup>39</sup>) = 1.18, *p* = 0.284]. No other effects were observed. This pattern of results indicates that the amplitude of this peak to both standard and deviant non-target stimuli was larger in children than in adults.

#### *Frontal negativity: latency*

The repeated measures mixed ANOVA for peak latency of the negativity at around 150 ms at the frontal electrode positions demonstrated no main effect of Group [*F*(1, <sup>39</sup>) = 0.19, *p* = 0.684], but did show a main effect of Stimulus [*F*(1, <sup>39</sup>) = 7.68, *p* = 0.009] and a main effect of Electrode position [*F*(1, <sup>39</sup>) = 9.76, *p* = 0.003]. No other effects were observed. Since a significant main effect of Stimulus was observed we performed follow-up exploratory within group analyses to test whether the effect of Stimulus held up for both groups separately.

In the adult group, a significant effect of Stimulus was observed [*F*(1, <sup>19</sup>) = 5.27, *p* = 0.033] in the absence of any other effects, indicating that this negativity peaked earlier in the deviant stimulus condition than in the standard stimulus condition.

In the child group, no significant effect of Stimulus was observed, [*F*(1, <sup>20</sup>) = 3.02, *p* = 0.098], but a significant effect of Electrode position [*F*(1, <sup>20</sup>) = 8.98, *p* = 0.007] was observed. These results indicate that the latency of the negativity peak did not differ enough between standard and deviant non-target stimuli to reach significance, whereas it did peak earlier at electrode channel F3 than at electrode channel F4.

#### *N1: amplitude*

The repeated measures mixed ANOVA for N1 amplitude at the occipital electrode positions didn't show a significant effect of Group [*F*(1, <sup>39</sup>) = 0.007, *p* = 0.94]. However, a significant main effect of Stimulus [*F*(1, <sup>39</sup>) = 9.59, *p* = 0.004] and a trend for a Stimulus × Group interaction [*F*(1, <sup>39</sup>) = 3.22, *p* = 0.08] effect was observed. No other effects for N1 amplitude were observed. Since a significant main effect for Stimulus and a trend for a Stimulus × Group interaction were observed, we performed follow-up exploratory within group analyses to test whether the N1 effect of Stimulus held up for both groups separately.

In the adult group, significant effects of Stimulus [*F*(1, <sup>19</sup>) = 27.88, *p* < 0.001] and Electrode location [*F*(1, <sup>39</sup>) = 8.36, *p* = 0.009] were observed, indicating that the N1 amplitude to deviant non-target stimuli was larger than the amplitude to standard nontarget stimuli and that the amplitude on electrode position O2 was larger than the amplitude on electrode position O1.

In the child group, no significant effects were observed, indicating that N1 amplitudes did not differ enough between standard and deviant non-target stimuli and between occipital electrode positions to reach significance.

#### *N1: latency*

The repeated measures mixed ANOVA for N1 latency at the occipital electrode positions demonstrated a main effect of Group [*F*(1, <sup>39</sup>) = 61.80, *p* < 0.001] in the absence of any other effect, indicating that the N1 to both standard and deviant non-target stimuli peaked later in the children than in the adults.

#### *Frontal positivity: amplitude*

The repeated measures mixed ANOVA for the positivity at around 230 ms at the frontal electrode positions demonstrated a main effect of Group [*F*(1, <sup>39</sup>) = 4.12, *p* = 0.049], and significant effect of Stimulus [*F*(1, <sup>39</sup>) = 18.77, *p* < 0.001]. No other effects were observed. This pattern of results indicates that the peak amplitude at around 230 ms to both standard and deviant non-target stimuli was larger in children than in adults. Since a significant main effect of Stimulus was observed, we performed follow-up exploratory within group analyses to test whether the effect of Stimulus held up for both groups separately.

In the adult group, a significant effect of Stimulus [*F*(1, <sup>19</sup>) = 6.46, *p* = 0.020] was observed, in the absence of any other effects, indicating that the amplitude of the positivity at around 230 ms to deviant non-target stimuli was larger than the amplitude to standard non-target stimuli.

In the child group, a significant effect of Stimulus [*F*(1, <sup>20</sup>) = 12.72, *p* = 0.002] was observed, in the absence of any other effects, indicating that the amplitude of the positivity at around 230 ms to deviant non-target stimuli was larger than the amplitude to standard non-target stimuli.

#### *Frontal positivity: latency*

The repeated measures mixed ANOVA for latency of the positivity at around 230 ms at the frontal electrode positions demonstrated a main effect of Group [*F*(1, <sup>39</sup>) = 6.25, *p* = 0.017] in the absence of any other effect, indicating that the positivity at around 230 ms to both standard and deviant non-target stimuli peaked later in the children than in the adults.

#### *Difference waves*

Difference waves (deviant non-target stimuli – standard nontarget stimuli) for adults and children are shown in **Figure 8** (for occipital electrode positions) and **Figure 9** (for frontal, central, and parietal electrode positions).

We first compared the single occipital negativity occurring in the difference wave of the adult group, the two occipital negativities occurring in the difference wave of in the child group and the frontal positivities occurring in the difference waves of both groups against the average amplitude of the baseline period (−200 to 0 ms) to find out whether these peaks significantly differed from "0." Hereto, a repeated measures ANOVA with factors Stimulus (difference wave response vs. average baseline response), and Electrode position (O1 vs. O2; or F3 vs. F4) was used.

The repeated measures within (adult) group ANOVA for the amplitude of the negativity occurring at around 150 ms against the average baseline activity at occipital electrode positions demonstrated a main effect of Stimulus [*F*(1, <sup>19</sup>) = 41.87, *p* < 0.001] in the absence of any other effects.

The repeated measures within (child) group ANOVA for the amplitude of the first negativity against the average baseline activity at occipital electrode positions demonstrated a main effect of Stimulus [*F*(1, <sup>20</sup>) = 13.93, *p* = 0.001] in the absence of any other **(***N* **= 20) at electrode positions O1 and O2.**

effects. This was also the case for the second negativity {main effect of Stimulus: [*F*(1, <sup>20</sup>) = 20.08, *p* < 0.001]}.

The repeated measures within (adult) group ANOVA for the amplitude of the positivity at around 250 ms against the average baseline activity at frontal electrode positions demonstrated a main effect of Stimulus [*F*(1, <sup>19</sup>) = 40.85, *p* ≤ 0.001] in the absence of any other effects.

The repeated measures within (child) group ANOVA for the amplitude of the positivity at around 250 ms against the average baseline activity at frontal electrode positions demonstrated a main effect of Stimulus [*F*(1, <sup>20</sup>) = 30.55, *p* ≤ 0.001] in the absence of any other effects.

These results indicate that, for both groups, responses apparent in the difference wave over occipital electrode positions as well as over frontal electrode positions significantly differed from the average baseline amplitude.

Finally, we directly compared the difference wave responses that were occurring at around the same point in time between both groups. Hereto, a repeated measures mixed ANOVA with between subject factor Group (child vs. adult) and within subject factor Electrode (O1 vs. O2 or F3 vs. F4) was used.

Although the (first) negative difference wave<sup>1</sup> occurring at the occipital electrode positions appeared to be larger in the adult group than in the child group, the repeated measures mixed ANOVA indicated that the amplitude difference was not statistically significant [*F*(1, <sup>39</sup>) = 5.24, *p* = 0.043] between groups. No other effects for amplitude or latency were observed.

The repeated measures mixed ANOVA for the amplitude of the positivity occurring in the difference wave<sup>1</sup> at around 250 ms at the frontal electrode positions demonstrated a main effect of Group [*F*(1, <sup>39</sup>) = 4.98, *p* = 0.031], in the absence of any other effects, indicating that the amplitude of this positivity at around 250 ms was larger in the child group than in the adult group. No other effects for amplitude or latency were observed.

#### **DISCUSSION**

This study investigated the vMMR in healthy children as compared to healthy adults using a simple visual target detection task, during which task irrelevant gratings of high and low spatial frequencies were presented in the background. We found a robust vMMN in the difference wave (deviant non-target stimuli – standard non-target stimuli) occurring around 150 post-stimulus over occipital electrode positions in the adult group, and two occipital negativities in the children, the first one occurring at around 150 ms and a second one at around 230 ms. We also observed a positivity at frontal and central electrode positions at around 250 ms in both groups. This study confirms previous research investigating vMMN in healthy adults and is one of the first to investigate this difference wave in children aged 8–12 years old. The results indicate that both children and adults respond to the occurrence of rare task irrelevant visually deviant stimuli, although this response is still developing in healthy children ages eight to twelve and may be quite different in this age group in terms of morphology (amplitude, latency) and topography (occipital negativities, fronto-central positivity) compared to typically-developing adults.

Our results differ from previous work by Clery et al. (2012), in which changes in form and motion resulted in three sequential negative and one positive response in 8–14 year old children while only one negative response was observed in adults. We observed two negativities and one positivity in the difference waves in our study. Clery et al. argue that multiple peaks may be due to a sequential visual processing of deviancy necessary in the developing brain but not in the mature brain. Our results generally support this hypothesis, however, the inconsistent findings concerning number of negativities may indicate that these peaks are more dependent on individual differences, or are undergoing developmental changes in this age range. The differences between our results and those of Clery et al. (2012) could also be due to the different nature of the stimuli used and the properties each investigates: Clery et al. point out that it is difficult to determine whether their results were driven by changes in form, motion, or both. Perhaps less dynamic stimuli such as the ones used in our study impose reduced processing demands, insufficient to activate the third waveform observed by Clery et al. It would be interesting to determine if multiple peaks can be elicited with static stimuli of increasing complexity, or if this is due to the dynamism of a stimulus alone.

A limitation of this study is that it could be argued that stimulus effects from the use of low frequency gratings as deviant stimuli may account for the vMMN seen here. Spatial frequency deviance has been previously studied by Heslenfeld (2003), where differences in ERPs were indeed observed based on different spatial frequencies. Some behavioral differences were also observed: e.g., task-irrelevant stimuli of low spatial frequencies were more likely to interfere with performance than HSF stimuli, but only in difficult tasks. However, our task was not demanding and all subjects performed it easily and accurately, including the youngest children. In the previous study by Heslenfeld (2003), ERP effects were observed in different components of the ERP and different electrode sites than are studied here, such as a larger early C1 component (60–100 ms) in HSF gratings vs. low, as well as larger responses at frontal and central scalp sites at 120–180 ms in LSF stimuli vs. high. Heslenfeld concluded that this deviance was due to stimulus effects and was congruent with previous literature, which found higher response-interference and attentioncapturing properties of low spatial frequencies. However, the effects at occipital sites (120–200 ms) were independent of task load or spatial frequency, showing that this response was not related to individual stimulus properties or refractoriness. Hence, this negativity is likely the true visual analog of the auditory MMN because it is not related to stimulus features or task difficulty. Our results in the adult group show a negativity at comparable electrode locations and latency. Similar effects have been observed in other studies using the equiprobable paradigm (Czigler et al., 2006; Kimura et al., 2009), where two negativities were found but only one was attributed to stimulus-independent visual deviance. We believe that the mismatch effects observed in the current study are not solely related to refractoriness or spatial frequency effects although our study design did not allow for excluding this possibility. In the child group, two occipital negativities were observed. The second occipital peak co-occurs with the frontal positivity observed at around 250 ms. This may suggest recruitment of higher-order cognitive processes with a more frontally located brain source. However, more research is needed to confirm this hypothesis. We should also point out the fact that we examined the process of automatic visual deviance detection while participants were engaged in a visual target detection task. Hence all task stimuli were presented in the same modality. However, in a typical auditory MMN paradigm the participants' attention is usually directed toward another (e.g., visual) modality. Participants are asked to read a book or watch a movie for instance. Keeping attention focused within the same modality as opposed to dividing attention between the auditory and the visual domain may

<sup>1</sup>Note that the cognitive process underlying this difference wave response may not be identical in adults and children.

differentially impact the vMMR. Future studies could examine the possible effect of this on the vMMN.

There were also differences in other ERP components between adults and children: as seen previously in the literature, early components, particularly the amplitude of the P1 was larger in children and both the P1 and N1 peaked later in children. Batty and Taylor (2002) also noted this effect in a simple visual categorization task, finding that the amplitude of P1 seemed to decrease with age throughout adolescence. In our study, amplitude of the P1 was also larger and the peak more broad, resulting in a much later N1 in children vs. adults. It could be that underlying neural mechanisms are underdeveloped in children and/or that they may employ fewer response strategies when performing this particular task (i.e., concerns about speed, accuracy, and impulsivity management, and attention devoted to the task's purpose). Behavioral reports on subjects' experience of the task following the ERP experiment might help to answer this question.

This study adds to the limited pool of studies investigating vMMR in children. Due to the preliminary nature of this study, and aware of the developing cognitive system and accompanying changes in ERPs that tend to occur across the lifespan, we chose a limited age range to determine initial differences between children and adults. However, future research should examine other and even narrower age ranges in order to better map the development of vMMR. Our stimuli also probed only one aspect of automatic visual deviancy detection (spatial frequency), and future work should investigate other stimulus properties such as color, luminance, and size, to further understand development of the visual deviance response.

Considerations for future studies should also include investigating abnormal development of vMMR. Individuals with schizophrenia have been found to exhibit reduced amplitudes of vMMN when compared to healthy controls (Urban et al., 2008). Furthermore, reduced vMMN amplitude was found to be associated with lower levels of functioning in schizophrenia, as well as with higher levels of medication dosage. In another study, Qiu et al. (2011) found decreased vMMN amplitudes in individuals with major depressive disorder, although this difference did not correlate with depression severity.

Although the above research has demonstrated the usefulness of vMMN as a potential clinical tool, few studies have investigated altered vMMR in disorders affecting children. To our knowledge there have only been two other studies of vMMR in children with neurodevelopmental disorders (Horimoto et al., 2002; Clery et al., 2013). Visual MMR could be useful to probe visual information processing deficits in children with neurodevelopmental disabilities, and future work should investigate what differences in vMMR, if any, might occur in atypical neurodevelopment.

#### **ACKNOWLEDGMENTS**

We thank the adult participants, children, and their families who took part in this study as well as Erin King for assistance with data analysis, Alana Campbell for assistance in data interpretation, and Lora Maroney for editorial help with preparing the manuscript. We thank the two reviewers for their helpful comments on earlier versions of this manuscript.

# **REFERENCES**


double-blind, placebo-controlled study with the hemoderivative actovegin in age-associated memory impairment. *Neuropsychobiology* 24, 49–56. doi: 10.1159/000119042


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 28 February 2013; accepted: 17 December 2013; published online: 30 December 2013.*

*Citation: Cleary KM, Donkers FCL, Evans AM and Belger A (2013) Investigating developmental changes in sensory processing: visual mismatch response in healthy children. Front. Hum. Neurosci. 7:922. doi: 10.3389/fnhum.2013.00922*

*This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2013 Cleary, Donkers, Evans and Belger. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# **APPENDIX**

#### **Table A1 | Peak amplitude values and latencies for ERPs and difference waves for electrode positions O1, O2, F3, and F4.**


*ERP peak amplitude values and latencies as well as their respective standard deviations (in parentheses) for standard non-target (Std) and deviant nontarget (Dev) background stimulus conditions in adult (N* = *20) and child (N* = *21) groups. Neg, Negativity; Pos, Positivity; Diffwave, Difference wave; MMN, Mismatch Negativity;* <sup>+</sup>*, significantly different from child group with p* < *0.05; \*, significantly differs from average baseline activity with p* < *0.05 (within group); n/a, non-applicable.*

# Visual mismatch negativity: a predictive coding view

# *Gábor Stefanics 1,2\*, Jan Kremlácek ˇ <sup>3</sup> and István Czigler <sup>4</sup>*

*<sup>1</sup> Translational Neuromodeling Unit, Institute for Biomedical Engineering, University of Zurich, ETH Zurich, Zurich, Switzerland*

*<sup>2</sup> Laboratory for Social and Neural Systems Research, Department of Economics, University of Zurich, Zurich, Switzerland*

*<sup>3</sup> Department of Pathological Physiology, Faculty of Medicine in Hradec Králové, Charles University in Prague, Hradec Králové, Czech Republic*

*<sup>4</sup> Research Center for Natural Sciences, Institute of Cognitive Neuroscience and Psychology, Hungarian Academy of Sciences, Budapest, Hungary*

#### *Edited by:*

*Lynne E. Bernstein, George Washington University, USA*

#### *Reviewed by:*

*Erich Schröger, University of Leipzig, Germany Floris P. De Lange, Radboud University Nijmegen, Netherlands*

#### *\*Correspondence:*

*Gábor Stefanics, Translational Neuromodeling Unit, Institute for Biomedical Engineering, University of Zurich, ETH Zurich, Wilfriedstrasse 6, Zürich, CH-8032, Switzerland e-mail: stefanics@biomed.ee.ethz.ch* An increasing number of studies investigate the visual mismatch negativity (vMMN) or use the vMMN as a tool to probe various aspects of human cognition. This paper reviews the theoretical underpinnings of vMMN in the light of methodological considerations and provides recommendations for measuring and interpreting the vMMN. The following key issues are discussed from the experimentalist's point of view in a predictive coding framework: (1) experimental protocols and procedures to control "refractoriness" effects; (2) methods to control attention; (3) vMMN and veridical perception.

**Keywords: EEG, ERP, perceptual learning, predictive coding, prediction error, repetition suppression, stimulus specific adaptation, visual mismatch negativity**

# **INTRODUCTION—WHAT IS VISUAL MMN AND WHAT IS IT GOOD FOR?**

Current theories of visual change detection emphasize the importance of focal attention to detect changes in the visual environment (Rensink, 2002; Simons and Rensink, 2005). However, an increasing body of studies shows that the human brain is capable of detecting even small visual changes, especially if such changes violate automatic (non-conscious) expectations (based on repeating experiences). In other words, our brain automatically represents statistical regularities of the environment and registers "surprising" events. Since the discovery of the mismatch negativity ERP component, the majority of research in the field has focused on auditory deviance detection, operating outside the focus of active attention. Historically, change detection indexed by the MMN was thought to be primarily an auditory phenomenon (Näätänen et al., 2001), hearing being a "temporal" sensory modality. However, substantial evidence has accumulated suggesting that automatic mechanisms of change detection operate in the visual modality too.

The system generating the auditory MMN has been referred to as a "primitive system of intelligence" by the discoverer of the MMN response (Näätänen et al., 2001). This system organizes the auditory input by extracting the common invariant patterns shared by a number of acoustically varying sounds, anticipates the events of the immediate future in the absence of attention, and even manifests simple concept formation. In a general framework of human cognition Kahneman (2011) postulated two general systems underlying information processing. System 1 is automatic and fast, and works without effort of voluntary control, whereas System 2 uses attention to carry out effortful mental activities1. He describes System 1 as "effortlessly originating impressions and feelings that are the main sources of the explicit beliefs and deliberate choices of System 2" and identifies automatic change detection ("Orient to the source of a sudden sound") as an automatic activity of System 1 which is capable of generating complex patterns of ideas by extracting regularities from the environment. In Kahnemann's framework, the main function of System 1 is to maintain and update our model of the world, which represents what is normal in it, i.e., what is predictable based on past events. The visual MMN can be described as the electrophysiological correlate of the automatic detection of *unpredicted* changes in our visual environment carried out by System 1.

In MMN paradigms short term predictive representations of environmental regularities are thought to be formed based on the observed likelihood of frequently repeating events (standard). Implicitly learned statistical regularities serve as a basis to automatically detect rare events (deviant) which do not match predictions. Recent modeling studies (Lieder et al., 2013a,b) suggest that the (auditory) MMN reflects approximate Bayesian learning of sensory regularities, and that the MMN-generating process adjusts a probabilistic model of the environment according to mismatch responses (MMRs) (prediction errors). The *MMN response is widely considered as a perceptual prediction error signal* (Friston, 2005; Garrido et al., 2008, 2009; den Ouden et al., 2012; Stefanics and Czigler, 2012)—a member of a family of prediction errors, which include perceptual, higher cognitive, and motivational prediction errors.

<sup>1</sup>Note that Kahnemann is only using the distinction of System 1 and 2 as a metaphor of two agents to illuminate different aspects of human cognition.

The notion that automatic change detection in the visual modality does not operate only at the level of simple sensory features such as color (Czigler et al., 2002, 2004, 2006a; Horimoto et al., 2002; Mazza et al., 2005; Kimura et al., 2006b; Liu and Shi, 2008; Grimm et al., 2009; Thierry et al., 2009; Czigler and Sulykos, 2010; Müller et al., 2010; Mo et al., 2011; Stefanics et al., 2011), line orientation (Astikainen et al., 2004, 2008; Czigler and Pató, 2009; Flynn et al., 2009; Kimura et al., 2009, 2010a, 2006b; Czigler and Sulykos, 2010; Sulykos and Czigler, 2011), or spatial frequency (Heslenfeld, 2003; Kenemans et al., 2003, 2010; Maekawa et al., 2005, 2009; Sulykos and Czigler, 2011), but also at higher cognitive levels, has been supported by several visual MMN studies. Recent studies demonstrated that object-based irregularities are automatically detected by the visual system (Müller et al., 2013), as well as irregular lexical information (Shtyrov et al., 2013). Another recent study showed that visual mismatch negativity (vMMN) can be elicited both by real and illusory brightness changes (Sulykos and Czigler, 2014). vMMN was also evoked by changes in abstract attributes (if..then conditional probability) of simple geometric patterns (Stefanics et al., 2011), but also by changes in attributes of complex natural stimuli such as laterality of body parts (Stefanics and Czigler, 2012), or socially more relevant stimuli such as facial emotions (Susac et al., 2004, 2010; Zhao and Li, 2006; Astikainen and Hietanen, 2009; Chang et al., 2010; Kimura et al., 2012; Stefanics et al., 2012; Fujimura and Okanoya, 2013), and facial gender (Kecskés-Kovács et al., 2013b). These observations are well in line with theories of generative models (for reviews, see Kimura et al., 2011; Winkler and Czigler, 2012; Clark, 2013) which posit that unpredicted stimulus attributes evoke mismatch signals (prediction errors) which in turn modifies predictions pertaining to the given attributes.

Although studying both visual and auditory mismatch processes rests on the common principle that extraction of statistical regularities in characteristics of many environmental events can be probed indirectly by recording the MMN response to events which violate such regularities, there are also important methodological differences between visual and auditory mismatch paradigms. For example, to minimize attentional components in ERPs evoked by events in auditory MMN experiments, often a separate visual task is used to engage the attention of participants, thus MMN-evoking stimuli are task-independent and assumed to be unattended. Due to the relative dominance of vision over hearing, primary visual tasks are useful in auditory studies. However, visual MMN studies should also use visual tasks instead of auditory tasks to effectively minimize attentional effects in processing of MMN-evoking stimuli. Here we provide a brief summary of some of the important methodological approaches and their rationale which we believe should be taken into account when one designs a visual MMN protocol and interprets its results.

MMN is often elicited by rare events embedded in a series of frequently repeating events. It is important to emphasize that labeling an event as "surprising," "unexpected," or "improbable" can be based on probabilities learned over shorter or longer time scales. Regularities (i.e., probability structure of events) established in MMN/vMMN paradigms exist over relatively short time scales, in the range of 4–15 s in the auditory modality (Mäntysalo and Näätänen, 1987; Cowan et al., 1993; Ulanovsky et al., 2004), and probably less in vision (Astikainen et al., 2008) and have been suggested to be supported by short-term synaptic plasticity (Garrido et al., 2009; Kujala and Näätänen, 2010). The possibility of multiple short-term mechanisms has led to a rather long but not particularly productive debate on the processes underlying the MMN, usually labeled as the "refractoriness" issue. The contribution of the repetition effect to the differential activity evoked by the rare stimulus, i.e., the "refractoriness" issue, will be discussed in Section Memory mismatch and refractoriness.

MMN is usually observed when a "surprising," "unexpected," "unpredicted," or "infrequent" event occurs. It is important to point out that in the context of (v)MMN research, none of these terms refers to processes requiring attention. Registration of the change in likelihood of task-irrelevant environmental events happens in the absence of attention or without conscious effort (Näätänen et al., 2001, 2007, 2010). One prerequisite for such a "surprise" is that the neural populations which generate the MMN have extracted a statistical regularity from the sequence of environmental events, so that it has become able to detect events which deviate from the regular. Surprise can thus only occur if some kind of a prediction has been formed a priori. Although most MMN experiments employ sequential regularities, recent evidence indicates that the human perceptual system implicitly encode non-sequential stochastic regularities too and keep track of the *uncertainty* induced by apparently random distributions of sensory events (Garrido et al., 2013). vMMN paradigms usually employ attention-demanding primary tasks to ensure that activity of conscious attentional mechanisms is not superimposed on mismatch activity. A variety of primary tasks have been used in different studies, which will be discussed in Section Visual MMN and attention.

At the outset of vMMN research, studies focused on individual features (color, spatial frequency, orientation, movement direction, etc.); later vMMN has been investigated for feature conjunctions, object-related deviances and the violation of sequential regulations. Furthermore, an increasing number of studies show, that vMMN is also sensitive to higher-order deviances and correlates with behavioral measures. Importantly, the features defining the contents of automatic expectations can be not only simple physical, but more abstract properties too, even socially relevant signals such as facial emotions. Thus, mechanisms underlying the vMMN are able to support flexible categorization processes (Czigler, 2013). The relationship between visual mismatch and behavior is discussed in Section The link between vMMN, veridical perception, and behavior.

According to the hierarchical predictive coding framework veridical perception is supported by neural processes optimizing probabilistic representations of the causes of sensory inputs (Friston, 2010). The continuous interaction between top-down flow of predictions and bottom-up flow of prediction errors keeps our internal model of reality up-to-date. Here we argue that the visual MMN response is a "special case" of the ubiquitous prediction error signals that support our internal model of reality, where the incoming input is highly improbable (deviant) based on the probability of the frequent events (standard). That is, the function of the "vMMN-generating system" is to update our predictive model of the world by means of prediction errors and infer the likely causes of the sensory inputs. *We interpret the vMMN as a prediction error signal to visual input that does not match probabilistic representations of the predicted (external causes of) input.* Unpredicted events carry a lot of information and can be important to survival. Thus, a further role that has been attributed to the mismatch signal is a trigger function for attention allocation (Nyman et al., 1990; Deouell, 2007). Attention is thought to increase precision of sensory signals (e.g., Feldman and Friston, 2010; Kok et al., 2012; Adams et al., 2013) and can deploy decision making and executive mechanisms.

The logic of the MMN studies rests on the usually hidden and rarely-studied process during which repetition of an event leads to the formation of a prediction pertaining to the probability of a given "feature" or "event" to occur. Such predictions in the MMN research are often referred to as "regularities" extracted from the stimulus stream (Winkler, 2007) and its presence is usually demonstrated indirectly by showing that stimuli that deviate from the frequent stimuli evoke a differential (mismatch) response. Most studies emphasize only one obviously beneficial aspect of automatic mismatch processes, namely the automatic registration of unpredicted changes in the environment, which has been suggested to trigger an attention orienting response (e.g., Kimura et al., 2008b). However, the other side of this coin is perhaps as much as important, namely the extraction and representation of the regular features, i.e., the formation of predictions (for a similar notion in auditory stream segregation see Schröger et al., 2014).

The extraction of the "common nominator" across repeating events leads to the representation of their invariant feature, which is the regularity itself. From this point of view the automatic build-up of a prediction corresponds to the process of *implicit category formation*, in a sense that a common feature which characterizes successive events has become active as an *ad-hoc* automatic "perceptual filter." Thus, visual MMN seems to be suitable for studying whether a given visual "feature" is represented as an implicit category which serves as a basis for automatic discrimination processes and enables detection of remarkable/significant changes based on statistical characteristics of the environment. In summary, the vMMN is a universal tool which can be used to study automatic sensory discrimination and implicit (category) learning, i.e., a wide aspect of cognitive functions relying on visual information.

#### **MEMORY MISMATCH AND REFRACTORINESS**

Repetition of events lead to a response attenuation, a phenomenon often referred to as repetition suppression, stimulus-specific adaptation (SSA), habituation, refractoriness, or neural fatigue (Grill-Spector et al., 2006). Traditionally, amplitude decrease of ERP components over repetitions has been attributed to the decreased responsiveness of neurons for repeated input (Näätänen and Picton, 1987; May and Tiitinen, 2010). According to the "refractoriness" or "fatigue" model, in oddball sequences, neurons responding to the specific characteristics of the standard stimulus might acquire the refractory state, while the deviant stimulates "fresh" neural populations. Consequently, the amplitude of the exogenous (or obligatory sensory) ERP components evoked by the deviant will be larger than that of the standard. Such amplitude difference can be considered as a basic physiological phenomenon, without any cognitive functional significance. Alternatively, decreased activity can be considered as a manifestation of an active memory representation, established by the previous stimulation. The "predictive coding" account went a step further, suggesting that repetition suppression depends on the probability structure of the environment (see e.g., Summerfield et al., 2011) and involves an active process which generates models of the causes of the sensory input. These generative models can be thought of as hierarchical memory representations of stimulus characteristics, equivalent to predictive perceptual object representations (Winkler and Czigler, 2012). A stimulus that does not match this representation elicits a "mismatch process." This process is manifested as an ERP component (MMN/vMMN). It is worth noting here that unpredicted omissions of attended (Bullock et al., 1994) and unattended (Czigler et al., 2006b) visual stimuli also evoke distinct ERP components which are difficult to account for based on the "fatigue" model, since there is no physical stimulus presented to activate "fresh" neural populations, although it is not known to what extent these components can be attributed to violated predictions and attentional effects. However, after more than three decades of research on MMN, the relationship of the "fatigue model" and "memory mismatch" (including the predictive coding account) has remained an unsettled issue (e.g., Näätänen et al., 2005; Garrido et al., 2009; May and Tiitinen, 2010; Wacongne et al., 2011, 2012; Todorovic and de Lange, 2012).

MMN/vMMN (or MMR) can be defined in at least two ways. In a broader and functional sense it is the ERP correlate of an automatic comparative process where the observed stimulus is different from perceptual memory representations of environmental regularities activated by recent external events. According to this definition, stimulus-specific response decrements to repeating events can be considered as a mechanism of memory match, and increased ERP amplitude to rare deviant events as a correlate of memory update. This is in line with the hierarchical predictive coding framework, where updating memory happens via a mismatch processes, i.e., prediction error responses update the models about external causes of the observed input (see **Figure 1**). The other definition is more restricted: "genuine" MMN/vMMN is the deviant-minus-standard differential activity, unless the difference is due to modulation by attention or refractoriness (passive amplitude reduction) of a negative ERP component, i.e., N1 (for the visual modality see e.g., Kimura, 2012). Separating "genuine" mismatch from activity due to passive amplitude reduction is important. If there is more than one process underlying stimulus–specific response decrements to repeating events, it is important to isolate these different kinds of activity and identify their potentially different contributions or functional roles.

The neurophysiological processes underlying regularity extraction, i.e., the formation of a predictive representation of stimulus features is not fully understood yet. A modeling study of the auditory MMN showed that experience-dependent plasticity can be explained by changes in the synaptic efficacy of extrinsic

**FIGURE 1 | Simplified scheme of the hierarchical predictive coding**

**framework (Friston, 2005, 2008, 2010).** The figure shows message passing between two putative neuronal populations: error units (E) and representation units (R). In this framework, bottom-up forward connections convey prediction errors (MMN or mismatch response) and top-down backward connections carry predictions, which explain away prediction errors (repetition suppression). Representation units residing in deep layers of cortical columns are thought to code the causes of sensory inputs. Representation units receive input from error coding units (E) in superficial layers in the same level (dotted lines) and lower hierarchical levels, and also from lateral connections at the same level (not shown). Lateral interactions between R and E units are proposed to select and sharpen R units, which in turn encode the causes of a given sensory inputs. Error units residing in superficial layers of cortical columns receive input from representation units in the same level and the level above. Inhibitory intrinsic connections are depicted by means of black arrows above and below E and R units, respectively. Perception depends upon a set of prior expectations, i.e., regularities extracted from earlier sensory events. Environmental statistical regularities are transformed into predictions about current sensory signals via the interaction of E and R populations. In MMN experiments using scalp EEG recordings the deviant ERP is contrasted to the standard ERP and components of their difference are commonly interpreted as manifestation of a prediction error signal. On the other hand, electrophysiological studies involving repetition suppression, i.e., the decrease in response amplitude over multiple presentations, provide only indirect evidence for the existence of putative representation units. That said, a recent functional magnetic resonance imaging (fMRI) study (de Gardelle et al., 2013) provides initial evidence for units coding perceptual predictions. Nevertheless, the hierarchical predictive coding framework elegantly accommodates the "fatigue model" and "memory mismatch" account of the visual and auditory mismatch negativity.

and intrinsic connections of sources generating the MMN (Garrido et al., 2008). Perceptual learning, caused by stimulus repetition, has been suggested to be brought about by changes in intrinsic and extrinsic neural connectivity corresponding to adaptation and prediction updating (model adjustment) processes, respectively (Garrido et al., 2009). Thus, reduction in response amplitude to repeated events is thought to be brought about by fast changes in synaptic connections (Baldeweg, 2006, 2007; Garrido et al., 2009) within and between hierarchical levels of neural elements which represent predictions based on previous events and generate MMRs (prediction errors) when deviation from prediction occurs (Friston, 2005). **Figure 1** shows a simplified diagram of connections through which information flows between different layers of cortical columns at different hierarchical levels based on known functional anatomy (Zeki and Shipp, 1988; Douglas and Martin, 2004; Bastos et al., 2012).

According to this view (Friston, 2005, 2008, 2010), prediction errors flow bottom-up and update predictions at higher levels, whereas top-down modulations mediate predictions by "explaining away" (reduce) prediction errors at lower levels, forming hierarchical non-linear loops. Predictive coding theories of perception postulate that our internal model of probable causes of sensory events (i.e., reality) consists of a set of such loops (Winkler and Czigler, 2012) being supported by the complex hierarchical organization of brain networks (Kiebel et al., 2008; Wang, 2010; Arnal and Giraud, 2012).

# *Commonly used experimental protocols and procedures to elicit vMMN*

There are mainly two kinds of protocols used to study vMMN. Since the vMMN is elicited by events which violate a probabilitybased regularity, these protocols systematically vary the probability of different stimulus types. Frequently used is the "oddball" paradigm, where the same type of stimulus is presented frequently, interspersed with a rare different stimulus which is sometimes referred to as "oddball." There are essentially two types of oddball paradigm. In the "active oddball," where the rare stimulus is usually task-relevant and attended, the rare stimulus is termed "target," and is used to elicit P3b/P300 and other attentionrelated components. In vMMN experiments the "passive oddball" is used (**Figure 2**), where the stimulus stream which is used to build up automatic predictions is unattended, the rare stimulus (or stimulus feature) is task-irrelevant and is termed "deviant," emphasizing its difference from the frequent "standard."

Stimuli in every sensory modality elicit exogenous (obligatory) ERP components. The amplitude and latency of these components depends on the physical characteristics of the stimuli (e.g., luminance, contrast, or spatial frequency) and stimulus conditions (e.g., the time interval between successive stimuli). If deviant and standard stimulus categories are not equated appropriately, then the deviant minus standard difference wave is a summated activity of mismatch-related processes and brain electric activity in response to other different stimulus characteristics. This latter activity is not elicited by the violation of the probability-based rule established by the pattern of the stimulus sequence, and it might confound the vMMN. It is not known how variability of stimulus features—other than on which the probability-based rule rests—affects the mismatch generation process. Therefore, in experiments where vMMN is used as a tool to address a specific question of automatic information processing in the brain, it is advisable to make sure that *different stimulus types differ only in that feature which carries the distinctive information*<sup>2</sup> , i.e., which defines the standard vs. deviant stimulus categories.

**Figure 2** illustrates the oddball paradigm. In the traditional passive oddball paradigm the standard and deviant stimuli differ in their (i) physical properties and (ii) probability. Correspondingly, (i) different (although potentially overlapping) neuronal pools will respond to the standard and deviant and (ii)

<sup>2</sup>For example, the free Matlab-based SHINE (Spectrum, Histogram, and Intensity Normalization and Equalization) toolbox offers functions to control low-level image properties (Willenbockel et al., 2010).

their level of adaptation will differ. A frequently applied solution to control for potential ERP differences arising due to differences in physical stimulus properties involves changing the probabilities of standard and deviant stimuli across experimental blocks (e.g., Stefanics and Czigler, 2012; Stefanics et al., 2012; Csukly et al., 2013). Running a "reverse block" generates data that allows comparison of ERPs to physically identical stimuli which served both as standard and deviant in different experimental blocks and thus eliminates one of the potential confounds inherent in the design of an oddball paradigm. Although reverse blocks for oddball series offer stimulus conditions which allow control for physical differences between standard and deviant, they do not control for repetition effects arising from the difference in the presentation rate between standard and deviant.

The less frequently used protocol to elicit vMMN is the "roving standard" paradigm (**Figure 3**), where the first stimulus in the train can be considered as "deviant" which over several repetitions becomes the "standard." An advantage of the "roving standard" compared to the oddball paradigm is that it allows studying repetition effects following stimulus change, i.e., the time course of response decrement over repetitions. The roving paradigm has only been used in few vMMN studies so far (Czigler and Pató, 2009; Sulykos et al., 2013), and these studies did not take advantage of the roving protocol to study repetition effects. In terms of experiment duration, running a roving standard paradigm should take less time than running an oddball sequence and its "reverse" control condition, provided that the deviant/standard ratio is the same in both paradigms. Thus, the roving paradigm is less demanding for participants, which might be particularly important in case of children and patient populations.

#### *Exogenous ERP components in the vMMN range*

In case of interest in mismatch-related processes beyond the stimulus-specific refractoriness, it is important to separate the probability effects on the exogenous components and the putative additional activity. ERP components are often classified as exoor endogenous (Donchin et al., 1978; Näätänen, 1992; Koelsch, 2012). External stimuli are necessary and sufficient to elicit exogenous components and they are main determinants of the characteristics of exogenous components, whereas external stimuli are not necessary to elicit endogenous components which are dependent on factors such as attention and intention. Compared with visual ERPs, the succession and scalp distribution of exogenous auditory components (N1, P1, and N2) is remarkably stable. Most importantly, in the present context, reliable auditory N1 emerges over the anterior scalp within the 70–160 ms latency range. The auditory N1 consists of several sub-components with different latencies, scalp distributions and refractoriness characteristics (Budd et al., 1998). However, the N1 is treated as a single component in the majority of MMN studies. To claim that at least a part of the deviant-related negativity in vision is due to refractoriness, it is necessary to identify functionally similar exogenous component(s). In fact, the N1 visual component is present in many visual ERP studies, and traditionally, this component is treated as the analog of the auditory N1. However, the component structure of exogenous visual potentials is highly variable. Furthermore, the set of exogenous components in vision is more complex. The onset of visual stimuli might elicit luminance and pattern-specific ERP components. The latency and polarity of these components depend on the stimulated part of the visual field (Jeffreys and Axford, 1972; Di Russo et al., 2002). The interaction of the luminance and pattern-related activity adds further variability to the scalp-recorded waveform. The polarity and amplitude of scalp-recorded ERPs depend on the spatial orientation of their underlying (dipolar) sources (Di Russo et al., 2002, 2003), which is in turn defined by the particular folding structure of the cortical generator area and its relative position to the active and reference electrodes. Taking into account the spatial extent of visual brain areas and their complex folding structure, it is easily conceivable that some deviant minus standard difference waves will show not only deviant-related negativity but also deviantrelated positivity at some posterior sites. Accordingly, although several vMMN studies indicate that in the vMMN latency range the event-related activity is dominantly negative over the posterior locations (over the visual brain areas), in other studies, no characteristic negativities have been recorded.

In the auditory modality, the repetition-related N1 decrement within a stimulus sequence occurs mostly between the first and second stimulus presentation, without hardly any decrement with further stimulus repetitions (Budd et al., 1998), suggesting that refractoriness is the main reason underlying the N1 amplitude decrement. In this study, ERP amplitudes to the first and subsequent stimuli were investigated after a long silent period. Such a decrement results from the combined effect of non-specific factors and factors specific to the repetition of particular stimulus features (stimulus specific refractoriness). In a recent electrocorticography (ECoG) study using an auditory paired stimulus paradigm numerous cortical regions were found to generate remarkable N1 responses, and about half of them, including frontal, orbito-frontal, cingular, parietal, and temporal areas exhibited significant repetition suppression effects

(Boutros et al., 2011). This finding suggests that N1 amplitude suppression might result mainly from active processes, and not only from passive refractoriness. Importantly, the difference in the topography of the initial response and the repetition effect suggests that these two functions are supported by distinct neural circuitries. Refractoriness changes as a function of the duration of stimulus onset asynchrony (SOA); therefore, using longer intervals between consecutive stimuli, a smaller amplitude difference is expected. As for the visual modality, according to recent studies, the SOA effects on posterior visual ERP components are not particularly large. Coch et al. (2005) observed no amplitude increase in the N1 range between 450 and 650 ms SOA, while the preceding positivity was larger at the longer SOA value.

In some studies, the application of repeated stimuli after a stimulus change (AAABAA) did not elicit decreased exogenous activity. In an oddball sequence, Kimura et al. (2010d) compared the ERPs of the first and second standard after a deviant in a task with orientation deviancy. The orientation of stimulus bars was task-irrelevant; participants had to respond if the bars had round but not square edges. In this study, the peak of the negative component of the first and second standard after a deviant at ∼150 ms was not different; the ERPs of the first and second deviants diverged somewhat later, at ∼170 ms. Furthermore, there was no difference between the ERPs of the second standard and the average of the standard-related ERPs. Czigler et al. (2006a) presented colored checkerboard stimuli in a regular AABBAABB order (A and B corresponding to red and green), with 350 ms SOA, where the deviant was an unpredicted repetition of a color, e.g., BBAAA). According to the "refractoriness" model, the repeating predicted stimulus (e.g., AA) is expected to elicit smaller exogenous activity. However, such stimuli elicited larger posterior negativity than the regular change (e.g., AB). Moreover, Stefanics et al. (2011) recorded ERPs in a sequence of paired stimuli with equal probability of within-pair color change or color differences. The between- and within-pair SOA was 800 and 300 ms, respectively. In this study, the stimulus change and stimulus repetition elicited almost identical ERPs. Findings of a recent fMRI study might resolve these seemingly controversial results. de Gardelle et al. (2013) presented subjects with repeating face stimuli and found that distinct patches of faceresponsive extrastriate region showed simultaneously repetition enhancement and suppression responses to repetitions. This finding is consistent with the predictive coding account which posits representation (prediction) coding units enhance their activity and error coding units show decreased activity over repetitions.

To demonstrate the relationship between exogenous activity and the deviant minus standard difference potentials, here we survey studies which used deviant stimulus orientation to elicit vMMN. This type of deviant has been applied in several studies in various laboratories, and it was also used in studies that attempted to eliminate refractoriness effects using the so called "equal probability control" condition. Kimura et al. (2009) presented single gray bars in the center of a dark screen (stimulus with luminance increase). The stimuli elicited a posterior positivity with ∼100 ms latency (P1), followed by negativity with ∼150 ms latency (N1). Astikainen et al. (2008) presented a single dark bar in the center of a gray background (stimulus with luminance decrease). In this study, a large posterior positivity emerged with ∼140 ms latency, and the subsequent negativity with ∼210 ms peak latency. Kimura et al. (2009) showed that the deviant minus standard difference emerged as a parietooccipital negativity in the 100–250 ms range, while Astikainen et al. (2008) showed negativity in the 185–205 ms range. Czigler and Sulykos (2010) presented a texture of colored oblique lines in a dark field. The latency of the posterior negativity was ∼130 ms, followed by positivity with ∼250 ms latency. Deviant-related negativity appeared in the 130–190 ms interval, with peak latency of ∼160 ms, i.e., the difference potential peaked later than the exogenous negativity. Sulykos and Czigler (2011) presented a set of gray-scaled Gabor-patches in a dark stimulus field, either to the lower or upper half of the visual field. The lower half-field stimulation elicited a posterior positive-negative-positive sequence of potentials with ∼100, ∼150, and ∼240 ms peak latencies, respectively, whereas the polarity of the components was reversed in the upper half-field stimulation (∼100, ∼170, and ∼260 ms peak latencies, respectively). The deviant minus standard difference potential also showed polarity reversal depending on which hemifield was stimulated, and its peak latency was 130 ms at lower half-field stimulation and 132 ms at upper half-field stimulation, i.e., the deviant-related activity appeared earlier than components in the "N1" or "inverted N1" range. Takács et al. (2013) presented a set of task-irrelevant Gábor-patches with deviant and standard orientations to the whole visual field while participants performed a tracking task presented in the center of the visual field. Over the occipital scalp Gabor-patches elicited a positive-negative-positive complex with ∼90, ∼110, and 240 ms peak latencies, respectively. At occipito-temporal locations, a further negativity emerged with 170 ms peak latency. Deviant-related negativities emerged in the 120–140 and ∼200–230 ms intervals, i.e., outside the ranges of the exogenous negativities.

Some of the above studies (Astikainen et al., 2008; Czigler and Sulykos, 2010) showed that the posterior negative difference potential appeared in the range of a positive ERP component. Similar examples were observed in studies with other deviant features (see e.g., Czigler et al., 2002; Liu and Shi, 2008; and Stefanics et al., 2011 for color; Kremlácek et al., 2006 ˇ and Pazo-Alvarez et al., 2004b for motion direction; Maekawa et al., 2005 for shape/spatial frequency). However, none of these studies reported "mismatch positivity" at posterior sites, i.e., a potentially refractoriness-related effect appearing as a positive difference potential. To our knowledge, no argument has been presented for the exclusive sensitivity to refractoriness of posterior negative ERP components and the lack of refractoriness in the case of positive components. Nevertheless, it is worth noting that positive components of the deviant minus standard waveforms have been observed at central (Stefanics et al., 2012; Csukly et al., 2013) and frontal (Stefanics and Czigler, 2012) sites, evoked by deviant facial emotions and hand laterality, respectively, which correlated with behavioral measures.

#### *The equal probability control for repetition effects*

Schröger and Wolff (1996) and Jacobsen and Schröger (2001) suggested the elegant method of equal probability control to deal with repetition effects due to refractoriness assumed to be present in the deviant minus standard activity obtained in oddball paradigms. This method allows comparison of ERPs elicited by the deviant of the oddball sequence to the ERPs elicited by physically identical stimuli from a sequence without one particular frequent (standard) stimulus. In the equal probability control condition (**Figure 4**) stimuli with a structured set of parameters are presented where the mean difference between consecutive stimuli is equal to or larger than the difference between the deviant and standard used in the oddball sequence, furthermore stimuli identical to the oddball deviants have the same probability as the deviants. Activity considered as "genuine" MMN (i.e., MMN without stimulus specific refractoriness effects superimposed) emerges when the oddball deviant elicits larger negativity than the control stimuli. It should be noted, that the equiprobable control can be considered as a sequence of deviants where each stimulus violates the expectation based on the previous stimulus, i.e., that a given stimulus would repeat. Therefore, the ERP to the equiprobable control stimulus probably contains weaker prediction error responses than those to oddball deviants since there is less sensory evidence available for every external event in the equiprobable control condition due to the lack of sequential stimulus repetitions. From a probabilistic point of view, the "genuine" vMMN to the oddball deviant reflects a prediction error to events which violates expectations based on stronger sensory evidence provided by frequent standard stimuli.

Studies employing changes in line orientation have illustrated the relationships between deviant-related negativity and exogenous components using the equal probability control. These studies (Astikainen et al., 2008; Kimura et al., 2009) were discussed above in the context of the relationship between the exogenous and deviant-related negativities. Kimura et al. (2009) showed that the equal probability control efficiently removed the early part of the deviant-related negativity of oddball sequences. As a result, "genuine vMMN" appeared in the 200–250 ms range. Astikainen et al. (2008) showed that the deviant minus equal probability control difference resulted in a less broad distribution of the difference potential over posterior locations, but the latencies (185– 205 ms) were identical in the deviant minus standard and deviant minus control differences. In a recent study, Kimura and Takeda (2013) presented a set of gray bars on a dark field and recorded exogenous activity at parieto-occipital locations with ∼180 ms peak latency for the deviants and controls, whereas the standard elicited no N1-related negativity. The deviant minus standard difference potential resulted in long-lasting bilateral negativity within the 120–250 ms range. The amplitude of the deviant minus control difference ("genuine vMMN") was much smaller and restricted to the right side indicating that the equal probability control dissociated the effects of exogenous components and an additional posterior negativity.

Schröger (1997) and Ruhnau et al. (2012) argued that the equal probability control overestimated the effect of refractoriness. This is because oddball is a regular sequence, whereas the equal probability control is an irregular one. Therefore, an "irregularity effect" might add to the lack of stimulus repetition. They proposed a sequence called cascadic control. In this sequence stimuli with various characteristics are ordered in upward-downward sub-sequences, preserving regularity, and stimulus variability (and avoiding stimulus repetition). In this study the random equal probability control elicited larger N1 than the oddball deviant and cascadic equal probability control suggesting that the random equal probability control might overestimate frequencyspecific repetition effects3 . File et al. (in preparation) compared vMMN of the traditional oddball paradigm, the equal probability control and the cascadic control. The deviant set of bar pattern had different orientation than the standard. Both the equal probability and the cascadic control eliminated the deviant-related effect in the 120–160 ms interval.

In addition to studies on orientation deviancy, equal probability control was introduced in three other studies.

<sup>3</sup>The cascadic control can also be viewed as a "roving standard" paradigm with predictable changes in pitch in two directions alternating in short, regular sequences. Strictly speaking, in the oddball sequence pitch change has a low probability, whereas in the cascadic control a certain change in a given direction has a high probability. One might argue that difference between the response to the oddball deviant and its cascadic control might not only reflect differences in prediction errors but also activity related to fulfilled predictions.

Czigler et al. (2002) investigated color-related deviance and obtained similar posterior negativity in the deviant minus control and deviant minus standard difference potentials. In this study, the average distance between the various colors of the control condition was not necessarily larger than the distance between the standard and deviant; therefore, the control condition did not guarantee non-refractory ERPs. However, in this study, the latency of the exogenous posterior negativity was 100 ms, whereas deviant-related activities emerged later, in the 128–142 ms range, where the exogenous activity was positive. In this study, the standard elicited the largest exogenous negativity. Li et al. (2012) used equal probability control to study emotion-related vMMN. Facial emotions are categorically different; therefore the magnitude of the distance within the oddball and control sequences is meaningless. In the oddball condition, the standard face was neutral and the deviant face was sad, whereas in the control conditions, three additional emotions were added to the sequence. Both the deviant minus standard and the deviant minus control difference potentials were negative within a long range (100– 400 ms) over the occipito-temporal regions. In the latency range of the exogenous negative component, the negative difference was smaller (but present) in the deviant minus control difference potential, suggesting the contribution of refractoriness for the standard face of the oddball sequence. Recently, Astikainen et al. (2013) also used equal probability control to study emotionrelated vMMN. In the oddball sequence rare fearful and happy faces were presented among frequent neutral faces, whereas in the equal probability condition all three expressions were presented with the same probability. The independent component analysis showed that the deviant minus standard differential negativity at ∼130 ms was larger at right posterior sites than the

deviant minus control difference potential, indicating that a portion of the deviant minus standard negativity could be explained by repetition effects.

In summary, the results of equal probability control suggest that stimulus-specific repetition effects might contribute to the increased negativity to the deviant stimulus. Whether these effects reflect basic neurophysiological processes without functional significance in perceptual learning is still an open issue, although it is unlikely to be the case (cf. predictive coding theories). However, majority of the studies indicated the emergence of a posterior negativity, which cannot be attributed to the refractoriness of the endogenous components. Furthermore, considering the results of these studies and the results showing that deviantrelated negativity might precede or follow negative exogenous components, there is no unequivocal evidence that the additional negativity (genuine vMMN) emerges later than exogenous activity. Applying equal probability control in future studies to obtain results allowing generalization to other features than line orientation is recommended.

#### *Other methods to control repetition effects (refractoriness)*

To investigate the effects of repetition, it is possible to compare the ERPs of the deviant of the oddball sequence to the ERPs elicited by identical stimuli from sequences without the standard stimuli ("lonely deviant"). If memory representation of the standard is necessary for the emergence of the deviant-related activity, an additional negativity is expected in the deviant minus standard difference potential. Without such additional activity, the similarity of the negative ERP component (similar latency and scalp distribution but larger amplitude for the lonely deviant) supports a refractoriness effect. Kenemans et al. (2003) using changes in special frequency of grating stimuli found a posterior negativity with similar latency and scalp distribution for the "lonely deviant" and in the deviant minus standard difference potential, supporting the refractoriness account. Due to the larger interval between the stimuli (decreased non-specific refractoriness); the negativity to the lonely deviant was larger. However, in a similar study, Astikainen et al. (2004) did not obtain a similar increased negativity using tilted bars as the standard deviant and "lonely deviant." Berti and Schröger (2006) investigated the distracting effects of task-irrelevant stimuli on duration discrimination tasks. In an oddball condition in the standard trials, the stimuli (triangles) were presented to the center of a screen, but infrequent stimuli were presented at either of two eccentric positions. In a control condition, the probability of stimulation in the three possible positions was equal, and in another control condition, the probability of the central position was equal to the sum of the probabilities of the eccentric position. Accordingly, in the oddball condition, the standard acquired a probability-based regularity, whereas no such regularity was present in the equal probability, 50% standard, and 25–25% deviant conditions. Deviant-related posterior negativity of ∼220 ms latency (N2p according to the authors' terminology) appeared only in the oddball condition. This negativity might be associated with the vMMN, and the results show that rareness itself is not enough to elicit this component.

Indirect support for "refractoriness" in the N1 latency range was provided by Kimura et al. (2008a, 2010b). Higher stimulus intensity is expected to increase response amplitude, i.e., a deviant with higher luminance should elicit larger N1 due to the additional exogenous activity, which in turn might contribute to the deviant minus standard difference. In these studies larger negativity appeared for deviants with higher luminance, but not for deviants with less intensity. Stagg et al. (2004) also compared the effects of brighter and darker deviants. In their study vertical bars were presented to the upper and lower half of the visual field, and both luminance and the deviancy-related effects appeared after the N1 negativity. While both the bright and dark bars elicited similar deviant-related negativity in the 210–400 ms range (comparison between identical stimuli as deviant vs. standard), the bright stimuli elicited larger negativity (comparison between the bright and dark stimuli). Therefore, in this study, the effect of physical difference and the deviant-related activity was additive.

In summary, deviant-related negativity cannot be fully explained on the basis of stimulus-specific refractoriness. At the same time, the contribution of repetition effects and stimulus-specific refractoriness cannot be ruled out.

#### *Stimulus-specific adaptation and refractoriness*

The effect of SSA of the oddball sequences can be viewed in the context of adaptation studies, where the adaptor stimulus is presented first, sometimes for a longer time, followed by a probe stimulus. The effect of an adaptor is stimulus-specific, both at the level of behavioral performance and ERP activity (e.g., Webster and MacLin, 1999; Eimer et al., 2010; Kloth et al., 2010; Eimer, 2011; Zimmer and Kovács, 2011). The adaptation effect is widely considered as an index of an acquired specific memory representation. There is apparently a discrepancy in the interpretation of repetition effects between fields using the adaptation method and the oddball task, as in the former field repetition-related changes are thought to reflect memory formation (e.g., Desimone, 1996; Ringo, 1996), whereas in the latter field a decrease in response amplitude is often considered as an irrelevant neurophysiological effect reflecting neuronal "fatigue" or "refractoriness" (e.g., Maess et al., 2007).

In functional MRI, using adaptation effects (repetition suppression) is a standard mapping tool to identify brain regions associated with different stages of stimulus-processing and to investigate memory representation (e.g., Henson, 2003; Grill-Spector et al., 2006; Kovács et al., 2013), even though the relationship between repetition suppression and repetition enhancement is a more complex issue (Segaert et al., 2013). For example, Park et al. (2007) observed decreased activity in brain areas sensitive to visual scenes if a scene was preceded by a similar scene, but from a narrower view. This difference was attributed to an effect called boundary extension (Czigler et al., 2013), and interpreted as a proof of the illusory memory representation of scenes represented together with a broader background.

Mismatch negativity has a potential analog in the stimulus repetition effects measured with single-cell recording in a variety of species including mice, cats, rats, owls and primates. SSA is the closest known single-neuron phenomenon of MMN (for reviews see Nelken and Ulanovsky, 2007; Escera and Malmierca, 2014). SSA is a non-trivial effect, since use dependence (refractoriness or fatigue) cannot account for SSA (Nelken and Ulanovsky, 2007). SSA and the auditory MMN show remarkable similarities. The magnitudes of SSA and MMN are both negatively correlated with the probability of the deviants but positively correlated with the difference between standard and deviant. However, an important difference is that the earlier timing of SSA relative to MMN, which led Nelken and Ulanovsky (2007) to suggest that SSA is a correlate of change detection in the primary auditory cortex upstream of MMN, and that MMN itself is a compound response of primary and higher-level cortical areas with longer response latencies. Beside in cortical neurons, SSA has been observed in subcortical structures, such as the superior colliculus and thalamus as well, supporting the notion of a hierarchically organized changed detection system (Grimm and Escera, 2012; Escera and Malmierca, 2014) which is in line with the hierarchical predictive coding framework.

Although the exact mechanisms and neurophysiological effects of stimulus specific adaptation in the visual system are not fully understood yet, at least three mechanisms have been identified, including somatic afterhyperpolarization, synaptic (network) mechanisms, and synaptic depression due to the depletion of vesicles from the presynaptic terminal (for a review, see Kohn, 2007). It is important to note that only one of the three contributing mechanisms of adaptation, namely depletion of neurotransmitter vesicles is in line with the interpretation of repetition effects according to the passive "refractoriness" model. Furthermore, SSA has more complex properties than is usually assumed from neural "refractoriness" in human electrophysiology (Nelken, 2012; Nelken et al., 2013). However, it is relatively unknown whether mechanisms underlying the repetition-related amplitude reduction and the increased response to "unpredicted" events interact. In cognitive terms the processes supported by these mechanisms correspond to the build-up of predictions (internal model of the environment), and change detection (model update). Human ECoG recordings indicate that not every brain site that responds to repeated tones show repetition suppression (Boutros et al., 2011), thus it is plausible that the initial response to the unpredicted stimuli and repetition suppression are two linked, but separate, functions.

At this stage, two outstanding issues can be pointed out. First, is there a relationship or interaction between the processes underlying the repetition-related amplitude decrement for the standard (adaptation, refractoriness, or repetition suppression) and the increased activity to the deviant ("genuine" mismatch negativity)? According to the hierarchical predictive coding framework (Friston, 2005, 2008, 2010) the two processes are mutually linked and influence each other. Neurophysiological (Ulanovsky et al., 2003) and ERP findings (Boutros et al., 2011) as well as empirically based models (Garrido et al., 2009) argue for the contribution of "refractoriness" to the mismatch process. However, there is no direct empirical evidence in the vMMN literature for a link between the change of the exogenous components (as memory representation of the standard) and the detection of changing stimulation. Second, are there any other correlates (ERP or other) associated with vMMN-related memory representation (i.e., tobe-mismatched memory)? In the auditory modality, Haenschel et al. (2005) described a positive ERP component for stimulus repetition (repetition positivity, see also Costa-Faidella et al., 2011). Until recently, no visual analog of this component has been reported, although a recent fMRI study by de Gardelle et al. (2013) presented subjects with repeating face stimuli and found that distinct patches of face-responsive extrastriate region showed concurrent repetition enhancement and suppression to repeated stimuli. As previously mentioned, some studies have shown that vMMN was apparently independent of the "refractoriness" of exogenous activity. In these cases, we have no data concerning the memory acquisition and retention processes, and it is possible that these processes are different from those underlying the decreased amplitude of the exogenous components or repetition positivity.

#### **VISUAL MMN AND ATTENTION**

vMMN is thought to be a neural correlate of automatic perceptual processes. To identify components of the deviant minus standard difference potential as a vMMN, it is necessary to ensure that the eliciting stimuli remain outside the focus of attention. It is important to recognize that this issue has both theoretical and methodological significance. In the hierarchical predictive coding framework the major task of the perceptual system is to predict future events as precisely as possible (Muckli, 2010). Attention is thought to modulate the precision of prediction errors by altering the gain of error-units (Friston, 2005, 2010). Higher precision means less uncertainty of prediction errors. According to this hypothesis, attention increases the weight of error units processing certain features or events and controls their relative influence at different levels (c.f. Bowman et al., 2013). The momentary strength of top-down and bottom-up interactions is dynamic, with attentional processes being able to modulate the weight of prediction errors (Clark, 2013). Accordingly, recent functional MRI findings support such a predictive coding model where topdown predictions attenuate sensory signals while attention can reverse such effects (Kok et al., 2012). Apart from theoretical considerations, from a methodological point of view, task-relevant or otherwise attended stimuli elicit posterior negativities in comparable latencies (e.g., Harter and Guido, 1980; Czigler and Csibra, 1990; Kenemans et al., 1993; Torriente et al., 1999), that is attentional effects might easily confound MMRs. Therefore, careful control of attentional processes is necessary for the identification of posterior negativities as vMMN.

In the majority of auditory MMN studies, attention to the MMN-related stimuli is reduced by visual tasks. Experimental protocols often involve watching a silent movie or reading a book, and due to lack of behavioral indicators of attentional involvement, it is difficult to gauge to what extent attention might be involved in those studies. Nevertheless, the claim that auditory MMN can be elicited independent of attention is supported by studies showing a MMR in sleeping newborns (Stefanics et al., 2007, 2009; Háden et al., 2009), sleeping adults (Nashida et al., 2000; Atienza and Cantero, 2001; Sculthorpe et al., 2009), and comatose patients (Kane et al., 1993, 1996; Fischer et al., 1999). In the majority of vMMN studies, the concurrent tasks are also visual, because in the absence of other relevant visual events it is difficult to withdraw attention from visual stimuli. Vision is usually considered as the dominant sensory modality, at least at the pre-response level, where visual distractors cause more interference to auditory processing than vice versa (Chen and Zhou, 2013). Several different protocols have been used to keep the participants' attention engaged and away from the mismatch-evoking stimuli. **Table 1** summarizes different approaches that have been used to reduce attention to the vMMN-related sequences. As a prototypical example, Winkler et al. (2005) instructed participants to detect infrequent stimulus changes of a central fixation cross, while mismatch-evoking stimuli were presented in the background. From time to time, the cross became wider or longer, which participants had to indicate with a button press. After the experiment participants were debriefed about the vMMN-related stimuli and the stimulus changes. According to their reports, they did not notice the regularity within the sequences. Czigler and Pató (2009) used a similar central task arrangement and debriefed participants in a detailed interview about their experiences. According to the answers, they were unaware of the changes within roving standard sequences. In spite of the lack of awareness, changes elicited posterior negativities. After an instruction that brought the changes within the sequence to the attention of participants both scalp distribution and latencies of the negativities were markedly different.

Using an attentional blink paradigm, Berti (2011) investigated the potential involvement of attention in mismatch generation more directly. In this elegant experiment, irrelevant deviant stimuli (stimuli in deviant location) followed the target events at various lags in rapid serial visual presentation (RSVP) sequences. A robust result of attentional blink studies is that if the target is followed by another target stimulus within an interval


*Different approaches are listed according to their putative efficiency to engage the participant's attention in tasks that are irrelevant to the vMMN-evoking stimuli.*

of ∼100–500 ms, the probability of detecting the second target decreases (e.g., Dux and Marois, 2009). Berti (2011) observed vMMN in the attentional blink interval, indicating that no attentional processing is needed for the emergence of this ERP component.

Continuous tasks, such as tracking and RSVP of sequences together with para-foveal or peripheral stimulation, seem to be the most stringent controls. A somewhat less strict method is the introduction of detection tasks at the fixation point together with presentation of the vMMN-related stimuli outside the fixation field. As for the ecological validity of this spatial arrangement of the stimuli, in everyday situations unattended but important events first occur outside the center of our visual field. In a perhaps more effective variant, the onset time of taskrelated (target) stimuli is independent of the appearance of vMMN-related stimuli; in the other version, the onset of the task-relevant stimuli coincides with that of the vMMN-related stimuli (and usually of the standards). Furthermore, reduction of attention to the vMMN-related stimuli is presumably weaker if the target stimuli are members of a sequence of vMMN-related events. This arrangement is similar to the three-stimulus oddball paradigm (Katayama and Polich, 1998, 1999). In some studies, vMMN-relevant stimulus features are present also in the task-relevant objects. A problem with this design is that studies on object-related attention have shown that irrelevant features of task-related stimuli cannot avoid attentional processing (e.g., Duncan, 1984).

A set of studies attempted to translate auditory MMN protocols by presenting the task-irrelevant visual stimuli together with the task-relevant auditory stimuli. To reduce the saliency of the visual stimuli, some studies have combined the auditory task with visual target stimuli. Finally, there have been attempts to record vMMN without any concurrent task and vMMN has been investigated using task-related stimuli. On one hand, it is important to note that even if the level of attentional control in vMMN studies is highly variable, the results of the various studies have been remarkably similar, since their overwhelming majority has reported negative-going deviant minus standard ERP components with posterior scalp distribution in the ∼100–400 ms range. Nevertheless, it does not mean that strictly controlling attention is not required in future studies, since attentional effects might overlap with and confound components related to automatic mismatch processes. On the other hand, as the results of some studies show, vMMN is not independent of the characteristics of the ongoing task, but in this respect, the results are not unequivocal.

By varying the difficulty of a tracking task, Heslenfeld (2003) obtained identical vMMNs, but the amplitude of an anterior positivity decreased as a function of tracking difficulty. In an fMRI study, Yucel et al. (2007) reported reduced deviantrelated posterior activity during a more difficult tracking task. Kimura et al. (2010b) investigated sequential regularity effects on vMMN and observed that vMMN-related activity to the rare stimuli of the regular patterns were absent in a conditions where participants attended to the regularity. Kimura and Takeda (2013) presented a set of bars in a passive oddball sequence, and varied the difficulty of a size discrimination task, where from time to time the fixation circle became smaller. Using an equal probability control the authors eliminated the earlier effect of deviant-related negativity. As a function of task difficulty the latency of the deviant-related negativity (vMMN) became longer (186, 195, and 226 ms, respectively). It seems that the difficulty of a task-set had a moderate effect on the speed of deviant processing. Task difficulty had no effect on vMMN amplitude. Kuldkepp et al. (2013) utilized motion direction stimuli and instructed participants to ignore or attend motion stimuli presented in the background. The authors found two distinguishable posterior vMMN components in the ignore condition, whereas in the attended condition a differential response was only observed in the later interval at frontal location. Kremlácek et al. (2013) ˇ systematically varied attentional load (no-load, easy, and difficult) using a central number detection task also during presenting oddball sequences of visual motion direction stimuli. They found no effect of attentional load manipulation on vMMN amplitude.

In an MEG study by Kogai et al. (2011), vMMN responses elicited by undetected (masked) stimuli were recorded. The standard and deviant stimuli were gratings with various spatial frequencies. The authors obtained stronger responses to deviants in the 143–154 ms range even when deviant detection was below 6%. Perhaps it is safe to conclude that vMMN is a correlate of automatic processes, but these processes are not fully independent of the load and specificity of the ongoing task.

Another aspect of automaticity, namely processing capacity, was assessed by Czigler and Sulykos (2010). In this study reduced orientation-related vMMNs to peripherally presented bar-patterns were observed when the central task required orientation detection, and color-related vMMNs were also reduced if the central task required color detection. It seems plausible that sharing processing resources of structures involved in the primary attention task may have reduced the activity of the mechanisms underlying vMMN. In the field of visual attention research similar results were obtained within the framework of the dimensional weighting theory (Müller et al., 1995). The featurespecific effect implies a limit of the vMMN automaticity, and that *both overt attention and automatic change detection (predictive) processes might rely on the same or overlapping neural resources.* If the processing of task-relevant and irrelevant stimuli share certain common structures, and the former has a selective effect on the latter, then processes underlying vMMN are not fully autonomous. Importantly, in the study by Czigler and Sulykos (2010) the effect of shared capacity was due to the influence of a task-set (attend to orientation or attend to color), instead of the necessity of simultaneous stimulus processing. Thus, the influence on vMMN had to be originated by control processes.

The relationship between the stimuli regulating the ongoing behavior and the processing of irrelevant changes requires further investigations. This is because phenomena of visual attention, like contingent capture (e.g., Folk et al., 1992) may predict the facilitation of task-related dimensions, instead of the diminished activity within such dimensions. In summary, it is recommended to control for attentional effects as efficiently as possible, but taking into account also that highly demanding tasks may exhaust participants faster.

# **THE LINK BETWEEN vMMN, VERIDICAL PERCEPTION, AND BEHAVIOR**

Automaticity is a key characteristic of the MMN response. Perceptual learning and the generation of perceptual prediction error responses have been demonstrated to occur in the absence of focused attention. Since behavior is usually linked to performance on the processing of task-relevant items, and vMMN stimuli are task-irrelevant, the issue of a relationship between vMMN and behavior is seldom investigated. However, just because information processing mechanisms operate independently of attention, it does not mean that they do not influence behavior. In fact, most of the information carried by the light entering the retina is processed "automatically" without conscious effort and relying on attentional resources (Velmans, 1991). The question arises whether vMMN mechanisms play a functional role in such automatic processes. As mentioned in Section Memory mismatch and refractoriness, the main function of System 1 in Kahnemann's framework is to maintain and update our predictive model of the world (Kahneman, 2011) and MMN is the neural correlate of the automatic detection of *unpredicted* changes in our visual environment carried out by System 1. That is, processes underlying the auditory and visual MMN seem to have key role in veridical perception. But how does veridical perception affect everyday behavior?

The auditory and visual MMN response is thought to reflect the important cognitive process of automatic stimulus discrimination (for reviews, see Kujala et al., 2007; Czigler and Pató, 2009; Näätänen et al., 2007, 2011; Kujala and Näätänen, 2010). A relationship between auditory MMN and behavioral measures of discrimination ability has been reported in several studies (Lang et al., 1990; Näätänen et al., 1993; Baldeweg et al., 1999; Desjardins et al., 1999; Amenedo and Escera, 2000; Kujala et al., 2001; Novitski et al., 2004; De Sanctis et al., 2009). It is generally accepted in the auditory MMN field that perceptual discrimination performance is strongly associated with MMN characteristics (amplitude and/or latency), e.g., increasing stimulus deviance increases MMN amplitude which correlates with higher discrimination rate. From a predictive point of view, perception involves inference about the causes of sensory input received by the brain. The fact that magnitude of prediction error response evoked by improbable events exhibits a relationship with behavioral measures of discrimination performance indicates that the efficiency of perceptual categorization may depend on the ability of the brain to infer upon the causes of sensory input. Automatic sensory discrimination reflected by auditory MMN is also associated with psychosocial functioning in healthy adults (Light et al., 2007) and has been suggested to serve as a gateway to higher order cognitive operations (Rissling et al., 2013). Similarly in the visual domain, vMMN has been argued to show automatic categorization processes based on fairly complex stimulus representation (Czigler, 2013).

It is uncommon in vMMN studies to collect behavioral data relevant to the processing of the vMMN-evoking stimuli. One reason is that usually a distractor task is employed in vMMN paradigms (as discussed in the previous section), where participants behaviorally respond, usually by pressing a button, to task-relevant stimuli. The distractor task serves the important purpose to eliminate potential effects of attention on ERPs to task-irrelevant standard and deviant stimuli. Applying a distractor task allows the experimenter to focus exclusively on effects of "surprise" or "deviance," since brain responses to unattended and task-irrelevant stimuli are supposed to be uncontaminated by attentional and behavioral response-related activities. Thus, the standard and deviant ERPs in vMMN paradigms are usually task-irrelevant; consequently no behavioral data is collected during their recordings which could demonstrate the relevance of the processes underlying vMMN generation to behavioral functions. Another possible reason is that often low-level visual features are used to establish regularities in vMMN experiments (e.g., line orientation, spatial frequency) without obvious links to higher-level cognitive functions that are usually probed by behavioral tasks. Thus, the behavioral significance of vMMN responses, or the relationship between the vMMN response and behavioral measures is seldom demonstrated.

How can we obtain behavioral measures relevant to perceptual (cognitive) processes putatively related to vMMN processes, when vMMN is evoked by unattended and task-irrelevant events? The behavioral advantages brought about by automatic deviance detection systems ("primitive intelligence," Näätänen et al., 2001) should be demonstrated in vMMN studies. To this end, one should show that there is a link between a vMMN property (e.g., amplitude, latency) and a behavioral index of performance in the cognitive domain where a regularity was used in a given experiment. To gain insight into how visual prediction error responses support veridical perception, we suggest that future studies should investigate the *relationship between visual mismatch responses and relevant behavioral measures*. Obtaining behavioral data (psychophysics, questionnaires, etc.) in separate protocols that assess functions putatively related to the vMMNgenerating system is recommended.

Until now, only a few studies investigated the relationship between vMMN and behavior. In a study by Stefanics and Czigler (2012) laterality of hands was used to establish a regularity in the stimuli (e.g., pictures of right hands were repeated frequently (standard) with occasional pictures of left hands (deviant) interspersed in the stimulus stream). Preference of participants to use one hand over the other was measured by the Edinburgh handedness questionnaire. They found a significant relationship between handedness score and visual mismatch amplitude at the left fronto-temporal region for righthand deviants, indicating that hand preference and MMRs to hands with unexpected laterality are related, however the exact nature of the relationship is not yet clear. In a recent study by Gayle et al. (2012) happy and sad faces were used to elicit vMMN in healthy individuals and autism spectrum personality traits were measured by the Adult Autism Spectrum Quotient (AQ). Smaller vMMN amplitudes to happy faces were associated with higher AQ score, and the authors suggested that vMMN evoked by unexpected emotional expressions may be a useful indicator of affective reactivity. Another recent study (Csukly et al., 2013) using emotional faces reported a correlation between vMMN amplitude to happy faces and emotion recognition performance as measured by the Ekman-test (Ekman and Friesen, 1976), both in healthy subjects and patients with schizophrenia.

The importance of auditory MMN-generating processes in supporting cognition and everyday behavior by veridical perception is highlighted in neurodevelopmental and psychiatric disorders where cognitive impairments are often accompanied by MMN deficits (for a review, see Näätänen et al., 2011). Numerous studies on developmental dyslexia used auditory MMN as an objective index of deficits in auditory information processing (Kujala and Näätänen, 2001). Furthermore, audiovisual training has been shown to enhance auditory cortical discrimination accuracy, as indexed by MMN, and concurrently improve reading skills in children with dyslexia (Kujala et al., 2001).

In schizophrenia research, one of the most replicable electrophysiological abnormalities is the reduced auditory MMN response (Umbricht and Krljes, 2005; Todd et al., 2012). MMN deficits are one of the features in schizophrenia that indicate severe abnormalities in fundamental brain processes of prediction and inference (Stephan et al., 2006). This is further corroborated by parallel evidence for a key role of NMDA receptors in auditory MMN generation and in the pathophysiology of schizophrenia (Umbricht and Krljes, 2005; Coyle, 2006; Javitt, 2009). Visual MMN studies with clinical samples are relatively rare (for a review, see Maekawa et al., 2012; Kremlácek et al., ˇ in preparation) but they provide hints to a relationship between vMMN and various deficits. Urban et al. (2008) used deviant motion-direction and found attenuated vMMN in patients with schizophrenia, which was associated with medication dose, level of functioning and the presence of a deficit syndrome. A study by Maekawa et al. (2013) found attenuated vMMN to deviant windmill pattern stimuli with high spatial frequency in patients with bipolar disorders. Another recent vMMN study by Csukly et al. (2013) used deviant emotional expressions and found attenuated vMMN in schizophrenia patients which correlated strongly with decreased emotion recognition. These studies indicate a relationship between insufficient automatic processing of both lower-level (motion, spatial frequency) and higher-level (emotion) deviant characteristics and symptoms. A study by Wang et al. (2010) used vMMN to study orthographic processing skills in Chinese children with developmental dyslexia. They found reduced vMMN to moving gratings with deviant direction in the dyslexia group suggesting impaired visual discrimination processes, which might be related to reading deficits. Cléry et al. (2013b) used vMMN elicited by dynamic stimuli to study automatic sensory discrimination in children with autism spectrum disorder (ASD). They found an earlier visual MMR in children with ASD which the authors interpreted as a sign of hypersensitivity to visual deviancy. Although there are relatively few clinical vMMN studies yet, taken together, they suggest that impaired automatic visual discrimination might underlie or contribute to deficits in a variety of developmental and psychiatric syndromes.

The above examples illustrate that vMMN deficits are present in psychiatric and developmental disorders and that a correlative relationship between vMMN and specific behavioral indices has already been demonstrated in a handful of studies. The visual MMN seems to predict some aspects of behavior (such as personality traits, handedness, and emotion recognition skills) thus it might be a *potential biomarker in populations with deficits in specific cognitive domains*.

### **CONCLUSIONS**

Visual MMN similarly to auditory MMN is a promising basic and clinical research tool. Several studies confirmed that vMMN can be elicited by infrequent changes in lower- and higher-level attributes of simple and more complex stimuli. VMMN reflects automatic perceptual prediction error responses to events violating statistical regularities, and is a correlate of model update processes which likely operates through short term synaptic plasticity involving stimulus specific adaptation. In general, we recommend that future vMMN studies should take into account the issues regarding repetition suppression (refractoriness). We recommend using effective primary tasks to avoid attentional confounds. Finally, to show that vMMN obtained by violation of a regularity in a particular cognitive domain is not only an intriguing epiphenomenon we recommend investigating the relationship between vMMN attributes and discrimination performance in the cognitive domain relevant to the particular regularity.

#### **ACKNOWLEDGMENTS**

The authors thank Jakob Heinzle and Justin Chumbley for their insightful comments on an earlier version of this paper.

### **REFERENCES**


the mismatch negativity (MMN). *Brain Res.* 901, 151–160. doi: 10.1016/S0006- 8993(01)02340-X


mismatch negativity event-related brain potential. *Front. Hum. Neurosci.* 6:334. doi: 10.3389/fnhum.2012.00334


to pitch change. *Neuroimage* 37, 561–571. doi: 10.1016/j.neuroimage.2007. 05.040


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 14 May 2014; accepted: 11 August 2014; published online: 16 September 2014.*

*Citation: Stefanics G, Kremláˇcek J and Czigler I (2014) Visual mismatch negativity: a predictive coding view. Front. Hum. Neurosci. 8:666. doi: 10.3389/fnhum.2014.00666 This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2014 Stefanics, Kremláˇcek and Czigler. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Visual mismatch negativity (vMMN): a prediction error signal in the visual modality

# *Gábor Stefanics 1,2\*, Piia Astikainen3 and István Czigler 4,5*

*<sup>1</sup> Translational Neuromodeling Unit, Institute for Biomedical Engineering, University of Zurich and ETH Zürich, Zurich, Switzerland*


*<sup>5</sup> Department of Cognitive Psychology, Institute of Psychology, Eötvös Loránd University, Budapest, Hungary*

*\*Correspondence: stefanics@biomed.ee.ethz.ch*

#### *Edited and reviewed by:*

*Hauke R. Heekeren, Freie Universität Berlin, Germany*

**Keywords: EEG, ERP, perceptual learning, predictive coding, prediction error, repetition suppression, stimulus specific adaptation, visual mismatch negativity**

Our visual field contains much more information at every moment than we can attend and consciously process. How is the multitude of unattended events processed in the brain and selected for the further attentive evaluation? Current theories of visual change detection emphasize the importance of conscious attention to detect changes in the visual environment. However, an increasing body of studies shows that the human brain is capable of detecting even small visual changes if such changes violate non-conscious probabilistic expectations based on prior experiences. In other words, our brain automatically represents environmental statistical regularities.

Since the discovery of the auditory mismatch negativity (MMN) event-related potential (ERP) component, the majority of research in the field has focused on auditory deviance detection. Such automatic change detection mechanisms operate in the visual modality too, as indicated by the visual mismatch negativity (vMMN) brain potential to rare changes. vMMN is typically elicited by stimuli with infrequent (deviant) features embedded in a stream of frequent (standard) stimuli, outside the focus of attention. Information about both simple and more complex characteristics of stimuli is rapidly processed and stored by the brain in the absence of conscious attention.

In this research topic we aim to present vMMN as a prediction error signal and put it in context of the hierarchical predictive coding framework. Predictive coding theories account for phenomena such as MMN and repetition suppression, and place them in a broader context of a general theory of cortical responses (Friston, 2005, 2010). Each paper in this Research Topic is a valuable contribution to the field of automatic visual change detection and deepens our understanding of the short term plasticity underlying predictive processes of visual perceptual learning.

A wide range of vMMN studies has been presented in seventeen articles in this Research Topic. Twelve articles address roughly four general sub-themes including attention, language, face processing, and psychiatric disorders. Additionally, four articles focused on particular subjects such as the oblique effect, object formation, and development and time-frequency analysis of vMMN. Furthermore, a review paper presented vMMN in a hierarchical predictive coding framework.

Four articles investigated the relationship between attention and vMMN. Kremlácek et al. (2013) ˇ presented subjects with radial motion stimuli in the periphery of the visual field using an oddball paradigm and manipulated the attentional load by varying the difficulty of a central distractor tasks. They aimed to manipulate the amount of available attentional resources that might have been involuntarily captured by the vMMN-evoking stimuli presented in the periphery outside of the attentional focus. The distractor task had three difficulty levels: (1) a central fixation (easy), and a target number detection task with (2) one target number (moderate), and (3) three target numbers (difficult). Analysis of deviant minus standard differential waveforms revealed a significant posterior negativity in the ∼140–200 ms interval, which was unaffected by the difficulty of the central task, indicating that the automatic processes underlying registration of changes in motion are independent of attentional resources used to detect target numbers.

Kimura and Takeda (2013) investigated whether characteristics of vMMN depended on the difficulty of an attended primary task, i.e., they tested the level of automaticity of the vMMN. Task difficulty was manipulated as the magnitude of change of a circle at fixation, and vMMN was elicited by deviant orientation of bar patterns. An equal probability control condition was also used. The difference potential between the deviant-related ERP and the ERP elicited by identical orientation pattern in the control condition appeared to be influenced by the difficulty of the attentive task. As a function of task difficulty, the latency of the difference potential (i.e., the vMMN) increased, indicating that processes underlying vMMN to orientation changes are not fully independent of the attention demands of the ongoing tasks.

Kuldkepp et al. (2013) used rare changes in direction of peripheral motion to evoke vMMN applying a novel continuous whole-display stimulus configuration. The demanding distractor task involved motion onset detection and was presented in the center of the visual field. The level of attention to the vMMNevoking stimuli was varied by manipulating their task-relevance using "Ignore" and "Attend" conditions. Deviant minus standard waveforms in the "Ignore" condition showed significant vMMN in the 100–200, 250–300, and 235–375 ms intervals, whereas in the "Attend" condition only in the later 250–400 ms interval a reliable vMMN was observed, indicating that task-relevance eliminates the difference observable in early components to task irrelevant deviant and standard stimuli.

van Rhijn et al. (2013) used a binocular rivalry situation to investigate whether vMMN is generated by low levels of the visual system at which simple features are processed, or by higher levels. Attention to the vMMN-evoking stimuli was manipulated by their task-relevance, in the "reduced-attention" condition participants performed a two-back task presented at fixation, whereas in the "attend-to-rivalry" they recorded their experiences of rivalry by button presses. Oddball series emerged only if the stimuli from the two eyes were treated separately, i.e., combining the stimuli from the eyes produced no deviant-standard separation, but two equiprobable orientations of a grating pattern. VMMN emerged in the 130–160 and 196–226 ms intervals both when the stimulus stream was task-irrelevant and when it was attended. The results indicate that vMMN may emerge in visual structures before the level of binocular integration.

Two studies in this Topic demonstrate the potential of vMMN in studying language-related phenomena. Shtyrov et al. (2013) used vMMN to investigate early automatic lexical effects in the visual modality. They presented participants with word and pseudo-word stimuli perifoveally using an oddball design, while participants engaged in a centrally presented task. Significant vMMN responses were observed to words at the 100–120 and 240-260 ms latency ranges and the authors concluded that early processing of orthographic stimuli can take place automatically outside the focus of attention.

Files et al. (2013) presented consonant-vowel syllables visually using videos of talking faces. Within the oddball sequences taskirrelevant deviant and standard syllables were presented, as well as task-relevant target syllables. The syllables were phonetically near (e.g., "zha" vs. "ta"), or far (e.g., "zha" vs "fa"). The main interest of the study was the lateralization of the vMMN. In the left posterior temporal areas area only the phonetically far deviant elicited vMMN. However, sound difference *per-se* elicited vMMN in the right temporal areas, independent of the phonetic distance. The results also show the influence of speech-related processing on visual change detection.

The auditory MMN has proved extremely useful in studying cognitive deficits in neuropsychiatric and neurological diseases. There is hope that studies using visual MMN can further our understanding of a variety of disorders, too. Three studies in this Topic used vMMN to investigate clinically relevant issues. Maekawa et al. (2013) applied a three-stimulus oddball paradigm in patients with bipolar disorder. Rare changes in spatial frequency of windmill pattern stimuli were used to elicit vMMN while subjects performed a task which involved detection of rare white discs (target) and simultaneously listened to an acoustically presented story. The vMMN component was smaller in patients than in control participants, indicating impairment in automatic visual predictive mechanisms in bipolar disorder.

Cléry et al. (2013) investigated predictive visual processing in patients with autism spectrum disorder (ASD). Adult ASD patients and control participants were compared in a threestimulus passive oddball paradigm. The participants' task was to detect the disappearance of the fixation point. The standard or deviant stimuli were frequent horizontal and rare vertical deformations of a circle into an ellipse, respectively, while deformation into another shape served as a novel stimulus. Deviant deformation elicited smaller vMMN in the patient group. In the control group a subsequent ERP component, the orientation-related P3a, emerged only to the novel stimuli. However, in the ASD group the deviant stimulus also elicited the P3a, indicating altered change-detection and orientation processes in the ASD group.

Four studies used vMMN to investigate face processing. Gayle et al. (2012) investigated whether vMMN evoked by rare changes in unattended facial expressions can be used to predict autism spectrum personality traits as measured by the Adult Autism Spectrum Quotient in healthy adults. Emotionally neutral faces served as frequent standard stimuli, whereas rare happy and sad faces served as deviant stimuli in an oddball paradigm while participants engaged in a separate task. Deviant emotions elicited a posterior vMMN response at 150–425 ms, which correlated with the autism quotient. The authors concluded that vMMN might be useful as an objective index of affective reactivity in ASD.

Detection of changes in facial expressions was also explored by Astikainen et al. (2013). ERPs were recorded to pictures of neutral, fearful, and happy faces using oddball and equiprobable stimulus conditions presenting rare emotional faces among neutral ones or all three expressions with equal probability, respectively. Independent component analysis applied to the emotional minus neutral differential responses revealed two prominent components in both stimulus conditions in the 100–200 s interval. A component peaking at 130 ms showed a difference in scalp topography between oddball and equiprobable conditions. This bilateral component at 130 ms in the oddball condition conformed to vMMN. Moreover, it was distinct from face sensitive N170 which was modulated by the emotional expression only. Results suggest that future vMMN studies should take into account possible confounding effects caused by the differential processing of the emotional expressions as such.

Kreegipuu et al. (2013) presented participants with schematic faces of neutral, happy and angry expressions while they were attending to scrambled faces presented in the same series. Two stimulus presentation conditions were compared, an oddball and an optimum paradigm, the latter involving several different deviant facial emotions. VMMN was elicited similarly in both conditions at posterior sites. Angry deviant faces elicited larger vMMN responses than happy deviant faces irrespectively of the paradigm type. The results encourage using a multi-feature "optimum" paradigm to study predictive processes related to different facial emotions.

Processing of the gender information from the faces was investigated by Kecskés-Kovács et al. (2013). Female and male faces were applied as standard and deviant stimuli using an oddball design with two different stimulus-onset asynchronies. Faces with different identities were presented, without repetition of the same identity in consecutive pictures. Male and female deviant faces elicited similar vMMN in both SOA conditions at around 200–500 ms latency. The results suggest that vMMN is a reliable index of regularity violations in facial gender categories.

Takács et al. (2013) investigated electrophysiological correlates of the oblique effect under unattended and attended conditions using an oddball design. The oblique effect refers to that the perceptual system is more sensitive to cardinal (vertical and horizontal) than oblique line orientations. Task-irrelevant Gábor patches elicited no vMMN when a moderate 50◦ change in orientation was presented. However, in the subsequent attentive condition it was found that changes as 10◦ in cardinal direction and 17◦ in oblique direction were behaviorally detectable. When 90◦ change was applied in the following vMMN experiment, deviations from the cardinal angels elicited larger and more sustained vMMN than those from oblique angles. Sufficiently large magnitude of change thus elicited typical oblique effect as indexed by vMMN.

Müller et al. (2013) used vMMN to study whether object formation happens in the absence of attention. Using an elegant design, participants were presented with two symmetrically arranged ellipses, and two discs of either lower or higher luminance. In separate blocks, the discs were either frequently enclosed in one ellipse or in both ellipses (standard). Occasionally, the frequent disc-to-ellipse assignment was randomly changed (deviant), allowing the investigation of ERPs to unexpected configurational changes in the arrangement of discs and ellipses. Task-irrelevant changes in disc-to-ellipse assignment resulted in increased reaction times, indicating that an unpredicted change in a task-irrelevant feature (assignment to ellipse) of the otherwise attended discs captured some of the attentional resources available for the processing of the behaviorally relevant feature (luminance). VMMN emerged in the 246–280 ms interval at posterior sites, which was localized to the inferior temporal gyrus. These results indicate that the visual system automatically registers the probability of different features to occur together in spatial proximity, i.e., to form objects.

Developmental studies on vMMN are rare. Cleary et al. (2013) investigated vMMN in 8–12 year old children and 18–42 year old adults in an oddball paradigm using low spatial frequency grating stimuli as a deviant while participants performed a centrally presented target detection task. VMMN components were observed in the 130–200 and 200–275 ms intervals in children, and in the 130–200 ms interval in the adult group at posterior electrodes. The results confirm that vMMN can be observed in 8–12 years old children and the authors suggests it as a potential tool to study visual information processing deficits in children with neurodevelopmental disabilities.

Little attention has been devoted to studying the contribution of phase reorganization vs. evoked activity, and contribution of evoked vs. induced activity to vMMN generation. Stothart and Kazanina (2013) investigated differences in phase-locking and induced activity in ERPs to rare unattended changes in spatial frequency of vertical bar stimuli in an oddball paradigm. Participants engaged in a central primary task. The results of time-frequency analyses show that vMMN—similarly to auditory MMN—is associated with an increase in phaselocking at ∼100–250 ms in the theta range, which was followed by a decrease in induced power in the ∼380–580 ms interval in the higher alpha range. The authors conclude that increase in theta phase-locking may reflect the higher functional coupling between cortical areas involved in the vMMN response.

Finally, in a review paper Stefanics et al. (2014) argues that the vMMN brain potential is a perceptual prediction error response, i.e., it represents the difference between the expected and the observed stimulus. Besides placing the vMMN in the hierarchical predicting coding framework, the issues of neural refractoriness, methods to control attention, and the link between veridical perception and vMMN have been discussed in the paper.

In summary, the variety of studies presented here shows that similarly to its auditory counterpart, visual MMN is a useful tool to investigate a wide range of aspects of predictive perceptual processes, including automatic stimulus discrimination, change detection, and attention-related effects. Besides indicating that vMMN has the potential to become a widely used tool in basic research, the Topic also highlighted the clinical relevance of vMMN. Regarding future directions, we believe that the field in general would greatly benefit from bridging the gap between visual MMN and research on repetition suppression or stimulus-specific adaptation, and that future vMMN studies should recognize the relevance of predictive coding theories.

# **REFERENCES**


van Rhijn, M., Roeber, U., and O'Shea, P. (2013). Can eye of origin serve as a deviant? visual mismatch negativity from binocular rivalry. *Front*. *Hum Neurosci*. 7:190. doi: 10.3389/fnhum.2013.00190

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 09 September 2014; accepted: 29 December 2014; published online: 22 January 2015.*

*Citation: Stefanics G, Astikainen P and Czigler I (2015) Visual mismatch negativity (vMMN): a prediction error signal in the visual modality. Front. Hum. Neurosci. 8:1074. doi: 10.3389/fnhum.2014.01074*

*This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2015 Stefanics, Astikainen and Czigler. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# ADVANTAGES OF PUBLISHING IN FRONTIERS

FAST PUBLICATION Average 90 days from submission to publication

COLLABORATIVE PEER-REVIEW

Designed to be rigorous – yet also collaborative, fair and constructive

RESEARCH NETWORK Our network increases readership for your article

# OPEN ACCESS

Articles are free to read, for greatest visibility

#### TRANSPARENT

Editors and reviewers acknowledged by name on published articles

GLOBAL SPREAD Six million monthly page views worldwide

# COPYRIGHT TO AUTHORS

No limit to article distribution and re-use

IMPACT METRICS Advanced metrics track your article's impact

SUPPORT By our Swiss-based editorial team

EPFL Innovation Park · Building I · 1015 Lausanne · Switzerland T +41 21 510 17 00 · info@frontiersin.org · frontiersin.org