# **EYE MOVEMENT-RELATED BRAIN ACTIVITY DURING PERCEPTUAL AND COGNITIVE PROCESSING**

**Topic Editors Andrey R. Nikolaev, Sebastian Pannasch, Junji Ito and Artem Belopolsky**

#### *FRONTIERS COPYRIGHT STATEMENT*

© Copyright 2007-2014 Frontiers Media SA. All rights reserved.

All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.

**ISSN** 1664-8714 **ISBN** 978-2-88919-273-1 **DOI** 10.3389/978-2-88919-273-1

### *ABOUT FRONTIERS*

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

### *FRONTIERS JOURNAL SERIES*

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing.

All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### *DEDICATION TO QUALITY*

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view.

By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

### *WHAT ARE FRONTIERS RESEARCH TOPICS?*

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area!

Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **EYE MOVEMENT-RELATED BRAIN ACTIVITY DURING PERCEPTUAL AND COGNITIVE PROCESSING**

Topic Editors: **Andrey R. Nikolaev,** KU Leuven, Belgium **Sebastian Pannasch,** Technische Universität Dresden, Germany **Junji Ito,** Forschungszentrum Jülich, Germany **Artem Belopolsky,** Vrije Universiteit Amsterdam, Netherlands

A trace of eye movements performed by a capuchin monkey during free viewing of a natural scene image (Left). Gaze behavior is depicted as a continuous trace including fixations (knots) and their start times relative to the image onset. Time-frequency plots (Right) of the saccade-triggered power modulation (upper panel) and the inter-saccade phase consistency (lower panel) of the local field potentials measured from the primary visual cortex of the freely viewing monkey. 0 ms is the saccade onset. Figure taken from: Ito J, Maldonado P and Grün S (2013) Cross-frequency interaction of the eye-movement related LFP signals in V1 of freely viewing monkeys. *Front. Syst. Neurosci*. 7:1. doi: 10.3389/fnsys.2013.00001

The recording and analysis of electrical brain activity associated with eye movements has a history of several decades. While the early attempts were primarily focused on uncovering the brain mechanisms of eye movements, more recent approaches use eye movements as markers of the ongoing brain activity to investigate perceptual and cognitive processes.

This recent approach of segmenting brain activity based on eye movement behavior has several important advantages. First, the eye movement system is closely related to cognitive functions such as perception, attention and memory. This is not surprising since eye movements provide the easiest and the most accurate way to extract information from our visual environment and the eye movement system largely determines what information is selected for further processing. The eye movement-based segmentation offers a great way to study brain activity in relation to these processes. Second, on the methodological level, eye movements constitute a natural marker to segment the ongoing brain activity. This overcomes the problem of introducing artificial markers such as ones for stimulus presentation or response execution that are typical for a lab-based research. This opens possibilities to study brain activity during self-paced perceptual and cognitive behavior under naturalistic conditions such as free exploration of scenes. Third, by relating eye movement behavior to the ongoing brain activity it is possible to see how perceptual and cognitive processes unfold in time, being able to predict how brain activity eventually leads to behavior.

This research topic illustrates advantages of the combined recording and analysis of eye movements and neural signals such as EEG, local field potentials and fMRI for investigation of the brain processes in humans and animals. The contributions include research papers, methodology papers and reviews demonstrating conceptual and methodological achievements in this rapidly developing field.

# Table of Contents

*06 Eye Movement-Related Brain Activity During Perceptual and Cognitive Processing*

Andrey R. Nikolaev, Sebastian Pannasch, Junji Ito and Artem V. Belopolsky

*08 Attentional Dynamics During Free Picture Viewing: Evidence From Oculomotor Behavior and Electrocortical Activity*

Thomas Fischer, Sven-Thomas Graupner, Boris M. Velichkovsky and Sebastian Pannasch

*17 Decision-Making in Information Seeking on Texts: An Eye-Fixation-Related Potentials Investigation*

Aline Frey, Gelu Ionescu, Benoit Lemaire, Francisco López-Orozco, Thierry Baccino and Anne Guérin-Dugué

*39 Cognitive Processes Involved in Smooth Pursuit Eye Movements: Behavioral Evidence, Neural Substrate and Clinical Correlation*

Kikuro Fukushima, Junko Fukushima, Tateo Warabi and Graham R. Barnes

*67 Co-Registration of Eye Movements and Event-Related Potentials in Connected-Text Paragraph Reading*

John M. Henderson, Steven G. Luke, Joseph Schmidt and John E. Richards

*80 Saccades During Visual Exploration Align Hippocampal 3 -8 Hz Rhythms in Human and Non-Human Primates*

Kari L. Hoffman, Michelle C. Dragan, Tim K. Leonard, Cristiano Micheli, Rodrigo Montefusco-Siegmund and Taufik A. Valiante

*90 Parafoveal X-Masks Interfere With Foveal Word Recognition: Evidence From Fixation-Related Brain Potentials*

Florian Hutzler, Isabella Fuchs, Benjamin Gagl, Sarah Schuster, Fabio Richlan, Mario Braun and Stefan Hawelka

*100 Cross-Frequency Interaction of the Eye-Movement Related LFP Signals in V1 of Freely Viewing Monkeys*

Junji Ito, Pedro Maldonado and Sonja Grün


Chie Nakatani, Mojtaba Chehelcheraghi, Behnaz Jarrahi, Hironori Nakatani and Cees van Leeuwen

*131 Antecedent Occipital Alpha Band Activity Predicts the Impact of Oculomotor Events in Perceptual Switching*

Hironori Nakatani and Cees van Leeuwen

*140 Visual Encoding and Fixation Target Selection in Free Viewing: Presaccadic Brain Potentials*

Andrey R. Nikolaev, Peter Jurica, Chie Nakatani, Gijs Plomp and Cees van Leeuwen


Fabio Richlan, Benjamin Gagl, Sarah Schuster, Stefan Hawelka, Josef Humenberger and Florian Hutzler

*181 Eye Movement Related Brain Responses to Emotional Scenes During Free Viewing*

Jaana Simola, Jari Torniainen, Mona Moisala, Markus Kivikangas and Christina M. Krause

## Eye movement-related brain activity during perceptual and cognitive processing

#### *Andrey R. Nikolaev1 \*, Sebastian Pannasch2, Junji Ito3 and Artem V. Belopolsky4*

*<sup>1</sup> Research Group Experimental Psychology, KU Leuven, Leuven, Belgium*

*<sup>2</sup> Department of Psychology, Technische Universität Dresden, Dresden, Germany*

*<sup>3</sup> Research Center Juelich, Institute of Neuroscience and Medicine (INM-6), Juelich, Germany*

*<sup>4</sup> Department of Cognitive Psychology, Vrije Universiteit Amsterdam, Amsterdam, Netherlands*

*\*Correspondence: andrey.nikolaev@ppw.kuleuven.be*

#### *Edited and reviewed by:*

*Maria V. Sanchez-Vives, ICREA-IDIBAPS, Spain*

**Keywords: eye movements, saccade, smooth pursuit, eye tracking, free viewing, EEG, local field potentials, fMRI**

For several decades researchers have been recording electrical brain activity associated with eye movements in attempt to understand their neural mechanisms. However, recent advances in eye-tracking technology have allowed researchers to use eye movements as the means of segmenting the ongoing brain activity into episodes relevant to cognitive processes in scene perception, reading, and visual search. This opened doors to uncovering the active and dynamic neural mechanisms underlying perception, attention and memory in naturalistic conditions. The present eBook contains a representative collection of studies from various fields of visual neuroscience that use this cutting edge approach of combining eye movements and neural activity.

The majority of the articles in the eBook combine the measurement of eye movements with the recording of the electroencephalogram (EEG) in human subjects performing various psychological tasks. The most common methodological approach is examination of the EEG activity time-aligned to certain eye movement events, such as the onset of a fixation or a start of a saccadic eye movement (Fischer et al., 2013; Frey et al., 2013; Henderson et al., 2013; Hutzler et al., 2013; Nikolaev et al., 2013; Richards, 2013; Simola et al., 2013). Several works employ the time-frequency and synchrony analysis (Fischer et al., 2013; Hoffman et al., 2013; Ito et al., 2013; Nakatani and Van Leeuwen, 2013; Nakatani et al., 2013).

The advantage of simultaneous EEG and eye movement recording is most evident in investigation of perceptual and cognitive processes during free eye movement behavior. Saccades in free viewing are guided by the bottom-up and top-down attentional mechanisms. To study interactions between these mechanisms Fischer et al. (2013) explored the eye fixation-related potentials (EFRP) and EEG power during extended picture viewing. The difference between the mechanisms was reflected in the EFRP components and in the power of the frontal beta- and theta activity. Nakatani and Van Leeuwen (2013) recorded EEG and eye movements during free viewing of the Necker cube. They found that saccades and blinks facilitate perceptual switches. Moreover, the amplitude of alpha activity preceding these eye events predicted whether a blink or a saccade results in the switch. Nikolaev et al. (2013) examined the pre-saccadic EEG activity during free visual exploration of a natural scene in anticipation of a memory test. Their findings illustrate how pre-saccadic activity differentiates encoding of visual information and selection of a target for the next fixation. Simola et al. (2013) investigated attention and emotion processes by analyzing EFRPs in free viewing. They found that emotional processing depends on the overt attentional resources. Ito et al. (2013) observed interaction between low and high frequency components in the local field potentials (LFP) recorded in the visual cortex of monkeys performing voluntary saccades during natural scene viewing. They concluded that the cross-frequency interaction is a manifestation of the mechanism which coordinates oculomotor behavior and sensory processing. Hoffman et al. (2013) recorded the fixation-related neural activity in the human and macaque hippocampus during unrestricted visual search. They found in both species that the fixation-related phase alignment of the hippocampal low-frequency oscillations depends on the visual task.

Not only EEG or LFP recordings, but fMRI can be also related to eye movements in free viewing of scenes (Marsman et al., 2013). The authors investigated the neural correlates of ambient and focal processing using fixation-based event-related (FIBER) fMRI in combination with independent component analysis. They reported the eye-movement related activity in the ventromedial and ventrolateral visual cortices.

As shown in this eBook, reading studies also benefit from the combination of EEG and eye movement recordings. Henderson et al. (2013) devised an advanced procedure to correct EFRP for eye movement artifacts. After correction the early EFRP components were different between reading and pseudo-reading (control) condition. To investigate parafoveal pre-processing in reading Hutzler et al. (2013) applied the X-mask in the task of new/old word judgment. The time course of the EFRP indicated dynamical interference of the parafoveal mask with foveal word recognition. Frey et al. (2013) studied decision making during reading using EFRP. They compared EFRP when participants fixated the words related and unrelated to the predefined decision conditions. The late EFRP components appeared to be indicative of the semantic decision.

Furthermore, two studies investigated brain activity during performance of particular saccade tasks. Nakatani et al. (2013) measured cross-frequency phase coupling in peri-fixation brain activity during semantic judgment in the controlled saccade conditions. They concluded that the cross-frequency phase synchrony constitutes a plausible mechanism for tagging of fixation information. Richards (2013) used prosaccade and antisaccade tasks to examine the cortical sources of the related EEG activity. He demonstrated the value of this approach to differentiate the brain activities associated with general preparatory processes and with eye movement execution.

The methodological aspects of co-registration of brain activity with eye movements are addressed in the work which introduces a video projection system for high-speed, gaze-contingent stimulus presentation (Richlan et al., 2013).

Besides the chapters which describe the brain activity related to saccadic eye movements, the eBook also features a review on cognitive processes involved in smooth pursuit eye movements in humans and animals (Fukushima et al., 2013). The authors discuss involvement of cognitive processes in the smooth pursuit including neural substrates, behavioral evidence and clinical correlations.

Taken together, the present eBook demonstrates that relating brain activity and eye movements is a fruitful way of studying a wide range of psychological processes without imposing an artificial task structure. This approach is particularly useful to demonstrate how brain dynamics underlying perceptual and cognitive processes unfolds over time in naturalistic conditions.

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 08 February 2014; accepted: 04 April 2014; published online: 24 April 2014. Citation: Nikolaev AR, Pannasch S, Ito J and Belopolsky AV (2014) Eye movementrelated brain activity during perceptual and cognitive processing. Front. Syst. Neurosci. 8:62. doi: 10.3389/fnsys.2014.00062*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2014 Nikolaev, Pannasch, Ito and Belopolsky. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Attentional dynamics during free picture viewing: Evidence from oculomotor behavior and electrocortical activity

*Thomas Fischer <sup>1</sup> \*, Sven-Thomas Graupner 1, Boris M. Velichkovsky1,2 and Sebastian Pannasch1,3*

*<sup>1</sup> Engineering Psychology and Applied Cognitive Research, Department of Psychology, Technische Universitaet Dresden, Germany*

*<sup>2</sup> Department of Cognitive Studies, Kurchatov Institute, Moscow, Russia*

*<sup>3</sup> Brain Research Unit and MEG Core, O.V. Lounasmaa Laboratory, School of Science, Aalto University, Espoo, Finland*

#### *Edited by:*

*Andrey R. Nikolaev, KU Leuven, Belgium*

#### *Reviewed by:*

*José M. Delgado-García, University Pablo de Olavide, Spain Anne Guérin-Dugué, GIPSA-Lab, France*

#### *\*Correspondence:*

*Thomas Fischer, Applied Cognitive Research/Psychology III, Dresden University of Technology, Helmholtzstrasse 10, 01069 Dresden, Germany e-mail: thomas.fischer@ tu-dresden.de*

Most empirical evidence on attentional control is based on brief presentations of rather abstract stimuli. Results revealed indications for a dynamic interplay between bottom-up and top-down attentional mechanisms. Here we used a more naturalistic task to examine temporal signatures of attentional mechanisms on fine and coarse time scales. Subjects had to inspect digitized copies of 60 paintings, each shown for 40 s. We simultaneously measured oculomotor behavior and electrophysiological correlates of brain activity to compare early and late intervals (1) of inspection time of each picture (picture viewing) and (2) of the full experiment (time on task). For picture viewing, we found an increase in fixation duration and a decrease of saccadic amplitude while these parameters did not change with time on task. Furthermore, early in picture viewing we observed higher spatial and temporal similarity of gaze behavior. Analyzing electrical brain activity revealed changes in three components (C1, N1 and P2) of the eye fixation-related potential (EFRP); during picture viewing; no variation was obtained for the power in the frontal beta- and in the theta activity. Time on task analyses demonstrated no effects on the EFRP amplitudes but an increase of power in the frontal theta and beta band activity. Thus, behavioral and electrophysiological measures similarly show characteristic changes during picture viewing, indicating a shifting balance of its underlying (bottom-up and top-down) attentional mechanisms. Time on task also modulated top-down attention but probably represents a different attentional mechanism.

**Keywords: eye fixation-related potentials, saccadic eye movements, top-down attention, bottom-up attention, sustained attention, EEG**

### **INTRODUCTION**

When exploring our visual environment, the sampling of information is based on sequences of single eye fixations guided by visual attention. The concept of visual attention describes how the attentional focus moves (e.g., Peelen and Kastner, 2011) and how the focused information is processed (e.g., Hillyard et al., 1998). A well-established approach about the control of attention characterizes two distinct modes of information selection (James, 1890; Kinchla, 1992): In the *bottom-up* mode (stimulus-driven or exogenous control), information selection is guided by lowlevel visual features such as physical and biological saliencies (Itti and Koch, 2001; Ohman et al., 2001) or is captured by transient changes such as stimulus onset or motion (Egeth and Yantis, 1997; Peters et al., 2005). In the *top-down* mode (goal-driven or endogenous control) information selection is guided by internal goals, knowledge, or task instructions (Egeth and Yantis, 1997). While there is agreement on the existence of such two attentional modes, there is a lack of consensus on the interaction between them, particularly about the relative timing and the neural mechanisms of their activity (Chun et al., 2011).

Although theoretical concepts often propose simultaneous activity of both modes of attentional control (Egeth and Yantis, 1997; Itti and Koch, 2001; Corbetta et al., 2008), empirical findings often reveal differences in the engagement of both mechanisms at least within short time periods: Immediately after the onset of a new stimulus, bottom-up control is dominating before top-down control becomes more influential over time (Van der Stigchel et al., 2009; Hickey et al., 2010). Other authors reported an immediate influence of top-down factors, such as task demands (Einhäuser et al., 2008). Throughout the time course of attentional deployment it is furthermore unclear if the influence of bottom-up control decreases (Parkhurst et al., 2002) or if it remains stable but additional top-down regulation comes into play (Tatler et al., 2005). The analysis of psychophysiological indicators of the temporal interaction so far has mainly been conducted on the scale of milliseconds and seconds (Theeuwes, 2010). Examining behavioral and psychophysiological indicators of attention during more natural tasks would allow generalizing previous results.

At the behavioral level, indications have been found that attention changes over longer time intervals during naturalistic viewing: Eye movement analyses revealed that regions of high saliency, i.e., objects that clearly stand out from the background, are fixated earlier than less salient objects if no particular instruction is provided (Underwood and Foulsham, 2006; Underwood et al., 2006). This has been interpreted as an early dominance of bottom-up processing, where our gaze is captured by low level features of high saliency and was confirmed by higher interindividual consistency of gaze locations early in scene inspection (Parkhurst et al., 2002; Tatler et al., 2005; Masciocchi et al., 2009). In contrast, the interindividual consistency decreased later during inspection which was attributed to a stronger influence of top-down regulation on the viewing behavior due to the individually gathered knowledge (Velichkovsky, 2002; Henderson, 2003). According to Tatler et al. (2011), there are problems with this interpretation. On one hand, natural scenes often have a small but reliable bias for high salient objects being rather located in the center; on the other hand, there is a tendency that central regions of an image are fixated more often early in scene inspection. Such a "central fixation bias" may reflect a general tendency for observers to fixate near the center of scenes, irrespective of saliency (Tatler et al., 2005, p. 650) and thus may be unrelated to bottom-up control of attention.

Apart from gaze locations, changes in fixation durations and saccadic amplitudes during longer inspection times were reported for naturalistic viewing. Within 2 s after the image onset, fixations were shorter and saccades were larger compared to later stages of scene exploration (Unema et al., 2005; Pannasch et al., 2008). Recently, it was found that disrupting top-down guidance by scrambling the picture content subsequent fixations became shorter while the saccadic amplitudes increased (Foulsham et al., 2011).

In contrast to the observations of gaze behavior, less is known about the dynamics of psychophysiological indicators (e.g., EEG) during longer intervals (>2 s) of naturalistic viewing. One reason is probably that conventional analyses cannot adequately take into account the appearance of sequential eye movements. Here, the analysis of EEG epochs time-locked to onsets of eye fixations (i.e., eye fixation-related potentials, EFRP) is necessary. Using this method revealed similar results as in more traditional experiments where cortical responses are locked to a sudden stimulus change (e.g., Yagi, 1979; Graupner et al., 2007, 2011; Rama and Baccino, 2010). The neuronal sources of EFRPs in scene viewing are mainly distributed across occipital and parietal regions and are primarily characterized by the components P1, N1, and P2. Recent evidence also suggests the existence of an early C1 component in the EFRP during picture perception (see Figure 3 in Graupner et al., 2011).

Early components such as C1, P1, N1, and P2 are usually assumed to be controlled by physical stimulus properties (Hopfinger and Ries, 2005). In contrast, later components such as N2, P3, and N4 are rather thought to reflect top-down processing (see e.g., Donchin et al., 1978). While this distinction of the components seems appealing in the context of describing attentional mechanisms, it is presumably too simple. Top-down regulation, for instance, has also been found to influence C1, P1, N1, and P2 (Johannes et al., 1995; Freunberger et al., 2007; Rauss et al., 2009, 2011; Wykowska and Schubo, 2010). Specifically, for the N1, influences of working memory (WM) load were found. During a visual selection paradigm the N1 was smaller when WM demands were high (Rose et al., 2005). Similar influences were also found in WM paradigms with auditory evoked N1 components (Conley et al., 1999; Golob and Starr, 2004) and in a spatial WM paradigm (Rader et al., 2008). Furthermore, de Fockert et al. (2001) found a strong connection between WM and visual selective attention, demonstrating that WM can reduce visual distraction due to the prioritization of relevant information. The few investigations that analyzed the functional aspects of the P2 component demonstrated its association with visual selective attention and WM (Freunberger et al., 2007). When irrelevant stimuli were presented before target presentation the P2 increased as function of distraction (Vierck and Miller, 2009).

Even for C1—the earliest component of the ERFP-complex results suggest a susceptibility to top-down modulation (Rauss et al., 2011). Nevertheless, the majority of evidence has found bottom-up related influences on C1 (Khoe et al., 2005; Stolarova et al., 2006), in particular by effects of saliency (Zhang et al., 2012). Therefore, we expect that large C1 amplitudes during naturalistic viewing should be associated with stronger bottom-up control. The C1 amplitude should become smaller when bottomup influences are less important (i.e., later during inspection). With increasing inspection time, we not only postulate a diminishing impact of bottom-up attention but also a shift toward a stronger top-down controlled mode of attention. Such stronger top-down regulation could be triggered for instance by increased demands on WM and selective attention that might result in decreased N1 and increased P2 amplitudes.

Another important function of top-down control describes the ability to maintain an adequate level of internal arousal to fulfill demands of an ongoing task over longer periods. This ability is associated with the concept of sustained attention and characterized as the effort to compensate the negative outcomes of decreasing arousal, known and well documented as increasing subjective sleepiness and fatigue with time on task (Parasuraman et al., 1998; Lorist et al., 2000).

Demands on sustained attention have been found to correlate with the amount of power in frontal theta and beta frequency band of the EEG (Arruda et al., 1999; Sauseng et al., 2007). Therefore, we expect increased power in theta and beta frequency bands during later phases of the experiment. So far it is not known to what extent demands on sustained attention are required to maintain performance in shorter tasks (<1 min). We expect to contribute to this question by comparing frequency power between early and late periods of image inspection.

To examine attentional mechanisms on a larger time scale, our subjects freely explored paintings for a period of 40 s. Paintings are considered as "maximal memory stores" (Leyton, 2006, p. 2). Their inspection requires active exploration in combination with time-consuming accumulation of knowledge which corresponds well with demands on attention in everyday activities. During our experiment we predicted changes at two different time scales. Firstly, we expect changes throughout the 40 s of inspection of each picture (henceforth picture viewing) indicating variations in the balance of bottom-up and top-down attention. Secondly, we presume changes throughout the time course of the whole experiment (henceforth time on task). Such variation should indicate various demands on sustained attention. To best of our knowledge, behavioral and psychophysiological correlates of bottom-up, top-down and sustained attention have never been investigated using such a naturalistic task.

### **MATERIALS AND METHODS**

#### **SUBJECTS**

Twenty-seven healthy volunteers (5 males, mean age 23.5, age range 18–35) participated in the experiment. All subjects had normal or corrected to normal vision and received either course credit or monetary reward for their participation in the study that was conducted in conformity with the declaration of Helsinki and approved by the Ethics Committee of the Technische Universitaet Dresden. Written informed consent was obtained from all participants.

#### **APPARATUS**

Participants were seated in a dimly illuminated, sound-attenuated room. Eye movements were recorded monocularly at 500 Hz using the EyeLink 1000 infrared eye tracking system (SR Research, Ontario, Canada), operated in the remote mode. The system allows continued eye movement recordings with a spatial resolution below. 0.01◦ and a spatial accuracy of better than 0.5◦. The distance between the eye-tracking device and the subjects' eye was always about 60 cm. The eye tracker and the experimental procedure were controlled using the Experiment Builder software (SR Research, Ontario, Canada). Saccades and fixations were defined using the saccade detection algorithm supplied by SR Research: Saccades were identified by deflections in eye position in excess of 0.1◦, with a minimum velocity of 30◦ s <sup>−</sup><sup>1</sup> and a minimum acceleration of 8000◦ s <sup>−</sup>2, maintained for at least 4 ms.

EEG activity was recorded using a Brain Amp DC-amplifier. Sixty-four electrodes were placed according to the standard 10/10 system. Data were collected in a shielded room with 500 Hz sampling rate and a high pass filter at 0.1 Hz. Both mastoids were used as reference and earlobes served as ground. All electrode impedances were kept below 5 k-.

We furthermore employed the Short Questionnaire for Current Strain (KAB; Mueller and Basler, 1992) to measure current subjective strain. The KAB is a self-report questionnaire including eight pairs of adjectives on a 6-point Likert-type rating scale describing opposite endpoints of different strain dimensions (e.g., stressed vs. relaxed; languid vs. fresh). The Stanford Sleepiness Scale (SSS; Herscovitch and Broughton, 1981) quantifies sleepiness based on seven bipolar items and was used to record changes in fatigue over the course of the experiment.

#### **STIMULI AND PROCEDURE**

Sixty digitized copies of representational paintings by different 16th and 17th century European artists were presented in random order. As there was variation in the format of the original paintings, they were proportionately rescaled to fit either the width or height of the display device resolution (1024 × 768 pixels). Stimuli were presented using a JVC DLA G11 video projector at a refresh rate of 60 Hz. The size of the projection screen was about 110 by 80 cm; viewed from a distance of 180 cm, the screen subtended a visual angle of 33◦ horizontally and 25◦ vertically. Before signing the consent form, participants were informed that the purpose of the study was to investigate eye movement behavior and brain activity in perception of art. They were asked to freely inspect and enjoy the images as they would do in an art gallery. An initial 9-point calibration and validation was performed before the start of the first trial and after the break; calibration was checked prior to each trial. A trial started with an 8 s presentation of a random pixel image—created from the subsequently shown image—followed by a central white fixation cross shown for 1.5–3 s. During the presence of the fixation cross, participants had to fixate it until the real image was shown for an inspection time of 40 s. After half of the trials, subjects were given a short break of 5 min. The total duration of the experiment was approximately 1 h. Prior and after the experimental session subjects had to complete both questionnaires, the KAB and the SSS.

#### **DATA ANALYSIS**

We employed two different analysis strategies to examine the behavioral and psychophysiological data. Possible short term changes during picture viewing were examined by dividing the 40-s viewing period in particular time intervals (for details see below). For the time on task investigation (i.e., examining changes on a larger time scale), we distinguished between early (first 20 images) and late (last 20 images) parts of the experiment.

#### *Behavioral data*

Gaze behavior was analyzed in terms of fixation duration, saccade amplitude and viewing similarity. We excluded fixations preceded or followed by blinks, fixations shorter than 120 ms, and those fixations during which the image onset and offset took place. To examine effects of 40 s of picture viewing the eye movement data was segmented into four 10-s bins per image.

For the analysis of fixation duration and saccade amplitude, we calculated the median value per subject for the respective time interval. Examination of viewing similarity is based on the chronological order of fixation locations and fixation durations. The analysis of viewing similarity employed the ScanMatch method (Cristino et al., 2010), using a 8 × 8 substitution matrix, dividing the screen in 64 sectors of 128 × 96 pixels. We used a gap penalty of "0" as it "benefits the global alignment of the sequences" (Cristino et al., 2010). For temporal binning, we applied a value of 325, since the median of all fixation durations was 326 ms. Thus, in the sequence a fixation of 325 ms was counted only once while a fixation of 650 ms was counted two times.

#### *Psychophysiological data*

To analyze the effect of picture viewing time on fixation related activity in the EEG we compared EFRPs from the first 10 s (early) to that from the remaining 30 s (late). Early and late EFRPs were matched by selecting fixations with durations of >300 ms and preceding saccade amplitudes of >3◦ from early and late time intervals. For each of the early fixations a gaze event from the late interval was selected based on two criteria: (1) the preceding saccade length belonged to the same quartile and (2) fixations were located at the same image region within a range of 3◦. The same matching procedure was applied to study time on task effects, except for the gaze position criterion since congruency of the low-level visual features can hardly be achieved between the different stimuli of first and last 20 pictures. Hence, different sets of EFRPs were used for comparing early and late stages during picture viewing and for the analysis of time on task influences across the whole experiment.

For artifact rejection of the EEG, data were picture-wise epoched into 40-s segments. A blind source analysis (SOBI) was computed using the EEGLAB Matlab toolbox (Delorme and Makeig, 2004). The resulting components were visually inspected, to manually reject those components that were related to muscle or eye-ball activity. After artifact rejection the onsets of the selected fixations were used to create EFRPsegments. Subsequently, the EEG was segmented in epochs ranging from 200 ms before fixation onset to 500 ms afterwards. The −200 to −50 ms interval prior to fixation onset served for baseline correction. After preprocessing, an average of 99 (*SD* = 37.7) pairs of EFRPs per subject remained for the within picture comparison and an average 248 (*SD* = 68.3) pairs of EFRPs remained for the across picture comparison.

A parieto-occipital cluster, including the electrode positions PO3, POz, and PO4, was chosen to evaluate activity of the EFRP components. To define the EFRP components, we used the mean activity subsequent to the fixation onset with the following temporal boundaries: C1: 30–60 ms, P1: 90–120 ms, N1: 130–170 ms, and P2: 180–250 ms. For the analysis of activity in the frequency domain of the EFRPs, we calculated mean power of the theta (5–8 Hz) and beta (13–18 Hz) band for a fronto-central cluster, including Fpz, F3, Fz, F4, and FCz electrode sites. Multivariate analyses of variances (ANOVAs) were performed to separately evaluate the effects of picture viewing and time on task on the EFRP components (C1, P1, N1, P2) and on the frequency-bandpower. Univariate statistics were performed to disentangle the specific effects. All steps of the EEG data processing were carried out using the Matlab toolbox EEGLAB (version 10) and all statistical analyses were performed with the SPSS 17.0 software package.

### **RESULTS**

#### **SUBJECTIVE DATA**

Analysis of the SSS revealed increased sleepiness over time, *F*(1, <sup>24</sup>) = 23.7, *p* < 0.001. Self-reported sleepiness was significantly lower before (*M* = 2.08, *SD* = 0.76) than after the experiment (*M* = 3.04, *SD* = 0.94). Similarly, subjective strain as indicated by KAB values increased significantly, *F*(1, <sup>23</sup>) = 24.4, *p* < 0.001, from the start (*M* = 16.8, *SD* = 4.35) to the end of the experiment (*M* = 21.8, *SD* = 6.71).

#### **BEHAVIORAL DATA**

Median fixation durations and saccade amplitudes were entered into two two-factorial repeated measures ANOVA with picture viewing (0–10, 10–20, 20–30, 30–40 s) and time on task (first vs. last 20 pictures) serving as within-subjects factor. For fixation durations, we found a significant main effect for picture viewing, *F*(3, <sup>78</sup>) = 23.9, *p* < 0.001. This effect was consistent across the whole experiment, as no influences of time on task and no interaction effect were observed, both *F* < 1.86. **Figure 1A** illustrates the asymptotic increase of fixation duration across the four bins of viewing time. Bonferroni corrected pairwise comparisons revealed a significant increase in fixation duration from the first to the second and from the second to the third time bin.

For saccade amplitude we also obtained a significant main effect for picture viewing, *F*(3, <sup>78</sup>) = 49.7, *p* < 0.001, but no influence of time on task and no interaction, both *F* < 1. As shown in **Figure 1B** saccadic amplitude decreased in an asymptotic fashion. Pairwise comparisons of viewing time confirmed the decrease only from the first to the second and form the second to the third bin.

Fixation locations and durations along the time course of exploration were used to examine viewing similarity imagewise and subjectwise. For the imagewise analysis, viewing sequences of all subjects for a particular painting were pairwise compared for each respective time bin. Each comparison produced a ScanMatch score (normalized between 0 and 1), indicating the similarity magnitude as distance from 0. The obtained ScanMatch

standard error.

scores for an image were averaged, resulting in one similarity index per painting. Equally, for the subjectwise analysis, viewing sequences of one subject for all paintings were pairwise compared and subsequently averaged. For testing of statistical differences, ScanMatch scores were entered to a two-factorial ANOVA for repeated measures, with type of contrast (imagewise, subjectwise) and time bin (1–10, 10–20, 20–30, 30–40) as within subject factors. In the ANOVA we compared ScanMatch scores of 27 participants and 60 paintings. Therefore, we performed 1.000 ANOVAs, each with a random selection of 27 out of 60 paintings. We found no reliable differences for type of contrast, since 60% of the tests revealed *p* > 0.05, but highly significant differences for time bin, *F*(3, <sup>78</sup>) = 64, *p* < 0.001. Furthermore, there was a significant interaction of type of contrast × time bin, *F*(3, <sup>78</sup>) > 0.9, with 87% of the tests revealing *p* < 0.05 (**Figure 1C**). The significant main effect for time bin was based on the larger ScanMatch scores of the first time bin compared to the subsequent time bins, indicating highest viewing similarity within the first 10 s. The interaction was qualified by larger ScanMatch scores for the picturewise analysis in the first time bin, while no differences were found for the subsequent time bins. Thus, the synchrony of spatial and temporal gaze behavior was highest across participants within the same painting but only during the first 10 s. The strongest drop in similarity can be found from the first to the second time bin, revealing that the most pronounced change in viewing behavior takes place within the first 20 s.

Finally, comparing similarity in scanpaths between the first and last 20 pictures per subject i.e., examining influences of time on task, revealed no reliable difference, *F*(1, <sup>26</sup>) < 1.

#### **PSYCHOPHYSIOLOGICAL DATA**

The multivariate analysis, testing for EFRP differences between early and late time bin during picture viewing, revealed a significant main effect, *F*(4, <sup>23</sup>) = 4.73, *p* < 0.01. The univariate tests show for the C1, N1, and P2 components significant differences between early and late time bin. As illustrated in **Figure 2A** and listed in **Table 1**, C1 and N1 amplitudes were more negative during the first 10 s. The reverse pattern was observed for the P2: the amplitude was larger in the late time bin. No difference was found for P1 component.

Furthermore, we compared power in the beta and theta frequency band for electrodes from a frontal-ROI between the early and late time bin. Multivariate testing revealed no differences in band-power as function of viewing time within a picture, *F*(2, <sup>25</sup>) = 3.01, *p* = 0.07.

For the analysis time on task effects, EFRPs of the first 20 and last 20 pictures in the experiment were matched (**Figure 2B**). The multivariate analysis revealed no time on task effect on the EFRP components, *F*(4, <sup>23</sup>) = 1.33, *p* = 0.29.

The topography of spectral beta and theta power over the scalp for EFRPs from the first and last 20 pictures of the experiment are illustrated in **Figures 3A,B**. The difference maps in **Figure 3** indicate stronger beta and theta power over frontal regions during the last 20 images. Statistical testing (multivariate analysis) of the band power for the a priori defined frontal ROI revealed a significant difference between the first and last 20 pictures, *F*(2, <sup>25</sup>) = 12.24, *p* < 0.001. Univariate testing demonstrated higher beta activity (early: *M* = 42.9, *SD* = 3.46; late: *M* = 42.5, *SD* = 3.22), *F*(1, <sup>26</sup>) = 15.4, *p* < 0.001, as well as higher theta power (early: *M* = 48.0, *SD* = 3.4; late: *M* = 48.4, *SD* = 3.34), *F*(1, <sup>26</sup>) = 11.1, *p* < 0.001, for the last compared to the first 20 pictures.

**Table 1 | Mean activity of EFRP components from early and late phases during picture viewing and the univariate test statistics.**


*\*p* <sup>&</sup>lt; *0.05; \*\*\*p* <sup>&</sup>lt; *0.001; n.s.* <sup>=</sup> *<sup>p</sup>* <sup>&</sup>gt; *0.10.*

### **DISCUSSION**

We investigated behavioral and psychophysiological parameters during the free exploration of representational paintings in order to obtain further insights into the temporal dynamics of attentional control mechanisms. Electronic copies of paintings were shown for 40 s while eye movements and EEG were recorded simultaneously. We analyzed parameters of gaze behavior and fixation-related EEG-activity by comparing the initial 10 viewing seconds with the subsequent 30 s of each picture. We contrasted the same parameters in search for time on task effects by comparing gaze behavior and brain activity between the first and last 20 pictures of the experiment.

Analyses of gaze behavior revealed shortest fixation durations and largest saccade amplitudes during the first 10 s. Furthermore, the examination of viewing similarity indicated highest interindividual congruency during the initial 10 seconds of picture inspection. In contrast, comparing these parameters across the first 20 and last 20 pictures of the experiment revealed no changes.

The psychophysiological indicators also revealed particular differences. The ERFP components C1, N1, and P2 varied only during the 40 s of picture viewing but not between the first and last 20 pictures of the experiment. Larger negative amplitudes in C1 and N1 components were found during the initial 10 s compared to the subsequent exploration. In contrast, for P2, amplitudes were initially smaller. The analyses in the frequency domain of the EFRPs demonstrated changes only on the larger time scale. The frontal theta and frontal beta band power increased with time on task but remained stable throughout picture viewing.

During the initial 10 s of picture viewing, we observed shortest fixation duration and largest saccade amplitudes. This initial gaze behavior has already been reported (Antes, 1974; Unema et al., 2005) and was even suggested as an expression of bottomup processing (Pannasch et al., 2008). Eye movement recordings have often been used to investigate influences of the given task (Yarbus, 1967), as well as saliency-driven bottom-up guidance (Underwood and Foulsham, 2006; Underwood et al., 2006). Massaro et al. (2012) explicitly investigated the relationship between bottom-up and top-down processes comparing task requirements and image features such as content and color. The most pronounced indicator for bottom-up influences was found for naturalistic paintings evidenced by shorter and more widespread fixations. Since about two-thirds of our stimulus material corresponds to the naturalistic category by Massaro et al. (2012), the initial short fixations and long saccades are likely to indicate bottom-up processing also in the present work. This seems furthermore supported by the fact that similarity is largest during the initial 10 s and drops subsequently. While this might be a valid interpretation at the first glance, it seems rather contradictory considering the fact that similarity was also highest when comparing the similarity subjectwise across images. Since all paintings are different, this early correspondence in spatialtemporal viewing behavior might rather be an expression of the central fixation bias (Tatler, 2007; Tatler et al., 2011). This interpretation is supported by the fact that a central fixation cross was shown before the image onset, i.e., each exploration started from the image center. How can we integrate an initial stronger bottom-up influence and the central fixation bias? It is known that in art, main figurative elements often appear in a central position (Locher et al., 2007; Tyler, 2007), thereby inducing intense scanning of these regions. The correspondence of viewing strategies was largest for the early exploration of the same picture by different participants. Under these circumstances, visual attention is similarly allocated which could be accounted best by bottom-up guidance to regions of highest saliency.

Such an interpretation is further supported by the modulation of the earliest EFRP activity. The amplitude of the C1 component was larger for the first 10-s time bin. It has been shown already, that the C1 arises from neural generators in the primary visual cortex (Di Russo et al., 2002). This brain region has also been proposed to create a saliency map via intracortical interactions (Li, 1999, 2002). Recently, by employing a masking design to analyze ERP and BOLD signal, Zhang et al. (2012) observed a relatively pure saliency signal. The authors observed that C1 amplitudes increased with saliency. To further support this line of argumentation, C1 was found to be not modulated by high vs. low attentional load (Fu et al., 2010). However, care has to be taken by adapting these results to the present work. Although we carefully selected the EFRPs for the two distinct phases further influencing factors could be possible in our free viewing experiment (for a recent discussion on C1, see Fu et al., 2012).

In agreement with numerous other studies, we observed a fixation duration increase as a function of the viewing time (Antes, 1974; Unema et al., 2005; Pannasch et al., 2008; Mills et al., 2011). Longer fixation duration has been related to more elaborated and detailed processing of fixational content (Loftus and Mackworth, 1978). It thus may be feasible to assume that the processing of information changes with inspection time toward a modus of deeper processing, possibly facilitated by knowledge acquired during the initial seconds of exploration. Functions of WM may play an essential role to enable such elaborated processing. Yet, in what order the information are selected depends strongly on individual characteristics, such as motivation, intention and goals and previous experience. These individual factors may strongly contribute to the decreasing consistency in eye movement patterns between subjects during late phases of image inspection.

Recent research has advocated the view that WM and selective attention are tightly interconnected phenomena (Awh and Jonides, 2001; Pratt et al., 2011; Gazzaley and Nobre, 2012). Electrophysiological research on this topic may thus help to understand the results obtained in our study. One finding in this domain is that the amplitude of the N1 component seems to correlate with the ability to direct selective attention and to react fast and appropriately to targets especially when WM load is high due to a secondary task (Rose et al., 2005). It was found that N1 amplitude decreased and distractibility increased as function of WM load. A similar explanation may be applied to our findings, where the N1 amplitude decreased as a function of 40 s of scene exploration. This may reflect an increase in demands on WM during inspection. Low WM load can be assumed after picture onset since new information is presented. With the ongoing inspection information about the scene, its objects and specific relations accumulates in WM. These pieces of information have to be stored but also compared and integrated with the prior knowledge from long term memory. Following this argumentation, N1 variation may be correlated with the changing WM demands during image exploration.

The P2 amplitude of the visual evoked potential has also been associated with states of selective attention. It was proposed that this component may express enhanced cognitive processing demands or processes of active inhibition, particularly in situations when expected targets and irrelevant stimuli appear simultaneously (Kotchoubey, 2006; Freunberger et al., 2007). An increase in P2 might thus either reflect stronger focusing on targets or higher demands to suppress irrelevant information which both are necessary during a state of focused attention. This inhibitory aspect is in particular apparent in experimental paradigms using distractor stimuli (Hickey et al., 2009). Since top-down control serves as a common neural mechanism for selective attention and WM (Gazzaley and Nobre, 2012), we assume that our findings for N1 and P2 illustrate a general bias toward top-down modulations across inspection time.

While the parameters of gaze behavior as well as the components of the EFRPs remained stable from the first to last 20 pictures of the experiment, we observed a pronounced increase in the frontal beta and theta power over that time. Along with this variation our subjects reported increased sleepiness and subjective strain with time on task. Similar results of increased frontal beta activity and subjective strain were previously reported for low bottom-up stimulation when sustained attention was required for an appropriate completion of the experimental task (Smit et al., 2004; Barbato et al., 2007; Fischer et al., 2008). Increased frontal theta activity was previously related to WM load (Gevins et al., 1997; Jensen and Tesche, 2002) and to sustained attention (Sauseng et al., 2007). According to Sauseng et al. (2007) it is possible to differentiate between the two effects: while sustained attention is expressed by higher frontal theta activity, memory processing can be identified by increased connectivity in theta activation between frontal and parietal regions. Considering this interpretation, our results of increased theta and beta activity together with the larger self-reported strain and sleepiness demonstrate indications of higher demands on sustained attention later during the experiment.

Taken together, our study revealed systematic variation in parameters of behavioral and psychophysiological measures which seems to indicate a general adaption of attentional mechanisms in the time course of naturalistic image exploration. Early during inspection, we found a pattern that suggests a stronger influence of bottom-up control on attentional selection and processing. This early period is followed by a change that suggests an increasing impact of top-down controlled attentional processes. This, however, is a rather coarse interpretation of the current observations since dynamics and competition between these two attentional mechanisms may be much more vital on a finer time scale. While our findings reveal a shifting balance between bottom-up and top-down attentional guidance, it remains open which of the two mechanisms plays the dominating role to direct attention and control eye movements. Furthermore, it cannot be clarified how the interplay between the attentional processes exactly changes. As it looks from the present results so far, during visual exploration bottom-up activity decreases while at the same time the top-down influence increases. However, other interactions between both mechanisms are conceivable: Bottom-up activity remains stable but only topdown influences increase or vice versa. Further research should answer this question by explicitly testing these hypotheses.

We did attempt for the first time to explore aspects of the dynamic interaction between different attentional mechanisms and their neuronal correlates under relatively naturalistic conditions. Although we found a dynamic interaction between the discussed attentional mechanisms, understanding the precise nature of the interaction needs further investigation. Furthermore, our

**REFERENCES**


approach was grounded on the concepts of bottom-up, topdown, and sustained attention, alternative approaches for the explanation of naturalistic viewing should also be considered in further studies (Hochstein and Ahissar, 2002). Finally, more clarification is needed on how WM load can influence the EFRPs components during free exploration.

### **ACKNOWLEDGMENTS**

This research was supported by the EU NEST-Pathfinder project PERCEPT (043261), by the Bundesministerium für Bildung und Forschung (RUS09/005), the Russian Foundation for Basic Research (Interdisciplinary Oriented Research 09- 06-12003) to Boris M. Velichkovsky; and the Deutsche Forschungsgemeinschaft (PA 1232/1) and the European Commission (FP7-PEOPLE-2009-IEF, EyeLevel 254638) to Sebastian Pannasch. Thanks are due to Franziska Schrammel and Susen Döbelt for their support in data acquisition.


a working memory task. *Eur. J. Neurosci.* 15, 1395–1399.


and fixations. *J. Vis.* 11, 17. doi: 10.1167/11.8.17


relationship between saccade amplitude and fixation duration. *Vis. Cogn.* 12, 473–494.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 March 2013; accepted: 06 May 2013; published online: 04 June 2013.*

*Citation: Fischer T, Graupner S-T, Velichkovsky BM and Pannasch S (2013) Attentional dynamics during free picture viewing: Evidence from oculomotor behavior and electrocortical activity. Front. Syst. Neurosci. 7:17. doi: 10.3389/ fnsys.2013.00017*

*Copyright © 2013 Fischer, Graupner, Velichkovsky and Pannasch. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any thirdparty graphics etc.*

## Decision-making in information seeking on texts: an eye-fixation-related potentials investigation

#### *Aline Frey1, Gelu Ionescu2, Benoit Lemaire3, Francisco López-Orozco3, Thierry Baccino1 and Anne Guérin-Dugué2 \**

*<sup>1</sup> Chart-Lutin, Université Paris 8, Saint-Denis, France*

*<sup>2</sup> GIPSA-lab, University of Grenoble Alpes, Grenoble, France*

*<sup>3</sup> Laboratoire de Psychologie et NeuroCognition, University of Grenoble Alpes, Grenoble, France*

#### *Edited by:*

*Andrey R. Nikolaev, KU Leuven, Belgium*

#### *Reviewed by:*

*Florian Hutzler, University of Salzburg, Austria Sergei L. Shishkin, "National Research Centre Kurchatov Institute", Russia*

#### *\*Correspondence:*

*Anne Guérin-Dugué, GIPSA-lab, University of Grenoble Alpes, 11 rue des mathématiques, Domaine Universitaire BP 46, F-38040, Grenoble, France e-mail: anne.guerin@gipsa-lab. grenoble-inp.fr*

Reading on a web page is known to be not linear and people need to make fast decisions about whether they have to stop or not reading. In such context, reading, and decision-making processes are intertwined and this experiment attempts to separate them through electrophysiological patterns provided by the Eye-Fixation-Related Potentials technique (EFRPs). We conducted an experiment in which EFRPs were recorded while participants read blocks of text that were semantically highly related, moderately related, and unrelated to a given goal. Participants had to decide as fast as possible whether the text was related or not to the semantic goal given at a prior stage. Decision making (stopping information search) may occur when the paragraph is highly related to the goal (positive decision) or when it is unrelated to the goal (negative decision). EFRPs were analyzed on and around typical eye fixations: either on words belonging to the goal (target), subjected to a high rate of positive decisions, or on low frequency unrelated words (incongruent), subjected to a high rate of negative decisions. In both cases, we found EFRPs specific patterns (amplitude peaking between 51 to 120 ms after fixation onset) spreading out on the next words following the goal word and the second fixation after an incongruent word, in parietal and occipital areas. We interpreted these results as delayed late components (P3b and N400), reflecting the decision to stop information searching. Indeed, we show a clear spill-over effect showing that the effect on word N spread out on word N + 1 and N + 2.

**Keywords: information seeking, eye-fixation-related potentials, semantic processing, decision-making, EEG, eye movements**

#### **INTRODUCTION**

Seeking information in a newspaper or on a web page, both composed of multiple blocks of text demands that rapid decisions be made about whether to stop the reading of the current block and switch to another one. Quite often, the time constraint is strong and texts are not entirely processed. People are able to judge from a couple of words whether they found a text relevant or not. This involves two concurrent cognitive processes, (1) the word-to-word collection of the necessary and relevant information and, (2) the decision to leave the current block once provided with enough information. *Reading and decision-making* are therefore processes which intertwine during the search for information. For instance, it has been demonstrated that in situations where they need to solve problems, people decide to stop seeking information by estimating the cost of information with regards to the environment in which the task is performed. The general behavior is found to be sensitive to even the smallest changes in information-seeking costs. However, in the case of reading, costs are more difficult to determine since the goal is ill-defined and mostly based on semantic processing. To develop a cognitive model adapted to the search for information, it is necessary to feed it with human variables sensitive to *semantic* *processing* and *continuously available* throughout the progression on the page.

On the other hand, the search for information on textual web pages constantly requires that the reader switches between different *strategies* (reading, searching, stopping rereading) alternating from deep reading to word searching. Carver (1990) identified five reading strategies based on the reader's goal: memorizing, learning, rauding, skimming, and scanning. He assumed that these strategies might be clustered by the reading rates (in words per min). Hence, the reading strategy (called *rauding*) is achieved on an average 300 Wpm while scanning is performed at 600 Wpm and used when readers are looking for a particular word. Our task is situated in between these strategies and corresponds to what Carver called *skimming* (450 Wpm). Recent simulations using scanpaths as human metrics have been developed for an automatic identification of some of these strategies and they show the moment by moment orientation of attention but once again no completely reliable information on semantic processing was provided with this metric.

Consequently, the main issue of this paper is to distinguish between semantic and decision-making processes in informationseeking tasks, through the joint analysis of eye-tracking and EEG data (the so-called Eye-Fixation-Related Potentials—EFRPs) (Hutzler et al., 2007; Baccino, 2011; Dimigen et al., 2011; Kliegl et al., 2012). We used this technique because not every metric (eye movements and EEG) is able to unveil on its own what is really happening during the search for information. Eye-tracking data provides highly valuable information on the sequence of words that have been fixated by the reader before he/she decides to stop reading. Fixation durations are also available, but they are a weak indicator of what happens during the reading process, because several factors influence the fixation duration (for example, word frequency or word predictability) even if no decision-making is involved. In addition, there is no one-to-one temporal mapping between a fixation on a word and the cognitive processes associated with that word: word processing may continue after the reader's have left the word. For instance, this well-known *spillover effect* is demonstrated by the fact that a low-frequency word results in an extra processing time, not only for that word but also for the next one (Rayner and Duffy, 1986). Knowing when a word has been fixated is therefore insufficient to know exactly when this word is processed (Rayner, 1978). Finally, most often the decision to stop searching does not occur within the last milliseconds before the participant leaves the current text, for some time elapses between these two events. People need extra time before they move their eyes away from the current text. Therefore, decision-making is not necessarily associated with the last fixation.

Similarly, EEG data do not provide enough information on their own either. The reason is threefold: (1) it is impossible to know exactly which words have been fixated during a real reading task, (2) consequently, in EEG experiments on reading, words have to be presented one at a time onto the screen at a speed of about one word per second, (3) the EEG signal is "contaminated" by saccades.

However, the EEG technique has allowed pointing out stereotyped electrophysiological responses to specific cognitive events. In particular, it has been shown that an element that is unexpected in this context elicits a larger negative waveform distributed over the centro-parietal areas and occurring about 400 ms after the stimulus onset. This so-called N400 component was first identified by Kutas and Hillyard (1980) and is usually associated with tasks of visual and auditory comprehension of sentences, in which the amplitude of the N400 is correlated with the degree of incongruence of the sentence and final word (Key et al., 2005). More specifically, the incongruous words elicited a larger amplitude of the N400 response than the congruous ones (e.g., *The man liked his coffee with dog* elicits a larger N400 amplitude than *The man liked his coffee with sugar*). The N400 amplitude also inversely correlates with cloze-probability levels, defined as, for an item, the percentage of people that will continue a sentence fragment with that particular item (Gonzalez-Marquez et al., 2007). Therefore, it was proposed that the N400 reflects processes of semantic priming or activation (Kutas and Hillyard, 1984)—i.e., the easy integration of a word into a context, or the extent to which the context pre-activates the word - to reflect the system expectation for either a content word or an index of processing difficulty. Later, meaningful stimuli other than words, such as faces (Barrett and Rugg, 1989), pictorial stimuli (Barrett and Rugg, 1990; Praterelli, 1994), objects (Ganis and Kutas, 2003), or music (Koelsch et al., 2004) were found to elicit N400-like potentials. In brief, N400 seems to reflect the degree of contextual facilitation and semantic context integration (DeLong et al., 2011). Lastly, some recent consideration has been given to the possibility that the N400 potential indexes activation processes, but also inhibition processes (see Debruille, 2007 for a review).

Another component that has been extensively studied is the P300 component. P300 is a positive component that develops over parietal areas when a subject detects an infrequent stimulus, expected yet unforeseeable, through a series of stimuli (Sutton et al., 1965). The experimental protocol that revealed this component, i.e., the "oddball paradigm," consists in successively presenting two types of stimuli that differ in one of their physical parameters (e.g., for auditory stimuli, two sounds with different pitches) with different probabilities of occurrence: one is frequent (e.g., 80% of trials), the other is rare (e.g., 20% of trials). Note that it is not the parameters of the physical stimulation that determine the appearance of P300, but their status, that is to say their probability of occurrence. In this respect, a larger P300 is elicited by the events representing the low-probability category (Donchin, 1981). To elicit P300, the participant must be actively involved in the task, either by counting or by pressing button in response to the rare stimuli. The amplitude of P300 therefore served as our covert measure of attention that arises independently of behavioral responding (Gray et al., 2004). The amplitude of P300 is proportional to the amount of attentional resources engaged in processing a given stimulus (Johnson, 1988) and is not influenced by factors relating to the selection or execution of a response (Crites et al., 1995). In addition to low probability, stimulus properties that heighten the amplitude of P300 are relevant to the subject's task (Squires et al., 1977) and qualitative deviance (Nasman and Rosenfeld, 1990). In brief, the amplitude of P300 is affected by attention, stimulus probability, stimulus relevance, and the amount of available processing resources (Key et al., 2005). The latency of P300 is assumed to reflect the duration of the stimulus evaluation. The functional interpretation of P300 consists of memory updating, active stimulus discrimination, and categorization, as well as response preparation (Donchin and Coles, 1988).

P300 has been decomposed into two components, P3a and P3b. P3a, with a shorter latency (Knight, 1996) and a more frontal distribution, occurs even when the participant passively receives stimuli and is not required to actively respond to the targets (Timsit-Berthier and Gerono, 1998). P3a may be interpreted as an attentional shift in response to an unexpected disruption in the environment (Yamaguchi and Knight, 1991), the physiological correlate of a reaction of orientation toward novelty, the reflection of involuntary attention. Unlike P3a, the unpredictability of a stimulus is insufficient to demonstrate P3b, it is therefore necessary that the participant pays attention and responds to stimulation. According to Hansen and Hillyard (1983), P3b is associated with the final decision made on the status of the stimulus with regards to the required task. In general, P3b was associated with discrimination, categorization, selection, matching processes, and decision-making (see Picton, 1992; Hruby and Marsalek, 2003 for review). P3b does not seem to directly reflect processes relating to stimulus memorization but rather the process of evaluating the stimulus inducing decision-making.

EEG and eye-tracking data are therefore complementary in the study of the reading activity.

The aim of our experiment is to investigate the EFRPs during the search for information in a text. Participants had to make binary decisions about the semantic relatedness of a text to a goal. People would decide to stop reading for two different reasons. Firstly, they would realize that the current text was related to the goal. They would not need to read further after they found what they were looking for. We call it a *positive decision*. Secondly, they would find that the text had nothing to do with the goal. We call it a *negative decision*. In both cases, people would stop reading. To simplify the EFRPs analyses, we have defined two kinds of words that are likely to trigger these decisions. We assume that positive decisions should be triggered by *target words* (i.e, words whose verbal form belongs to the search goal). For instance, if the goal is "presidential campaign," the word "president" may trigger a positive decision. In the same way, our hypothesis is that negative decisions result from the presence of so-called *incongruent words* (i.e, words that are specific to a domain other than that of the goal). For instance, "basketball" is a word that is specific to a particular domain, which has nothing to do with "presidential campaign." Therefore, this incongruent word is a good "candidate" for the occurrence of a negative decision.

### **METHODS**

#### **PARTICIPANTS**

This experiment lasted about 1 h and 40 minutes and involved a total of 21 participants, all French native speakers. For technical reasons, four participants could not be registered or were excluded from the final data analysis. The seventeen remaining participants (7 women and 10 men, 16 right-handed and 1 lefthanded, aged 19–43 years, mean age 27 years, *SD* = 8 years) had normal or corrected-to-normal visual acuity, and had no known neurological disorder. The purpose of the study remained unknown to them. They all gave written consent and were paid 20C for their participation in the experiment. The whole experiment was reviewed and approved by the ethics committee of the CHU ("*Centre Hospitalier Universitaire*") of Grenoble (RCB: *n*◦ 2011-A00845-36).

#### **TEXTUAL MATERIAL**

First, a set of thirty goals (topic) was created. Each goal is expressed as a nominal phrase in French such as "observation des planètes" (planet observation), "réhabilitation des logements" (housing renovation), "associations humanitaires" (humanitarian associations), etc. For each goal, six texts were created in French, two of which were highly related to the goal (HR), two moderately related (MR), and two unrelated (UR). For each goal, participants had to read texts that were either highly related (2), or moderately related (2), or unrelated (2) to a given goal. Finally they had to read 180 texts: (2HR + 2MR + 2UR) × 30 goals.

This task requires a binary decision for each text. We have considered, but indeed not a posteriori verified, that if we presented only two kinds of texts (HR, and UR), the task would be too easy and the participants could answer very rapidly after very few words, and even without a linear reading. Our goal was not to design a reflex task. We wanted to design a task where ideally reading and decision-making were intertwined and this intertwining depended on both the structure of the text and the participant. Therefore, our assumption was that, without MR texts, reading, and decision-making would not be intertwined.

In order to control the semantic relatedness of the texts to the goals, a method called Latent Semantic Analysis (Landauer et al., 2007) was used, which consists in computing semantic similarities between texts. LSA was trained on a 24 million word French corpus composed of all the articles published in the newspaper "Le Monde" in 1999. A 300 dimension space was generated based on the corpus, by means of a singular value decomposition of the word × text occurrence matrix [see Martin and Berry (2007) for more details]. Each word of the corpus being represented by a 300 dimension vector, new texts can also be represented by a vector through the simple sum of their word vectors. A cosine function was used to compute the similarity between vectors. The higher the cosine value, the more similar the two texts are. For highly related texts, cosine with the goal was above 0.40, for moderately related texts, cosine was between 0.15 and 0.30, and for unrelated texts, cosine was below 0.10. **Figure 1** shows the English translation of three examples of texts (HR, MR, and UR) for the goal "observation des planets," as well as an example of how they were visually presented to the participants. From the point of view of analyses, the whole material was organized into three sets of 30 × 2 = 60 texts each, respectively for all the highly related, moderately related and unrelated texts.

The texts were written in "DejaVu" font. The letters were black on a medium gray background. All the texts were composed of an average 5.18 sentences (*SD* = 0.7) and 30.1 words (*SD* = 2.9). Each word was composed of an average 5.34 characters (*SD* = 3.24). The average number of lines was 5.18 (*SD* = 0.68). In average, the text was displayed with 40.1 (*SD* = 5,4) characters per line.

#### **EXPERIMENTAL PROCEDURE**

Goals were randomly presented to participants. For each goal, the presentation order of the six texts, along with their different relatedness to the goal, were set at random and participants

did not know the distribution of the relatedness to the goal beforehand.

The objective was to make a press review on given goals, by deciding as fast as possible whether the presented text had to be kept or rejected. Every participant made 180 decisions, i.e., 180 trials (30 goals × 6 texts). To ensure the correct understanding of the instructions, practice trials using one new goal and six texts were performed at the beginning of the experiment. **Figure 2** describes the exact sequence of stimuli. A series of six trials began with the presentation of the goal, and then a fixation cross was displayed on the left of the first character of the first word for gaze stabilization. The duration of this period is random (mean = 800 ms, *SD* = 40 ms) to avoid the risk of saccade anticipation before the text presentation. The texts were displayed after this period of gaze stabilization. The mouse cursor remained invisible throughout the reading. Participants had to mouse-click as quickly as they could once they had taken a decision (to keep or reject the text). The next screen, with the visible mouse cursor, was then displayed to collect the participant's decision. Participants had to left-click on a green symbol to keep the text, or right-click on a red symbol to reject it. The trial was repeated six times for the same goal, and the whole procedure was repeated thirty times for the thirty different goals. In between goals, participants were given the opportunity to rest when necessary. They were also informed of the number of goals remaining until the end. A screen indicated that a new goal would be presented as well as the remaining goals count (**Figure 2**, last screen). The program describing the whole experiment was written in the Matlab environment, using Psychophysics Toolbox (Brainard, 1997) and SR Research's Eyelink library.

#### **EEG AND EYE TRACKING ACQUISITION**

Throughout the experiment, participants were comfortably seated in an adjustable chair while their EEG activity was being recorded. Thirty-one active electrodes (Brain Products GmbH) were mounted on an EEG cap (BrainCapTM) placed on their scalp in compliance with the International 10–20 system (Jasper, 1958). One electrode was affixed under the right eye to record vertical electro-oculographic activity (EOGv) with FP2 on the

scalp. The information relating to the bipolar horizontal electrooculographic (EOGh) signal was obtained through the eye tracking system. To get electrical contact and increase the signal-noise ratio, we have used contact gel (SuperVisc Gel, Brain Products, Inc.) and adjusted individual sensors until impedances were inferior to 5 k-.

Electrodes were referenced to head (FCz—ground:AFz), and EEG data were amplified through the BrainAmp™ system (Brain Products, Inc.), sampled at 1000 Hz, and then filtered with a 250 Hz low-pass filter.

For the sake of compatibility with this EEG acquisition, we have used the remote binocular infrared eye tracker EyeLink 1000 (SR Research) to track the gaze of each eye while the observer was looking at the screen. The EyeLink system was used in the Pupil-Corneal Reflection tracking mode sampling at 1000 Hz. For eye tracking acquisition purposes, the position of the head was stabilized with a chin rest and a fixed bar for the forehead. Participants were seated 68 cm in front of a 24-inch monitor (42 × 21◦ of visual field) with a screen resolution of 1024 by 768 pixels. The text was displayed at the centre of the screen (21 × 11◦ of visual field). While the text was displayed in average with 40.1 characters per line, each character covered 0.52◦ of horizontal visual angle, corresponding to about 3.8 characters in fovea.

At the beginning of the experiment, a 9-point calibration was operated. A drift correction was performed before every trial and a 9-point calibration would automatically be carried out again should the timeout on the initial fixation cross elapse or the experimenter decide to run it, in case an error above 0.5◦ was detected.

#### **TECHNICAL IMPLEMENTATION**

After acquisition, data (EEG raw data, hardware triggers, eye tracker raw data) had to be collected, synchronized and enriched with additional information on the read words, through a joint analysis of the scanpaths and associated texts (section Data enrichment: from fixations to words).

Raw data for both EEG (32 channels) and eye tracker (right and left eye positions) were sampled at 1 kHz. Besides, hardware triggers were automatically generated during the experiment to identify the different sequences (fixation cross, text/goal presentation ...) in each trial. These marks were found both in the EEG file and the eye tracker file containing the raw data.

Even if both the EEG and eye tracker data were sampled at the same rate, data had to be synchronized to make up for clock drift, jitter and others sources of time distortions or information losses. This task was carried out through the known sequence of hardware triggers automatically generated during the experiment. After that, the raw data consisted in thirty-seven channels sampled at 1000 Hz: 32 for EEG data, 4 for eye tracker and one logical signal for the blinks detected by the eye tracker. After this global synchronization, eye tracker events (start, end of fixations and saccades) were added to the synchronized data file. The thresholds for saccade detection were a minimum velocity at 30◦/ s, a minimum acceleration at 8000◦/ s and a minimum motion at 0.1◦/ s. These detections provided 12 eye tracker events (beginning / end of fixations, saccades and blink periods for both eyes).

#### **DATA PREPROCESSING**

EEG data preprocessing and EFRP analyses were carried out with the Brain Analyzer 2 software (Brain Products GmbH, Version 2.0.2). Continuous EEG was first segmented over the whole duration of each text. A low cutoff filter was applied (2 Hz, Time constant 0.0796, 12 dB/oct.) as well as a notch filter at 50 Hz (symmetrical 5 Hz bandwidth around the notch frequency, i.e., 50 ± 2.5 Hz; 24 dB/oct.) to eliminate interference from the electricity network. Independent Component Analysis (ICA) was used to remove blink and saccadic movement artifacts from the EEG data (Makeig et al., 1997; Jung et al., 2000). Visual inspection of the FP2 channel, on which blinks are supposed to be maximal, in comparison with channels on which our effects were maximal (e.g., P4), showed that EOG artifacts were sufficiently removed from the data and that there was no significant impact from EOG artifacts on our results. Afterwards, epochs of 1200 ms were defined, starting 200 ms before the fixation onset. EFRPs analyses excluded the first fixations because they were related to the text onset (cf. Dimigen et al., 2011), as well as the last fixations that are known to be longer than intermediate position fixations (Just and Carpenter, 1980; Rayner et al., 2000) and to elicit specific EEG patterns (Hagoort, 2003). The baseline was defined between 200 and 100 ms before the onset of the fixation and then subtracted, because the artifact of the previous saccade was restricted from 100 to 0 ms before the fixation onset (Dimigen et al., 2011; Kamienkowski et al., 2012). Segments containing artifacts (bad gradients or excessive maxmin) were rejected using a semi-automatic artifact rejection procedure. The **Table 4** presented below (cf. Section Methodological aspects of EFRP analysis) shows, for each fixation of interest, the mean, standard deviation and minimum, across participants, of the remaining fixations after artifact rejection. Average EFRPs were then generated for every participant, electrode, and fixation of interest across trials. Among all available fixations, fixations of interest were selected based on semantic information taken from the different texts, as will be explained in the next section.

#### **DATA ENRICHMENT: FROM FIXATIONS TO WORDS**

Before data analysis, fixations whose duration was inferior to 80 ms or superior to 600 ms were excluded from the data (0.1% of all fixations). Fixation-related events were added *a posteriori* in order to characterize fixations according to both some semantic properties of the words read at each fixation and the fixation duration.

We had to predict which words were actually processed by participants for each fixation, in order to study EEG components that could have been induced by specific words. It is known that the area from which information can be extracted during a single fixation extends from about 3–4 characters to the left to 14–15 characters to the right of the fixation (Rayner, 1998). This area is asymmetric to the right and corresponds to the global perceptual span. Therefore, more than one word may be processed for a given fixation. It is hard to identify which words may have been processed for each fixation. As we mentioned earlier, staring at a word does not necessarily imply that it is processed. It could be processed when the eyes are still on the previous

fixation, because it is highly predictable and/or partly distinguishable even from the parafoveal area. It could also be processed when the eyes are on the subsequent fixations, in case it was a low-frequency word whose processing spilled over onto the next word. EZ-Reader (Reichle et al., 1998, 2003) is a computational model that well describes this complex phenomenon by considering that eye movements and attention are decoupled, although processing is supposed to be performed one word at a time. It is therefore quite hard to guess exactly when each word is processed. Things get even worse as we fall within the SWIFT model (Engbert et al., 2005) because it assumes that a parallel processing could occur, i.e., several words could be processed at the same time. If there had been a consensus on a model, a solution could have been to run this model onto our data. Since it is not the case yet, we have ended up with the following method. We have used a window that was sized according to Rayner's assumptions. He has shown that the area from which a word can be identified extends no more than 4 characters to the left and 7–8 characters to the right of fixation, which corresponds to the word identification span. Moreover, Pollatsek et al. (1993) have shown that even if information taken from the next line was processed during a reading task, participants were not able of retrieving any semantic information. Therefore, the width of our window was 4 characters to the left plus 8 characters to the right of the fixation point. Since the initial fixations on the beginning of a word made it easier to recognize than initial fixations on the end of the word (Farid and Grainger, 1996), we have considered that a word is processed if at least the first third or last two-thirds of that word are inside the window.

Let us remember that we were interested in two particular decision-making situations:

#### *Positive decision*

This decision is made by participants once they find a semantic relationship between the goal and the current text. Our goal is to identify the premises of decision-making in the EEG signals, but we had to decide what part of the signals we had to look for, because the decision may occur some time before the mouse click. Several kinds of words were of interest as part of our investigation. Words with a high semantic relationship with the goal were choice words, but it would have been difficult to define a threshold for the relatedness to the goal. Words from the goal were also choice ones for two reasons: they were likely to be related to the goal and had been previously seen by the participant, which could also trigger the decision. For these reasons, we have selected those target words as potential markers of a positive decision. We have accepted all the words deriving from every word taken from the goal. For example, if the search goal was "croissance de l économie," target words could be: "croissance," "croissante," "économie," "économique," "économiste," etc.

#### *Negative decision*

This decision is made by participants once they are sure that there is no semantic relationship between the goal and the current text. Just like in the previous scenario, several words were of particular interest but we have ended up with incongruent words, i.e., words that have nothing to do with the goal and yet are specific enough of another domain. Therefore, their frequency is low and they are in no way related to the goal. We have empirically adjusted the thresholds to classify a word as incongruent, in order to meet two objectives. The first of these objectives was to have about the same number of target words read in highly related texts and of incongruent words read in unrelated texts. The second one concerned the participants' ability to effectively stop reading after they had read those words. For each participant, the number of remaining fixations after reading those words was computed in these two situations. Then, we adjusted thresholds to obtain similar distributions among participants. See section Results on eye tracker data on results of eye tracker data for more details. These two constraints were combined and the two thresholds, respectively set to 0.06 for the LSA semantic similarity, and 1.5 per million words for the word frequency. So, the fixated words, whose cosine with the goal was below 0.06 and word frequency below 1.5, were tagged as incongruent words.

Through the windowing mechanism, as explained above, both the fixations on incongruent words and target words were tagged after scanning the complete eye fixations dataset for all the subjects. These tagged fixations, labeled "Fixations of interest," were specific events for EFRP analyzes, allowing the correct fixation selection before epoching. **Figures 3A,B** illustrates the temporal sequence of each and every event and provides a glossary of these event names. This glossary will be completed in section Fixation selection for global analysis. We will keep on referring to this glossary throughout the presentation of the methodology (sections Fixation selection for rank analysis and Fixation selection for global analysis), results and discussion.

#### **STATISTICS ON WORDS**

We have computed some statistics about target and incongruent words and compared them to all other words.

The average length of target words is 7.8 characters (*SD* = 2.7) while the average length of incongruent words is 9.1 characters (*SD* = 3.4). Those words are slightly longer than all other words (5,4 characters, *SD* = 3.2). However, target words are about the same length as noun words (7.5 characters, *SD* = 2.6). Incongruent words are longer because they were selected from low frequency words and it is known that the length of a word tends to bear an inverse relationship to its relative frequency.

The average frequency of target words is 47.3 per million (*SD* = 52.4) while the average frequency of incongruent words is 0.4 per million (*SD* = 0.6). The frequency of target words lies in the central part of the distribution, whereas incongruent are by definition located in the left tail of the distribution.

#### **RESULTS ON EYE TRACKER DATA**

Out of the 17 participants, two were excluded from the EFRP analysis because of their behavioral data. One of them had read all the texts up to the end, so we considered that the decisions he made were not made during the reading process. Consequently, the average number of fixations for this participant was too high (over one standard deviation from the average of all participants). A second participant was excluded for the opposite reason; the average number of fixations for this participant was too low (under one standard deviation from the average of all participants). For the fifteen remaining participants, the average number of fixations and standard deviation for the three types of text (HR, MR, and UR) are indicated in **Table 1**.

HR texts are likely to induce positive decisions (keep the text), whereas UR texts should lead to negative decisions (reject the text). As regards the decision-making task performed by the participant, moderately related texts were created to introduce a continuum of relatedness to the goal, from unrelated to highly related, in order to maintain both the difficulty of the task and the necessity to read a significant part of the text before making any decision. In order to verify these assumptions, ratios for the kept and rejected texts were computed for each type of text. The ratio of correct responses (keep the text for highly related texts, and reject the text for unrelated texts) was very high (see **Table 2**, lines 2, 4, and 5). As regards moderately related texts, participants decided to keep the text in 47.2% of cases (about chance). These results confirm the role of neutrality for the MR texts in the experiment and their intermediate situation between UR and HR texts, which are our texts of interest.

Let us consider trials with target words and incongruent words, and also a temporal perspective on decision-making after

**Table 1 | Average number of fixations and standard deviation for all participants (15), according to the three kinds of text.**


**Table 2 | Ratio of decisions for all the participants according to the different kinds of texts.**


those particular fixated words. Two issues were addressed. As we explained earlier, the first one was to set the selection thresholds for incongruent words. The second one was to analyze whether fixations on those words were linked to a possible speed-up in decision-making. In other words, do these specific words induce decision-making?

To empirically define the selection thresholds for incongruent words, two conditions were defined. On one hand, we wanted to have about the same number of trials with fixated target words and incongruent words. There were 32 target words in the 14 highly related texts, providing 207 trials where participants fixated target word(s) before making any decision. By setting the thresholds to 0.06 for the semantic link with the goal and 1.5 per million words for the word frequency, we have obtained 33 incongruent words used in 17 unrelated texts. We have observed 210 trials in which participants fixated incongruent words before making a decision. The number of trials with target words was about the same. On the other hand, we have aimed at an even distribution among participants of the median values of the number of remaining fixations after an incongruent word, but also after a target word. We have checked that these two patterns of median values were similar (minimum of mean squared error) as we changed the thresholds. Finally, the average for all participants of the median values of remaining fixations was 6.53 after a target word (*SD* = 2.20) and 5.76 (*SD* = 3.59) after an incongruent word. Then with the proposed thresholds, these two constraints were met.

The second calculation was intended to show that target and incongruent words are more likely to induce decision-making than other words. In trials with target words (**Table 2**, line 3), the percentage of correct response was high (90.8%) and not different from the overall trials on highly related texts (92.9%). In trials with incongruent words (**Table 2**, line 6), the percentage of correct response was high (94.8%) and not different from the overall trials on unrelated texts (95%). Then, fixation on a target or an incongruent word did not improve performance.

To analyze decision-making after a target word or respectively an incongruent word, we have compared the remaining number of fixations after those particular words (target or incongruent) and the remaining number of fixations after words which were equally fixed although they were NOT target or incongruent words. To do so, we have considered two couples of conditions: texts with fixations on target words (TW) vs. texts without fixations on target words (NTW), and respectively, texts with fixations on incongruent words (IW) vs. texts without fixations on incongruent words (NIW). In conditions TW and IW, for every participant and every text, we have computed the number of remaining fixations after target or incongruent words. For conditions NTW and NIW, for each participant all fixation ranks of target and incongruent words were randomly matched with texts without target words or incongruent words, in order to compute the remaining number of fixations. For example, suppose a participant made 12 fixations on a text, and fixated a target word with a rank of 9. The number of remaining fixations is 3. This case was randomly matched with another text without any target word. Suppose the reading of this text was abandoned after 14 fixations. The number of remaining fixations after 9 fixations is 5. This value of 5 will be compared with the value of 3 previously obtained. These matches were repeated 100 times.

In all the trials with target words, the average number of remaining fixations after a target word is 7.33. Using the same fixation ranks, this time for HR texts without target words, and considering 100 repetitions for matching, the average number of remaining fixations was 9.46 (*SE* = 0.14). In all the trials with incongruent words, the average number of remaining fixations after an incongruent word was 7.49. Now let us take the same fixation ranks, only this time on UR texts without incongruent words: considering 100 repetitions for matching, the average number of remaining fixations was 8.19 (*SE* = 0.11). In both cases, the remaining number of fixations after a target word or an incongruent word for a given rank was inferior to the remaining number of fixations after a word of the same rank that did not have those properties. These differences were highly significant (*p* < 0.01). From these behavioral data, we have concluded that fixations on target words or on incongruent words impacted decision-making since the effective decision to stop reading seemed to occur with a reduced latency from the fixation on those words, but did not impact the performance level.

Considering the fixation durations, the average and standard deviation of the fixation and saccade durations for both HR texts with target word(s) and UR texts with incongruent word(s) are indicated in **Table 3**. The fixations with a duration inferior to 80 ms or superior to 600 ms were excluded from analyses (0.1% of the whole fixations).

To carry out EFRP analyses, it is important to know the distribution of the intervals between fixations, which constitute the inter stimuli interval (ISI) for ERP extraction. The duration of the inter stimuli interval corresponds to the sum of the durations of one fixation and one saccade (**Table 3**). Then considering these statistics, we observe a continuum of events coming from previous and subsequent fixations, for a given time interval in EFRP analysis. To illustrate this, let us note *t*1, respectively *t*2, the temporal interval between a given fixation of interest and the first subsequent, respectively the second subsequent, and also *t*−1, the temporal interval with the first previous fixation. Such distributions on *t*−1, *t*1, and *t*<sup>2</sup> are plotted in **Figure 4**, where the fixations of interest are all the fixations on target words (**Figure 4A**), or all the fixations on incongruent words (**Figure 4B**), for all the subjects. We will discuss


**Table 3 | Average number of fixations, average and standard deviation for fixation durations and saccade durations for all the participants**

**FIGURE 4 | Distribution of ISI** *t***−1,** *t***1,** *t***−<sup>2</sup> when fixations of interest are fixations on (A) target words, or (B) on incongruent words.**

these statistical properties of ISI in EFRP when interpreting the results of our EFRP analyses (section Discussion, Conclusion). Consequently for EFRP, if we consider a time interval of around 300 ms to extract a component, the resulting signal pattern will reflect the convolution of the EEG activity elicited at the onset of both the fixation and the subsequent fixation, since in average, the duration of one fixation cumulated with one saccade –185 + 45 ms = 230 ms- is inferior to 300 ms. The same argument applies to a time interval of around 500 ms, the result of EFRP analysis will reflect the convolution of the EEG activity elicited at the onset of both the fixation and the two subsequent fixations, while in average the duration of two fixations cumulated with two saccades −2 × (185 + 45 ms) = 460 ms- is inferior to 500 ms. We will discuss these situations in sections EFRP analysis for late components and results and discussion, conclusion.

### **METHODOLOGICAL ASPECTS OF EFRP ANALYSIS**

In this study, our objective was to analyze the neural activities of decision-making during the reading of a text on the time scale of the eye fixations. The methodology for EFRP analyses therefore combined both the selection of particular fixations in the texts and the temporal evolution of neural activities, before and after these events. Besides, because of brain signal overlaps resulting from a much smaller average inter stimuli interval (230 ms including the durations of the fixation and the saccade) than the usual latency of late components, EFRP were analyzed according to a three-step strategy.

Firstly, EFRP were studied at the level of fixation ranks. Brain signals were averaged for each fixation events preceding or following a fixation on a target word or an incongruent word. More precisely, we first averaged brain signals at the onset of fixations on the target word. Then, we averaged brain signals on fixations occurring just before the target word, but also on fixations Target−2, Target−3, etc. We did the same on fixations following the target word: Target+1, Target+2, etc. We have used the same principle for the incongruent words. Rank analyses were analyzed through temporally linked conditions, in order to study the time course of neural activities, fixation after fixation, around particular events. The goal was to look for fixation ranks that could be associated with a specific brain component that would be a marker of a decision.

Secondly, we focused on those fixation ranks showing a specific brain signal pattern that was different from the other surrounding ranks, in order to carry out a more global analysis. To study whether a particular signal pattern we had found was really specific to a given rank, we have compared it to every signal that appeared both at previous fixation ranks and subsequent ones. Actually, we have made a selection of those events so as to guarantee an even distribution of fixation duration before and after the central event.

Finally, we investigated late components synchronized with these particular events, in spite of the fact that after about 230 ms, the signal overlaps with the one associated with the *N* + 1 fixation and, even worse, at about twice that time, the signal overlaps with that of the *N* + 2 fixation. For these analyses, the expected latencies will be deduced from the previous results. The **Table 4** shows the mean, standard deviation and minimum for all the fixations of interest as mentioned earlier.

For all these analyses and for each participant, a voltage average has been computed on different latency windows, for all fixations of interest and Regions of Interest, (ROI, see **Figure 5**), each grouping three electrodes and were subjected to ANOVAs. ROI 1, 2, 3 and 4 were selected to allow left (ROI 1, ROI 3) vs. right (ROI 2, ROI 4) and anterior (ROI1, ROI2) vs. posterior (ROI 3, ROI 4) comparisons. ROI 5 includes midline electrodes (Fz, Cz, Pz) excepted Oz, because we wanted, due to our task, to put together all the occipital electrodes (ROI 6: O1, Oz, O2). *P*-values were reported after the Greenhouse-Geisser correction for nonsphericity and Tukey tests were used for *posthoc* comparisons.

### **RANK ANALYSIS AND RESULTS**

#### **FIXATION SELECTION FOR RANK ANALYSIS**

According to our assumptions, the aim of the fixation-by-fixation EFRP analysis, called *rank analysis*, was to search for EEG components which would be related to decision-making consecutively to fixations on target words or incongruent words. **Figure 3A** illustrates the temporal position of all the events related to the fixations on, before and after target words. For instance, the fixation-related event that is situated two fixations after the fixation on the target word is called "Rank2AfterTarget." **Figure 3B** illustrates the temporal position of all the events related to the fixations on, before and after incongruent words. Thanks to all these events, we first conducted different EFRP analyzes, rank by

**Table 4 | Mean, Standard deviation (***SD***), and Minimum (Min) across participants for all fixations of interest, for target and incongruent words, after preprocessing.**


rank, centered on "TargetWord" and "IncongruentWord" events. All these *rank analyses* are presented in section Results of rank analysis on target words concerning the events related to target words, and in section Results of rank analysis on incongruent words concerning the events related to incongruent words.

These rank analyzes thus allowed us to detect EEG components which could potentially be related to decision-making, but also to characterize these neural patterns as a transient response. The result of these *rank analyses* was the selection of key events on which a more *global analysis* was carried out to characterize the elicited EEG components with more accuracy.

#### **RESULTS OF RANK ANALYSIS ON TARGET WORDS**

After the inspection of the EFRPs data and review of the relevant literature (Key et al., 2005; Polich, 2010), we selected respectively the 0–50, 51–90, and 91–200 ms latency windows for analysis. Indeed, as seen in the following results, an effect was observed in the first positive component. According to us and regarding to our task, this early effect was interpreted as a late effect of previous fixations (cf. Section EFRP analysis for late components and results), specifically in this case as an effect on the P300 component. The Fixations of interest included in the ANOVAs were: JustBeforeTarget, TargetWord, JustAfterTarget, Rank2AfterTarget and Rank3AfterTarget.

#### *0–50 ms latency window*

No significant effect was found.

#### *51–90 ms latency window*

The Fixations by ROI interaction was significant between JustBeforeTarget, JustAfterTarget and Rank2AfterTarget fixations [*F*(10, <sup>140</sup>) = 2.27, *p* = 0.05]. At ROI 4, JustAfterTarget events elicited a larger positivity than both JustBeforeTarget (*p* = 0.0001) and Rank2AfterTarget events (*p* = 0.01). At ROI 6, JustAfterTarget events elicited a larger positivity than JustBeforeTarget ones (*p* = 0.043).

No significant difference was found between Target and JustAfterTarget fixations or between Target and JustBeforeTarget fixations [*F*(2, <sup>28</sup>) = 1.48, *p* = 0.24]. Likewise, no significant difference was found between Rank2AfterTarget and Rank3AfterTarget fixations [*F*(1, <sup>14</sup>) = 0.30, *p* = 0.59]. See **Figure 6A** for mean values and standard errors.

#### *91–200 ms latency window*

No significant effect was found.

These results (**Figure 6B**) showed that the first positive component observed after the onset of each fixation was larger for the fixation just after that on the target word. This effect was located in the right centro-parietal and occipital areas. We will call this JustAfterTarget event *key event* throughout the rest of the paper.

#### **RESULTS OF RANK ANALYSIS ON INCONGRUENT WORDS**

We have selected the 0–50, 51–120 and 121–200 ms latency windows for the analysis, after visual inspection of the traces and regarding to the literature (Camblin et al., 2007; Kutas and Federmeier, 2011). Indeed, data inspection showed an early negative effect that was interpreted as an N400 modulation of previous fixations (cf. Section EFRP analysis for late components and results). The fixations of interest included in the ANOVAs were: JustBeforeIncongruent, IncongruentWord, JustAfterIncongruent, Rank2AfterIncongruent, Rank3AfterIncongruent, and Rank4AfterIncongruent.

#### *0–50 ms latency window*

No significant effect was found.

#### *51–120 ms latency window*

Rank2AfterIncongruent (−0.28µV) elicited a lower positivity than for JustBeforeIncongruent events (0.81 µV; main effect of Fixation: [*F*(1, <sup>14</sup>) = 7.12, *p* = 0.018], specifically at ROI 3 (*p* = 0.0018) and ROI 6 (*p* = 0.00012; Fixation by ROI interaction: [*F*(5, <sup>70</sup>) = 2.90; *p* = .043].

Rank2AfterIncongruent event also elicited a lower positivity than for IncongruentWord event [0.82 µV; main effect of Fixation: *F*(1, <sup>14</sup>) = 4.25, *p* = 0.05].

Finally, Rank2AfterIncongruent fixations were less positive than Rank3AfterIncongruent fixations [1.16 µV; *p* < 0.05; main effect of Fixation: *F*(2, <sup>28</sup>) = 3.39; *p* = 0.05], specifically in the ROI 3, ROI 4 and ROI 6 Fixation by ROI interaction marginally significant [*F*(10, <sup>140</sup>) = 2.25; *p* = 0.07], while no significant difference was observed with JustAfterIncongruent fixations. See **Figure 7A** for mean amplitude and standard error values.

No significant difference was observed between JustBeforeIncongruent, Incongruent and JustAfterIncongruent fixations [*F*(2, <sup>28</sup>) = 0.086, *p* = 0.87], or between Rank3AfterIncongruent and AfterRank4Incongruent [*F*(1, <sup>14</sup>) = 0.83, *p* = 0.38; Fixation by ROI interaction: *F*(5, <sup>70</sup>) = 1.28, *p* = 0.29].

#### *121–200 ms latency window*

No significant effect was found.

These results showed (**Figure 7B**) that the first positive component elicited by the second fixation after that on the incongruent word was less positive. This effect is specifically located in centro-parietal and occipital areas. This Rank2AfterIncongruent event will be called *key event* throughout the rest of the paper (underlined in **Figure 7**).

### **GLOBAL ANALYSIS AND RESULTS**

#### **FIXATION SELECTION FOR GLOBAL ANALYSIS**

The objective of the global analysis was to observe the EEG components in comparison with after the previous and next ones, considering more than one specific fixation, contrary to the rank analysis. In order to clarify things, let us explain the global analysis, taking the example of fixations on target words. The principle is the same for the incongruent words and we will notice only the specific differences between these two cases.

Among other things, the rank analysis on target words (section Results of rank analysis on target words) allowed us to detect a neural cue on the fixation just after the target words (JustAfterTarget event). In this case, this event is our *key event*. For the incongruent words, the *key event* is the Rank2AfterIncongruent one (section Results of rank analysis on incongruent words).

Based on these key events, we have defined more global events, gathering previous and subsequent events. Subsequent events consisted of the fixations after the key event (i.e., for the target words: Rank2AfterTarget, Rank3AfterTarget ...) until the penultimate fixation before the participant decides to stop reading. Likewise, previous fixations included the fixations before the target word: from the JustBeforeTarget fixation up to the second one (the first fixation at the onset of the text presentation was excluded). Finally, in order to minimize the effect of the fixation duration variability between the three previously defined populations, namely key event (i.e., Rank2AfterTarget or Rank3AfterIncongruent), before key events (i.e., AllBeforeTarget or AllBeforeIncongruent) and after key events (i.e., AllRank2AfterTarget or AllRank3AfterInconguent),

we have made a selection of fixations (Nikolaev et al., 2011), so that fixations before the key event and fixations after the key event followed the same distribution of durations as the key event (see **Figure 8**).

To do so, the distribution of fixation durations for the key event has been computed (in red, in **Figure 8B**), as well as histograms of fixation durations for the surrounding events (in red, in **Figure 8C**). For each bin, the number of selected fixations is computed to be proportional to the distribution on the key event and select a maximum number of fixations (in blue, in **Figure 8C**). This selection was made according to a uniform random sampling. The selected set of fixations composed a new event, called SelectBeforeTarget event. The same procedure has been applied to generate the set of fixations related to the SelectRank2AfterTarget event. This method has been replicated for the selection of fixations related to the previous and subsequent events in the case of incongruent words (SelectBeforeIncongruent and SelectRank3AfterIncongruent to match the distribution of fixation durations on the key event Rank2AfterIncongruent).

The global analysis focused on the contrast between the key event and both the selected previous events and selected subsequent events.

distribution of fixation duration as that of the key event (represented by stars). **(B)** Result of the fixations selection related to the events before "TargetWord"

### **RESULTS OF THE GLOBAL ANALYSIS ON TARGET WORDS**

The selected latency windows were the same as those of the rank analysis (0–50, 51–90, and 91–200 ms). The fixations of interest included in the ANOVAs were: SelectBeforeTarget, JustAfterTarget and SelectRank2AfterTarget.

#### *0–50 ms latency window*

No significant effect was found.

#### *51–90 ms latency window*

The Fixation by ROI interaction was significant between SelectBeforeTarget, JustAfterTarget and SelectRank2AfterTarget [*F*(10, <sup>140</sup>) = 2.83, *p* = 0.026). At ROI 4, JustAfterTarget elicited a larger positivity than for both SelectBeforeTarget and SelectRank2AfterTarget fixations (*p* = .00041 and.00007, respectively) that did not differ from each other. At ROI 6, JustAfterTarget event elicited a larger positivity than for SelectRank2AfterTarget event (*p* = 0.0065). See **Figure 9A** for mean amplitude and standard error values.

#### *91–200 ms latency window*

No significant effect was found.

These results showed that the first positive component elicited by the fixation just after that on the target word (i.e., JustAfterTarget) was more positive than both the selected previous and subsequent fixations (i.e., SelectBeforeTarget and SelectRank2AfterTarget) in the right centro-parietal areas and the selected subsequent fixations (i.e., SelectRank2AfterTarget) in the occipital areas (**Figure 9B**).

event, random selection of the fixations for the matching distribution. The result is the histogram ("SelectBeforeTarget") after selection **(B)**.

#### **RESULTS OF THE GLOBAL ANALYSIS ON INCONGRUENT WORDS**

The selected latency windows were the same as those of the rank analysis (0–50, 51–120, and 121–200 ms). The fixations of interest included in the ANOVAs were: SelectBeforeIncongruent, Rank2AfterIncongruent and SelectRank3AfterIncongruent.

#### *0–50 ms latency window*

No significant effect was found.

#### *51–120 ms latency window*

Rank2AfterIncongruent was less positive (−0.28µV) than in SelectBeforeIncongruent (1.05 µV) and SelectRank2AfterIncongruent [0.80µV; main effect of Fixation: *F*(2, <sup>28</sup>) = 4.38, *p* = 0.023], which did not differ from one another. Specifically, Rank2AfterIncongruent was less positive than SelectBeforeIncongruent at ROI 3 (*p* = 0.00013), ROI 4 (*p* = 0.00017) and ROI 6 *p* = 0.00012;

**window, where differences were significant.** Key event was underlined;

Fixation by ROI interaction: [*F*(5,70) = 2.89, *p* = 0.049]. See **Figure 10A** for mean amplitude and standard error values.

#### *121–200 ms latency window*

No significant effect was found.

window in which differences were significant.

These results show that the first positive component elicited by the Rank2AfterIncongruent fixations was less positive than

selected previous fixations (i.e., SelectBeforeIncongruent) in centro-parietal and occipital areas (**Figure 10B**).

### **EFRP ANALYSIS FOR LATE COMPONENTS AND RESULTS EFRP ANALYSIS FOR LATE COMPONENTS**

The previous analyses revealed that there is a more negative component in the right centro-parietal and occipital areas on fixation N+2 after the incongruent word than on the previous window in which differences were significant.

fixations. The negativity occurred quite early, starting 50 ms after the onset of the incongruent word. Since a negative decision made on an incongruent word is likely to be based on a complex semantic process, we suspected that this negativity could be due to a late component of the incongruent word processing. This component would be strong enough to remain visible after it is merged with components from the two subsequent fixations.

The same argument applies to the target words. We found a specific higher positivity right after the onset of the fixation following the target word. It could also be due to a late component of the target word processing.

As we mentioned earlier, the mean fixation duration is about 185 ms and the saccade duration about 45 ms. So there are about 230 ms from fixation N to N+1 and 460 ms from fixation N to N+2. Since we found a specific component that started around 50 ms after the beginning of the fixation on the Target+1 event as well as on the Incongruent+2 event, we have looked for late components after these durations, that is after 230 + 50 ms for fixations on target words and after 460 + 50 ms for fixations on incongruent words.

Late component were analyzed respectively in the 260–320 ms latency window for target words and in the 500–530 ms latency window for incongruent words. Those latency windows were chosen after visual inspection of EFRPs data, review of the relevant literature (Key et al., 2005; Polich, 2010 for target words analysis; Camblin et al., 2007; Kutas and Federmeier, 2011 for incongruent words analysis) and, for their beginning, calculation of average fixation duration (cf. above). The fixations of interests were: JustBeforeTarget, TargetWord and JustAfterTarget for target word analysis, and JustBeforeIncongruent, IncongruentWord and JustAfterIncongruent for incongruent word analysis.

#### **LATE COMPONENTS ON TARGET WORDS**

As shown on **Figure 11B**, a late positive component seemed to be arising on the 260–320 latency window, which was linked to the fixations on target words (red line). However, no significant effect was found between JustBeforeTarget (0.14 µV), TargetWord (0.36µV) and JustAfterTarget (–0.17 µV; *F*(2, <sup>28</sup>) = 0.74, *p* = 0.48 for the main effect of Fixation, and *F*(10, <sup>140</sup>) = 0.94, *p* = 0.45 for the Fixation by ROI interaction) in this latency window. See **Figure 11A** for mean amplitude and standard error values for ROI 3, 4, and 6. This late positivity elicited by target words could be interpreted as a P300 component.

#### **LATE COMPONENTS ON INCONGRUENT WORDS** *In the 500–530 ms latency window*

IncongruentWord elicited a larger negativity (−0.93µV) than JustBeforeIncongruentWord [0.37 µV; main effect of Fixation: *F*(2, <sup>28</sup>) = 3.17, *p* = 0.05], specifically at ROI 3 (*p* = 0.042), ROI 4 (*p* = 0.005) and ROI 6 (*p* = 0.00016); Fixation by ROI interaction: [*F*(5, <sup>70</sup>) = 3.13, *p* = 0.043]. See **Figure 12A** for mean amplitude and standard error values. No significant difference was observed between IncongruentWord and JustAfterIncongruentWord (−0.063µV, *p* = 0.24).

These results showed that fixations on incongruent words elicited a larger negativity than fixations just before incongruent words in centro-parietal and occipital areas (**Figure 12B**). The latency window (500–530 ms) and scalp distribution of this negative component suggest that we identify it as an N400 component.

#### **DISCUSSION, CONCLUSION**

Decision-making is a fundamental activity of the search for information in texts. The aim of our study was to highlight specific EFRPs patterns linked to decision-making, in a task where participants had to decide as quickly as possible whether the text currently read was semantically related to a given goal.

The task we have designed is much closer to the natural search for information than what literature usually describes. Reading is often temporally or spatially constrained in order to facilitate the alignment of EEG signals: words are studied separately or slowly presented one at a time. In this particular case, words are displayed all at once on the screen, covering multiple lines, exactly like the texts we are used to read every day.

The downside of this ecological approach is that signal patterns are much harder to analyze because of overlapping processes. The

reason is that as early as 230 ms (a fixation of about 185 ms in average plus a saccade of about 45 ms) after the onset of a fixation, a new fixation occurs which produces a new signal pattern scrambling the previous one.

To reduce that noise, we could have tried to control the material very precisely. Actually, we controlled the overall relatedness of the text to the goal, but it was not easy to do so with all the factors that affect reading in a full text and which would have facilitated the analyses: word frequency, word predictability, word-goal relatedness distribution over the text. Therefore, we have picked up a high number of real texts, more or less associated to a large variety of goals. The idea was to have a large text variability in order to expect a compensation of brain signal peculiarity. For instance, suppose we are interested in the brain waves induced by fixations F. There are two ways to solve the issue of the overlapping F and F+1 fixations. The first one is to guarantee that all F+1 fixations elicit the same signal, which could then be easily subtracted from the main signal (Woldorff, 1993). This is almost impossible to do with a textual material, because of the high number of factors involved. The second way is therefore to select a large variety of texts in order to expect that F+1 fixations elicit signals that would counterbalance each other and as a whole would not affect the main signal too much, considering that temporal jitters on previous and subsequent events act as high frequencies filters. The higher the fixation, the larger the jitter. Then, the overlapping effect can be reduced for late components.

#### **THE PROPOSED METHODOLOGY FOR EFRP ANALYSIS**

To overcome this difficulty, sophisticated methods for uncovering specific EEG patterns during reading and before making the final decision were therefore initiated. The task is complex since at least two different processes are intertwined (reading and decisionmaking) and this intertwining depends on both the structure of the text and the participant. Consequently, rank analyses, either forward from the first fixation or backward from the last one, are inappropriate. Moreover, extractions of EFRP from words that have particular properties (frequency, predictability,...) independently of the fixation rank (Kliegl et al., 2012) is not our purpose, while in this case we have explicitly asked the participants to make decisions as they were reading. Our methodology is therefore a hybrid of these two cases. We first identified choice words likely to elicit specific components. We then performed an analysis at the level of fixations, around these words, without any assumption on the latency. By resorting to the EFRPs technique, which allows for a fixation-by-fixation analysis of the EEG signal, we identified two response patterns, related respectively to the first fixation just after a target word and the second one after an incongruent word: the fixation just after a target word elicited a larger first positive component in the right parieto-occipital areas and the second fixation after an incongruent word elicited a lower first positive component in the parieto-occipital areas (bilateral scalp distribution). These effects were very early (50–90 and 50– 120 ms after the fixation onset) and can hardly be interpreted as the reflection of a decision made in relation to a semantic process. This is why we have mixed EFRP analyses at two different scales: at the scale of one ISI duration to extract early components, and at the scale of 2 or 3 times the ISI duration to extract late components. In the first case, the early extracted components cannot be interpreted as the reflection of a semantic process beginning at the onset of the current fixation. However, they reveal the overlapping of current and previous activities started one or two fixations earlier. In the second case, the late components observed with a latency about 2 or 3 times the ISI duration are also corrupt. In some situations, thanks to the natural ISI jitter, these late components can be observed (depending on the relation between the jitter range and the temporal frequency band

of the component). Then, while in these two cases the extracted components are corrupt, the results of these two analyses must be congruent to allow for an interpretation of the extracted component (**Figures 13A,B**). Let us consider typical ERP components in the centro-parietal areas, which are related to word processing during reading (Sereno et al., 1998). Such a signal elicited at each fixation is illustrated in **Figures 13A,B**. Then, our hypothesis is that late components, such as P300 or N400, could be superimposed on one or two additional components corresponding to the first and second subsequent fixations.

#### **POSITIVE DECISION (TARGET) INDEXED BY A P3B?**

We have defined target words as words that must have the two following characteristics: on the one hand, a morphological form that corresponds to the goal of the text, and on the other hand, a high semantic association with the goal. In the light of these two preceding criteria, we had hypothesized that these words would be likely to induce a (positive) decision to leave the text. Our behavioral data support this hypothesis and show that the remaining number of fixations after a fixation on a target word was significantly reduced when compared with the remaining number of fixations after any other word that was not a target word while being at the same rank of fixations. Thus, decision-making involves accumulation of information, and in our case, with the same preceding amount of information, target words apparently provide a large enough gap to induce a decision.

In terms of electrophysiological data, these two characteristics of the target words (i.e., same verbal form as the goal and high semantic link with the goal) may result in different components. As regards the morphological form, a consensus seems to emerge in the literature showing early electrophysiological effects related to its processing. The first one, the P150, is a relatively focal positive component elicited in the occipital sites (especially in the right hemisphere), which is more positive when a target word is not related to a prime word, in comparison with a full or partial (one letter changed) repetition (Holcomb and Grainger, 2006). This component may not be language-specific and has also been observed in experiments with single letters (Petit et al., 2006) and pictures of objects (Eddy et al., 2006), with the same latency and scalp distribution, suggesting that it would reflect an early process related to surface features and the mapping of visual features onto higher level representations (Chauncey et al., 2008). Note that the latency of P150 overlaps that of N1, hence reducing this latter component.

The second one, the N250, has a broader scalp distribution, somewhat larger over anterior sites than over posterior ones, and takes the form of a larger negativity to the target words that were unrelated to their preceding prime words than to target words that shared letters with their primes (Chauncey et al., 2008). Contrary to P150, this component is not elicited for individual letters or objects and is interpreted as the reflection of the processing of letters combinations (e.g., bigrams, trigrams), the mapping of prelexical form representations onto whole-word form representations (Grainger and Holcomb, 2009).

We suspect that many reasons could explain why we have not observed these two components in our results. First of all, our task is quite different from those previously reported, in the sense that it is a task consisting in searching information, in real reading conditions, and it usually takes a lot of time and words between the goal of the text and a target word. In the above mentioned tasks in which P150 and N250 were reported, there were a few hundred milliseconds between a prime word and a target word. Therefore, we can assume that the form of the prime word is much more present in memory than those of the words of the goal in our experiment. Moreover, two words established the goal in our experiment, which perhaps reduces the role of their form in solving the task, and gives a "supra" goal that corresponds to a semantic combination of the 2 words' meaning. Therefore, we assume that our task requires semantic processes above all, especially since morphologically related word forms are not necessarily associated with the same meaning. A measure of morpho-semantic coherence (Ford et al., 2003) captures the difference between semantically transparent and semantically opaque morphologically related words (Hauk et al., 2006). For example, "government" and "govern" present a cosine value (LSA) of 0.68 whereas that of "department" and "depart" is 0.04. We have some reason to believe that the processes partaking in our task are not related to the perception of the word form, but rather linked to deeper processes reflected by later components.

At the key event JustAfterTarget, we have found a first positive component synchronized with this event that is larger than the previous and subsequent ones. The effect is significant in the 50–90 ms window after the onset of the JustAfterTarget event. Reported at the onset of the TargetWord event (when the subject is supposed to read the target word), this effect appears in average in the 280–320 ms window (by adding the mean durations of one fixation and one saccade). Moreover, as explained before, the EFRP extracted at the key event JustAfterTarget is the sum of multiple contributions (**Figure 13A**): the expected response (blue line) is mixed with the previous responses, mainly from the TargetWord (red line) and JustBeforeTarget events (black line). More specifically, we suppose that the increase in positivity is due to the accumulation of a more positive P300 component elicited from the TargetWord fixation (a more positive P300 component is illustrated by a continuous red line as opposed to the red dotted line) to the early positive component P1 from the JustAfterTarget event (blue line).

As a reminder, a positive decision means that the target word has been found, the content has been retrieved and the decision to leave the text can be made. As seen in introduction, the P300 component can be divided into two subcomponents, P3a and P3b (Polich, 2007). While the former component originates from stimulus-driven frontal attention, the latter originates from a temporal-parietal activity associated with attention. Our P300 may be interpreted as a P3b component reflecting the decision to stop the search for information.

Indeed, many reports suggest that the amplitude of P3b varies according to the role that the eliciting stimulus plays in the participant's task. For instance, it has been shown that P300 is enhanced when the stimulus "resolves uncertainty," is made "task relevant" or is novel or unexpected. A main hypothesis regarding the functional significance of the P3b is that it is related to decision-making (Verleger et al., 2005; Verleger, 2008), which generally results in the enhancement of the P3b amplitude when people are required to make decisions based on stimuli (Acosta and Nasman, 1992). Specifically, there is a relationship between the amplitude of P3b amplitude and the confidence in a decision, with a higher amplitude being associated with greater degrees of confidence (Andreassi, 2006). P3b has been regarded as a sign of memory access processes, evoked by the evaluation of stimuli in tasks that require some form of action (Kok, 2001). Decisionmaking refers to processes responsible for the identification of the presence or identity of task-relevant stimuli and the mapping of these onto appropriate responses (Nieuwenhuis et al., 2005), which is exactly the case of our target words. For instance, when equated for frequency of occurrence, target stimuli (i.e., stimuli requiring a response) typically elicit higher P3b amplitudes than non-target stimuli. Furthermore, the P3b component appears as an index of controlled processing resources allocated to decision-making or the updating of memory after a decision is made (Donchin and Coles, 1988). This is typically what our participants have to do after reaching the target word: to decide whether they continue reading. Unfortunately, when synchronized with the TargetWord event, this effect on the 280–320 ms window is not significant. Indexing decision-making, P3b seems to be spread out onto the next fixations. The absence of any effect on the target word may be explained by the fact that (1) the early components' jitter (± 65 ms, cf. **Table 3**, see **Figure 13A**) is not large enough to reduce this overlapping (blue line) and (2) the overlapping with the late component elicited from the JustBeforeTarget event is not that disrupting, as its amplitude is weaker and its shape in low frequency. We plan to carry on this work and use the Adjar algorithm (Woldorff, 1993) to evaluate overlapping before we extract more reliable EFRP components. However, at the current state of our research, it is not very clear cut if this increase of positivity interpreted as a possible P300 is really linked to a decision-making processes, or more simply to the processing of the significance of the word in the context of the task. In other words, it is possible that a participant decided to leave the text not specifically at the reading of a target word, but slightly later. Decision-making could be in this case a more global process of integration of information from the context (i.e., the previous words) and specific words (i.e., target words). To answer this question, one can extend this research by replicating this experiment in changing the instructions (e.g., in asking participants to read the whole text and making the decision at the end of the reading), in order to separate the decision related to the perception of the relevance of a word in the context of the task, from the decision to interrupt the reading of the text.

Another possible explanation of this increase in positivity might be linked to the increase in the amplitude of the P325 component. This component, whose posterior hemisphere distribution is more oriented to the right, confirms our results in that it is more positive to targets that overlapped their primes in all letter positions and more negative to unrelated and partially overlapping prime-target pairs (Chauncey et al., 2008). At the moment, we do not have enough elements to decide between these two explanations, namely an increase in amplitude of a P300 and/or a P325, even if, as mentioned above, the kind of task in which a P325 has been reported is quite different from ours. Fact is that our future experiments should draw a clear distinction between target words that are highly semantically related to the goal while having different verbal forms, and target words sharing the same verbal form that are semantically highly related to the goal. It would be interesting to observe, both at behavioral and electrophysiological levels, which kind of component would be elicited, and if one or other of these kinds of targets would result in the greatest acceleration in decision-making.

#### **NEGATIVE DECISION (INCONGRUENT) INDEXED BY A N400?**

Incongruent words were defined as words that had a low frequency and were unrelated to the goal. Again, behavioral data showed—as we expected—that the remaining number of fixations after a fixation on an incongruent word was significantly less important than the remaining number of fixations after any other fixation on a word other than an incongruent word, at the same rank of fixation. Thus, we can assume that the lack of a semantic link between the incongruent word and the goal has been well perceived and partakes in decision-making. From the electrophysiological point of view, this semantic mismatch is reflected by a specific component, the N400 component.

Our results show that the first positive component elicited by the Rank2AfterIncongruent fixations was less positive than on previous and subsequent fixations. This effect is significant in the 50–120 ms window after the onset of the Rank2AfterIncongruent event. Reported two fixations earlier, i.e., at the onset of the IncongruentWord event (when the subject is supposed to read the incongruent word), the effect appears in average in the 510– 580 ms window (by adding the mean durations of two fixations and two saccades). Our hypothesis is that this reduction of the positive amplitude is due to an increase in the amplitude of the N400 component elicited from the IncongruentWord event occurring two fixations earlier. This increase is illustrated by the component in red line as opposed to the component in red dotted line in **Figure 13**. The main disrupted overlapping is due to the early components (green line) from the Rank2AfterIncongruent event, but here the jitter is larger (± 125 ms, cf. **Table 3**, see **Figure 13B**), and then the overlapping is less disrupted and the effect is experimentally significant in the 500–530 ms window, when EFRP is extracted at the onset of the IncongruentWord event. This late negative component elicited by the fixation on a incongruent word, with the same scalp distribution (parietocentral and occipital) as the early effect elicited by the second fixation after an incongruent word, was interpreted as an N400 component.

The functional signification of N400 is linked to meaning processing. More specifically, N400 is largest for semantic anomalies, with its amplitude inversely related to the degree of semantic relatedness of a stimulus event (see for a review Kutas and Federmeier, 2000). The N400 amplitude is also highly correlated with an offline measure of the word's expectancy (i.e., cloze probability), that is to say the percentage of individuals who would continue a sentence fragment with that word. Many studies have shown this predictability effect on the N400 component, with low predictability words eliciting a larger N400 component than high predictability words (Lee et al., 2012).

Our result is quite original, for very few studies have observed an N400 effect in such a "large" linguistic material, in the sense that incongruent words are specific to a completely different goal and that the goal was disclosed very early in the reading, even before the presentation of the text. This result is in line with recent theories on the role of the N400 component (Kutas and Federmeier, 2011), that interpreted it as an integration of the semantic information accessed from the current word with semantic information spread over multiple words (e.g., discourse message-level representations, presumably held in working memory). In this case, the N400 amplitude is more related to higher-level factors than to lower-level ones. The N400 component reflects a process that could pertain to discourse comprehension, in which the reader has to develop a mental representation of the text, which requires a continuous process of

#### **REFERENCES**


potentials: Eye fixation-related potential investigations in reading," in *The Oxford Handbook of Eye Movements,* eds S. P. Liversedge, I. D. Gilchrist, and S. Everling (New York, NY: Oxford University Press), 857–870. doi: 10.1093/oxfordhb/ 9780199539789.013.0047

Barrett, S. E., and Rugg, M. D. (1989). Event-related potentials and the semantic matching of faces. *Neuropsychologia* 27, 913–922. doi: 10.1016/0028-393290067-5

integration of the words presented in the text into background knowledge. However, we have to slightly nuance this conclusion, because the incongruent words, as we have defined them, may also be incongruent with respect to the context of the sentence they are taken from. In the future, a relevant distinction between the incongruent words that are incongruent only in relation to the goal (and not to the preceding context of the sentence) and those that are incongruent in their sentence and not relatively to the goal, would be most welcome.

#### **CONCLUSION**

The aim of this experiment was to co-register EEG signals and eye tracking measures while participants had to decide stop or not reading. We found two early effects: a more negative component on the fixation *N* + 2 after an incongruent word and a more positive component on the fixation *N* + 1 after a target word. The first one was interpreted as a N400 component related to the processing of the incongruous word. Further experiments needs to be put in place to better explain the increase of positivity. The present paper demonstrates how EFRPs can be a useful tool for the identification of the cognitive processes at work during a natural task such as the search for information in texts. Indeed, the major benefit of the EFRPs technique used in our experiment is to investigate EEG components during free eye-movements, in ecological reading conditions. However, the major challenge this technique is still confronted with is to disentangle the overlapping processes occurring during a short fixation duration. The Adjar deconvolution technique could represent a partial solution to the problem and we are confident that future research will strongly benefits from cross-linking eye movements and ERPs. We hope that our paper also largely contributes to this issue by presenting precise methodological points.

#### **ACKNOWLEDGMENTS**

This work was supported by a grant from the ANR Gaze-EEG "Joint synchronous EEG signal and eye tracking processing for the spatio-temporal analysis and modeling of neural activities," and a grant from the "Pôle Grenoble Cognition." A part of the soft development was realized by Ronald Phlypo and Nicolas Tarrin. The authors would like to thank the "Délégation à la Recherche Clinique et à l'Innovation" of the CHU "Centre Hospitalier Universitaire" of Grenoble for their role in the ethics committee, and the reviewers for their comments that significantly help the improvement of the manuscript.


sentence processing: evidence from ERPs and eye tracking. *J. Mem. Lang.* 56, 103–128. doi: 10.1016/j.jml.2006.07.005


*Lang. Cogn. Processes* 23, 183–200. doi: 10.1080/01690960701579839


*Cogn. Brain Res.* 16, 123–144. doi: 10.1016/S0926-641000244-6


H. Coles (Greenwich, London: JAI Press), 69–138.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 03 March 2013; accepted: 19 July 2013; published online: 14 August 2013.*

*Citation: Frey A, Ionescu G, Lemaire B, López-Orozco F, Baccino T and Guérin-Dugué A (2013) Decision-making in information seeking on texts: an eyefixation-related potentials investigation. Front. Syst. Neurosci. 7:39. doi: 10.3389/ fnsys.2013.00039*

*Copyright © 2013 Frey, Ionescu, Lemaire, López-Orozco, Baccino and Guérin-Dugué. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Cognitive processes involved in smooth pursuit eye movements: behavioral evidence, neural substrate and clinical correlation

#### *Kikuro Fukushima1,2, Junko Fukushima3, Tateo Warabi <sup>1</sup> and Graham R. Barnes <sup>4</sup> \**

*<sup>1</sup> Department of Neurology, Sapporo Yamanoue Hospital, Sapporo, Japan*

*<sup>2</sup> Department of Physiology, Hokkaido University School of Medicine, Sapporo, Japan*

*<sup>3</sup> Faculty of Health Sciences, Hokkaido University, Sapporo, Japan*

*<sup>4</sup> Faculty of Life Sciences, University of Manchester, Manchester, UK*

#### *Edited by:*

*Sebastian Pannasch, Technische Universität Dresden, Germany*

#### *Reviewed by:*

*Uwe Ilg, Hertie-Institute for Clinical Brain Research, Germany Marcus Missal, Université Catholique de Louvain, Belgium*

#### *\*Correspondence:*

*Graham R. Barnes, Faculty of Life Sciences, University of Manchester, Carys Bannister Building, Dover Street, Manchester M13 9PL, UK. e-mail: g.r.barnes@manchester.ac.uk* Smooth-pursuit eye movements allow primates to track moving objects. Efficient pursuit requires appropriate target selection and predictive compensation for inherent processing delays. Prediction depends on expectation of future object motion, storage of motion information and use of extra-retinal mechanisms in addition to visual feedback. We present behavioral evidence of how cognitive processes are involved in predictive pursuit in normal humans and then describe neuronal responses in monkeys and behavioral responses in patients using a new technique to test these cognitive controls. The new technique examines the neural substrate of working memory and movement preparation for predictive pursuit by using a memory-based task in macaque monkeys trained to pursue (go) or not pursue (no-go) according to a go/no-go cue, in a direction based on memory of a previously presented visual motion display. Single-unit task-related neuronal activity was examined in medial superior temporal cortex (MST), supplementary eye fields (SEF), caudal frontal eye fields (FEF), cerebellar dorsal vermis lobules VI–VII, caudal fastigial nuclei (cFN), and floccular region. Neuronal activity reflecting working memory of visual motion direction and go/no-go selection was found predominantly in SEF, cerebellar dorsal vermis and cFN, whereas movement preparation related signals were found predominantly in caudal FEF and the same cerebellar areas. Chemical inactivation produced effects consistent with differences in signals represented in each area. When applied to patients with Parkinson's disease (PD), the task revealed deficits in movement preparation but not working memory. In contrast, patients with frontal cortical or cerebellar dysfunction had high error rates, suggesting impaired working memory. We show how neuronal activity may be explained by models of retinal and extra-retinal interaction in target selection and predictive control and thus aid understanding of underlying pathophysiology.

**Keywords: smooth pursuit, eye movements, anticipation, efference copy, species comparisons, prediction, computational modeling, pathophysiology**

#### **MAJOR COGNITIVE INFLUENCES ON PURSUIT BEHAVIOR BASIC FEATURES OF PURSUIT**

#### *Smooth pursuit initiation*

The simplest way to assess pursuit performance is to examine the response to the sudden onset of an unexpected, constant velocity target motion (a ramp stimulus). **Figure 1A** shows typical human eye displacement responses to ramp stimuli of varying velocity; responses in the monkey are similar (Lisberger and Westbrook, 1985; Lisberger et al., 1987). In humans there is normally a latency of ∼100–130 ms before smooth movements start (Tychsen and Lisberger, 1986; Carl and Gellman, 1987), whereas in the monkey shorter latencies of 80–100 ms are generally observed (Lisberger and Westbrook, 1985). The initial response delay results in a positional error that is corrected by a saccade that normally occurs after ∼240 ms (**Figure 1A**) and realigns the image close to the fovea. Smooth eye displacement prior to the first saccade is often small but derivation of its velocity shows that the eye accelerates prior to the first saccade. However, after the saccade, eye velocity often jumps to a higher level (Lisberger, 1998) a feature referred to as post-saccadic enhancement. To eliminate the initial saccade or, at least, to ensure that it occurs later in the response, many investigators have used the so-called step-ramp stimulus (Rashbass, 1961), in which the target first jumps to one side, then makes a ramp in the opposite direction and crosses over the starting point in ∼200 ms (**Figure 1B**). Eye movement normally starts somewhat later than for a simple ramp at ∼130–150 ms after the step in humans (Rashbass, 1961). Once under way, the first 100 ms of the smooth response is effectively in an open-loop phase, since the delay in visual processing dictates that within this time period the retinal velocity error is not changed by the movement of the eye, as confirmed by open-loop studies (Carl and Gellman, 1987). Detailed examination of the step-ramp response has shown two

smooth eye velocity responses made in response to target motion cues during presentation of randomized target velocities. Each cue comprised a digit (1, 2, 3, or 4) representing speed (10, 20, 30, or 40◦ /s, respectively) and a directional indicator (*<* or *>*) and occurred 600 ms before target onset. Anticipatory velocity markers (-) indicate eye velocity 50 ms after target onset (V50), prior to visual feedback. V50 increased as target speed increased (Jarrett and Barnes, 2005) as shown in these examples. In the catch presentation, target motion was unexpectedly delayed by 160 ms, resulting in earlier initiation of anticipatory movement and attainment of higher than normal V50.

distinct phases. In the initial 20–30 ms eye acceleration shows some increase with target velocity but not with starting position of the target motion (Lisberger and Westbrook, 1985; Tychsen and Lisberger, 1986), whereas, in the period 60–80 ms after onset there is a much greater modulation of eye acceleration by target velocity and a strong dependence on eccentricity of starting position. In humans, peak eye velocity is normally attained at a time that typically increases from ∼220–330 ms after response onset as target velocity increases from 5 to 30◦/s (Robinson et al., 1986).

#### *Smooth pursuit maintenance*

Following initial response onset eye velocity frequently overshoots target velocity and may oscillate at a frequency of 3–4 Hz in humans (**Figure 1B**) (Robinson et al., 1986). Oscillations normally die away within one or two cycles, although this varies between subjects and the size of the visual stimulus (Wyatt and Pola, 1987). With prolonged stimulation eye velocity settles to an average that is close to target velocity. Gain (the ratio of eye velocity to target velocity) is normally in the range 0.9–1.0 for target velocities *<*20◦/s. Meyer et al. (1985) showed that gain in humans could remain as high as 0.9 up to ∼90◦/s, but declines at higher velocity. If gain falls substantially below unity, corrective saccades are made to realign the target image on the fovea.

#### *Smooth pursuit termination*

Ocular pursuit is an example of a negative feedback control system and if it were linear, the response evoked by termination of a ramp stimulus should be the inverse of that at initiation; eye velocity should thus oscillate when reaching zero velocity (i.e., in the transition from pursuit to fixation). However, when target motion ceases unexpectedly, following a latency of ∼100 ms, eye velocity generally decays to zero with a time constant of ∼100 ms (Robinson et al., 1986; Pola and Wyatt, 1997) without evidence of overshoot. This was taken as evidence that fixation does not represent pursuit at zero velocity; rather, the simple decay of eye velocity was thought to represent the disengagement of pursuit (Robinson et al., 1986). As discussed later the response at termination actually depends on the subject's expectation.

#### **THE ROLE OF RETINAL AND EXTRA-RETINAL MECHANISMS**

Models based on control theory have been used very successfully to describe the dynamic characteristics of pursuit (Robinson et al., 1986). The major problem lies in simulating the relatively rapid rise of eye velocity combined with the high levels of closed-loop gain normally attained. These two requirements cannot be met by simple negative feedback without the system exhibiting unstable oscillation because of the time delays associated with visual motion processing; although some oscillation is observed (**Figure 1**), it is generally of small amplitude. The most widely accepted way in which stability is thought to be achieved is through the positive feedback of an efference copy of eye movement, as represented by the *reactive* loop of the model shown in **Figure 2**, a proposal originally made by Yasui and Young (1975). Elaborations of this concept have formed the basis for a number of subsequent models (Robinson et al., 1986; Krauzlis and Lisberger, 1994; Deno et al., 1995; Krauzlis and Miles, 1996).

An important generic feature of these models is that if visual feedback is suddenly cut off, the efference copy feedback loop can sustain the response to some extent (**Figure 1C**). In effect, the loop acts as a simple, but volatile, velocity memory. This fits with an important observation, that during pursuit of a target that unexpectedly disappears, smooth eye movements do not simply stop but can be sustained, albeit at reduced velocity, in both humans (Von Noorden and Mackensen, 1962; Becker and Fuchs, 1985) and monkeys (Newsome et al., 1988). This occlusion paradigm has been used frequently to reveal features of the internal (or extra-retinal) drive mechanisms for pursuit.

Recent evidence has called into question the validity of this simple efference copy model (Barnes and Collins, 2008a,b). Although target occlusion experiments lead to a decrease in eye velocity, there is often a recovery of eye velocity (**Figure 1C**) prior to expected target reappearance (Becker and Fuchs, 1985; Bennett and Barnes, 2003) that cannot be easily explained by such models. Moreover, eye velocity can increase above the level attained prior to disappearance if target velocity is expected to increase at the end of the occlusion (Barnes and Schmid, 2002; Bennett and Barnes, 2004). Also, corrective saccades during occlusion tend to align eye position with the expected target trajectory (Bennett and Barnes, 2003), as shown in **Figure 1D** (see section "The Role of Expectation and Mismatch Detection in Predictive Pursuit" for details), suggesting that true velocity has been retained and integrated to estimate future target position despite smooth eye velocity reduction (see also Orban de Xivry et al., 2008, 2009). Related evidence for such positional corrections has been obtained in monkey (Barborica and Ferrera, 2003). This suggests that initial target velocity is sampled and stored in a less volatile form of memory than implied by continuous efference copy feedback.

#### **THE ROLE OF EXPECTATION AND MISMATCH DETECTION IN PREDICTIVE PURSUIT**

One of the problems in assessing the validity of the efference copy idea is that it is difficult to demonstrate the existence of internally driven eye movements in the absence of vision unless there has been some prior visual input (as in **Figure 1C**). In particular, it is difficult to initiate smooth eye movements in the absence of visual input. Early experiments suggested a capacity to evoke only very low velocity smooth pursuit at will (Heywood, 1972; Kowler and Steinman, 1979), but subsequent experiments have revealed that much higher velocities can be evoked as anticipatory movements during repeated stimulation in humans (Becker and Fuchs, 1985; Barnes et al., 1987; Boman and Hotson, 1988) and monkeys (Ilg, 2003; Missal and Heinen, 2004). In addition, Jarrett and Barnes (2001, 2002) have shown that subjects can use symbolic cues that reliably indicate the speed and direction of an upcoming target motion to generate appropriately scaled and directed anticipatory movements, even when target movements are randomized in speed and direction (**Figure 1E**). More surprisingly, smooth movements can be evoked in the absence of any retinal slip, e.g., when following a series of target steps (Barnes et al., 1987; Barnes and Asselman, 1992) or when shifting attention to a more eccentric location on an image that is stabilized on the fovea. The latter generates smooth movement

**FIGURE 2 | Model of ocular pursuit.** The basis of the model is a negative feedback loop in which retinal velocity error is processed by internal dynamics F(s) with variable gain K and a delay (τv) of ∼80–100 ms. The negative visual feedback is supplemented by extra-retinal input from either a *reactive* or *predictive* loop. The input to both reactive and predictive pathways comes from sampling (for ∼150 ms) and holding a copy of the reconstructed target velocity signal (T') in module S/H. The reactive loop can thus sustain eye velocity even if visual input is withdrawn (i.e., if sw1 is opened). The predictive loop includes a more robust short-term memory (MEM), which can hold velocity information over longer periods and during fixation. Both direct and indirect pathways feed

out through an expectation-modulated gain control (β *<* 1) and filter F"(s). In a reactive response, S/H output is fed out directly but is also temporarily stored in MEM. In predictive mode, output of MEM is fed out to form an anticipatory response with timing based on external cues or on the detection of direction changes in the reconstructed target velocity signal and held in the predictive timing store. F"(s) <sup>=</sup> F'(s) <sup>=</sup> F(s) <sup>=</sup> (1 <sup>+</sup> Te.s)−1. Te <sup>=</sup> <sup>0</sup>*.*12 s. Non-linear gain function approximated by: K = K0 (1 + e/e0) <sup>−</sup>0*.*5, where e <sup>=</sup> retinal error, e0 = 4◦/s, typically, K0 ≈ 2*.*4. For information on putative brain areas (MT, MST, FEF, SEF, PFC, CER, and BG) see section "Allocation of Model Functions to Specific Brain Areas." Adapted from Barnes and Collins (2011).

scaled to the eccentricity and in the direction of the shift (Grüsser, 1986; Sheliga et al., 1994; Barnes et al., 1995). These findings suggest a more generalized mechanism for generating smooth pursuit when the target is expected to move from one position to another. In all cases, though, expectation is the critical factor that allows initiation of such internally generated movements (Kowler, 1989; Barnes et al., 2002). Expectation is also a critical factor in the maintenance of eye velocity during occlusion; without expectation of target reappearance, eye velocity rapidly declines toward zero (Mitrani and Dimitrov, 1978; Bennett and Barnes, 2004), even when the subject attempts to continue pursuit, as evidenced by the fairly successful ability to follow the future target movement (**Figure 1D**).

The dependence on expectation is probably associated with the need to detect any mismatch between prediction and sensory feedback. Such a mechanism is essential if false predictions generated by extra-retinal mechanisms are to be rectified. Effects of expectation can be readily revealed by catch trials in which unexpected stimulus changes occur (see example in **Figure 1E**). In general, inappropriate prediction occurs for at least 100 ms after expected target appearance, i.e., the expected latency of visual feedback (Barnes and Asselman, 1991; Barnes et al., 2000). Absence of conflict is probably the factor that allows smooth pursuit to be continued when the image is stabilized on the retina (Cushman et al., 1984; Barnes et al., 1995).

#### **EVIDENCE OF SAMPLING AND STORAGE IN THE INITIAL PURSUIT RESPONSE**

To test the hypothesis that target velocity might be sampled at the onset of the pursuit stimulus, Barnes and Collins (2008a) devised an experiment in which the target was presented for a very brief period (*PD* = 50–200 ms) at the beginning of the ramp and was then extinguished for a period (ED) up to 600 ms. Crucially, the direction, speed, initial presentation duration and time of initiation of the ramp were randomized with the objective of determining whether subjects were able to extract motion information within the brief presentation so as to scale their smooth eye velocity to target velocity during occlusion. Various sources of behavioral (Lisberger, 1998; Bennett et al., 2007), and neurophysiological evidence (Osborne et al., 2004) suggested that 200 ms should be sufficient to fully extract target velocity information. Given the brief duration of presentation, the retinal component of pursuit was expected to be considerably reduced, allowing any extra-retinal component to be clearly identified. As shown in (**Figures 3A–C**), there were two distinct phases of the response to this Mid-ramp extinction condition,

**(A–C)** and Short Ramp **(D)** tasks. In **(A)** and **(B)** target velocity = 5◦/s (orange), 10◦/s (blue), 15◦/s (green), or 20◦/s (black); PD = 50 ms in **(A)**, PD = 200 ms in **(B)**. In **(C)** and **(D)** target velocity = 20◦/s; PD = 50 ms (magenta), 100 ms (gray), 150 ms (red), or 200 ms (black).

lines denote 650 ms after target onset; note that for these examples target extinction occurred at a different time for each data series. PD = initial target exposure duration; ED = duration of target extinction. From Barnes and Collins (2008a).

an initial rapid increase in eye velocity followed by a secondary, more sustained response. The initial component closely followed the response in a control condition (cyan trace, **Figure 3C**) in which the target was continuously visible, but this initial component reached a peak that increased as the duration of target presentation (PD) increased; this represents the visually-driven component of the pursuit response. The secondary component, however, which continued well after target extinction, represents the extra-retinal component of pursuit. For the shortest presentations (50 and 100 ms) the secondary component continued to increase beyond the initial peak whereas for the 200 ms presentation there was a decline from the initial peak toward an asymptotic level which was similar for 150 and 200 ms. Importantly, this asymptotic level increased as target velocity increased (**Figures 3A,B**).

This experiment also took advantage of the finding that the continuation of the extra-retinal component would be dependent on the expectation of target reappearance by comparing the Mid-ramp extinction condition with a Short Ramp condition in which the target failed to reappear. It was argued that subtraction of the Short-ramp response (**Figure 3D**) from the Mid-ramp response (see **Figure 4A**) should give an indication of the temporal development of the expectation-dependent extra-retinal component. As shown in **Figure 4A**, because the

across all six subjects for PD = 150 ms for each target velocity [5 (red), 10 (green), 15 (magenta), and 20◦ /s (blue)]. Gray shading indicates period of target extinction. **(C)**. Example response from single subject during first (blue trace) and second presentations (red trace) of the Initial Extinction

from similar Initial Occlusion experiment (Collins and Barnes, 2006, mean of 16 and 24◦/s responses) in which response is aligned to audio cue occurring 700 ms before target appearance. Data in **(A–C)** derived from Barnes and Collins (2008a,b).

initial visually-driven components of Mid-ramp and Shortramp responses were very similar, their effect was eliminated, revealing that the difference signal increased with time with much lower acceleration than the visually-driven component. Importantly, eye velocity at the end of occlusion increased with target velocity (**Figure 4B**), thus providing evidence that target velocity had been sampled during the initial presentation and held as a reference level in a form of working memory.

#### **SIMILARITY OF EXTRA-RETINAL PURSUIT COMPONENT AND ANTICIPATORY SMOOTH PURSUIT**

In an attempt to determine how the extra-retinal component might develop in the complete absence of initial retinal input a further experiment was devised (Barnes and Collins, 2008b). Subjects initially fixated a stationary target for a randomized period of 500–1000 ms. The target was then extinguished for 600 ms but the subject was informed that target extinction indicated that its unseen motion had started; thus, when it reappeared, the target was already in an eccentric position and in motion. This paradigm was referred to as the Initial Extinction condition. Since target velocity was unknown at the start of motion, the stimulus was presented in blocks of repeated, identical stimuli, but target speed and direction were randomized between blocks. In the first presentation of each block the response started ∼100 ms after target appearance whereas in the second and subsequent presentations anticipatory smooth movements were made during the initial occlusion (**Figure 4C**). These anticipatory responses were initiated with a mean latency of 196 ms after the offset of fixation, i.e., ∼50–60 ms after the onset of the visually-driven response, and once initiated, exhibited a relatively slow build-up of eye velocity during the remaining occlusion. Eye velocity at the end of occlusion increased significantly with target velocity, in line with previous observations relating to anticipatory eye movements (Collins and Barnes, 2006). These anticipatory responses could not be distinguished from the difference signal (**Figure 4B**) described above, for the same subjects and target velocities (Barnes and Collins, 2008b). Furthermore, in an attempt to mimic conditions that are more representative of the underlying processes in the Mid-ramp condition, a modified technique was subsequently developed, in which the initial velocity estimate had to be obtained from a single Short-ramp presentation [i.e., a brief sample (150 ms) of target motion]. This was followed by a period of fixation prior to presentation of a single Initial Extinction condition (Ackerley and Barnes, 2011). This method yielded very similar anticipatory responses in the Initial Extinction condition. This study was conducted in both head-fixed and head-free pursuit conditions; it demonstrated that subjects are able to store target motion information in each Short-ramp presentation and use it to initiate appropriately scaled anticipatory movements of both head and gaze in the Initial Extinction condition.

The picture that develops from these findings is that when the subject attempts to follow a randomized ramp stimulus the retinal and extra-retinal components operate as shown in **Figure 4D**. The retinal component has a latency of ∼100–130 ms, but when initiated, has relatively high acceleration and allows eye velocity to reach target velocity in 200–300 ms. The underlying extra-retinal component starts ∼50 ms later and develops more slowly, probably taking around 500–600 ms to reach peak velocity. Evidence suggests that it is a much noisier estimate of target velocity than that provided by visual feedback (Ackerley and Barnes, 2011). As the extra-retinal component develops it gradually takes over from the retinal component, which then diminishes toward zero as a natural consequence of its dependency on the rapidly decreasing retinal velocity error. The extra-retinal component is not a trivial proportion of the total response; it can reach gains *>*0.6 prior to target appearance [see cyan trace in **Figure 4D**; data from Collins and Barnes (2006)]. Importantly, this does not mean that the retinal component is eliminated; it still remains active in most circumstances and can correct for unexpected changes in the stimulus. Hence, when transient target motion probes are used during steady state pursuit the expected reactive response is still evoked (Schwartz and Lisberger, 1994).

#### **TARGET SELECTION AND GAIN CONTROL**

When humans are confronted with multiple moving stimuli (e.g., a typical street scene) they must select which particular moving object to pursue. One way to accomplish this would be to enhance the visual feedback of the selected object in relation to other stimuli by increasing the open-loop gain (K in the model, **Figure 2**) associated with that target. Evidence for such gain increases comes from experiments in which clear differences have been shown in the magnitude of responses evoked by active pursuit as opposed to passive stimulation in which the subject simply stares at the moving target (Barnes and Hill, 1984; Barnes and Crombie, 1985; Pola and Wyatt, 1985). It has also been shown that when a high frequency (e.g., 5 Hz) single cycle perturbation is superimposed on constant velocity target motion the eye velocity gain associated with the perturbation increases with target velocity in both monkey (Schwartz and Lisberger, 1994) and man (Churchland and Lisberger, 2002). Once the pursuit target has been selected and the eye moves across the remaining non-selected stimuli, the passive response induced should reduce pursuit velocity. Such interactions can be demonstrated for pursuit against large backgrounds, although the decrease is normally no more than 10–20% (Yee et al., 1983; Collewijn and Tamminga, 1984; Kowler et al., 1984; Barnes and Crombie, 1985; Worfolk and Barnes, 1992; Kasahara et al., 2006). Although this type of interaction explains some behavior in the steady state, there are clearly other mechanisms at play (Keller and Khan, 1986; Kimmig et al., 1992; Mohrmann and Thier, 1995).

Surprisingly, even quite small targets (or distracters) can have a passive influence on smooth pursuit (Cheng and Outerbridge, 1975; Barnes and Hill, 1984). When a simple distracter is presented simultaneously with a pursuit target an attentionmodulated selection process occurs before pursuit initiation. Ferrera and Lisberger (1995, 1997) showed that the initial openloop response is a vector average of the responses that would be made to individual stimuli. After an initial period (∼50 ms) a saccade is made to one of the targets and post-saccadic eye velocity is then in the direction of the selected target. If the distracter moves in the opposite direction or is stationary an increase in latency alone is observed (Lisberger and Ferrera, 1997; Knox and Bekkour, 2004). In general, changes in pursuit gain observed in the presence of backgrounds or distracters result from physical characteristics such as size and peripheral location of competing stimuli, but perhaps most importantly, by the influence of attention (Kerzel et al., 2008), which raises the gain for the selected target and/or suppresses the gain for competing stimuli.

#### **UPDATING THE PURSUIT MODEL**

If as we propose, the extra-retinal component underlying the maintenance phase is produced by the same mechanism as anticipatory pursuit it is necessary to suggest how this might be incorporated in a more general model of pursuit. This requires additional features to be added to the efference copy model, notably the inclusion of a second internal loop, the *predictive* pathway (**Figure 2**). Whereas the *reactive* pathway is assumed to function during randomized responses and generates an extraretinal component scaled to the initial target velocity as shown in **Figure 4**, the *predictive* pathway holds velocity samples captured during prior stimulation in a form of working memory (MEM). This *predictive* pathway enables motion information to be retained during fixation and thereby allows appropriately scaled anticipatory movements to be released in advance of future eye movement, given appropriate expectation of target appearance. The results of numerous experiments (Barnes and Asselman, 1991; Kao and Morrow, 1994; Barnes and Donelan, 1999) have shown that anticipatory movements evoked by repeated motion stimuli have a velocity that is scaled in proportion to target velocity, even when the subject fixates a stationary target and simply views but does not pursue the initial presentation (Barnes et al., 1997, 2000; Burke and Barnes, 2008a). This implies that target speed information can be stored independently of ongoing eye movement, a feature that can be accomplished by assuming that the target velocity estimate is internally reconstructed by summation of efference copy and retinal error independently of the main oculomotor drive, as shown in **Figure 2** (junction C). Results of experiments using complex motion stimuli comprising sequences of ramps with randomized speed and direction (Barnes and Schmid, 2002; Collins and Barnes, 2005; Burke and Barnes, 2007) have shown that multiple levels of velocity may be retained within MEM. The output of variable levels of stored information from MEM over time may constitute a basis for the dynamic representation of target motion described by Orban de Xivry et al. (2008).

If stored motion information is to be used effectively for prediction it needs to be released at an appropriate time to minimize velocity error. The release of the output from MEM is dependent on timing that can be derived from external cues (Boman and Hotson, 1988; Barnes and Donelan, 1999; Jarrett and Barnes, 2005) or cues derived from the motion itself if it is periodic (Barnes and Asselman, 1991). Timing is of importance not only for response initiation but also for its termination. Even for a simple ramp stimulus of known duration there is a tendency to reduce eye velocity in anticipation of the ramp termination (Robinson et al., 1986; Kowler and McKee, 1987; Boman and Hotson, 1988) as shown by the control examples in **Figure 4**. Krauzlis and Miles (1996) showed that the dynamics of pursuit offset are significantly affected by the subject's experience. When ramp stimuli of identical duration are repeated, timing becomes pre-programmed, so that an unexpected increase in duration results in inappropriate eye velocity reduction for ∼400 ms (Barnes et al., 2005).

An important feature of the model (**Figure 2**) is that output from the *reactive* and *predictive* loops is gated by expectation, which is represented by the variable gain β (≤1). This includes a mechanism for detecting mismatch between the predictive velocity and available visual feedback. This reflects the fact that, in anticipatory mode, the system has changed from being one that relies on visual feedback to one that generates a predictive estimate of the required motor drive and uses feedback to check that this is correct. Importantly, it would not be possible for the *reactive* and *predictive* pathways to operate simultaneously since this would overestimate target velocity, so it must be assumed that activation of the *predictive* pathway automatically leads to inhibition of the *reactive* pathway. This model has been used to provide a very effective simulation of responses in the Mid-ramp, Short ramp, Initial Extinction and Control conditions (Barnes and Collins, 2008b). It should also be noted that, by incorporating two working memory components, one that holds current motion information (S/H) and another that holds prior information MEM, this model provides the necessary structures for motion perception tasks in which current and prior motion stimuli are compared (Greenlee et al., 1995).

### **NEURAL SUBSTRATE OF WORKING MEMORY AND MOVEMENT PREPARATION FOR SMOOTH-PURSUIT**

#### **MAJOR PATHWAYS RELATED TO SMOOTH-PURSUIT EYE MOVEMENTS**

**Figure 5** depicts major pathways for smooth-pursuit (for reviews; see Lisberger et al., 1987; Robinson and Fuchs, 2001; Leigh and Zee, 2006). Briefly, the medial superior temporal (MST) cortical area is essential. From there, output signals are sent in two directions; one to pontine nuclei, primarily to the dorso-lateral pontine nuclei (DLPN), and through the cerebellar floccular region that includes the flocculus and ventral paraflocculus (Gerrits and Voogd, 1989), signals are sent to vestibular nuclei. The other direction is to the frontal cortex that includes the caudal part of the frontal eye fields (caudal FEF) and supplementary eye fields (SEF). From there, signals are sent to the nucleus reticularis tegmenti pontis (NRTP), and through the cerebellar dorsal vermis lobules VI–VII (i.e., oculomotor vermis) and its output region (i.e., the caudal fastigial nucleus, see Noda, 1991, for a review), signals are further sent to vestibular nuclei.

Output signals from the vestibular nuclei are sent directly, and also indirectly through the nucleus prepositus hypoglossi (NPH) or interstitial nucleus of Cajal (INC), to extraocular motoneurons (**Figure 5A**). These indirect pathways are involved in integration of eye velocity signals to eye position, common for all conjugate eye movements that consist of smooth-pursuit, saccades, optokinetic eye movements, and vestibulo-ocular reflex (VOR) (i.e., common neural integrator, Robinson, 1975; for a review, Fukushima et al., 1992). The cerebellar flocculus is also necessary for the neural integrator function (see Leigh and Zee, 2006 for a review). Signals in the cerebellar nuclei and vestibular nuclei are also sent to the cerebral cortex through the thalamus (**Figure 5A**, see Ito, 1984 for a review; also, Kyuhou and Kawaguchi, 1987; Noda et al., 1990; Fukushima, 1997).

Smooth-pursuit is required even when our head and/or whole body moves (for review see Barnes, 1993). Consistent with this requirement, vestibular-related signals induced by whole body rotation/translation, which activates semi-circular canals/otolith organs, are found in wide areas of the cerebral cortex including virtually all brain regions related to smooth-pursuit (**Figure 5A**; for reviews, Fukushima et al., 2011a; Goldberg et al., 2012; also Miyamoto et al., 2007; Schlindwein et al., 2008 for functional magnetic resonance imaging (fMRI) studies using high intensity clicks that selectively stimulate the sacculus).

#### **MEMORY-BASED SMOOTH-PURSUIT**

As noted earlier, efficient pursuit requires selection of the target to be pursued and predictive compensation for inherent delays in responses to target motion to ensure clear vision about the target. Prediction is influenced by various factors such as cues and working memory of stimulus trajectory

(e.g., Badler and Heinen, 2006; Barnes and Collins, 2011; see Barnes, 2008, for review). Prediction could occur not only in motor commands to prepare for and maintain ongoing movements but also in the sensory and/or perception pathways (e.g., Barborica and Ferrera, 2003). Such mechanisms use memory (e.g., Assad and Maunsell, 1995; see section "Major Cognitive Influences on Pursuit Behavior"). However, our understanding of neural mechanisms of predictive pursuit is still incomplete.

Prediction-related neuronal discharge during smooth-pursuit was reported in the SEF (Heinen, 1995; Heinen and Liu, 1997; Kim et al., 2005; de Hemptinne et al., 2008) and caudal FEF (e.g., MacAvoy et al., 1991; Fukushima et al., 2002). Prediction-related activation of these areas during smooth-pursuit was also reported by fMRI in humans (e.g., Schmid et al., 2001; Burke and Barnes, 2008b). However, in these studies, activation related to preparation for pursuit eye movements could not be separated from activation related to processing of target motion signals or their working memory. Moreover, in daily life, a specific target must be selected from multiple moving objects, requiring decisions and selection of whether and what to pursue. Although the notion that the caudal FEF issues pursuit commands is well supported (MacAvoy et al., 1991; for a review, Fukushima et al., 2011a), the precise roles of the FEF and SEF in predictive pursuit were largely unknown.

To examine neuronal substrates for predictive pursuit, it is necessary to separate visual motion memory and movement preparation. For this, we employed a memory-based smooth-pursuit task that used two cues and two delay periods (**Figure 6A**; Shichinohe et al., 2009; Fukushima et al., 2011a,b): cue 1 indicated the visual

were required to fixate it (**1**. fixation). Cue 1 consisted of a random-dot pattern of 10◦ diameter. All 150 dots moved along one of eight directions at 10◦/s for 0.5 s [**2**. cue 1, 100% correlation of Newsome and Pare (1988)]. Visual motion-direction was randomly presented. The monkeys were required to remember both the color of the dots and their movement direction while fixating the stationary spot. After a delay (**3**. delay 1), a stationary random-dot pattern was presented as the 2nd cue for 0.5 s (**4**. cue 2). If the color of the stationary cue 2 dots was the same as the cue 1 color, it instructed the monkeys to prepare to pursue a spot that would move in the direction instructed by cue 1 (i.e., *go*). If the color of cue 2 differed from cue 1, it

the monkeys were required to execute the correct action by selecting one of three spots and either pursuing the correct spot in the correct direction or maintaining fixation (**6**. action). For this, the stationary spot remained centered, but spawned two identical spots; one that moved in the direction instructed by cue 1 and the other moved in the opposite direction at 10◦/s. For correct performance, the monkeys were rewarded. For analysis, all trials were sorted by cue 1, cue 2 direction/instructions. **(B)** eye movement records during early and late training when cue 1 was rightward and cue 2 was *go*. Pos and vel indicate position and velocity. For further explanation, see text. Modified from Fukushima et al. (2008, 2011a) and Shichinohe et al. (2009).

motion-direction and cue 2 instructed whether to prepare to pursue (i.e., *go*) or not to pursue (i.e., *no-go*). Based on the memory of visual motion-direction presented at cue 1 (**Figure 6A2**) and the *go*/*no-go* instruction presented at cue 2 (**Figure 6A4**), monkeys were trained to decide which of two oppositely-directed targets should be selected and whether to pursue or not to pursue (by maintaining fixation of a third stationary spot) during the action period (**A6**, for further task explanation, see legend of **Figure 6**). This task thus invokes most of the mechanisms discussed in the section "Major Cognitive Influences on Pursuit Behavior."

**Figure 6B** shows representative eye movement records of a representative monkey during early and late training when cue 1 was rightward and cue 2 was *go* (Fukushima et al., 2008, 2011a). Early in their training (typically after 6–8 months of training), monkeys learned the task basics with error rates of less than 10% for *go* and *no-go* trials. As illustrated in **Figure 6B1**, the monkey initiated the final action by saccades (but not by smoothpursuit) with latencies typically 260–300 ms (**B1**, upward arrow), and these saccades were followed by smooth-pursuit. The lack of an initial smooth-pursuit component before saccades (**B1**, downward arrow) was consistent with the finding that vector averaging was used to combine visual inputs arising from two moving spots (Lisberger and Ferrera, 1997); in our task, visual motion inputs arising from the two oppositely moving spots with the same speed during the action period (**Figure 6A6**, e.g., leftward vs. rightward) would have been nullified (also Garbutt and Lisberger, 2006). Saccades to the cued direction during early training (**Figure 6B1**) must have enhanced visual motion processing of the pursuit target in that direction so that smooth-pursuit was effectively induced after saccades (i.e., postsaccadic enhancement of pursuit initiation, Lisberger, 1998; Ogawa and Fujita, 1998).

Later (typically after a year of training), saccade latency to spot motion shortened usually to about 220 ms, and preceding the saccades, initial smooth-pursuit appeared with latencies typically of 130–150 ms (**Figure 6B2**, arrow). This indicates that the acquisition of working memory and the appearance of the initial smooth-pursuit before saccades in this task are separate processes (**Figures 6B1**,**B2**; see section "Parkinson's Disease"); the latter required further training for efficient and nearly "automatic" tracking performance. Shortening of initial saccade latencies and appearance of the initial pursuit component in the late training are consistent with the interpretation that these responses were induced by priming effects of cue 1 direction memory and cue 2 *go* instruction (Bichot and Schall, 2002; Garbutt and Lisberger, 2006; see section "Representation of directional visual motion-memory and movement-preparation signals in the frontal cortex").

#### **NEURONAL ACTIVITY IN THE MAJOR PATHWAYS RELATED TO SMOOTH-PURSUIT**

#### *Representation of directional visual motion-memory and movement-preparation signals in the frontal cortex*

Using the memory-based smooth-pursuit task, signals for directional visual motion-memory and movement-preparation have been identified in the SEF and caudal FEF. Three groups of neurons were found; two of them carried these signals separately (visual memory neurons, movement-preparation neurons) and the third carried both signals (visual memory + movementpreparation neurons). Although the two regions carried qualitatively similar signals, consistent with the anatomical studies that show reciprocal connections between the SEF and FEF (Huerta et al., 1987), there were significant quantitative differences in the task-related signals represented in the two areas (see **Figure 7** legend for the definition of task-related neurons). SEF visual memory neurons were unrelated to pursuit, whereas some FEF visual memory neurons were pursuit neurons (Shichinohe et al., 2009; Fukushima et al., 2011b).

*Visual memory neurons.* Visual memory neurons exhibited direction-specific discharge during delay 1. An example SEF neuron (**Figure 7**) responded when rightward (but not leftward) visual motion was presented at cue 1; responses to cue 1 and during delay 1 were similar during *go* and *no-go* trials (**B1**,**B2** vs. **C1**,**C2**). The delay 1 discharge was not significantly influenced by the monkey's preparation of pursuit (**B1** vs. **C1**). This was also seen when the monkey erred (**Figure 7B1**, red trace in eye pos) by performing leftward (instead of rightward) pursuit. Despite this error, discharge similar to that during correct trials was clearly observed during delay 1 (**B1**, red raster). Moreover, it did not exhibit directional responses during delay 2 of *go* (**B3**, blue vs. black) or *no-go* trials (**C3**, blue vs. black). These results suggest that the delay 1 activity of visual memory neurons reflected memory of the visual motion-direction presented by cue 1. Although it exhibited a build-up activity during *go* trials (**Figures 7B1**,**B2**), it is unlikely that the activity was used directly for movement preparation, since it was non-directional (**Figure 7B3**).

Possible neural correlates for the putative priming effects by cues during the action period (**Figure 6B2**, arrow, section "Memory-Based Smooth-Pursuit") are suggested in **Figures 7B**,**C** for this SEF visual memory neuron that had rightward preferred direction to cue 1 visual motion (**B1**, **C1**). Since this neuron was unrelated to pursuit (**Figures 7B1**,**B2**, action), the initial burst during the action period of *go* trials (**B1**, downward arrow) must have reflected visual response to rightward spot motion. Notice selective burst discharge to the identical visual motion stimuli during the action period, i.e., the clear burst during the action period appeared only in **Figure 7B1** when cue 1 visual motion was rightward and cue 2 instruction was go (vs. **B2**, **C1**,**2**), indicating that the spot motion responses clearly depended on the visual motion-direction memory and *go*/*no-go* instructions. This interpretation is confirmed in **Figure 7D**; discharge to spot motion clearly occurred before the onset of the initial smooth eye velocity (D, red arrow before eye onset vs. other conditions D, E). Similar modulation of spot motion responses during the action period by cues was also observed in visual motion responses of some caudal FEF pursuit neurons (Figures 2F–I of Fukushima et al., 2011b).

*Visual memory* **+** *movement-preparation neurons.* Visual memory + movement-preparation neurons exhibited directionspecific discharge during both delay 1 and delay 2. An example SEF neuron (**Figures 8A1**–**A4**) showed clear discharge during the late period of delay 1 when leftward visual motion was presented at cue 1 during *go* and *no-go* trials (**A1** vs. **A2**, **A3** vs. **A4**). In addition, when the cue 2 instructed *go* to prepare to pursue in the congruent direction (**A1**), it exhibited robust discharge during the late period of delay 2. **Figure 8B** plots a difference in time course of mean discharge of visual memory neurons (red) and visual memory + movement-preparation neurons (blue) in the SEF during *go* trials in their preferred directions. While the initial response to cue 1 for visual memory neurons (B, red) was larger, the two groups of neurons displayed similar discharge during the delay 1 and cue 2. During delay 2, the discharge of the two groups of neurons diverged.

Visual memory + movement-preparation neurons exhibited congruent directionality during delay 1 and delay 2 of *go* trials (**Figures 8A1**,**B**, blue). Our results suggest that the delay 1 information about the visual motion-direction is used for further processing in preparing for pursuit direction in the SEF (Shichinohe et al., 2009). This interpretation was examined in the following experiments. First, to examine how delay 1 and 2 responses were correlated, we let the monkeys choose the pursuit direction and examined how these neurons discharged during these periods. For this, we used the paradigm devised by Newsome and Pare (1988, 0% correlation) that moved each dot randomly in different directions at cue 1. In this condition, cue 1 does not provide the necessary information about the visual motion-direction. If the color of cue 2 was the same as cue 1, it instructed *go* and the monkey followed one of the two moving spots. If the color of cue 2 was different from that of cue 1, it instructed *no-go*, and the monkeys' maintained fixation. Each trial was sorted based on the monkeys' choice of either the preferred direction of delay 2 activity or the anti-preferred direction of the neuron (tested by 100% correlation).

**Figure 8C** plots sorted trials during 0% correlation for leftward pursuit (**C1**), rightward pursuit (**C2**) and *no-go* (**C3**) of the same neuron (A). When the monkey made leftward pursuit (i.e., in the preferred direction of this neuron at 100% correlation, **Figure 8A**), discharge during delay 2 was much stronger compared to the trials where the monkey made rightward pursuit (**C1** vs. **C2**), indicating that the delay 2 activity indeed reflected preparation for pursuit. In addition, the stronger discharge during the delay 1 in the same trials (**C1** vs. **C2**) suggests

the task **(A2–7)** were associated with modulated neuronal activity, mean discharge rates of individual neurons were measured during the different task periods for the correct response [e.g., **(C1)**, periods 2–7], and were compared with the mean rate (± SD) during the initial fixation [**(C1)**, period 1] for each neuron. Significant differences were defined as those having a *p*-value *<*0.05 using Student's t test with the Bonferroni correction for multiple comparisons. *(Continued)*

#### **FIGURE 7 | Continued**

Neurons that exhibited significant modulation during this task were defined as task-related neurons. (**D** and **E**) de-saccaded and averaged eye velocity and discharge of this neuron 500 ms before and 1000 ms after spot motion onset (vertical straight line) during

that this discharge during delay 1 was also related to the monkey's choice and preparation for the subsequent pursuit direction independent of the cue 1 stimulus itself, which was non-directional.

Second, to evaluate these results, we calculated choice probability (Britten et al., 1996) and its time course based on whether the monkeys pursued in the preferred direction of the neuron (tested by 100% correlation) or anti-preferred direction. The results for 10 SEF visual memory + movement-preparation neurons are plotted in **Figure 8D**. Mean choice probability values (which were ∼0.5 before cue 1) increased above 0.7 during delay 1 and delay 2. For comparison, the time course of choice probability of the 10 neurons during 100% is plotted in **Figure 8E** (black). Also plotted in green (**Figure 8E**) was choice probability time course of the same 10 neurons when a stationary pattern (i.e., 0◦/s) was presented at cue 1. The 3 curves (**Figures 8D**,**E**) were basically similar, indicating that delay 1 discharge was not a simple holding of visual motion response; the delay 1 response did not require visual motion stimuli, but reflected motion-direction assessment and memory (Fukushima et al., 2011a).

The congruent directionality of delay 1 and 2 discharge of visual memory + movement-preparation neurons was also observed when moving two spots stepwise during the action period so that the monkeys made saccades instead of smoothpursuit (Shichinohe et al., 2009). These results suggest a common mechanism for visual memory and movement preparation for efficient tracking performance that includes both smooth-pursuit and saccades (Krauzlis, 2005).

#### *Similarity and differences of signals represented in the SEF and caudal FEF*

To compare direction-specific discharge modulation during different task periods of *go* trials in the caudal FEF and SEF, **Figure 9A** plots the percent of modulated neurons (out of the total number of task-related neurons in each area) that showed direction-specific modulation in each period (e.g., **Figure 7C1**, periods 2–7). Although qualitatively similar signals were found in both areas, there were quantitatively significant differences between the two areas during delay 1 and action period (**Figure 9A** ∗, Fukushima et al., 2011b); the percent of modulated neurons in the caudal FEF was significantly lower than that of the SEF during delay 1 but higher than that of the SEF during the action period. No significant difference between the two areas was detected in other periods including the delay 2 of *go* trials where movement-preparation is required.

FEF neurons exhibit visual latencies comparable with those in the middle temporal area (MT) and MST and sometimes even as early as some neurons in V1 (Schmolesky et al., 1998). Comparison of visual latencies of neurons that exhibited directional visual motion responses to cue 1 indicates that neurons with shorter visual latencies were significantly more frequent in the action period. Smooth-pursuit onset is indicated by a dashed line. Only correct trials were averaged for *go* **(D)** and *no-go* conditions **(E)** as indicated by colors. See text for further explanation. Reproduced and modified from Shichinohe et al. (2009) and Fukushima et al. (2011a).

the caudal FEF than the SEF (**Figure 9B**, Fukushima et al., 2011b). To examine how the difference between the two areas during delay 1 that signals directional visual motion-memory was reflected in the time course of mean discharge, **Figure 9C** plots discharge of caudal FEF neurons that exhibited directional responses to cue 1 in their preferred (green) and anti-preferred direction (black) during *go* trials. Although caudal FEF neurons exhibited a residual visual motion response to cue 1 that reflected directional visual motion-memory at the beginning of delay 1, the responses returned to control level near the end of delay 1 before cue 2 onset (**Figure 9C**, arrow). This contrasts with the discharge of SEF neurons that exhibited directional responses to cue 1 visual motion; cue 1 discharge was maintained during the whole delay 1 period (**Figure 9E**, arrow).

*No-go neurons. No-go* neurons exhibited *no-go* instructionspecific discharge during delay 2 *no-go* trials (Shichinohe et al., 2009). The proportion of *no-go* neurons (of the total number of task-related neurons) was significantly higher in the SEF than caudal FEF (50/248 = 24% vs. 16/185 = 9%, Fukushima et al., 2011b). As shown in **Figure 10A**, this example *no-go* neuron in the SEF exhibited discharge during the action period of *go* trials, regardless of the pursuit direction (**A1**). When the cue 2 instruction was *no-go* (**Figure 10A2**), it exhibited a stronger discharge during cue 2 and delay 2. The difference in discharge modulation during these periods is clear in the mean discharge rates during *no-go* and *go* trials (**Figure 10B**, red vs. black). Furthermore, when the monkey erred during the action period of a *no-go* trial by pursuing a leftward moving spot (**A2**, red trace), this *no-go* neuron nearly stopped discharging at cue 2 and during delay 2, suggesting that the discharge during these periods reflected the monkey's decision not to pursue during *go* trials. This interpretation was supported by the analysis of choice probability (Britten et al., 1996; Zaksas and Pasternak, 2006) during delay 2 with respect to the monkeys' choice based on whether they maintained fixation (i.e., *no-go*) or if they pursued a moving spot, regardless of its directions (**Figures 10A1** vs. **A2**). The choice probability increased to ∼0.8 after cue 2 and decreased during the action period (**Figure 10C**). Latencies of *no-go* discharge relative to cue 2 onset were distributed widely with modal values of 160 ms for SEF and 180 ms for caudal FEF (Shichinohe et al., 2009; Fukushima et al., 2011b).

*No-go* related SEF discharge during delay 2 was also observed when monkeys performed memory-based saccades (**Figures 10D1** vs. **D2**, Shichinohe et al., 2009). Discharge characteristics of *no-go* neurons in the caudal FEF were similar to SEF *no-go* neurons (**Figure 10E**), suggesting that *no-go* signals in SEF and caudal FEF were common during delay 2 that requires *no-go* instruction memory (**Figure 6A5**) for memory-based smoothpursuit and saccades (**Figures 10A**,**D**,**E**, also Krauzlis, 2005).

leftward pursuit **(C1)** and rightward pursuit **(C2)** during action period. **(C3)** *No-go* trials. (**D** and **E**) plot mean (±SE) choice probability time course of 10 SEF visual memory + movement-preparation neurons during *go* trials based on whether the monkeys pursued in the preferred directions of individual neurons during delay 2 when cue 1 was presented with 0% correlation **(D)** and 100% correlation (**E**, black). Green traces in **(E)** are mean (±SE) choice probability time course of the same 10 neurons when a stationary pattern was presented at cue 1 (0◦/s). For further explanation, see text. Reproduced and modified from Shichinohe et al. (2009) and Fukushima et al. (2011a).

**(E)** Mean ± SE discharge of 27 SEF neurons that exhibited directional visual motion response to cue 1 during *go* trials. In (**C** and **E**) Green and black traces are discharge modulation in the preferred direction and anti-preferred direction, respectively. (**D** and **F**) Mean ± SE discharge of movementpreparation neurons in the caudal FEF **(D)** and SEF **(F)** during *go* trials. Blue and black traces are discharge modulation in the preferred direction and anti-preferred direction, respectively. (**D** and **F**) Reproduced from Shichinohe et al. (2009). (**G** and **H**) Reproduced from Kurkin et al. (2011). Others, reproduced from Fukushima et al. (2011a,b).

None of *no-go* neurons tested exhibited discharge modulation during simple pursuit using a single spot (**Figure 10F**), indicating that *no-go* neurons differ from pursuit neurons. *No-go* neurons are also different from fixation neurons in the FEF

exhibited directional visual motion response to cue 1 during *go* trials.

(Izawa et al., 2009, 2011), since *no-go* neurons in the memorybased pursuit/saccade task exhibited significant discharge only after cue 2 but not before (e.g., during cue 1 or delay 1), despite that the monkeys fixated a stationary spot during these periods

position record (arrow) and arrow in spike raster highlight an error trial. **(B)** Time course of mean (± SE) discharge of the 24 *no-go* SEF neurons during *no-go* (red) and *go* (black) trials. **(C)** Choice probability time course for

(**Figures 10A**–**E**). *No-go* neurons were reported in a saccadic *go/no-go* task in the dorsomedial frontal cortex (Mann et al., 1988)

*Movement-preparation neurons.* Movement-preparation neurons exhibited direction-specific discharge during the delay 2 of *go* trials (Shichinohe et al., 2009). **Figures 9D**,**F** compare discharge modulation of movement-preparation neurons in the caudal FEF (D) and SEF (F); their time courses were similar. There was no significant difference in the percent of

and prefrontal cortex and FEF (Hasegawa et al., 2004).

movement-preparation neurons (**Figure 9A**, delay 2) between the two areas.

single spot that moved sinusoidally. **(A–D)** Reproduced from Shichinohe et al. (2009) and Fukushima et al. (2011a). **(E,F)** Reproduced from Fukushima et al.

#### *Other cerebral cortical areas*

(2011a,b).

Our knowledge of where the SEF visual memory signals are generated is still imprecise. The dorsolateral prefrontal cortex has been linked to temporal storage of sensory signals (i.e., working memory, Goldman-Rakic, 1995). Kim and Shadlen (1999) demonstrated that visual motion responses could be maintained during a delay period in prefrontal cortex neurons. However, in their studies, discharge related to the memory of visual motion could not be separated from discharge related to movementpreparation (also Zaksas and Pasternak, 2006).

Another potential site is MST, since this region, especially the dorsomedial MST (MSTd, Desimone and Ungerleider, 1986), sends direct projections to the SEF (Huerta and Kaas, 1990), and MSTd is involved in perception and memory of visual motion (e.g., Celebrini and Newsome, 1994; Britten and van Wezel, 2002; Gu et al., 2007; Liu and Angelaki, 2009; cf. Heuer and Britten, 2004). However, as illustrated in **Figures 9G**,**H**, representative signals in MSTd clearly differed from those in the SEF during memory-based smooth-pursuit; MSTd neurons signaled visual motion accurately, but none of the 108 MSTd neurons that showed directional visual motion response to cue 1 exhibited direction- and/or instruction-specific discharge during delay 1 or delay 2. Although they did show significantly higher discharge rates during the delay periods compared to the control period (**Figures 9G**,**H**, delay 1 and delay 2), their discharge was not directional (Kurkin et al., 2011), which suggests that their activity during the delay periods most probably reflected an effect of attention (e.g., Recanzone and Wurtz, 2000).

By manipulating visual inputs during pursuit eye movements, Newsome et al. (1988) demonstrated that the extraretinal, pursuit response of MSTd neurons begins at least 50 ms after onset of the smooth-pursuit eye movements, consistent with the behavioral findings of Barnes and Collins (2008a,b). They suggested that this response most likely derives from corollary discharge mechanisms and that MSTd plays a role in generating the motor signals responsible for the maintenance of ongoing pursuit. The results showing lack of movement preparation signals and late onset of MSTd neuron modulation during the action period of *go* trials (**Figures 9G**,**H**, Kurkin et al., 2011) are consistent with their observation (Newsome et al., 1988). The exact origin of the possible corollary discharge to MSTd is still unclear, but multiple brain areas including ventrolateral MST (MSTl, Thier and Erickson, 1992) seem to be involved. In particular, pursuit command signals issued from the caudal FEF could be sent directly to MST through corticocortical projections (Stanton et al., 1995) and also indirectly to MST via the descending pathways including the deep cerebellar nuclei and vestibular nuclei through the thalamus (**Figure 5A**, Schlag and Schlag-Rey, 1986; Tanaka, 2005; also Perrone and Krauzlis, 2008). Although we do not exclude possible alternative types of MSTd neurons coding assessment and memory of visual motion-direction (e.g., Ferrera and Lisberger, 1997), it seems more likely that visual motion-direction information sent from MSTd and caudal FEF to the SEF is further processed within the SEF to create assessment and the memory of visual motion-direction (Fukushima et al., 2011a,b; Kurkin et al., 2011).

#### *Comparison of task-related discharge of the cerebellar oculomotor vermis/caudal fastigial nucleus and the floccular region*

Signals similar to those seen in the SEF and caudal FEF were also represented in the oculomotor vermis/caudal fastigial nucleus and the floccular region, although clear differences were also observed (Fukushima et al., 2011c). In the floccular region, simple spike discharge of most task-related Purkinje cells responded only during the action period of *go* trials. None of them tested (41/44) exhibited significant modulation during delay 1 or 2 of *go* or *no-go* trials, suggesting that the floccular pathway is specifically involved in executing smooth-pursuit eye movements *per se* as reported earlier (Robinson and Fuchs, 2001; Leigh and Zee, 2006; Lisberger, 2009).

In contrast, most task-related Purkinje cells (50/76 = 66%) in the oculomotor vermis showed *no-go* instruction-specific discharge during cue 2 and delay 2 (Fukushima et al., 2011c). Their activity was not modulated during sinusoidal pursuit using a single spot, suggesting that it was unrelated to eye movement *per se*.

In our task, some task-related Purkinje cells (10/76) in the oculomotor vermis were pursuit-related during memory-based pursuit. Discharge characteristics of these neurons during pursuit using a single spot were similar to those reported previously (Robinson and Fuchs, 2001; Leigh and Zee, 2006); some of them also carried visual motion responses including memory and movement preparation-related discharge (Fukushima et al., 2011c).

In the caudal fastigial nuclei (cFN), the major response type (46/77 = 60%) was also *no-go* neurons (Fukushima et al., 2011c). Although neurons with discharge related to eye movement *per se* were in the minority in the memory-based pursuit task, some of them carried visual motion-memory and movement preparation signals. *No-go* neurons are different from omni-pause neurons in the brainstem that are active during fixation by suppressing burst neuron activity (see Leigh and Zee, 2006 for a review), since *no-go* neurons in the memory-based pursuit task exhibited significant discharge only after cue 2 but not before (e.g., during cue 1 or delay 1), similar to *no-go* neurons in the SEF and caudal FEF, despite that the monkeys fixated a stationary spot during these periods (**Figure 10**).

What do *no-go* neurons in the oculomotor vermis/cFN signal? We believe that *no-go* neurons in these regions are non-motor neurons that receive inputs from SEF/caudal FEF *no-go* neurons (**Figure 5A**) and signal *no-go* (i.e., not to pursue) memory during delay 2 for the following reasons. (1) Discharge characteristics of *no-go* neurons in all these areas were basically similar (**Figures 10B**,**E**), but mean latencies (re cue 2 onset) of *no-go* responses in the oculomotor vermis/cFN were significantly longer (*>*250 ms, *p <* 0*.*001) than those of SEF/caudal FEF *no-go* neurons (Fukushima et al., 2011c). (2) None of *no-go* neurons tested in the 4 areas exhibited directional eye movement-related discharge (e.g., **Figure 10F**). (3) Mean discharge rates of *no-go* neurons during the action period were similar during *go* and *no-go* trials (e.g., **Figure 10B**), consistent with the results showing that *no-go* neurons coded useful information during delay 2 (judged from choice probability with respect to the monkeys' choice for *no-go* or *go*), but choice probability quickly decreased during the action period. (4) During the action period of *no-go* trials, the monkeys occasionally made small saccades with amplitudes 1–6◦ (e.g., **Figures 6A**,**B** of Fukushima et al., 2011b). Discharge rates of *no-go* neurons during the delay 2 of such conditions were virtually identical to those when the monkeys fixated the stationary spot well without making such saccades (e.g., **Figures 6A2** vs. **B2** of Fukushima et al., 2011b), suggesting that their discharge during delay 2 was unrelated to the appearance/suppression of small saccades during the action period. We suggest that *no-go* neurons in these 4 areas may form part of cerebro-cerebellar network (Ito, 1984, **Figure 5A**) for *no-go* memory thereby they are involved in target selection.

In previous studies using conventional pursuit or saccade tasks, monkeys were not required to perform a *go/no-go* selection; *no-go* signals could not be identified. Possible involvement of the oculomotor vermis-caudal fastigial nucleus pathway in working memory for *no-go* instructions in monkeys may be a result of training (section "Memory-Based Smooth-Pursuit") and part of cerebellar involvement in memory (see Ito, 2006, 2011, for reviews). Of note, Vastagh et al. (2005) examined the postnatal development of the Purkinje layer in the mouse cerebellum and showed that the oculomotor vermis belongs to the latest developing cerebellar cortical structures. Coffman et al. (2011) showed that the motor-related frontal cortical areas send massive projections to the cerebellar vermis including the oculomotor vermis. These observations indicate a close functional connection between the frontal cortex and the oculomotor vermis.

#### **CHEMICAL INACTIVATION**

#### *Different effects induced by chemical inactivation of the SEF and caudal FEF*

Significant quantitative differences in signals represented in the two areas (sections "Representation of directional visual motionmemory and movement-preparation signals in the frontal cortex," and "Similarity and differences of signals represented in the SEF and caudal FEF") are consistent with the differences in the effects of chemical inactivation (Shichinohe et al., 2009; Fukushima et al., 2011b). Infusion of GABA agonist muscimol into the SEF resulted in significantly higher direction errors during *go* trials and *go/no-go* selection errors during *no-go* trials. Such errors were not induced by caudal FEF inactivation. Also, consistent with the existence of movement-preparation neurons in both areas (**Figures 9D**,**F**), chemical inactivation of either area impaired an initial smooth-pursuit component before saccades. Furthermore, since both areas contained neurons (visual memory neurons and pursuit neurons) that showed visual motion response enhancement to the cued spot during the action period (e.g., **Figures 7B**,**D**, arrows), loss of their activity may also have contributed to the impaired initial pursuit. In addition, consistent with the significant difference in percent of pursuit neurons in the two areas (**Figure 9A**, action), caudal FEF inactivation significantly decreased pursuit eye velocity during pursuit maintenance, resulting in saccadic tracking, whereas SEF inactivation did not impair pursuit maintenance. In particular, caudal FEF inactivation not only decreased eye velocity gain, but impaired delay compensation of pursuit eye movements during sinusoidal pursuit of a singe spot at frequencies ∼1 Hz, suggesting that the caudal FEF is necessary for response delay compensation during sinusoidal pursuit.

These results indicate that the SEF is primarily involved in planning smooth-pursuit, whereas the caudal FEF is primarily involved in generating motor commands for pursuit execution. The existence of *no-go* neurons along with impairment in performing *no-go* trials after chemical inactivation suggests that the SEF is necessary for decision-process of whether or not to pursue moving spots including working memory of *no-go* instructions (Shichinohe et al., 2009; Fukushima et al., 2011a,b).

After inactivation of either area, postsaccadic enhancement of smooth-pursuit (Lisberger, 1998) was still observed (Shichinohe et al., 2009; Fukushima et al., 2011a,b), indicating involvement of different neural mechanisms in generating the initial pursuit component and postsaccadic pursuit enhancement. Mahaffy and Krauzlis (2011) reported that inactivation and stimulation of the frontal pursuit area change pursuit metrics without affecting pursuit target selection, consistent with our muscimol inactivation of the caudal FEF (Fukushima et al., 2011b).

#### *Chemical inactivation of the caudal fastigial nucleus*

Unilateral chemical inactivation of the caudal fastigial nucleus induces well-known impairments in smooth-pursuit and saccades (e.g., dysmetria, for reviews, see Robinson and Fuchs, 2001; Leigh and Zee, 2006). In addition, during memory-based smooth-pursuit, chemical inactivation of the caudal fastigial nucleus induced significantly higher *no-go* errors as well as direction errors (mean 40 vs. *<*10% before inactivation, Fukushima et al., 2011c), indicating impairments of visual working memory in this task. These results suggest that the oculomotor vermiscaudal fastigial nucleus pathway is involved in planning tracking eye movements that includes both smooth-pursuit and saccades, similar to the SEF (Shichinohe et al., 2009).

### **PRELIMINARY RESULTS OF CLINICAL APPLICATION PARKINSON'S DISEASE**

Characteristic of Parkinson's disease (PD) are difficulties in initiating volitional movements and, when initiated, slow and hypometric movement (e.g., Warabi et al., 2011). Ocular pursuit is impaired in most patients with PD, though the nature of the impairment is poorly understood (Leigh and Zee, 2006). Working memory impairment during cognitive tasks has been reported (e.g., Possin et al., 2008; Lee et al., 2010). To examine whether working memory of visual motion direction is impaired, we applied the memory-based smooth-pursuit task to patients with PD. None of the PD patients tested exhibited impaired working memory of motion-direction and/or *go/no-go* selection, indicating that these functions were normal in PD patients tested (Fukushima et al., 2011c), consistent with studies showing normal predictive function, including timing function, of most PD patients during smooth-pursuit (e.g., Waterston et al., 1996; Lekwuwa et al., 1999; also Pinkhardt et al., 2009; de Hemptinne et al., 2013).

Clear differences from normal controls were observed during *go* trials. Normal controls exhibited initial smooth-pursuit component in the cued direction with a mean latency of 155 ms (**Figure 11B1** ∗) followed by corrective saccades (Fukushima et al., 2011a,c; cf. Garbutt and Lisberger, 2006) which were further followed by enhanced smooth-pursuit responses (cf. **Figure 6B2**; Lisberger, 1998). Note that this pattern of tracking eye movements is basically similar to the pattern observed in monkeys after late training of this task (see **Figure 6B2**). In contrast, most PD patients tracked the correct spot with saccades; initial pursuit was

rarely induced before the saccades (**Figure 11A1** ∗), and postsaccadic enhancement of smooth-pursuit was rarely observed. Moreover, consistent with many previous reports, peak pursuit eye velocities after saccades were significantly lower (i.e., low gain) in PD patients than those of controls during pursuit maintenance (**Figures 11A1** vs. **B1**, de-saccaded, averaged).

The lack of initial pursuit and deficient postsaccadic enhancement in most PD patients are unlikely to be due to impairments of smooth-pursuit eye movements *per se,* since during simple ramp pursuit of a single spot moving at the same velocity, the same patients clearly exhibited an initial pursuit component before saccades, similar to normal controls (**Figures 11A2** vs. **B2** ∗), and since postsaccadic enhancement of smooth-pursuit was also seen at least for the first saccades after spot motion (**A2** and **B2**, downward arrows).

The appearance of the initial pursuit during the action period of memory-based pursuit in control subjects (**Figure 11B1**) most probably reflects priming effects by cues and depends on normal activity of the SEF and caudal FEF (sections "Representation of directional visual motion-memory and movement-preparation signals in the frontal cortex," "Similarity and differences of signals represented in the SEF and caudal FEF," Fukushima et al., 2011a), since in monkey studies cue 1 direction memory and cue 2 *go* instruction enhance visual motion responses of SEF and caudal FEF neurons in the cued direction (e.g., **Figures 7B**,**D**), and since chemical inactivation of these frontal cortical areas impairs initial pursuit before saccades (Shichinohe et al., 2009; Fukushima et al., 2011a,b).

Conversely, the lack of initial pursuit in patients with PD suggests that they have difficulty in inducing priming effects during memory-based pursuit (**Figures 11A1** vs. **B1** ∗) which required the patients to prepare and execute smooth-pursuit to a selected spot using the cue information (Fukushima et al., 2011a; cf. Ladda et al., 2008).

Cui et al. (2003) reported projection of the FEF pursuit area to the basal ganglia (BG) in monkeys, output of which further project back to the caudal FEF through the thalamus, forming a possible efference copy loop between the caudal FEF and BG (**Figure 5B**, also see Tian and Lynch, 1996; Lynch and Tian, 2006). Yoshida and Tanaka (2009) suggested that this pursuit loop may contribute to maintaining normal pursuit gain (see also Basso et al., 2005). Our results suggest that, of the two major components of predictive pursuit, the visual motion-direction memory is normal but movement preparation is impaired in PD in addition to impaired movement execution. A common pathophysiology may contribute to low gain pursuit and hypokinesia (Warabi et al., 2012).

In contrast to normal working memory during memory-based pursuit in patients with PD, significantly higher error rates were observed in patients with frontal cortical dysfunction using the identical task; these patients revealed low perfusional volume in the frontal or frontotemporal cortex using single photon emission computed tomography (Ito et al., 2011). Dramatic impairment of prediction due to frontal lobe degeneration has also been reported by Coppe et al. (2012). These results suggest that PD patients with working memory impairment may have frontal cortical dysfunction that includes the SEF (e.g., Possin et al., 2008; Lee et al., 2010).

#### **CEREBELLAR DEGENERATION**

Most cerebellar patients exhibit well-known impairments of eye position holding failure due to impairment of the neural integrator (section "Major Pathways Related to Smooth-Pursuit Eye Movements," Robinson, 1975; Leigh and Zee, 2006). As illustrated in **Figure 12C1**, a representative patient with cerebellar degeneration tracked a moving target with saccades. But unlike PD patients (e.g., **Figure 11A2**), corrective saccades of the cerebellar patient were followed by centripetal drift due to neural integrator failure, resulting in little pursuit eye velocity (**Figure 12A**; also Westheimer and Blair, 1973, 1974). Moreover, during visually guided saccades, the same patient exhibited dysmetric saccade (**Figure 12C**, arrow) that was followed by eye position holding failure (**Figure 12C** ∗), suggesting that both the cerebellar floccular region and oculomotor vermis were dysfunctional. In addition, during memory-based pursuit, most cerebellar patients tested exhibited direction errors during the action period (**Figure 12B**), suggesting impaired visual working memory in this task as well (Fukushima et al., 2012). These differences between patients with PD and those with cerebellar degeneration (**Figures 11** vs. **12**) suggest different roles for the BG and cerebellum in smooth-pursuit planning and execution (cf., Allen and Tsukahara, 1974).

#### **FUNCTIONAL CONSIDERATIONS**

#### **COMPARISON OF MEMORY-BASED AND SIMPLE RAMP PURSUIT**

Although smooth pursuit is evoked in both monkeys and humans in the memory-based task, comparison with simple ramp responses reveals clear differences. Memory-based eye acceleration starts slightly later and is considerably less than in the simple ramp, but a transition to higher acceleration occurs 250– 300 ms after target onset [**Figures 13A** (monkey), **C** (human)]. These differences probably result from competition between the dual identical targets in the memory pursuit task, which move in

opposing directions and are continuously visible throughout the task (Lisberger and Ferrera, 1997). The interactions can be represented by the model in **Figure 14** (adapted from Schweigart et al., 2003) in which the two channels correspond to neuronal structures with directional sensitivities of opposite polarity, similar to those shown in **Figure 7**. Retinal error input from each of the two targets interacts at junction D to create the final motor drive. Note that the extra-retinal pathway components [S/H, MEM, β, and F"(s)] of **Figure 2** have been reduced to a single function β'(s) and the main feedforward pathway has been split into direct (MST-DLPN) and indirect (MST-FEF-NRTP) components, consistent with established pathways from MST to brainstem.

Our hypothesis is that active pursuit of a single target in the simple ramp task is achieved by augmentation of gain for the

during simple ramp pursuit (SR) vs. memory pursuit (MP). **(B)** Simulations of model (thick lines) corresponding to SR and MP responses (thin lines) shown

to memory pursuit with Popout (Pop) in six Controls and seven PD patients. Averages of left- and right-going responses. From Ito et al. (2012).

selected target by increasing open-loop gain (wT1) in the indirect pathway and concomitantly initiating extra-retinal activity in the efference copy loop (i.e., increasing β1). Raising gain in the indirect pathway (wT1) is the primary factor responsible for the initial high acceleration pursuit response, the extra-retinal component giving a lower level of eye acceleration and developing later than the visually driven component (see **Figure 4D**). By contrast, in the memorized pursuit task, priming by the prior display motion presentation (cue 1) facilitates activation of the extra-retinal component (i.e., β<sup>1</sup> ≈ 1) in the appropriate channel but does not allow open-loop gain (wT1) to be immediately increased, thus leading to a low initial acceleration. Prior to initiation of the extra-retinal component weightings wD1 and wD2 are assumed to be equal and thus to cancel each other as a result of vector averaging (Ferrera and Lisberger, 1997); individually they would give a low-gain response of the type induced by passive stimulation (Cheng and Outerbridge, 1975; Barnes and Hill, 1984; Pola and Wyatt, 1985). We suggest that wT1 remains low because of difficulty in discriminating between the two identical, but oppositely directed, targets. When selection does occur, there is an abrupt increase in wT1 leading to a rapid increase in eye acceleration comparable to that seen in the simple ramp task.

priming of β

Adapted from Schweigart et al. (2003).

Model simulations of simple ramp and memory-based responses in the monkey are shown in **Figure 13B**. To assist discrimination we investigated the effect of stimulus popout by making the target in the cued direction change color at motion onset. This allowed abrupt acceleration to occur earlier (**Figure 13C**), an effect we attribute to an earlier increase in wT1.

direct (MST-DLPN) and indirect (MST-FEF-NRTP) components consistent with established pathways from MST to brainstem. Open-loop gain

Crucially, PD patients may not be capable of this modification of wT1 since their responses in the memory pursuit task do not show an abrupt increase in acceleration (Ito et al., 2012), even with a popout stimulus (**Figure 13D**). In addition, eye acceleration and peak velocity in the simple ramp task were lower than in Controls, consistent with previous observations of reduced gain in both anticipatory and visually-driven components of pursuit (Lekwuwa et al., 1999; Helmchen et al., 2012).

Notably, the initial low acceleration component of the memory-based response, which we attribute to the extra-retinal component, is absent in early training in the monkey, implying that it takes some time to train the animal to develop and release the extra-retinal response. This may be similar to a process described previously in the development of pursuit in juvenile monkeys (Shichinohe et al., 2011). Juvenile animals initially exhibit considerable instability that gradually disappears with practice. It was suggested that this could be explained by the gradual development of the extra-retinal component of pursuit. It is clear that there is a major species difference in the development of anticipatory movements and the extra-retinal component, since humans need only a few trials to learn how to generate such responses.

<sup>2</sup> or wT2 which remain inactive as indicated by crosses.

#### **ALLOCATION OF MODEL FUNCTIONS TO SPECIFIC BRAIN AREAS**

Given the findings reported here and those of earlier experiments it is possible to tentatively allocate some of the functions of the behavioral models (**Figures 2**, **14**) to specific brain areas. The reconstruction of target velocity at junction C in the models, which forms the basis of the extra-retinal component, is almost certainly carried out in MST/V5. It has long been assumed that MST plays an important role in the integration of retinal error and efference copy signals because of the sustained firing observed during target occlusion and image stabilization (Newsome et al., 1988). However, we have also taken into account the experimental results of Ilg et al. (2004) and the adaptive modeling of Dicke and Thier (1999), providing evidence that MSTl is an area in which not only retinal error and eye velocity, but also head velocity are integrated to provide an estimate of target velocity in world-centered coordinates, consistent with the modeling of results from head-free pursuit experiments (Ackerley and Barnes, 2011). In order to make internal target reconstruction temporally appropriate it is necessary to incorporate a delay in the efference copy feedback, so that if the inputs to junction C (**Figure 2**) are examined when operating in the reactive mode they comprise a retinal error signal and a delayed eye velocity efference copy signal, as observed in neuronal recordings (Newsome et al., 1988). However, if the system is operating in the predictive mode, initiating anticipatory eye movement on the basis of motion information previously stored in MEM, the activity in MST will be phase advanced with respect to that in the reactive mode; some evidence to support this has come from neuronal recordings in the monkey (Ilg, 2003).

Time-advanced neuronal activity has also been observed in FEF and SEF (Fukushima et al., 2002) during predictive pursuit of sinusoidal target motion. It is well-established that MST is connected bi-directionally with FEF (Huerta and Kaas, 1990) and an MST→FEF→MST feedback system might be one plausible way for the efference copy loop to operate, as outlined earlier. However, our results suggest MST may not be a velocity memory site *per se*, since no continued firing was observed during the delay periods of the memory pursuit task (**Figure 9G**). Whilst it is possible that such activity may be found in other parts of MST (e.g. MSTl), sustained firing here may, in fact, be dependent on ongoing eye movement. Given the evidence presented in section "Evidence of Sampling and Storage in the Initial Pursuit Response," that velocity information may be sampled, an intact MST-FEF feedback loop is unlikely to be necessary for memory maintenance. Rather, it is likely that the velocity sample is held in a form of working memory, most probably in dorsolateral PFC (Schmid et al., 2001; Burke and Barnes, 2008b), which is a likely indirect recipient of MT and/or MST output (Kim and Shadlen, 1999; Barborica and Ferrera, 2003; Zaksas and Pasternak, 2006). Such an area may be responsible for holding sampled velocity information (i.e., to be the substrate for S/H and MEM) in a similar way to that for spatial information in remembered saccade tasks (Funahashi et al., 1990). Unlike the remembered target location in the saccade task though, the sample would be held as a magnitude (firing rate) estimate. Some behavioral evidence suggests that magnitude may indeed be stored irrespective of intended direction, since appropriately scaled anticipatory

movements can be re-directed even without prior exposure to motion in the new direction (Jarrett and Barnes, 2002).

SEF is probably the area where decisions about the release of the extra-retinal component are controlled and, given the results presented in section "Similarity and differences of signals represented in the SEF and caudal FEF," FEF is also likely to be involved in that process as a result of reciprocal interconnections with SEF (Huerta et al., 1987). The results of the memory-based pursuit task demonstrate first that, in visual memory neurons, there is sustained activity during the delay periods that is specific to the direction of the initial cue. It is likely that this sustained activity emanates from the working memory holding sampled motion information, although whether this information has speed as well as directional content is unknown (see section "Representation of directional visual motion-memory and movement-preparation signals in the frontal cortex"). It is clear from the fact that directional errors occur that the sustained activity in delay 1 is not irrevocably associated with a motor response in that direction or, indeed, with any motor response at all in the no-go condition. The implication is that an erroneous higher-level decision is made to follow the target in the non-primed direction or, in the case of the no-go condition, to suppress all response. A second subset of SEF and FEF neurons exhibit motor preparation activity in the form of steadily increasing firing rate prior to the motor response. This type of activity has been observed before in SEF (Heinen and Liu, 1997) and is known to be increased by increasing stimulus predictability. This preparatory activity appears to be linked to anticipatory smooth pursuit, which is also dependent on stimulus predictability (Heinen et al., 2005; de Hemptinne et al., 2007, 2008). Anticipatory eye movements are enhanced by stimulation in SEF (Missal and Heinen, 2004), probably through augmentation of this preparatory signal. At present this function is represented in the model by the modifiable gain component β, although this is a considerable simplification of a complex probability-dependent process.

SEF is also implicated in other decision making processes, notably the timing of response initiation and termination (Heinen and Liu, 1997) and may thus be a component of the timing mechanisms associated with the release of predictive activity (**Figure 2**) for which there is ample behavioral evidence (Barnes and Asselman, 1991; Jarrett and Barnes, 2005; Badler and Heinen, 2006). Storage of timing information is an important aspect of other motor control processes (see Ivry and Spencer, 2004, for review). Notably, SEF contains only a small proportion of neurons whose activity is directly related to the motor response (Fukushima et al., 2004); consistent with this, chemical inactivation of SEF (with intact FEF) does not impair pursuit maintenance (Shichinohe et al., 2009). We suggest, therefore, that the major role of SEF lies not in the direct transmission of motor activity but in the regulation of such activity between visual motion memory sites (S/H and MEM in the model) and FEF, which is the major output center for pursuit. This includes the important ability to control suppression of the motor output in the no-go condition.

FEF is probably the site at which retinal error and internal drive (either *reactive* or *predictive*) signals are summated (junction B in **Figure 2**), since lesions of the FEF are known to impair both predictive and visually guided components of smooth pursuit (Keating, 1991, 1993). As shown in **Figure 9**, many FEF neurons fire continuously throughout the action period in a way that would be expected at the output of this summing junction (see **Figure 4F**). However, another type of FEF output neuron that exhibits temporal characteristics more consistent with the visually driven (retinal error) component has also been identified (Fukushima et al., 2000; Ono and Mustari, 2009). It is possible, therefore, that this summation may take place further downstream in, for example, the vestibular nuclei.

The control of open-loop gain is another function frequently associated with FEF. Tanaka and Lisberger (2001) showed that microstimulation in FEF can enhance the gain of pursuit and Churchland and Lisberger (2005) have suggested that MST may be the site that controls gain via its links to FEF, consistent with the representation in **Figure 2**. Given the reduced gain in PD patients, an FEF→BG→FEF positive feedback loop may carry out this function (see section "Parkinson's Disease").

#### **IMPLICATIONS FOR PERFORMANCE ASSESSMENT IN CLINICAL DISORDERS**

Observation of reduced pursuit performance is common in patients with various neurological conditions, such as cerebral cortical lesions, cerebellar degeneration, PD, and schizophrenia (Leigh and Zee, 2006), so standard pursuit tasks offer little potential for differential diagnosis. What we demonstrate here is that suitably devised tests that take into account a fuller range of facets of pursuit may provide much more information. For example, the effects of chemical inactivation of FEF (Fukushima et al., 2011a,b) suggest an association between timing and velocity of the memory-based pursuit response and the gain and phase error of sinusoidal pursuit. Such effects have been observed before in patients with cortical lesions (Lekwuwa and Barnes, 1996a,b) but localization has proved difficult because the tasks used did not clearly discriminate between factors such as gain control, timing and expectation. By continuing to investigate neuronal activity with more elaborate memory-based pursuit tasks that improve discrimination by adding factors such as storage of speed information, it should be possible to identify more areas that are critical for particular factors.

#### **ACKNOWLEDGMENTS**

Supported by Grant-in-Aid for Scientific Research on Priority Areas (System study on higher-order brain functions) from the Ministry of Education, Culture, Sports, Science, and Technology of Japan (17022001, 18300130).

#### **REFERENCES**


movements. *J. Physiol. (Lond.)* 439, 439–461.


ocular smooth pursuit. *Exp. Brain Res.* 144, 322–335.


location-based inhibition of return. *J. Neurosci.* 22, 4675–4685.


K. (2006). The vestibular-related frontal cortex and its role in smooth-pursuit eye movements and vestibular-pursuit interactions. *J. Vestibular Res.* 16, 1–22.


with idiopathic Parkinson's disease (PD): movement preparation and execution is impaired but not visual motion working memory," in *22nd Annual Meeting. Society for the Neural Control of Movement,* (Venice, Italy), 20.


the dorsolateral prefrontal cortex of the macaque. *Nat. Neurosci.* 2, 176–185.


smooth pursuit. *J. Neurol.* 255, 1071–1078.


the smooth pursuit eye movement system. *Biol. Cybern.* 55, 43–57.


visually-guided smooth pursuit eye movements by frontal cortex. *Nature* 409, 191–194.


effective stimulus to pursuit eye movement system. *Science* 190, 906–908.


pursuit eye movements. *Neuroreport* 20, 121–125.

Zaksas, D., and Pasternak, T. (2006). Directional signals in the prefrontal cortex and in area MT during a working memory for visual motion task. *J. Neurosci.* 26, 11726–11742.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 24 January 2013; paper pending published: 21 February 2013; accepted: 01 March 2013; published online: 19 March 2013.*

*Citation: Fukushima K, Fukushima J, Warabi T and Barnes GR (2013) Cognitive processes involved in smooth pursuit eye movements: behavioral evidence, neural substrate and clinical* *correlation. Front. Syst. Neurosci. 7:4. doi: 10.3389/fnsys.2013.00004 Copyright © 2013 Fukushima, Fukushima, Warabi and Barnes. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to*

*any copyright notices concerning any*

*third-party graphics etc.*

### *John M. Henderson\*, Steven G. Luke , Joseph Schmidt and John E. Richards*

*Department of Psychology, Institute for Mind and Brain, University of South Carolina, Columbia, SC, USA*

#### *Edited by:*

*Artem Belopolsky, Vrije Universiteit Amsterdam, Netherlands*

#### *Reviewed by:*

*Natasha Sigala, University of Sussex, UK Florian Hutzler, University of Salzburg, Austria*

#### *\*Correspondence:*

*John M. Henderson, Institute for Mind and Brain, University of South Carolina, 1800 Gervais Street, Columbia, SC 29201, USA e-mail: john.henderson@sc.edu*

Eyetracking during reading has provided a critical source of on-line behavioral data informing basic theory in language processing. Similarly, event-related potentials (ERPs) have provided an important on-line measure of the neural correlates of language processing. Recently there has been strong interest in co-registering eyetracking and ERPs from simultaneous recording to capitalize on the strengths of both techniques, but a challenge has been devising approaches for controlling artifacts produced by eye movements in the EEG waveform. In this paper we describe our approach to correcting for eye movements in EEG and demonstrate its applicability to reading. The method is based on independent components analysis, and uses three criteria for identifying components tied to saccades: (1) component loadings on the surface of the head are consistent with eye movements; (2) source analysis localizes component activity to the eyes, and (3) the temporal activation of the component occurred at the time of the eye movement and differed for right and left eye movements. We demonstrate this method's applicability to reading by comparing ERPs time-locked to fixation onset in two reading conditions. In the text-reading condition, participants read paragraphs of text. In the pseudo-reading control condition, participants moved their eyes through spatially similar pseudo-text that preserved word locations, word shapes, and paragraph spatial structure, but eliminated meaning. The corrected EEG, time-locked to fixation onsets, showed effects of reading condition in early ERP components. The results indicate that co-registration of eyetracking and EEG in connected-text paragraph reading is possible, and has the potential to become an important tool for investigating the cognitive and neural bases of on-line language processing in reading.

#### **Keywords: eyetracking, event-related potentials (ERPs), reading, eye movements, coregistration, pseudo-reading**

#### **INTRODUCTION**

Natural reading is a highly skilled activity that draws on most of the major perceptual and cognitive faculties of the human brain, including perception, attention, motor control, language processing, and reasoning. Two important techniques for investigating many of the sub-processes of reading have been eyetracking and event-related potentials (ERPs). A major motivation for the development of eyetracking methods historically (e.g., Dodge, 1901; Huey, 1908; see Rayner and Pollatsek, 2013) was the study of skilled reading.

During reading, the eyes move across the page at a rate of about four fixations per second. Most words in a text are fixated, and many words receive more than one fixation. Shorter, higher frequency, and more highly constrained words tend to be skipped more often than longer, lower frequency, and less constrained words. The majority of saccades carry the eyes forward (rightward in English) through the text, though backward or regressive eye movements are not uncommon. The eyes also move right to left during return sweeps, taking them from the end of one line to the beginning of the next. Fixation durations in reading are about 225 ms on average, and average forward saccade amplitudes are about eight character spaces or two degrees for normal text at a typical reading distance, with considerable variability for both measures. The durations of individual fixations on a word as well as cumulative gaze durations are related to the perceptual and cognitive processes associated with that word. For example, the duration of the first fixation on a word is affected by lexical factors (e.g., word length and word frequency), syntactic factors (e.g., syntactic complexity), and discourse factors (e.g., anaphor resolution). Because of its temporal and spatial sensitivity and the fact that it is an "online" measure in the sense that effects of variables of interest show up very rapidly in the eye movement record (e.g., within the fixation on a word of interest), eyetracking has proved to be one of the richest and most important behavioral sources of information about the perceptual, cognitive, and linguistic processes that take place during reading (for reviews, see Henderson, 2013; Rayner and Pollatsek, 2013).

Despite its strengths as a research method for studying reading, one drawback of eyetracking is that it does not provide a direct measure of neural activity. For this reason, investigators have often turned to ERPs in the study of reading and language processing. The vast majority of this work has involved presenting one word at a time in the center of the display while the participant holds fixation (see Kutas and Van Petten, 1994; Kutas and Federmeier, 2011). However, from the perspective of understanding the underlying cognitive and neural processes involved in reading, we would ideally like to be able to combine the spatial and temporal sensitivity of eyetracking with the temporal and neural sensitivity of ERPs (Sereno et al., 1998; Sereno and Rayner, 2003). Moreover, because skilled reading involves sequential motor activity in a series of eye movements, we would like to be able to combine eyetracking and ERPs in connected-text reading. Finally, we would want to precisely co-register these two measures in time with high temporal resolution so that specific ERP components can be linked to specific eye movement activities (like the beginning of a fixation) to generate fixation-based event related potentials.

Until recently there had been little successful work coregistering eyetracking and ERPs. However, in the last few years several groups of researchers have demonstrated that it is possible to control for the EEG activity generated by eye movements in complex tasks and to produce interpretable ERP waveforms (Marton and Szirtes, 1988a,b; Takeda et al., 2001; Graupner et al., 2007; Hutzler et al., 2007; Jagla et al., 2007; Simola et al., 2009; Ossandon et al., 2010; Rama and Baccino, 2010; Dimigen et al., 2011; Kamienkowski et al., 2012; see also Thickbroom and Mastaglia, 1985; Thickbroom et al., 1991). A relatively small number of these studies have examined ERPs in normal (connectedtext) reading (Marton and Szirtes, 1988a,b; Dimigen et al., 2011). This work is therefore in its early days, but the technique has enormous potential for increasing our understanding of basic processes related to reading.

The present paper represents our approach to combining eyetracking and ERP in normal reading. We describe our methods for combining data collection using eyetracking and ERP, for coregistering timing across these two methods, and for removing the eye movement artifacts from the EEG data. Several of these techniques are novel in their application to co-registration of eye movements and EEG. Specifically, we base our procedure on independent components analysis combined with source localization, and use three criteria for identifying components tied to saccades: First, component loadings on the surface of the head have to be consistent with eye movements; second, source analysis has to localize component activity to the eyes, and third, the temporal activation of the component has to occur at the time of the eye movement and differ for right and left eye movements.

In addition to a description of our methods, we report results from a new manipulation that allowed us to determine the degree to which effects observed in connected-text reading are due to higher-level cognitive processes related to reading rather than to lower-level processes related to programming and executing sequences of eye movements. We compared ERPs in paragraph text-reading to ERPs in paragraph pseudo-reading. In the pseudo-reading condition, each letter of the text was replaced by a meaningless geometric shape. This manipulation preserved word length and word spacing as well as overall paragraph spatial structure, but removed all meaning. Several studies in the eye-tracking literature have compared eye movements during reading and pseudo-reading (Vitu et al., 1995; Rayner and Fischer, 1996; Nuthmann et al., 2007; Henderson and Luke, 2012; Luke and Henderson, 2013). Eye movements are quite similar in text-reading and pseudo-reading overall, but with some significant differences, most notably in mean fixation duration.

Likewise, ERP studies sometimes compare the electrophysiological responses to words and pseudo-words in order to investigate the time-course of word recognition (e.g., Sereno et al., 1998; Maurer et al., 2005; Segalowitz and Zheng, 2009). For example, Sereno et al. (1998) observed an effect of words versus pseudowords in early ERP components. The pseudo-words used in these ERP studies are often pronounceable non-words and thus contain more linguistic information than the pseudo-text used in eye-tracking research, but the principle behind both techniques is the same. Thus, our manipulation is common to both eyetracking and ERP research. This makes the reading vs. pseudoreading manipulation ideal for testing the effectiveness of the co-registration methods described below.

Nine participants took part in two eye movement conditions. In the text-reading condition, participants read a series of paragraphs presented on a computer screen. In the pseudo-reading condition, participants moved their eyes through text-like stimuli that did not carry any meaning. Previous literature has shown that pseudo-reading in which the letters of text are replaced by a single letter such as Z or by geometric shapes produces basic eye movement behavior that in many ways is similar to eye movement behavior in normal reading (Vitu et al., 1995; Rayner and Fischer, 1996; Nuthmann et al., 2007; Henderson and Luke, 2012; Luke and Henderson, 2013). Therefore, pseudo-reading provides a nice control condition against which to examine influences of textreading on ERPs. In the reading and pseudo-reading conditions, eye movements and EEG were continuously recorded.

The present study was novel in several important ways. First, unlike previous studies, we presented full paragraphs of text rather than combinations of multiple words (Dimigen et al., 2012) or single lines of text (Dimigen et al., 2011). To our knowledge the present study represents the first report of co-registration of eye movements and ERPs in paragraph reading. Second, we introduced a pseudo-reading condition as a control for textreading. This study represents the first use of this control condition when combining eyetracking and ERP. Third, we used a novel set of procedures to adjust for the effect of eye movements on the post-fixation EEG waveform, including combining source localization with independent components analysis. Finally, to validate our method, we used a growth-curve analysis to investigate ERP differences across conditions.

### **DATA COLLECTION METHODS PARTICIPANTS**

Nine graduate and undergraduate students from the University of South Carolina Community gave informed consent and completed the experiment in accordance with the University of South Carolina Institutional Review Board. They were each paid fifty dollars for participating in the study. All had normal or correctedto-normal vision by self-report.

#### **MATERIALS**

Fifty-eight short paragraphs (40–60 words) were taken from online news articles. The texts were displayed on the screen in Courier New font. Each paragraph was also converted into pseudo-text using a custom font in which each letter was replaced by a geometric shape that preserved word location and word shapes but eliminated meaning (see Henderson and Luke, 2012; Luke and Henderson, 2013). Both fonts were mono-spaced, and all letters, words, and lines of text appeared in exactly the same location regardless of font. Examples of text-reading and pseudoreading stimuli can be found in **Figure 1**.

#### **APPARATUS**

Stimuli were presented at a screen resolution of 1024 × 768 pixels using a 28-- LCD monitor. Stimuli were only presented on the middle 2/3 of the screen (approximately 28◦ × 21◦) to accommodate the visual angle of the eye tracker. Eye movements were recorded via an SR Research Eyelink 1000 desktop-mounted eyetracker (spatial resolution of 0.01◦) sampling at 1000 Hz. Participants were seated 68 cm from the monitor so that approximately 3.5 characters subtended 1◦ of visual angle. Head movements were minimized using a chin rest. Viewing was binocular and eye movements were recorded from the right eye. Trials were initiated and terminated with a button box operated by the left hand.

#### **PROCEDURE**

Participants were told that they would be reading short texts on a computer screen while their eye movements were recorded, and that some texts would appear with blocks in place of letters. In the case of the pseudo-texts, participants were instructed to move their eyes "as if they were reading," consistent with standard pseudo-reading instructions (Vitu et al., 1995; Nuthmann et al., 2007; Henderson and Luke, 2012; Luke and Henderson, 2013). Each paragraph was presented in both the text-reading and pseudo-reading conditions, such that each participant completed 116 trials. The experiment was broken up into four blocks of 29 trials, and the text-reading and pseudo-reading versions of each text were always presented at least one block apart. Within each block, stimuli were presented in a random order for each participant.

#### *Eye movement recording*

Eyetracking began with a nine-point calibration routine used to map eye position to screen coordinates. Eyetracker calibration was not accepted until the average error was less than 0.49◦ and the maximum error was less than 0.99◦. Participants were recalibrated at the start of each block and as needed during testing. Trials began with participants fixating a point in the upper left corner of the screen and pressing a button on the button box. In addition to initiating the trial, this served as a "drift check" for the eye tracker to record any shift in gaze position since calibration. The fixation point was then replaced by the text, with the first character in the text appearing approximately three degrees below and to the right of the fixation point. The participant was instructed to read through the text and press a button on the

button box to end the current trial and proceed to the next trial. One paragraph was presented per trial.

After recording, the eye movement data were analyzed off-line to identify fixations and saccades using the DataViewer software package (SR Research Ltd, version 1.11.1). These data were used in segmenting the EEG data as described below.

#### *Recording of EEG and segmenting of EEG for ERP*

The EEG was recorded with a 128 channel system (EGI, Inc., Eugene, OR, USA), referenced to vertex during recording and re-referenced algebraically to an average reference, recorded with 20 K amplification, at a sampling rate of 1 kHz, and with impedances below 100 k-. A 128-channel Hydrocel GSN SensorNet (Tucker, 1993; Tucker et al., 1994) was used to record the continuous EEG. The segments for the EEG were initially extracted for the entire session and high-pass filtered with a 0.5 Hz filter. The electrooculogram (EOG) was extracted from the electrodes on the outer canthi (#'s 125 and 128). The saccades in the EOG were identified with a third-order differential filter (Matsuoka and Harato, 1983; Matsuoka and Ueda, 1986). We aligned the saccades found in the EOG with the saccade onsets defined by Dataviewer to ensure consistency between the EEG recording and the eye tracker. The EEG segments were extracted for 100 ms preceding each saccade, through the saccade, and up to 750 ms following the saccade. The segments were terminated if (1) a blink occurred, (2) a return sweep occurred, carrying the eyes from the end of a line to the beginning of the next line, or (3) the recording interval ended. For the ERP analysis, the electrodes were grouped into sets of electrodes from the 128 channel GSN Sensornet into 20 "virtual 10–20" electrodes. The five most posterior electrode groups are shown in **Figure 2**. Given that we were primarily interested in early ERP components, these five electrodes were the most relevant for analysis.

#### *Co-registration of eyetracking and EEG*

The EEG recordings and the eyetracking data were aligned so that saccade onset from the eyetracker could be found for each EEG segment. This was accomplished by controlling the experiment via EPrime. During the experiment, the EPrime program had access to the SR-Eyelink time with ms resolution. These times and the EPrime computer time were sent to the EGI system at the beginning of each paragraph by a dedicated TCP/IP port. Events were simultaneously saved in the continuous EEG stream and in the eye movement record. The EGI Netstation recording program

kept the trial onset code and exported the trial times with the EEG data. The time streams from the EPrime computer, the EGI computer, and the SR computer could then be integrated into a single time stream for all event codes from the SR eyetracker (saccades, fixations, blinks) and the EEG data.

### **DATA ANALYSIS METHODS: EYE MOVEMENT CORRECTION IN EEG DATA**

#### **FIXATION ACCEPTANCE CRITERIA**

Fixations that met the following criteria were included in the analyses. First, fixations had to be preceded by a rightward saccade. Second, fixations could not be followed within 700 ms by a return sweep. Third, fixations could not include a blink. These criteria were identified in the SR eyetracker data using Dataviewer. Fourth, of these fixations, those that were not clearly identifiable in the EOG were excluded. In total this resulted in the inclusion of over 25,000 fixations across all participants; on average each participant contributed approximately 1500 fixations in the text-reading condition and 1300 fixations in the pseudo-reading condition.

#### **INDEPENDENT COMPONENTS ANALYSIS OF ERP DATA**

The complete segmented ERP file was analyzed with Independent Components Analysis (ICA). A spatiotemporal ICA was conducted over the raw EEG data following the procedures outlined by Makeig and colleagues (Makeig et al., 1996, 1997; Jung et al., 2001a,b; DeLorme et al., 2002; also see Richards, 2005; Reynolds and Richards, 2009). The spatiotemporal ICA used the channels as variables, and the observations were all EEG segments from a single participant concatenated over the millisecond intervals for which the EEG was sampled. The weights were calculated using the extended-ICA algorithm of Lee et al. (1999), using sphering of the input matrix to aid in convergence, with an initial learning rate of 0.003. The ICAs were carried out separately on each participant's data.

#### **REALISTIC SOURCE ANALYSIS OF EYE-MOVEMENT ICA COMPONENTS USING MRI**

A structural (anatomical) MRI was taken for each participant. The MRI was used to locate electrodes on the head, develop a realistic finite element method (FEM) head model whose source locations included the eyes, and to estimate the cortical sources of ICA components representing eye movements.

The MRI data were collected at the University of South Carolina McCausland Center for Brain Imaging (USC-MCBI) on a Siemens Medical Systems 3T Trio with an overall duration of about 15 min. A 3D T1-weighted "MPRAGE" RF-spoiled rapid flash scan in the sagittal plane, and a T2/PD-weighted multi-slice axial 2D dual Fast Turbo spin-echo scan in the axial plane was used. The USC-MCBI T1 scans had 1 mm3 resolution and sufficient field of view to cover from the top of the head down to the neck.

The EEG recording was done with a Hydrocel GSN (HGSN). Following the recording, the participants were placed inside a "Geodesic Photogrammetry System" (GPS) dome, where images from eleven uniquely angled cameras were acquired, the sum of which provided a complete map of each participant's head. The electrode locations in each participant's headspace were found with a triangulation program for the positions of the electrodes in the photos with the GPS software package (Russell et al., 2005). The electrodes were registered to the structural MRI by identifying a set of fiducial electrodes on the net and on the MRI volume, and using point-set registration to fit the electrodes to the head (Richards et al., 2013).

The electrode locations and structural MRIs were used to develop a realistic head model using finite element method models (Richards, 2013). The heads were segmented into constituent media (gray matter, white matter, skull, skin, nasal cavity, muscle, and eyes) and a source model was constructed that consisted of gray matter and eyes. Three-dimensional tetrahedral wireframes were computed that contained the location of each corner of the tetrahedron and the type of material making up the tetrahedron using the MR Viewer module of the EMSE computer program (Source Signal, Inc.). The electrode locations, source locations, and head model were used with EMSE's Data Analysis (Source Signal, Inc.) to estimate the forward model, inverse model, and current density reconstruction (CDR; Darvas et al., 2001) using sLORETA (Pascual-Marqui et al., 1994; Pascual-Marqui, 2002). The sLORETA current density reconstruction was applied to each of the 128 ICA component loadings.

### **REMOVAL OF EYE MOVEMENT RELATED EEG ACTIVITY**

Our goal was to analyze the ERP data in the connected paragraph reading conditions with the activity due to eye movements removed from the EEG. To accomplish this, we first removed the electrical activity due to the eye movements from the EEG activity following procedures outlined by Jung et al. (2001a,b). An ICA was performed on all 128 channels using both saccade- and fixation-related ERP segments. Saccade-related segments encompassed 100 ms pre-saccade to 70 ms post-saccade, and fixationrelated segments encompassed 20 ms pre-fixation onset to 770 ms post-fixation onset. Note, given that most fixations were less than 770 ms, this resulted in many overlapping segments of data. The ICA component activations were examined at the point of an identified eye movement in the EOG record. Components were identified with activations that occurred primarily around the time of the eye movement. Next we examined the scalp topography of the component loadings. **Figure 3** (left panel) shows a scalp potential map with the ICA component loadings of a putative ICA eye movement component. We then used the source analysis based on the realistic head model and the source model (gray matter and eyes) to confirm that the current density in the eyeball location of the source analysis accounted for a substantial portion of the current density across the entire source model. The ratio of the average current density per mm in the eyes, relative to the average current density in the whole head, was computed from the source analysis. We found ratio values greater than approximately 1.5 were sources primarily in the eyeball volume. The right panel of **Figure 3** shows the eyeball current density for the same participant shown in the left panel. The eye movement components were identified using three criteria: (1) the component loadings on the surface of the head were consistent with an eye movement, (2) source analysis localized the component to the eyes (ratio of current density in eyes and head >1.5), and (3) the

temporal activation of the component occurred at the time of the electrooculogram activity in the eye and differed for right and left eye movements. For the nine participants, the average number of ICA components identified as eye movement components was 9.5 (range = 2 to 21, SD = 6.67). The remaining ICA components were used with loadings/activations to project from the ICA component space back into the temporal EEG space. This resulted in EEG data with the effect of the eye movements removed from the post-fixation EEG data.

To first assess the goodness of this correction algorithm we examined the saccade locked activity from the channels

**FIGURE 3 | The eye movement related source localization for one participant.** The left panel shows a scalp potential map with the ICA component loadings of a putative ICA eye movement component for one participant. The center panel shows the eyeball sources. The right panel shows the current density from the cortical source analysis for the ICA component that is shown in the left panel.

surrounding the eyes. **Figure 4** shows the average EEG recording for the participant shown in **Figure 3** from about 100 ms preceding saccades to about 70 ms following saccade onsets for forward-reading eye movements. The top two graphs show the uncorrected EEG recording for the electrodes around the right and left eyes. The EOG-defined saccade onset (sec 0) shows the large EEG changes occurring in these electrodes during the reading eye movement (∼30µV deflection). The bottom two graphs show the corrected EEG recording for the electrodes around the right and left eyes. In the 70 ms after the saccade, there is a small EEG deflection (<10µV deflection). Note the activity immediately around saccade onset where some electrodes still show a small deflection in the EEG recording.

The eye-movement corrected data were processed as typical EEG data for subsequent analyses. The data were filtered with a 45 Hz low-pass filter and re-referenced to the average of all electrodes. The EEG was segmented at the start of each fixation defined by the timing from the eyetracker through 700 ms after fixation offset. Channels on individual trials were eliminated if there was a voltage change of greater than 50µV within a segment.

#### **ERP ANALYSIS FOLLOWING EYE MOVEMENT CORRECTION**

Corrected EEG data were imported into EEGLAB\ERPLAB for further analysis (version 10.2.5.8b and 3.0.2.1, respectively). Fixation events were matched with the original eye movement data and recoded to include fixation duration and several other eye movement variables. The recoded events were imported using

**row:** corrected data, for the left and right eyes (left and right panels). Data

eye, 25, 26, 32, 128).

the ERPLAB eventlist function. Across all channels, data from two to six electrodes were averaged together to compute 20 virtual electrodes which approximate standard 10–20 coordinates. Our analyses focused on the five posterior electrodes (see **Figure 2**). The data were epoched, binned, and averaged by text type and by fixation duration.

### **RESULTS**

#### **EYE MOVEMENT ANALYSIS OF READING CONDITION**

As a manipulation check, we first confirmed an effect of text type in the eye movement data. Replicating previous research (e.g., Vitu et al., 1995; Nuthmann et al., 2007; Henderson and Luke, 2012; Luke and Henderson, 2013), fixation durations were significantly longer in the pseudo-reading (270 ms) than text-reading conditions [220 ms; *F*(1, <sup>8</sup>) = 40.1, *MSE* = 373, *p* < 0.001].

### **ERP BY FIXATION DURATION**

In an initial analysis to determine whether our eye movement correction method was effective, we stratified the EEG data by fixation duration to investigate the effect of eye movements on the fixation-based ERP (see **Figure 5**). For this analysis, fixation duration bins were chosen to roughly equate for the number of fixations per bin. Replicating Dimigen et al. (2011), we observed clear P1 and N1 components at most fixation durations. However, shortly after fixation offset, the next fixation elicited another P1, overlapping with later ERP components. The fact that P1 components from subsequent fixations appear earlier in the current fixation when the current fixation is shorter can be problematic for interpreting later components in the current fixation.

#### **TIME-COURSE ANALYSIS OF P1 AND N1; GROWTH-CURVE ANALYSIS**

We next compared the ERP responses relative to fixation onset in the text-reading and pseudo-reading conditions using Growth Curve Analysis (Mirman et al., 2008) in R (R Development Core Team, 2012). This analysis combines the analyses of amplitude and latency into a single analysis, and treats time as a continuous predictor so that no binning is required. We also performed more traditional analyses of peak amplitude and latency analyses, and the results were highly consistent with the results we report below. The participant-level overall time course for electrodes T5, O1, O2, and T6 was modeled using the ms-by-ms average ERP activity with a third-order (cubic) orthogonal polynomial and a fixed effect of Text Type (Text-Reading vs. Pseudo-Reading) on all time terms.

Analyses of sample data in the P1 time window (75–125 ms) revealed significantly higher overall amplitudes in text-reading than in pseudo-reading at all tested electrodes (all *t*s > 3.65, all *p*s < 0.001; see **Figures 6**, **7**). Additionally, text type interacted with the quadratic term in all analyses (all *t*s > 2.63, all *p*s < 0.05), indicating that the positive wave had a steeper slope in the text condition. The absence of any effect or interaction involving the cubic term indicates similar latencies for text-reading and pseudo-reading. **Figure 6** shows the positive activity that occurred during this time window. The positive peak was larger across all tested electrodes for text-reading than for pseudo-reading. The difference wave between text-reading and pseudo-reading ERPs peaked at about 100 ms for all five posterior electrode sites.

For the N1 time window (125–210 ms), amplitudes were significantly more negative for text-reading than pseudo-reading

at all tested electrodes (all *t*s > 2.33, all *p*s < 0.05), indicating a larger N1 in that condition (see **Figures 6**, **8**). Text type again interacted with the quadratic term in all analyses (all *t*s > 5.32, all *p*s < 0.001), indicating significantly steeper slopes in text-reading than pseudo-reading. Further, in the analyses of the two lefthemisphere electrodes, the cubic term interacted with text type (both *t*s > 2.33, both *p*s < 0.05) indicating that the cubic was a good fit in the text reading condition but not the pseudo-reading condition. This interaction indicates that the N1 peaked earlier for text-reading than for pseudo-reading at the left hemisphere electrodes.

**Figure 6** shows the negative activity occurring in the N1 time window. The ERP during text-reading showed a steeper slope from the peak of the positive component towards the baseline. Further, the difference wave in this figure shows the onset of the difference between text- and pseudo-reading occurred at about 125 ms on the left electrodes and about 150 ms on the right electrodes; and the corresponding peak of the difference wave also occurred earlier for text-reading than for pseudo-reading.

To further investigate the progression of the text-reading versus pseudo-reading effect across the different electrodes, we conducted a similar growth curve analysis on the difference wave (text-reading minus pseudo-reading) in the N1 time window, with electrode as a fixed effect. This analysis compared the curve of the waveform at electrode T5 to the curve at the other electrodes. The cubic polynomial was significant at T5 (coeff. = −0.8, SE = 0.12, *t* = −6.41, *p* < 0.001). The interaction of electrode and the cubic polynomial term was significant for electrodes O2 and T6 (both *t*s > 2.61, both *p*s < 0.05) but not for electrode O1 (*t* < 0.33), indicating that the cubic term fit the data from the two left hemisphere electrodes (T5 and O2) but not the right hemisphere electrodes (O2 and T6). The fact that the cubic term was the best fit on the left and the quadratic on the right over the same time window indicates that the N1 difference wave peaked earlier on the left than on the right. **Figure 6** shows the negative activity occurring in the N1 time window. The ERP during text-reading showed a steeper slope from the peak of the positive component towards the baseline. Further, the difference wave in this figure shows the onset of the difference between text- and pseudo-reading occurring at about 125 ms on the left electrodes and about 150 ms on the right electrodes; and the corresponding peak of the difference wave also occurred earlier for text-reading than for pseudo-reading.

**Frontiers in Systems Neuroscience www.frontiersin.org** July 2013 | Volume 7 | Article 28 |

Topographical maps of the ERP activity are shown in **Figure 7** for the P1 interval and **Figure 8** for the N1 interval. These maps illustrate the findings reported above. The positive activity (**Figure 7**) showed a distribution centered on the scalp ipsilateral to the forward reading eye movement, but with larger peak amplitude for text-reading than pseudo-reading. In contrast to this, the negative activity (**Figure 8**) showed a faster onset for the text-reading condition than for the pseudo-reading condition, with a "wave" of activity spreading from the left scalp sites toward the right scalp sites. The text condition also had significantly larger amplitude relative to the pseudo-reading condition. Thus, for the activity in the N1 interval, there were differences in the speed, slope, amplitude, and topography of the response. These findings are in line with prior research in which word reading resulted in significantly larger N1 and P1 amplitudes relative to pseudo-word reading (Sereno et al., 1998). This replication suggests that our source-localized ICA eye movement correction of the EEG signal removed the eye-movement-related artifacts in the data well enough to allow the investigation of ERP's time-locked to fixation onset.

### **DISCUSSION**

Eye movements and ERPs provide two important sources of data for investigating language processing generally and reading in particular. The possibility of co-registering and combining these two measures in connected-text reading has therefore been of wide interest in the psycholinguistics and reading communities, but the technical challenges have been formidable. In the present study our goal was to determine whether meaningful ERPs can be generated from connected-text paragraph reading. We therefore presented a new method for identifying and removing eye movement effects from EEG data, a new control condition for examining language effects in reading, a new paragraph-reading paradigm, and new data regarding early text-based effects in connected reading.

During normal reading, the eyes move through the text in a series of fixations and saccadic eye movements. Because the eye movements themselves generate a great deal of activity in the EEG data, it is difficult to observe subtle effects related to language processing in the ERP waveform. In the present study, we were interested in determining whether we could generate meaningful ERPs in which the event was tied to eye fixations in natural connected-text reading.

To determine whether ERPs related to reading could be distinguished in the resulting ERPs, we also introduced the use of a pseudo-reading control condition. In this pseudo-reading condition, a pseudo-font was used in which letters were replaced by geometric shapes that roughly maintained the look of text while providing no orthographic information. Texts written in this pseudo-font therefore retained letter spacing, word length, word spacing, and overall paragraph spatial structure without providing any meaning. In the experiment, participants were

asked to read paragraphs of text and to move their eyes through pseudo-text as though they were reading. Eye movements in these conditions were remarkably similar (though there were also some important subtle differences), so the pseudo-reading condition provides a useful control for eye movements without language processing.

Comparison of the ERPs from the text-reading and pseudoreading conditions revealed several clear differences in the shapes of the early ERP P1 and N1 components. For example, N1 amplitudes were more negative for text-reading than pseudoreading at all tested electrodes. The N1 also peaked earlier for text-reading than pseudo-reading. Furthermore, in a comparison of text-reading and pseudo-reading in posterior regions, the difference wave peaked earlier in the left scalp regions than in the right. The N1 has been associated with discrimination processes at fixation (Luck and Vogel, 2000) and with word recognition effects (Sereno et al., 1998; Maurer et al., 2005). For example, Sereno et al. (1998) found differences in N1 for words, pseudo-words, and consonant strings. The differences we observed for text-reading versus pseudo-reading in the present study are consistent with and extend these previous findings to continuous reading, and provide initial validation for the correction methods we have employed. These initial results suggest that it will be worthwhile to refine these methods and pursue them in future studies designed to use co-registered eyetracking and ERPs, particularly with respect to early ERP components.

Eye movement corrections algorithms have been used previously, even in free reading tasks (e.g., Dimigen et al., 2011). The procedure used in the current study compares favorably with those on two points. First, the eye movement correction algorithm introduced here was applied to the actual reading data rather than data from a separate eye movement calibration recording as in other studies (e.g., Dimigen et al., 2011; Plochl et al., 2012). It is likely that the types of eye movements made in calibration trials will be dissimilar to the ocular characteristics of eye movements occurring during reading (e.g., moving eyes at 15◦ in four cardinal directions, Dimigen et al., 2011; eye movements and blinks on a gray screen, Plochl et al., 2012). Separate eye movement calibration trials could result in models of the eye movement effect on EEG that are dissimilar to what happens during reading. Our *in situ* correction specifically models the eye movements occurring during reading for their effect on the EEG. Our strategy based on individual participants should account for individual differences in reading experience or ability. Second, several ocular correction techniques use tools that do not specifically identify the temporal location of a saccadic eye movement occurring during reading. Most techniques attempt to control for blinks, vertical saccades, small eye movements, and horizontal saccades. However, since we know the onset of the saccade that

moves the eye from one word to another from both the eye tracker and the EOG, our method limits the correction to the eye movement activity in the EEG specifically related to the saccade that begins the fixation of interest. The use of the temporal sequence of the ICA activation as a criterion for identifying eye movement ICA components limits our correction to eye movements occurring during reading and excludes other eye movements (blinks, vertical saccades) from our correction. In this respect our method compares favorably to that used by Plochl et al. (2012) which used both ICA temporal activation and component loadings during eye-tracking defined saccadic activity (though see the first point above).

We believe our method is superior to other methods in distinguishing ocular-related EEG activity and neural-related EEG activity caused by eye movements. First, our method identifies the eye-movement related activity in the EEG that is related specifically to the movement of the eyes in the orbit, and does not correct all eye movement related EEG activity. Eye movements may affect EEG activity for "ocular" reasons or "neural" reasons. The ocular effects on EEG occur because of electrical activity generated by the eye. This includes muscle activity around the eyes during eye movements, rotation of the corneal dipole, and muscle/eyelid activity occurring during blinks; often these are labeled "eye movement artifacts" and are the target for correction. However, there is also eye movement-related EEG activity from neural sources. For example, the fixation-related-potentials in this and similar studies are due to neural activity temporally linked to the eye movement, but presumably caused by neural processes related to reading. Some presaccadic ERP's (e.g., presaccadic parietal slow wave, presaccadic spike potential; Richards, 2005, 2013) are neural in origin and likely involved in the neural control of eye movements. Using source analysis to identify ICA components whose primary sources are in the eyes distinguishes ocular-generated EEG potential changes and neurally-generated EEG potential changes. An interesting comparison may be made with the Dimigen et al. (2011) study, which used a technique that forms a source dipole model of eye movements based on a calibration trial, and also did not remove the parietal presaccadic spike potential. A small ERP potential also occurred around fixation onset in our study lateralized on the side of the eye movement (**Figure 4**, bottom right panel). This is likely the presaccadic spike potential, which is slightly lateralized over the eye electrodes ipsilateral to the eye movements (Richards, 2013). Unlike both the Dimigen et al. (2011) and the Plochl et al. (2012) correction methods, our method specifically corrects for eye movement activity in the EEG of ocular origin and does not remove eye movement related EEG activity of neural origin. Finally, the use of the temporal activation of the ICA alone is insufficient to identify these eye movement components. The additional criteria of the scalp topography of the component loadings being consistent with an eye movement (e.g., Plochl et al., 2012), and of the source of the EEG activity being localized in the eyes, ensures that the spatial source of the corrected activity is due to ocular activity. It is possible that component loading topography and component activations could be sufficient (e.g., Plochl et al., 2012), or that the ocular source analysis and the component activity could be

sufficient. The ocular source analysis loosely reflects a particular topographical mapping on the head consistent with the eye movement in the EEG recording, so that using topographical mapping and the component activation would loosely match our three-pronged criteria.

There are a number of important remaining issues that will have to be resolved before the use of ERPs in connected-text reading can reach its full potential, especially with respect to later ERP components. One challenge is related to the fact that fixations have a relatively short duration in reading (e.g., averaging 220 ms for text-reading in the present study). Because of this, ERP effects arising from the fixation after a critical fixation (critical + 1 fixation) will often overlap with the later components from a critical fixation. This issue is illustrated by **Figure 5** (see also Dimigen et al., 2011, Figure 2b) where the components from the next fixation overlap earlier in time as the duration of the current fixation decreases. Adding to the challenge, language manipulations that are expected to affect late ERP components associated with fixation on a word in text (e.g., N400) are also likely to affect the durations of those fixations. Therefore, ERP components tied to critical fixation events in connected-text reading may be particularly susceptible to overlap of late components generated by the critical fixation and early components generated by the fixation following the critical fixation. And the nature of this overlap will change as the duration of the critical fixation changes (**Figure 5**). Dealing with this issue is not simply a matter of removing EEG effects from the eye movements themselves, but also of de-convolving overlapping waveforms containing the information that is of interest. This issue makes using the later components of ERPs in connected reading particularly challenging. However, we are optimistic that further work in this area will help to fulfill the promise of this method.

In this study we focused on using the eye movement data to identify the events of interest for ERPs (fixations) and to provide information (saccade onsets) useful for removing the eye movement effects in the EEG. However, the longer-term promise of the co-registration of eye movements and ERPs is to combine the evidence generated simultaneously from both measures. For example, in studies of reading we are typically interested in how a theoretically motivated manipulation affects measures of language processing. Data relevant to assessing that manipulation often derive from both eyetracking and ERPs, but traditionally these data are generated from separate experiments (e.g., Sereno et al., 1998; Dambacher et al., 2006; Dambacher and Kliegl, 2007). It would be far more powerful to be able to provide evidence from both methods when the data has been collected simultaneously, so that data drawn from each method can then be used to constrain inferences drawn from the other. The present method development is ultimately directed toward achieving this goal.

#### **ACKNOWLEDGMENTS**

The research reported here was supported by grants from the National Science Foundation (BCS-1151358) and the National Institutes of Health (NICHD, R37-HD19842).

#### **REFERENCES**


Dambacher, M., et al. (2007). Welcome to the real world: validating fixation-related brain potentials for ecologically valid settings. *Brain Res.* 1172, 124–129. doi: 10.1016/j.brainres.2007.07.025


activity in the brain. *Int. J. Psychophysiol.* 18, 49–65. doi: 10.1016/0167-8760(84)90014-X


and event-related potentials. *Trends Cogn. Sci.* 7, 489–493. doi: 10.1016/j.tics.2003.09.010


*J. Psychophysiol.* 40, 181–186. doi: 10.1016/S0167-8760(00)00185-9


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 02 March 2013; accepted: 14 June 2013; published online: 10 July 2013.*

*Citation: Henderson JM, Luke SG, Schmidt J and Richards JE (2013) Co-registration of eye movements and event-related potentials in connectedtext paragraph reading. Front. Syst. Neurosci. 7:28. doi: 10.3389/fnsys. 2013.00028*

*Copyright © 2013 Henderson, Luke, Schmidt and Richards. This is an openaccess article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## Saccades during visual exploration align hippocampal 3–8 Hz rhythms in human and non-human primates

### *Kari L. Hoffman1,2,3\*, Michelle C. Dragan2,3, Timothy K. Leonard1,3, Cristiano Micheli 2,4,5,6, Rodrigo Montefusco-Siegmund1 and Taufik A. Valiante4,5,6*

*<sup>1</sup> Department of Psychology, Centre for Vision Research, York University, Toronto, ON, Canada*

*<sup>2</sup> Department of Biology, Centre for Vision Research, York University, Toronto, ON, Canada*

*<sup>3</sup> Neuroscience Graduate Diploma Program, York University, Toronto, ON, Canada*

*<sup>4</sup> Division of Fundamental Neurobiology, Toronto Western Hospital Research Institute, Toronto, ON, Canada*

*<sup>5</sup> Krembil Neuroscience Center, Toronto, ON, Canada*

*<sup>6</sup> Division of Neurosurgery, Department of Surgery, University of Toronto, Toronto, ON, Canada*

#### *Edited by:*

*Junji Ito, Research Center Juelich, Germany*

#### *Reviewed by:*

*Keith P. Purpura, Weill Cornell Medical College, USA Lucia Melloni, Max Planck Institute for Brain Research, Germany Conrado A. Bosman, University of Amsterdam, Netherlands*

#### *\*Correspondence:*

*Kari L. Hoffman, Department of Psychology, Centre for Vision Research, York University, 4700 Keele St., Lassonde Bldg., Toronto, ON M3J 1P3, Canada e-mail: khoffman@yorku.ca*

Visual exploration in primates depends on saccadic eye movements (SEMs) that cause alternations of neural suppression and enhancement. This modulation extends beyond retinotopic areas, and is thought to facilitate perception; yet saccades may also influence brain regions critical for forming memories of these exploratory episodes. The hippocampus, for example, shows oscillatory activity that is generally associated with encoding of information. Whether or how hippocampal oscillations are influenced by eye movements is unknown. We recorded the neural activity in the human and macaque hippocampus during visual scene search. Across species, SEMs were associated with a time-limited alignment of a low-frequency (3–8 Hz) rhythm. The phase alignment depended on the task and not only on eye movements *per se*, and the frequency band was not a direct consequence of saccade rate. Hippocampal theta-frequency oscillations are produced by other mammals during repetitive exploratory behaviors, including whisking, sniffing, echolocation, and locomotion. The present results may reflect a similar yet distinct primate homologue supporting active perception during exploration.

#### **Keywords: theta, electrocorticography, epilepsy, saccades, phase-locking, macaque, human, foraging**

### **INTRODUCTION**

For most primates, exploration of the environment is primarily visual, and makes use of the specialized mechanism of saccadic eye movements (SEMs): the rapid and repetitive displacement of a high-acuity region of the retina to sample different locations in the visual environment. During SEMs, neural activity is suppressed (Latour, 1962; Burr et al., 1994; Reppas et al., 2002; Thiele et al., 2002; Uematsu et al., 2013) whereas after SEMs, during fixation, neural activity is enhanced (Ibbotson et al., 2007, 2008; Cloherty et al., 2010). This fluctuation is thought to promote efficient processing of new visual information (Melloni et al., 2009; Schroeder et al., 2010).

Although saccades are known to modulate visual perception, the most notable effects of saccadic activity have been observed in early and intermediate visual areas (Bair and O'keefe, 1998; Leopold and Logothetis, 1998; Martinez-Conde et al., 2000; Reppas et al., 2002; Thiele et al., 2002; Ibbotson et al., 2007, 2008; Maldonado et al., 2008; Rajkai et al., 2008; Bremmer et al., 2009; Crowder et al., 2009; Cloherty et al., 2010; Ibbotson and Krekelberg, 2011; Ito et al., 2011). In the temporal lobe, eye movements made in the dark or following simple visual stimuli elicited spiking and/or local field potential (LFP) modulation in subdomains such as the superior temporal polysensory area, the parahippocampal gyrus, inferotemporal cortex, and hippocampus, though the consequences for perception are unclear (Ringo et al., 1994; Sobotka et al., 1997, 2002; Purpura et al., 2003; Bartlett et al., 2011; Jutras et al., 2013).

The interaction between neural responses to saccades and to visual stimulation illustrates one role of eye movements in active vision. Eye movements that are concomitant with the onset of new visual information, as occurs during naturalistic visual search, lead to temporal lobe neuronal activity which is phase-locked in the theta-alpha range (Bartlett et al., 2011). This coupling effect is stronger than that predicted exclusively by the visually-evoked response when decoupled from fixation, and is generally consistent with reports of phase-locking and phasedependent codes in early visual areas (Montemurro et al., 2008; Rajkai et al., 2008; Bosman et al., 2009; Ito et al., 2011). Preciselytimed responses through neural synchronization may facilitate the speed and efficacy of perceptual processing (Masquelier et al., 2009; Panzeri et al., 2010; Turesson et al., 2012; Womelsdorf et al., 2012; Lisman and Jensen, 2013), but such mechanisms are not limited to perception. Precise phase-alignment of spiking activity to oscillations in the rodent hippocampus is posited to be relevant for memory encoding, as well (Tort et al., 2009; Shirvalkar et al., 2010; Lisman and Jensen, 2013).

The hippocampus produces low-frequency theta oscillations (5–10 Hz) during whisking, sniffing, and locomotion in rodents (Grastyán et al., 1959; Vanderwolf, 1969), during echolocation in the bat (Ulanovsky and Moss, 2007), and during passive viewing of images in macaques (Jutras et al., 2013, but see Skaggs et al., 2007). These oscillations are, in turn, associated with encoding of experiences for later recall, for review see Buzsáki (2005); Hasselmo (2005). The role of eye movements in modulating hippocampal oscillations during active search is unknown. Here we asked whether saccades influence hippocampal oscillations during a visual foraging task measured in the human and nonhuman primate.

### **MATERIALS AND METHODS PARTICIPANTS**

Six patients (3 males) with medically refractory epilepsy underwent surgical implantation of subdural surface electrodes and depth macroelectrodes to localize epileptogenic regions. Electrode location and type were selected solely on clinical considerations. These experiments occurred between 2 and 10 days post-operatively, with informed consent, and in accordance with protocols approved by the University Health Network Research Ethics Board and the York University Human Participants Review subcommittee. A single adult rhesus macaque (*Macaca mulatta*, female, 10 kg) also participated in these experiments following implantation of chronically indwelling tetrodes. The experiments occurred 2–5 months post-operatively, under protocols approved by the Animal Care Committee at York University, in accordance with the Canadian Council for Animal Care.

#### **EXPERIMENTAL DESIGN**

The basic experimental design is described previously (Chau et al., 2011). Exceptions are noted in the summary below.

### *Stimuli*

Photographic images were taken from a large collection of natural scenes including landscapes, cityscapes, wildlife, and indoor scenes. One object in a given image was modified in Adobe Photoshop to give the impression that it disappeared, referred to here as the "target." Sizes, locations, and content of targets were varied to reduce the predictability of the target object by prior experience with the task. Images were presented full-screen at 1280 × 1024 pixel resolution.

#### *Behavioral procedure*

Each participant sat approximately 51 cm away from a 38-cm by 30-cm computer monitor. Patients sat in a small testing room under fluorescent lighting with 2–3 experimenters in the room with them. Macaques sat in a private darkened booth inside a room where the experimenters monitored the neural and behavioral activity. All participants first underwent a 13-point calibration of the eye tracker system (patients: iView RED sampled at 60 Hz: macaque: iView high-speed primate, sampled at 1250 Hz, both from SensoMotoric Instruments, Teltow, Germany). The eye tracker was connected via ethernet cable to the stimulus presentation computer running Presentation (NeuroBehavioral Systems, Albany, CA, USA). During each trial of the main task, an original scene was shown in alternation with the target-modified scene, each lasting 500 ms, with an intervening 50-ms gray screen separating these image pairs. Participants searched for the single changing object—the target—and, upon detection, could elicit the end of the trial by fixating the target for 1000 ms. Regardless of whether or not the target was found within the time limit (typically 45 s), the trial ended by removing the gray screen gap, revealing the target as the only changing part of the image thereby removing the "change blindness" illusion. In the case of the macaque, if the target was found in the time limit, this target "giveaway" was accompanied by delivery of a preferred smoothie treat. All participants of the main task viewed trials in blocks of 30, with the number of blocks per analysis varying from 4–10 in the patient participants and spanning 37 blocks for the macaque. With equal probability, the scene pairs in a given trial were either novel or repeated once from a previous trial, and targets were unique to the scenes, i.e., uncued. All but one participant completed this main task.

The remaining participant, Patient 6, was run on a control variation of the task in which only one image of the pair was presented, and with a presentation duration of up to 6 s, after which the target location was revealed through alternation, as described above. Two successions of 10 unique targets-in-scenes were presented, and each succession was repeated 3 times, for a total of 80 trials. As with the main task, the trial ended before the time limit if the target location was fixated for 1000 ms. This control task provided a constant image during exploratory saccades, similar to other memory-guided search tasks (Ryan and Cohen, 2004; Smith et al., 2006; Chau et al., 2011; Chukoskie et al., 2013), removing the interposition of image onsets with saccades, as occurs during the main task.

The inter-trial interval procedures differed across patient and task. In the main task, patients saw a series of screens prompting verbal responses for memory of the scenes and target objects. The macaque was not asked for verbal report but was instead given a 20-s inter-trial interval in which the display was set to black. Similarly, in the control task (Patient 6) the end of the trial proceeded directly to a black display screen lasting 5 s, with no verbal report screen.

#### **NEURAL RECORDINGS**

Electrophysiological recordings in patients were obtained from depth macro electrodes with four electrical contacts used to record hippocampal activity. In addition, patients were implanted with strips of 4–6 subdural platinum-iridium electrodes 3-mm diameter and 10-mm inter-electrode distance (PMT, Chanhassen, MN, USA) targeting anterior temporal, ventral-medial temporal, and posterior temporal locations. A 4-contact subgaleal electrode over the parietal midline and facing away from the brain was used for ground and reference. Signals were filtered at 0.1 Hz–1 kHz, sampled at 5 kHz with a NeuroScan SynAmps2 data acquisition system (Compumedics, Charlotte, NC, USA), and recorded to disk. Electrode localization was verified by co-registering a post-operative CT image with a pre-operative MRI structural image.

In the macaque, quartz platinum tungsten tetrodes (Thomas Recordings, Giessen, Germany) were implanted chronically in a modified 18-drive (Neuralynx Inc, Gray Matter Research, Bozeman, Montana, USA). The guide tube was insulated until the tip, which ended ∼4 mm above the hippocampus and served as a local reference. Each tetrode was independently adjustable in depth up to 1 cm. Signal was split between spiking and LFP channels digitally and sampled at 32 and 2 kHz, respectively, using a Digital Lynx acquisition system (Neuralynx, Inc.). LFP was filtered between 0.5 Hz and 2 kHz. Electrode location was determined functionally with characteristic hippocampal activity and structurally using post-operative MRI.

For both neural acquisition systems, serial-output pulses from the Presentation stimulus-delivery PC were used to synchronize neural and behavioral events.

#### **DATA ANALYSIS**

Eye tracking files were preprocessed with iView X iTools IDF Event Detector, using a dispersion based algorithm (I-VT) with a minimum fixation duration of 80 ms and maximum dispersion of 100 pixels (Salvucci and Goldberg, 2000).

Neural and eye movement data files were read into MATLAB (The Mathworks Inc., Natick, MA), and processed with purposebuilt code and the FieldTrip toolbox (Oostenveld et al., 2011). LFP signal preprocessing included resampling to 1 kHz, filtering between 1 and 200 Hz, detrending and—for patient data but not macaque data—a notch filter was applied in preprocessing at 30 and 60 Hz to remove line noise and artifacts. Neural data underwent artifact rejection through the FieldTrip toolbox. No interictal spikes were observed to occur in the data presented here. Subsequent to testing in this experiment, some electrodes were identified as recording from epileptogenic regions (see **Table 1**, **Figure 4D**).

Exclusion criteria for eye movements were: 1. any eye movement occurring within the first 1 s of the trial, when strong image-evoked activity occurs and to allow pre-and post- fixation windows to reflect similar visual stimulation conditions, i.e., after image onset. 2. For trials in which the target was found before the time limit, we excluded fixations in the target area of interest that led to the "target found" trigger. In this way we are isolating eye movements during search and not as part of target fixation and recognition. Eye movements were also collected in the intertrial intervals of the macaque experiment, while in the darkened booth, and in the control task with Patient 6, on the black screen in the lit room. For the analysis of long fixations, we took the subset of fixations that lasted *>*500 ms and that were immediately preceded by a fixation whose duration was also *>*500 ms, thereby creating an analysis window that was not contaminated by additional eye movements, and that reflected an eye movement rate of *<*2 Hz, i.e., below the neural frequencies of interest.

In general, statistical significance was determined in two steps. First, a null distribution was created by shuffling the saccade times in each trial, for all trials for that electrode site, calculating the analytical measure of interest for each data point, and repeating this process 1000 times to identify observed values that exceeded the 0.001 threshold of the null distribution for that data point. Second, a Benjamini–Hochberg FDR correction was applied to the time or time-frequency series of interest, to correct for multiple comparisons. Specific details or exceptions are noted below. The within-trial shuffling was selected to allow an identical number of samples per trial, and trials per analysis, and the same across-trial variability as the original data, jittering only the precise timing to the onset of a given fixation.

Fixation-aligned mean evoked responses (**Figure 2**) were considered significant if the mean observed response exceeded the 0.002 percentile of the mean shuffled-fixation response distribution (a two-sided test of the 1000-element null distribution). Time-frequency plots aligned to fixation onset were calculated with FieldTrip using a Hanning window of 800 ms from −1.2 to 1.2 s (i.e., windows centered from −800 to 800 ms) taken every 10 ms, in 1-Hz increments from 3 to 80 Hz (**Figures 3**, **4**) and from 3 to 20 Hz (**Figure 5**). The Hann (or Hanning) window was selected rather than multiple tapers to maximize temporal precision with minimal spectral leakage for these time-limited, low-frequency events of interest. Peri-fixational changes in power were tested by comparing an observed time-frequency power value to those of its fixation-time shuffled distribution, and FDR-correcting for the number of time-frequency points tested.

Phase analysis used the same time and frequency windows as the power analysis. Phase alignment was calculated as the pairwise phase consistency, "PPC," (Vinck et al., 2010). Briefly, the PPC—like other measures such as the phase locking value (PLV)—is a measure of the consistency of phase across observations for a given frequency. Whereas PLV calculates phase for each event (here, fixation), and then determine the central tendency of the distribution, the PPC calculates the circular distance between pairs observations (the cosine of the absolute angular distances), permuted over the population of events, from which central tendency is then measured. The reasoning is that if events are aligned to a common phase, the average absolute angular


*For each participant, the sex, epileptogenic zone, number of recordings sites for a given location, number of saccades, median number of saccades per trial, and fast Fourier transform (FFT) trial-averaged theta peak is shown. The number of saccades for M1 varied by recording site (var). Abbreviations are: P, patient; MM, macaca mulatta, F, female; M, male; L, left; R, right; Hipp, hippocampus; OFC, orbitofrontal cortex; TL, temporal lobe neocortex; TP, temporal pole. Underlined numbers indicate post-experiment seizure activity in that hippocampus. \*Patient 6 ran a control task; data were analyzed separately, see Materials and Methods.*

distance among event pairs (i.e., among relative phases) will be small. We also calculated the PLV measure for comparison and, as previously established analytically, the PPC approximates the square of the PLV.

For the results shown in **Figures 4**, **5**, observed PPC values were compared to a permuted distribution containing the PPC values obtained when fixation onset times shuffled within-trials (over 1000 permutations). Significance masks were set at the 0.001 threshold of this permuted distribution and also had to survive multiple comparison correction using the Benjamini– Hochberg FDR correction of the Rayleigh distribution, set to *q <* 0*.*001. To test the frequency specificity of peri-fixational phase alignment, windows centered from −50 to 250 ms from fixation onset were selected for each frequency 3–80 Hz and tested against the fixation-time shuffled distribution at the respective frequency. Frequencies with PPCs exceeding the *p <* 0*.*001 threshold and also surviving the FDR correction are indicated in **Figure 4C**. Temporal selectivity was tested by comparing these peri-fixational PPC values to those obtained from a window centered −800 to −600 ms prior to fixation onset, i.e., "pre," and measured using a Wilcoxon signed rank test "signrank" (**Figure 4D**). PPC data from each individual hippocampus was tested for values exceeding *p <* 0*.*001 of its fixation-shuffled distribution obtained over the same time window. The individual results are presented in **Figure 4E** as a normalized index: (PPC\_peri - PPC\_pre)/(PPC\_peri + PPC\_pre), allowing an appreciation of the degree of fixation-related phase alignment above that seen prior to the fixation.

**Figure 5** shows the PPC in restricted frequency bands across control conditions. As before, each PPC value is compared to its fixation-shuffled distribution and FDR corrected for its respective plot. Only values exceeding both thresholds survive the mask. For 5A, the original and a subset of data are shown; the subset includes only long fixations before and after the fixation event of interest.

#### **RESULTS**

#### **SEARCH BEHAVIOR**

The seven participants in this study contributed between 1000 and 28000 fixation events during visual scene search (**Table 1**). All participants actively scanned the image in search of the "target" object (**Figures 1A,B** for examples) with median fixation durations between 180 and 350 ms across participants (**Figures 1C–H**).

#### **RECORDING LOCATIONS**

We determined patient recording locations from CT coregistration to the pre-operative MR independently from electrophysiological signal analysis. In addition, the macaque recordings included functional characteristics of hippocampal activity such as sharp wave ripples, and complex spikes, as defined in previous studies (Skaggs et al., 2007). Not all surface electrodes in patients sampled the same regions, with the exception of co-localized temporal pole sampling (e.g., **Figure 2A**, **Table 1**).

#### **EVOKED RESPONSES**

Recordings aligned to fixation onset revealed several patterns of modulation. A fast, transient response just prior to fixation

**FIGURE 1 | Saccadic eye movements during visual search. (A)** Gaze during one example trial from patient 2. The sequence of gaze locations is shown in purple. The target from this trial was a wind chime, which has been outlined with a yellow dashed rectangle for visibility in this figure. The patient spent the full 45 s searching for the target. **(B)** Gaze during the same trial viewed by a macaque. Conventions are as in **(A)**, and the macaque also did not find the target on this trial. **(C–H)** Normalized histogram of fixation durations during the search task for each of the six participants in the main task, respectively. Median search times are listed and indicated with a vertical red line.

was seen at most sites (green lines, **Figure 2B**, reflect this across-electrode common response). In comparison to the average, the pre-fixation transient was strongest at the temporal pole sites. The hippocampal probes showed a response following fixation not common to other sites, and which could manifest itself across multiple HC contacts, as opposite-polarity signals, depending on hippocampal recordings site (**Figure 2B**, second

across subjects, see **Table 1**.

plot). Slight fluctuations in frequency and phase of responses across trials can lead to attenuation of the evoked response, so higher frequencies and sustained responses can be difficult to measure. To evaluate the relative strength of oscillations independent of polarity and of evoked oscillations associated with fixations, we calculated the trial-by-trial spectral power.

responses from several electrode locations, aligned to fixation onset. Time

#### **SPECTRAL POWER**

When mean spectral power was calculated from segments of data aligned to fixation during search, a high-frequency (25– 80 Hz) power modulation was seen that was significant in 5/9 hippocampi tested (**Figure 3**). No changes in power under 20 Hz was observed at any recording site. If the theta-band activity is not power-modulated by eye movements, this indicates that either theta occurs independently of eye movements, or its phase is altered without a corresponding change in theta amplitude, as has been observed in other cognitive processes.

#### **POST-FIXATION PHASE ALIGNMENT**

SEMs elicited phase alignment in all hippocampal recording sites from all participants tested (for example see **Figure 4A**, for group mean, see **Figure 4B**). PPC results were qualitatively indistinguishable to those obtained using the PLV measure, so only the PPC values are depicted. The alignment was typically seen within the first 200 ms following fixation, and was restricted to a band that peaked within the 3–8 Hz range (**Figure 4C**; note non-significant harmonics of individual peaks at 10–16 Hz). In addition, 5/9 hippocampal recordings showed a brief PPC closely locked to the saccade event ranging between 15–35 Hz (**Figures 4A,B**), but presumably due to its transience, it did not survive significance testing in the −50 to 250 ms centered perifixation tests (**Figure 4C**). Furthermore, the beta-band but not theta-band phase alignment was seen in other sites that had evoked responses to saccades at that time, such as those in the temporal pole (**Figure 2**).

The duration of phase alignment and the frequency band overlapped with the average rate of saccades (2–6 Hz, **Figure 1B**). To determine whether hippocampal 3–8 Hz phase is passively reflecting the rate of eye movements, we calculated the PPC for the subset of fixations that were protracted in time (*>*500 ms fixations for both the aligned and the immediately preceding fixation, or *<*2 Hz). **Figure 5A** shows that even for this subset of long fixations, the 3–8 Hz phase effects seen from the full distribution persist. Phase alignment does not persist for longer than when shorter fixations are included, nevertheless, we see alignment at the same frequencies rather than decay or drop in frequency corresponding to the new saccade rate.

The effects of eye movements on hippocampal activity could be due to sampling new parts of the image, or they could be extraretinal, reflecting some non-visual correlate of the eye movement

**FIGURE 3 | Time-frequency spectrogram of hippocampal activity aligned to fixation onset during visual search, grand averaged across each patient.** Frequency was measured from 3 to 80 Hz at 1 Hz intervals, time was measured in 800 ms Hanning-tapered windows, shifted every 10 ms. For visualization of each frequency band, power is presented in each band relative to the average seen for that frequency band between ±800 ms around fixation onset. None of the hippocampal recording sites showed theta power modulation, though some exhibited modulation in higher frequency bands.

itself. To address the relevance of the visual scene search to theta phase alignment, we compared the PPC during search to the PPC following fixations made on the darkened monitor, during the inter-trial intervals. Whereas all recording sites showed phase alignment within the 3–8 Hz band during scene search, no site showed this alignment from fixations that occurred during the ITI (**Figures 5D,F**).

Finally, the image alternation was occurring throughout visual search in the main task. To determine whether 3–8 Hz phase alignment occurred during visual search independent of the changing stimulus, we tested a patient on a constant-image control task. Theta phase alignment persisted during search on the static image, but was lost during the inter-trial interval, suggesting that image-guided visual search elicits the 3–8 Hz phase alignment.

### **DISCUSSION**

Despite numerous differences across species, both human and monkey share a common mode of visual scene exploration. Capitalizing on the similarities in behavior, we conducted the same task in human and macaque and found a similar response in the hippocampus following fixations during visual search. The response, seen in all participants, was a 3–8 Hz frequency phasealignment to the fixation. The effect was relatively short lived, lasting around or slightly longer than the typical duration of a

**FIGURE 4 | Phase-alignment to fixations during search shows specificity in frequency and time. (A)** Single-subject theta-band phase-locking to fixations. Shown is the pairwise phase consistency (PPC) for each time-frequency bin described in **Figure 3**, with time on the x axis relative to fixation onset and frequency on the y-axis. Significant PPC values are unmasked (*p <* 0*.*001 of fixation-time-shuffled distribution and Rayleigh test *p <* 0*.*001 FDR corrected). **(B)** Group-averaged PPC plots for each hippocampus sampled. Conventions are as in **(A)**, but for the mean PPC values, with no masking. **(C)** Frequency band-limited hippocampal phase alignment to fixations, across subjects. For each frequency indicated on the x axis, the y-axis shows the mean PPC value from time-frequency points centered −50 to 250 ms from fixation onset, expressed as the proportion of the maximal PPC value of that sample from 3 to 80 Hz and −800 to 800 ms. Each thin line reflects the normalized PPC from a single hippocampus, with significant frequency bands indicated with a thickened gray line; all hippocampi produced significant phase concentration within a 3–8 Hz band; none produced significant phase concentration above 12 Hz. The black bold line is the group-averaged mean (*N* = 9). **(D)** Peri-fixational phase alignment, group averages pre and peri-fixation. Average PPC values from time bins centered −50 to 250 ms from fixation onset were compared with PPC values −800 to −600ms prior to each fixation. <sup>∗</sup>*p <* 0*.*01, Wilcoxon signed rank test. **(E)** Bars reflect the PPC values from each hippocampus recorded, indexed as the difference between the preand peri-fixational time windows (see Materials and Methods). Participant number and hemisphere are indicated below each bar.

**FIGURE 5 | Three to eight hertz phase alignment depends on visual search but not on matched-rate (3–6 Hz) saccades.** Left columns show full data and right columns the control condition. Significant PPC values are unmasked (*p <* 0*.*001 of fixation-time-shuffled distribution and Rayleigh test *p <* 0*.*001 FDR corrected). **(A)** PPC from a single hippocampal recording site in a patient. **(B)** PPC from the same site, excluding all but fixations lasting *>*500 ms both before and after the fixation onset of interest. **(C)** Average PPC from Patient 6 during search in the constant-stimulus version task. **(D)** Average PPC from the same sites, but for fixations made on the dark screen during the inter-trial intervals. **(E)** Average PPC from the macaque hippocampus aligned to fixations occurring during the main-task trials, and the inter-trial interval on a black screen **(F)**.

fixation, amounting to 2 or 3 cycles of the oscillation. The effect was not driven solely by the rate of eye movements, because it persisted when only long-lasting *<*500 ms sequences of fixations were considered. The short time course even for long fixations suggests that the response is rapidly dampened, even without the interruption of a subsequent eye movement. Furthermore, the effect was not observed for eye movements made in between search trials, on a darkened monitor (**Figure 5D**), and even with a darkened environment (**Figure 5F**). Thus, the eye movement *per se* was not sufficient to elicit the observed phasealigned activity, suggesting visual input and/or task-related factors contributed to the phase alignment. When a constant, unchanging scene was presented for the duration of search, the phase alignment persisted, suggesting that exogenous changes in visual input do not underlie the 3–8 Hz phase alignment (**Figure 5C**).

#### **THREE TO EIGHT HERTZ PHASE ALIGNMENT OCCURS WITHOUT INCREASES IN POWER**

Some of the earliest descriptions of theta phase resetting or aligning to an independent event, noted that it was not associated with increases in theta power (Buño et al., 1978; Givens, 1996). In the paper by Givens, a continuous conditional discrimination (CCD) showed phase-resetting where a sensory discrimination task did not. Theta power, however, was unchanged across tasks. In human ECoG recordings, theta phase-resetting was not associated with increases in power, suggesting a reorganization of oscillations rather than an evoked response (Rizzuto, 2003). During a Sternberg item recognition task, resetting was the predominant effect, and over a larger spatial extent. Another account of phase alignment without power changes was shown to result from sclerosis (Mormann, 2007), which would be relevant to the patient population; however, in the present study, individual hippocampi with no known or visible signs of pathology also showed phase-alignment in the absence of power changes (**Figures 4C,D**).

#### **CELLULAR CORRELATES OF THETA RESETTING IN THE HIPPOCAMPUS THAT ACCOUNT FOR NO CHANGE IN POWER**

The lack of power changes is consistent with spiking data (Vinogradova, 1995; Zugaro et al., 2005) in which theta phase reset is associated with no significant increase in spiking. On the contrary, a resetting stimulus first results in complete quiescence within the hippocampus, followed by periodic increases in spiking activity at the theta frequency. The maxima of the peaks, however, are not greater than the spike rates prior to the resetting stimulus. Thus, increases in spike-field coherence rather than overall firing rate are predicted for the post-saccadic activity. This would be consistent with human unit recordings during a visual recognition memory task in which theta phase-locking but not spike-triggered power distinguished laterremembered vs. later-forgotten trials (Rutishauser et al., 2010). Even in rodents known for sustained hippocampal theta, the spiking activity used to decode the spatiotemporal context occurs within one cycle and can, under some circumstances, flip as rapidly as one theta cycle, suggesting that the underlying coding can be accomplished within this time frame (Jezek et al., 2011).

#### **PHASE CLUSTERING AND POWER INCREASES**

A notable difference between our results and other intracranial results looking at hippocampal responses is that we see only phase effects whereas often both power and phase increases are identified. This suggests that the mechanisms are dissociable in the hippocampus and may be operative under different conditions. A detailed analysis was conducted of the hippocampal potentials during a continuous visual word recognition paradigm in which a button press indicated old or new judgments (Mormann et al., 2005). Phase locking occurred for both hits and correct rejections in the theta and alpha bands. Power increases were greater for hits than for correct rejections, thus, the phase locking they describe appears to be different from the evoked responses in this memory task. In the present study, the lack of phase alignment in the ITI, despite saccade generation, suggests that phase alignment is specific to post-saccadic (retinal) processing rather than an automatic result of the post-saccadic (or extra-retinal) state.

#### **HIGHER-FREQUENCY PHASE AND POWER EFFECTS**

For some participants, hippocampal power above 25 Hz was enhanced in a narrow window around the time of the saccade. In addition, phase alignment was seen in some individual sites at 15–35 Hz, also locked to the saccade event, but none survived the statistical testing in the peri-event window used here. These modulations—especially those also observed in the temporal pole—may be attributable to oculomotor artifact, even in intracranial recordings (Yuval-Greenberg et al., 2008; Kovach et al., 2011; Nagasawa et al., 2011); however, visually-modulated neural responses are also known to occur in early visual areas in these frequency bands (Ito et al., 2013), and gamma oscillations are thought to play an important role in hippocampal computation in rats (Colgin et al., 2009) and primates (Jutras et al., 2009), therefore, the higher frequency responses are not necessarily attributable to eye movement artifacts. In contrast to the higher-frequency responses, the protracted hippocampal response was of lower frequency than that described for oculomotor artifacts, did not modulate power, was not limited to the eye movement duration, showed a polarity reversal across two probes spanning the hippocampus mediolaterally, and was additionally task-dependent: there was greater phase alignment when eye movements were made during the search task than during the inter-trial interval, disambiguating it from other responses.

#### **WHETHER AND WHICH THETA**

The 3–8 Hz response in this task was among the most consistent and robust effects, with all participants showing this band in response to fixation. Like other studies with humans, macaques, and bats, clear theta rhythmicity in this study appears to be shorter-lived than what is seen in the rat. For eye movements, this may fall conveniently within the fixation durations, yet we did not observe a "ringing" or sustained theta commensurate with the sustained fixation windows. One possibility is that the nature of sampling strongly constrains the hippocampal oscillations; the greater the periodicity of movement, the more likely the hippocampus will entrain to it. Alternatively, hippocampal theta in primates may simply not

#### **REFERENCES**


*Brain Res.* 1319, 92–102. doi: 10.1016/j.brainres.2010.01.004


have the same resonance properties as that of the rat hippocampus. Another non-exclusive possibility is that the hippocampus here is matching extra-hippocampal oscillations. Both of these possibilities are broadly supported by the short-lived (500 ms) theta coherence associated with successful memory encoding among hippocampal and other neocortical structures (Burke et al., 2013), theta synchronization associated with visual search (Bosman et al., 2010) and the long-range theta coherence in fronto-parietal networks during planning epochs that are initiated by eye fixations (Phillips et al., 2013). Finally, if eye movements trigger a band-limited response to a single event, the typical rates of repetition of eye movements could nevertheless, produce effects at the cellular level that may be indistinguishable from those produced through sustained or intrinsic rhythms. As such, eye movements could still co-opt the functionality associated with these rhythms as they have been observed in rats. In any case, the hippocampus of primates is sensitive to visual exploratory rhythms; how this constrains or facilitates hippocampal function remains to be seen.

#### **ACKNOWLEDGMENTS**

The authors would like to thank Eleanor and Emily Murphy for assistance with the training and recordings in macaques and Martin Vinck for assistance with the phase-analytical measures. We would also like to thank the patients for their willingness and perseverance during recordings. This research was supported by: NSERC CREATE VSA (Rodrigo Montefusco-Siegmund, Timothy K. Leonard), NSERC Discovery Grant (Kari L. Hoffman), CIHR (Cristiano Micheli, Taufik A. Valiante) Alfred P. Sloan Foundation (Kari L. Hoffman), the Krembil Foundation (Kari L. Hoffman), CFI (Kari L. Hoffman), an Ontario MRI ERA (Kari L. Hoffman).


theta rhythm?–Linking behavioral data to phasic properties of field potential and unit recording data. *Hippocampus* 15, 936–949. doi: 10.1002/hipo.20116


Leopold, D. A., and Logothetis, N. K. (1998). Microsaccades differentially modulate neural activity in the striate and extrastriate visual cortex. *Exp. Brain Res.* 123, 341–345.

Lisman, J. E., and Jensen, O. (2013). The theta-gamma neural code. *Neuron* 77, 1002–1016. doi: 10.1016/j.neuron.2013.03.007

Maldonado, P., Babul, C., Singer, W., Rodriguez, E., Berger, D., and Grun, S. (2008). Synchronization of neuronal responses in primary visual cortex of monkeys viewing natural images. *J. Neurophysiol.* 100, 1523–1532. doi: 10.1152/jn.00076.2008


open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. *Comput. Intell. Neurosci.* 2011, 156869. doi: 10.1155/2011/156869


echolocating bats. *Nat. Neurosci.* 10, 224–233. doi: 10.1038/nn1829


intrahippocampal perturbation. *Nat. Neurosci.* 8, 67–71. doi: 10.1038/nn1369

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 08 May 2013; paper pending published: 28 May 2013; accepted: 02 August 2013; published online: 30 August 2013.*

*Citation: Hoffman KL, Dragan MC, Leonard TK, Micheli C, Montefusco-Siegmund R and Valiante TA (2013) Saccades during visual exploration align* *hippocampal 3–8 Hz rhythms in human and non-human primates. Front. Syst. Neurosci. 7:43. doi: 10.3389/fnsys. 2013.00043*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2013 Hoffman, Dragan, Leonard, Micheli, Montefusco-Siegmund and Valiante. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Parafoveal X-masks interfere with foveal word recognition: evidence from fixation-related brain potentials

#### *Florian Hutzler <sup>1</sup> \*†, Isabella Fuchs 2†, Benjamin Gagl 1, Sarah Schuster 1, Fabio Richlan1, Mario Braun1 and Stefan Hawelka1†*

*<sup>1</sup> Department of Psychology, Centre for Neurocognitive Research, University of Salzburg, Salzburg, Austria*

*<sup>2</sup> Department of Basic Psychological Research and Research Methods, Faculty of Psychology, University of Vienna, Vienna, Austria*

#### *Edited by:*

*Andrey R. Nikolaev, KU Leuven, Belgium*

*Reviewed by: Bart Machilsen, University of Leuven, Belgium Gijs Plomp, UNIGE, Switzerland Timothy Jordan, Economic Research Foundation, Turkey*

#### *\*Correspondence:*

*Florian Hutzler, Department of Psychology, Centre for Neurocognitive Research, University of Salzburg, Hellbrunnerstr. 34, 5020 Salzburg, Austria e-mail: florian.hutzler@sbg.ac.at †These authors have contributed equally to this work.*

The boundary paradigm, in combination with parafoveal masks, is the main technique for studying parafoveal preprocessing during reading. The rationale is that the masks (e.g., strings of X's) prevent parafoveal preprocessing, but do not interfere with foveal processing. A recent study, however, raised doubts about the neutrality of parafoveal masks. In the present study, we explored this issue by means of fixation-related brain potentials (FRPs). Two FRP conditions presented rows of five words. The task of the participant was to judge whether the final word of a list was a "new" word, or whether it was a repeated (i.e., "old") word. The critical manipulation was that the final word was X-masked during parafoveal preview in one condition, whereas another condition presented a valid preview of the word. In two additional event-related brain potential (ERP) conditions, the words were presented serially with no parafoveal preview available; in one of the conditions with a fixed timing, in the other word presentation was self-paced by the participants. Expectedly, the valid-preview FRP condition elicited the shortest processing times. Processing times did not differ between the two ERP conditions indicating that "cognitive readiness" during self-paced processing can be ruled out as an alternative explanation for differences in processing times between the ERP and the FRP conditions. The longest processing times were found in the X-mask FRP condition indicating that parafoveal X-masks interfere with foveal word recognition.

**Keywords: visual word recognition, preview benefit, invisible boundary technique, parafoveal masks, eye movements, EEG**

### **INTRODUCTION**

Most of what we know about parafoveal preprocessing is based on eye movement studies which administered the invisible boundary technique (Rayner, 1975). The boundary technique makes possible to experimentally manipulate the characteristics of the upcoming, parafoveal word. To illustrate, an invisible boundary is placed in a sentence before a target word. As long as the reader does not cross the boundary, the preview of the target words is experimentally manipulated (e.g., masked). When the reader's eyes cross the boundary, the preview is replaced by the target word.

Central to the present study is a variant of the boundary paradigm during which the parafoveal preview is masked. In this kind of experimentation, a parafoveal preview is presented which is either valid, that is, identical to the target word or partially valid (e.g., preview: *vievcn* or *viewXX*—target: *viewer*). The conditions with the valid and the partially valid previews are compared to a "baseline" condition in which the parafoveal preview of the target word is entirely masked. The masks, which are used most often, are either different letter masks (e.g., *nmovcn*—*viewer*) or of X-masks (e.g., *XXXXXX*—*viewer*; see Rayner, 2009). The rationale is that the mask prevents parafoveal preprocessing. The critical contrast is whether (and to what extent) participants are faster in the subsequent foveal recognition of the target word, when they are presented with (partially) valid previews compared to the baseline condition. If the processing times are shorter in the experimental condition than in the baseline-condition, then the standard interpretation is that useful information from the parafoveal preview was extracted during parafoveal preprocessing. This information may assist foveal processing of the target word. Put differently, the parafoveal preview facilitated foveal word recognition.

This interpretation, however, is crucially dependent on the "neutrality" of the mask, that is, the parafoveal mask itself must not induce uncalled-for effects during parafoveal preprocessing and must not affect (i.e., interfere with) foveal processing of the target word. If, to the contrary, the mask actually did affect parafoveal preprocessing and, in the most detrimental case, interfered with foveal processing as a consequence, then the interpretation sensu facilitation could be unwarranted. To illustrate, let us assume that a parafoveal mask (e.g., an X-mask) disrupts parafoveal processing and interferes with the subsequent foveal processing of the target word. As a consequence, the parafoveal preview of the mask may lead to a prolongation of foveal word recognition of, say, 30 ms. In such a case, an ostensible benefit of parafoveal preprocessing of, for example, a partially valid preview would be substantially overestimated, if it was derived from the contrast with the "baseline" condition.

Whether X-masks or different letter masks do indeed not elicit uncalled-for effects on foveal word recognition was seldom explicitly investigated. One exception is an early study by Rayner et al. (1978) whose finding led to a (short-lived) theoretical controversy about the suitability of various types of parafoveal masks (McClelland and O'Regan, 1981a,b; Rayner and Slowiaczek, 1981). To illustrate, Rayner and Slowiaczek (1981) reported that "*the direction of* [*...*] *preview effects is crucially dependent on the choice of the baseline condition*" (p. 645). Thus, "*some kind of neutral preview must be found to assess costs and benefits of information extracted from parafoveal vision*" (McClelland and O'Regan, 1981b, p. 653). More recently, Jordan et al. (2003) pointedly stated that "[*...*] *in the absence of clear unequivocal evidence that a primary experimental manipulation does not produce secondary, unwanted influences, it is prudent for researchers to seek to minimize the potential for these experimental side effects*. [*...*] *When the efficacy of a particular letter pair in word recognition is investigated, merely replacing all other letters in words with xs is unlikely to satisfy this principle of good scientific practice*" (p. 901).

These reservations about the application of parafoveal masks, however, had virtually no impact in the research field. The X-mask and the different letters mask are still the most common choice in eye movement studies on reading which use the boundary paradigm. Only recently, the issue of potential uncalled-for side effects of parafoveal masks was seized again. Kliegl et al. (2013) re-analyzed the data from a published eye movement study (McDonald, 2006) which used the boundary technique and different letter masks. A critical finding was that the masks elicited inflated gaze durations on the target word when the preceding fixation was in close proximity to the target word compared to instances where the preceding fixation was remote from the target word. The rationale of this comparison is that in case of near fixations the masks are perceived with high visual acuity (i.e., are more salient) compared to remote fixations. The inflated gaze duration for near fixations thus indicates that the masks interfered with foveal word recognition. Kliegl et al. concluded that preview effects, which were up to now subsumed under the umbrella term "preview benefit," could actually be a complex mixture of benefits and costs.

The objective of the present study was to assess the effect of the parafoveal preview of X-masks on the subsequent foveal processing of words. In particular, we were interested in the time course of the effect of the parafoveal mask. To this end, we combined eye movement recording and EEG—two methods which both provide high temporal resolution. By combining these methods one can obtain fixation-related brain potentials (FRPs; Baccino and Manunta, 2005; Hutzler et al., 2007; Dimigen et al., 2011). The technique makes possible to assess cognitive processes in an experimental setting which permits parafoveal preprocessing (of experimentally manipulated previews). Thus, the technique provides the possibility to measure the temporal dynamics of visual word recognition in a relatively natural (and hence ecologically valid) setting. Monitoring the eye movements granted the participants to read (more specifically, to parafoveally preview and foveally process) the words at their individual reading speed. The concurrently recorded EEG allowed the assessment of the temporal dynamics of visual word recognition after previewing an X-mask compared to preprocessing a valid preview.

To assess the effect of previewing an X-mask on the subsequent foveal word recognition we relied on an established effect from the event-related brain potential (ERP) literature, that is, the old/new effect in a continuous recognition task (Friedman, 1990) which was successfully used in a previous FRP study (Hutzler et al., 2007). The participants were presented with a list of 5 words and they had to judge whether the 5th word was the same as one of the previous 4 words, or whether it was a new word. In standard ERP conditions, in which the words are serially presented one-by-one, the task reliably elicited more positive waveforms for "old" words than for "new" words about 250 ms after stimulus onset particularly for electrodes at central/parietal sites of the scalp (Friedman, 1990) and it was shown that the effect is more pronounced at electrodes over the right than over the left hemisphere (Hutzler et al., 2007).

In the present study, we administered two FRP conditions. Both conditions presented rows of unrelated words. One condition permitted parafoveal preprocessing by presenting valid previews of the target words. In the other condition, the target words were X-masked until fixation (to be precise, until crossing the invisible boundary before the target word). In addition to the two FRP conditions, we administered two ERP conditions in which the words were presented serially (i.e., in isolation one-byone). In one of these conditions (i.e., the fixed-pace condition) the words were presented with a fixed, unvarying timing. In the other (the self-paced) condition, the presentation of the words were manually triggered by the participants. **Figure 1** depicts the events of a trial of the X-mask FRP condition and of a trial of the ERP conditions.

The start of a significant divergence of the FRP and ERP curves in response to the experimental conditions, that is, the onset of the old/new effect, is considered as the earliest point in time of differences in processing the target words (henceforth processing time). We expect that the valid-preview FRP condition will elicit the shortest processing times of the target words due to parafoveal preprocessing (i.e., a preview benefit). Theoretically relevant is the comparison of the processing times in the X-mask FRP condition with the processing times in the ERP conditions. If processing times are prolonged in the X-mask FRP condition compared to the ERP conditions, then this would indicate interference of the X-masks with foveal word recognition. Comparing the fixed-pace ERP condition with the self-paced ERP condition serves to assess whether "cognitive readiness" during self-paced processing account for differences in processing times between the fixed-pace ERP condition and the two (inherently self-paced) FRP conditions.

#### **METHODS**

#### **PARTICIPANTS**

Fifteen native German-speaking right-handed students (11 females) of the University of Salzburg (mean age 24 years) with normal or corrected-to-normal vision participated in the study.

#### **PROCEDURE**

To estimate the time-course of visual word recognition, we used the same marker-effect as in Hutzler et al. (2007), that is, the old/new effect in a continuous recognition task (Friedman, 1990). **Figure 1** schematically depicts the events of a trial from an FRP and a trial from the ERP conditions. In all four settings (fixed-pace ERP, self-paced ERP, valid-preview FRP, and X-masked FRP), five unrelated words were presented and participants had to indicate via button press whether the 5th word (henceforth: target word) was the same as any of the four previously encountered words ("old" trial) or not ("new" trial). Trials in the old-condition consisted of three filler words, one word which was the same as the target word and the target word. The word which was the same as the target word was at the 1st, the 2nd, or the 3rd position of the word list (counterbalanced across trials), but was never at the 4th position (i.e., it never was the pre-target word). The trials in the new-condition consisted of four filler words and a not previously presented word in the 5th position. Each of the four experimental setups presented 100 trials (50 "old" and 50 "new" trials) resulting in a total of 400 trials.

word. In the FRP conditions, the time-points for averaging the FRPs were the

All words were nouns ranging in word length from 3 to 8 letters. As evident from **Figure 1**, the words all had a capitalized first letter which is the correct form for German nouns. Words were presented in Courier New on a white background. The target words of the four experimental setups and the two conditions (old vs. new) were selected in such a way that they were closely matched on 8 word characteristics across setups and conditions (see **Table A1**). Furthermore, the pre-target words were matched on six characteristics (**Table A2**) in order to hold constant the processing difficulties imposed by these words (i.e., the foveal load; Henderson and Ferreira, 1990). The rigorous matching precludes that differences in findings across experimental setups and conditions are due to differences in the characteristics of the target words or due to spillover effects from the pretarget words.

dissuade the participants from regressions.

In all settings, each trial started with the presentation of a string of five hashes (#; varying between 1500 and 3000 ms to prevent phase locking on trial timing). The hash string signaled the participants that they were allowed to blink. Thereafter, a blank screen was presented for 2000 ms. The target words (i.e., the 5th word) remained on the screen until the participants indicated with a button press (with their index fingers on a gamepad) whether it was an "old" or a "new" word. The mode of response (*old word—left button*; *new word—right button*) was reversed after the presentation of half of the trials in each condition (to: *old word—right button*; *new word—left button*). The participants were required to respond as accurately as possible, but speed was not emphasized. The sequence of the experimental setups was one of the FRP conditions followed by one of the ERP conditions, followed by the other FRP and then the other ERP condition or vice versa.

#### *ERP settings*

The five words of a trial were presented singly and serially (i.e., word by word) at the center of the screen. The first four words of a trial were presented in black color. The target word, in contrast, was dark gray, allowing the participants to identify the 5th word as the target. In the fixed-pace ERP setting, the words were presented for 800 ms, one after another with a 500 ms blank screen in-between. In the self-paced ERP setting, the words remained on the screen until the participants pressed a button (with their thumb) and was then followed by a blank screen (200 ms). The intertrial-interval (blank screen) was 2000 ms. Three practice trials preceded the experimental conditions.

### *FRP settings*

At the beginning of a trial, a fixation cross was presented left of the screen center. Participants were required to fixate the cross and after the eye tracker registered the fixation, a blank screen was presented for 200 ms. (If the eye tracking system did not detect a fixation on the fixation cross within 5 s, the eye tracker was re-calibrated, see below). After the fixation-check, the five words of a trial were presented simultaneously in a row in such a way that the participants now fixated the first letter of the first word of the list. The series of words remained on the screen until response. In the X-mask preview condition, the target word was X-masked until the participants crossed the invisible boundary between the target word and the preceding word. The other condition presented a valid preview of the target word. To dissuade the participants from regressing back from the target word to the preceding words, we again applied the boundary technique. The first fixation on the target word reactivated the boundary between the target word and the 4th word. If the participant made a regression toward the preceding words (and in so doing crossed the boundary) all preceding words were replaced with hash-mark strings. Such trials were omitted from analyses. Ten practice trials preceded the experimental conditions.

#### **APPARATUS**

Multichannel EEG was recorded from 32 Ag/AgCl electrodes mounted with a modular elastic cap (Easy Cap, Falk-Minow Systems, Germany) on standard positions according to the 10– 20 system. Scalp electrodes were recorded referentially against linked earlobes (as common reference) with a sampling rate of 1000 Hz. To monitor horizontal and vertical eye movements, EOG was recorded bipolar from the outer canthus of each eye as well as from below and above the right eye (recorded bipolarly against FC1). Signals were amplified using a 32 channel Brainamp (BrainProducts, Germany) amplifier with a 0.1–1000 Hz band pass and a 50 Hz notch filter. Impedances for scalp electrodes were kept below 5 k. Eye movements were recorded (monocular for the right eye) with an EyeLink CL tower mount eye tracker (SR Research, Ontario, Canada) with a sampling rate of 1000 Hz. Before each of the two FRP conditions the eye tracker was calibrated with a horizontal 3-point calibration routine. The criterium for a successful calibration was an average tracking error of less than 0.5◦ of visual angle (max = 0.36◦ and 0.44◦ for the valid preview and the X-mask preview FRP conditions,

respectively; *M* = 0*.*19 for both conditions). The calibration of the eye tracker was repeated, when the fixation control at the beginning of a trial failed (see above).

Participants sat at a viewing distance of 52 cm (held constant by a forehead and a chin rest) from a 21CRT monitor. From the distance, a single letter of the words had a width corresponding to approx. 0.4◦ of visual angle. The monitor had a resolution of 1024 × 768 pixels and refreshed with 120 Hz. Stimulus presentation was controlled by the Experiment Builder software (SR Research Ltd., Canada). In the ERP settings, the point-in-time of the stimulus presentation was registered by the EEG recording equipment via standard communication (i.e., via the parallel ports of the Display PC and the EEG recorder). In the FRP settings, the point-in-time of the start of the first fixation on the target word was registered by the eye tracking system and sent to the EEG recorder. This point-in-time was corrected offline for the latency of the fixation detection algorithm of the eye tracking system. The default latency of the fixation detection algorithm is 36 ms and this value was fairly constant (in 97% of the instances it was either 36 or 37 ms). The value was never greater than 40 ms.

#### **ANALYSIS**

EEG data was analyzed using the EEG-Lab v6.01b toolbox (Delorme and Makeig, 2004) toolbox for MATLAB v7.0 (Mathworks, Natick, MA). For the analysis of ERPs and FRPs, the continuous EEG data was segmented upon the point-intime at which the theoretically relevant target word (i.e., the 5th word of a trial) appeared (in the ERP settings) or when it was first-time fixated (in the FRP settings). EEG data was segmented from 100 ms before to 600 ms after these time points. Trials corrupted by eye blinks or EEG-artifacts (determined by visual inspection) were excluded from further analysis. The 100 ms interval prior to the appearance of the target words (ERP) or the first fixation on the target words (FRP) was used for baseline correction.

#### *Artifact correction*

EEG-artifacts due to horizontal eye movements were corrected by means of independent component analysis (ICA; Vigário, 1997). The ICA separates waveforms into components that are maximally independent from each other. The ICA component resembling the typical activity pattern and component map of horizontal eye movements is then removed prior to backprojection (see also Delorme et al., 2007). ICA proved to be useful for the identification and elimination of the horizontal eye movement artifacts typical for reading in previous studies (e.g., Hutzler et al., 2007, 2008; Plöchl et al., 2012). Subsequently, epochs were filtered with a 30-Hz low-pass filter.

#### **RESULTS**

Trials with incorrect responses were excluded from analysis (6 and 9% for ERPs and FRPs, respectively). The group means of the median response times of the participants were 887 and 962 ms in the valid and the X-masked FRP condition, and 955 and 1016 ms in the fixed-pace and the self-paced ERP condition, respectively. The analysis of response times by means of a 2 × 4

repeated-measures ANOVA with old vs. new words and condition (valid preview and X-mask FRPs, fixed-pace and self-paced ERPs) as within-subject factor revealed a significant main effect of condition; *F(*3*,* <sup>42</sup>*)* = 3*.*78, *p <* 0*.*05, a significant interaction between condition and old vs. new words; *F(*3*,*42*)*= 4.91, *p <* 0*.*01; but no main effect of old vs. new words, *F <* 1. *Post-hoc* comparisons, however, failed to reveal reliable differences between the four conditions. Concerning the old/new effect, response times were around 73 ms slower for old compared to new words in the X-masked FRP condition (*M* = 1002 and 929 ms, respectively; *p <* 0*.*05), but around 57 ms faster for old compared to new words in the self-paced ERP condition (*M* = 990 and 1047 ms; *p <* 0*.*01). No reliable differences between RTs in response to old vs. new words were found in the valid preview FRP and fixed pace ERP conditions.

For the statistical analysis of the old/new effect on the brain potentials we collapsed the data from electrode clusters in the left anterior (F3, FC1, FC5), central (C3, CP1, CP5), and posterior (P3, P7, T7) regions and for the corresponding electrodes of the right hemisphere (i.e., F4, FC2, FC6 and C4, CP2, CP6 and P4, P8, T8, respectively). For a first exploratory analyses, we submitted the averaged data from the regions to point-bypoint repeated measures ANOVAs (i.e., for every 1 ms of the data stream) with old vs. new words, region (frontal, central, posterior), and hemisphere (left vs. right) as within-subject factors separately for each experimental condition. The ERPs and FRPs are depicted in **Figure 2**. In the right panel of the Figure we additionally depicted the time-points for which the ANOVA revealed continuously significant effects (*p <* 0*.*05) of old vs. new words, that is, either a main effect of old vs. new words or interactions of

**two rows: event-related) in the four experimental conditions for the left and right hemisphere and the old vs. new words.** A continuously reliable main effect of old vs. new words or the interactions of the effect depicted below the FRP/ERP curves of the **right panel**. The arrows denote the earliest time point of the onset of the old/new effect (i.e., main effect or interaction).

hemisphere and/or region with the factor old vs. new word (shortlived effects prior to 100 ms are not shown). The values beside the arrows indicate the onsets of such a continuously significant effect.

As evident from **Figure 2**, both FRPs and ERPs were more positive-going in response to old words compared to new words (from, dependent on the experimental setup, about 180 to 280 ms onwards). The effect was most pronounced for the central and anterior regions and more so in the right than in the left hemisphere (replicating previous findings with ERPs; Friedman, 1990; and ERPs and FRPs; Hutzler et al., 2007). For the valid preview FRP condition, the point-by-point ANOVAs revealed that the effect of old vs. new words began to be continuously significant from 176 ms after the start of the first fixation on the target word onward (until approx. 390 ms after the start of the first fixation). In the X-mask preview condition, no long-lasting, continuously reliable difference between old and new words (neither a main effect nor an interaction with region or hemisphere) emerged until 313 ms. Thus, the old/new effect emerged about 130 ms later in case of an X-mask preview compared to the valid preview of the target word. In the ERP conditions, the divergence points of the curves were intermediate and there was virtually no difference in the time points of the emergence of the old/new effect between the fixed-pace and the self-paced condition (i.e., at 229 and 222 ms, respectively). These time-points are about 50 ms later than the divergence point of the old/new effect in the valid preview condition, but they are considerably earlier than in the X-mask FRP condition.

Determining the temporal onset of the old/new effect individually for each participant would have been the prerequisite for a classical, inferential analysis of the differences in the onset of the old/new effect for the four conditions. The low signal-to-noise ratio of the EEG, however, did not allow such single-subject analyses. However, to assess the significance of the differences in the time courses of the old/new effect we administered a statistical analyses based on the *jackknife* procedure introduced by Ulrich and Miller (2001). This procedure makes possible to assess differences in the time-points of the emergence of an effect by a bootstrap procedure. In essence, the emergence of an effect is repeatedly assessed in subsamples of the original sample by consecutively leaving one subject out of the analyses (resulting, for the present analyses, in 15 subsamples with *n* = 14 for each subsample). The time-points of the subsamples are then submitted to a standard ANOVA. This procedure (to be specific, the use of subsample scores) leads to an underestimation of the error term of the ANOVA and hence the *F*-values associated with an effect (in the present case, the old/new main effect) must be corrected. The correction is administered by dividing the *F*-value(s) from the ANOVA by the squared number of subsamples minus 1 [i.e., *FC* <sup>=</sup> *<sup>F</sup>/(<sup>n</sup>* <sup>−</sup> <sup>1</sup>*)*2]. *Post-hoc* comparisons (with the Scheffé test in the present analysis) can also be carried out after correcting for the deflated error term. For details and the proof of the applicability of the procedure see Ulrich and Miller (2001; also Miller et al., 1998).

The old/new effect is, as aforementioned, most pronounced at central electrodes in the right hemisphere (e.g., Friedman, 1990; Hutzler et al., 2007). Thus, we analysed the significance of the old/new effect for the respective cluster. **Figure 3** depicts the FRPs and ERPs of this cluster. Furthermore, the Figure shows the onset of the old/new main effect provided by the *jackknife* procedure. The onsets represent, for each of the 4 experimental conditions, the mean of the first sample points of a sequence of a minimum of 30 sample points for which a one-sided, paired-sample *t*-test revealed a continuously and significantly higher amplitude of the FRP/ERP-curves in response to old vs. new words in the 15 jackknife-subsamples. The ANOVA with these subsample scores as the dependent measure and the 4 experimental setups as within-subject factor revealed that the old/new main effect differed significantly between the conditions; *FC(*3*,* <sup>42</sup>*)* = 11*.*4; *p <* 0*.*001. For the *post-hoc* Scheffè tests, the critical difference for the onsets of the old/new effect is 76 ms (corrected for the subsample-based error term and for a significance level of *p <* 0*.*05). Thus, the difference in the time points of the old/new effect was not reliable for the valid preview FRP condition compared to the two ERP conditions. Critically, the emergence of the old/new effect in the X-masked FRP condition was significantly delayed compared to all other conditions.

### **DISCUSSION**

The aim of the present study was to explore, whether an X-mask, which is commonly used to mask a target word in the invisible boundary paradigm, interferes with the foveal processing of

a target word. We administered, in the standard setting of the invisible boundary paradigm, FRPs for determining the relative time-course of word recognition processes at the highest possible temporal resolution. Additional event-related (ERP) setups assessed the time-course of word recognition without parafoveal preprocessing. The shortest processing times (around 180 ms) of the target words were observed, when a valid parafoveal preview was available in the FRP setting. As expected, when parafoveal preprocessing was prevented by an X-mask in the FRP setting, processing times were substantially prolonged. In the ERP conditions, the point-in-time of the old/new effect occurred substantially earlier than in the X-mask FRP condition.

The standard (i.e., fixed-pace) ERP condition and the FRP conditions did not only differ in the provision of parafoveal information. Another critical difference of the FRP conditions was that the acquisition of information was controlled by the participants themselves (i.e., internally; they moved their eyes when they were "ready" for processing the next word), whereas in the standard ERP setting the participants had no control of the acquisition of information. Thus, we reasoned that different findings of the ERP condition and the FRP conditions could (partially) reflect this difference, rather than (solely) the difference in the availability of parafoveal information. The self-paced ERP condition was administered to control for this possibility. However, the processing times were similar for the fixed-pace and the self-paced ERP conditions (around 220 ms) and hence we can discount the possibility that self-paced processing accounts for differences between the experimental setups.

The faster processing in case of valid previews compared to invalid previews clearly demonstrates that the marker effect of the present study (i.e., the old/new effect) was a suitable choice for assessing parafoveal preprocessing, in general, and the benefits and costs of the parafoveal previews, in particular. Parafoveal preprocessing is one of the key mechanism which enables fluent reading. It is, however, not the only mechanism. In case of natural texts, sentence level processing, such as inferring upcoming words from the preceding sentence context (i.e., the effect of word predictability) is also conducive for fluent reading. The objective of the present study, however, was to assess the "pure" (word-level) effect of previewing an X-mask vs. a valid preview. We reasoned that the presentation of lists of unrelated words revealed the pristine effect of the parafoveal X-masks, uninfluenced by effects of word predictability or contextual constraints.

The observed magnitude of the preview benefit in the valid preview FRP condition compared to the X-masked FRP condition of about 130 ms is surprisingly large in absolute terms. This prompts the question, whether this difference solely reflects preview benefits (to be attributed to the valid preview condition), or whether this difference might additionally reflect processing costs (due to interference in the X-mask condition). Comparing the X-mask FRP condition to the ERP conditions (which provided no parafoveal information) reveals that the latter interpretation is warranted. In the X-mask FRP condition, processing is substantially delayed (approx. 60 ms) compared to the ERP conditions.

The present finding suggests that an X-mask is not neutral, but interferes with the processing of the target word. The existence of a parafoveal preview benefit during reading is undisputed. Another question is, however, which particular processes are induced by the application of parafoveal masks. The present findings corroborates Kliegl et al.'s (2013) notion that by the application of parafoveal masks we probably assess a complex mixture of benefits and costs. Kliegl et al.'s study concerned differentletter masks, whereas the present study presented X-masks. The outcome, however, concur. Both studies indicate that parafoveal masks inflict processing costs on the subsequent recognition of the target word. The requirement for a baseline condition, that is, neutrality with regard to the theoretically relevant effect, is thus not fulfilled.

The implication of the findings is that processing benefits of a (partially) valid preview, are overestimated, when the estimate is derived from a baseline condition which presented parafoveal masks. Furthermore, it could be that ostensible preview benefits (of small magnitude) may do not reflect facilitation at all. To illustrate, Inhoff (1989) investigated, whether the final letters of a parafoveal word facilitate its subsequent recognition. The study revealed that preview "benefits" depended on the type of the baseline condition. The application of X-masks indicated a processing benefit for previewing the final letters of an upcoming word. Another condition used different letter masks and did not reveal such a benefit. In the light of the novel findings, which suggest that parafoveal masks interfere with foveal processing, it could be that Inhoff observed processing costs in the X-mask "baseline" condition, and not processing benefits in the valid preview condition. This is an issue which deserves further investigations with a proper baseline condition.

Interpreting the present findings as reflecting processing costs inflicted by a suboptimal baseline condition (i.e., a parafoveal mask) could be countered by the following argument: Processing costs due to invalid information necessarily imply processing benefits due to valid information. Put differently, if masking of, for example, the final letters of an upcoming word (e.g., *viewXX*) is thought to interfere with the subsequent recognition of the word, should not, in turn, the valid preview of the final letters (i.e., *viewer*) facilitate the recognition of the target word? This is not necessarily the case, if the abstraction level of the information which is extracted from the parafovea is taken into account: The interference of parafoveal masks can, in principle, occur at various different levels, from low-level visual information up to orthographic information. To illustrate, a parafoveal mask could interfere with the establishment of the correct visual representation of the target word (once fixated), because some low-level visual representation has to be amended or overwritten. The existence of such a low-level visual representation, however, is not necessarily on an orthographic level. It can be that some low-level visual representation of a parafoveal preview is established, but orthographic (or phonological, morphological, etc.) representations are not yet activated. In the foregoing example of the parafoveal preview *viewXX*, the *XX* might deter establishing a valid visual representation of the correct word, when it is fixated. Such a low-level visual interference, however, does not imply that the orthographic information of these final letters (e.g., abstract letter codes) is processed. In this case, a valid preview of the final letters does not necessarily provide discernible processing benefits, although a parafoveal mask at the same location may result in processing costs.

In the light of the present findings [and those by Kliegl et al. (2013)] it will be a crucial issue for future research to discover a neutral baseline-condition. Note that the problem is not limited to the issue of parafoveal processing (during reading), but is a general problem in all domains which conduct baseline-conditions in order to estimate the size (or the direction) of a theoretically relevant effect (e.g., in priming studies; Jonides and Mack, 1984). Thus, a solution for the problem developed in other domains could also be suitable to address the baseline-problem in studies on parafoveal preprocessing (and the boundary paradigm). Jacobs et al. (1995) presented an ingenious solution for priming studies (on visual word recognition), that is, the *incremental priming technique*. The technique provides a within-condition baseline which makes possible to use an experimental condition as a baseline with respect to itself. The logic is as follows: The informational value, that is, the salience of the primes is gradually increased (in steps from low salience toward full salience). In Jacobs et al. (1995) the salience was manipulated by varying the brightness of the primes. The critical aspect then is, how the processing times of the targets change in response to the increasing salience of the prime. If increasing salience speeds up target processing, then the prime is facilitatory. If, to the contrary, increasing salience prolongs target processing, then the prime interferes with processing. Thus, the critical advantage of the incremental priming technique is that an experimental condition is sufficient in itself for the examination, whether a specific type of information facilitates or interferes with processing. It is conceivable that the same logic can be applied for manipulating parafoveal previews (in combination with the boundary paradigm). The salience of a parafoveal preview could be varied in several ways such as varying the brightness/contrast of the parafoveal preview or visually degrading the preview by blurring or replacing pixels in the bitmap of the preview. We are currently testing several of these alternatives and the preliminary findings are promising.

To conclude, recent evidence justifies skepticism concerning the adequacy (more specifically, the neutrality) of those parafoveal masks which have been used most frequently in combination with the boundary paradigm. Different letter masks (Kliegl et al., 2013) as well as X-masks (the present study) inflict processing costs, that is, they interfere with foveal processing of the target words. These findings demonstrate that Rayner and McClelland debated in the early 1980's with good case over the question about a proper baseline. This debate, however, was a (mostly) theoretical discourse. Rayner and colleagues (back in 1981) already stated that this issue must be resolved empirically. However, the issue was not approached empirically for now more than 30 years. Thus, the jury is still out: On the one hand, on the issue of an adequate baseline condition for investigating preview benefits; on the other hand, on the validity of ostensible preview benefits which were inferred from contrasting valid previews with the previews of masks.

#### **REFERENCES**


*Psychol. Gen*. 140, 552–572. doi: 10.1037/a0023885


How preview space/time translates into preview cost/benefit for fixation durations during reading. *Q. J. Exp. Psychol*. 66, 581–600. doi: 10.1080/17470218. 2012.658073


fixations. *J. Exp. Psychol. Hum. Percept Perform*. 4, 529–544. doi: 10.1037//0096-1523.4.4.529


analysis. *Electroencephalogr. Clin. Neurophysiol*. 103, 395–404. doi: 10.1016/S0013-4694(97)00042-8

Võ, M. L., Jacobs, A. M., and Conrad, M. (2006). Cross-validating the Berlin Affective Word List. *Behav. Res. Methods* 38, 606–609. doi: 10.3758/BF03193892

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 07 March 2013; accepted: 04 July 2013; published online: 23 July 2013.* *Citation: Hutzler F, Fuchs I, Gagl B, Schuster S, Richlan F, Braun M and Hawelka S (2013) Parafoveal X-masks interfere with foveal word recognition: evidence from fixation-related brain potentials. Front. Syst. Neurosci. 7:33. doi: 10.3389/fnsys. 2013.00033*

*Copyright © 2013 Hutzler, Fuchs, Gagl, Schuster, Richlan, Braun and Hawelka. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

### **APPENDIX**

**Table A1 | Means (standard deviations) of the characteristics of the "old" vs. "new" target words separately for each of the four experimental setups.**


*aFrequency measures denote occurrences per million (CELEX; Baayen et al., 1993).*

*bNumber of words in the CELEX database with the same bigram.*

*cSummed frequency of the words with the same bigram.*

*dNumber of words of the same length which differ only by one letter.*

*eVõ et al. (2006); Zero denotes emotional neutrality—positive values denote positive emotional valence (max:* <sup>+</sup>3*); imageability ranges from 1 (low imaginability) to 7.*

**Table A2 | Means (standard deviations) of the characteristics of the pretarget words of the four experimental setups.**


*For a description of the measures see Table A1.*

## Cross-frequency interaction of the eye-movement related LFP signals in V1 of freely viewing monkeys

#### *Junji Ito1 \*, Pedro Maldonado2 and Sonja Grün1,3,4*

*<sup>1</sup> Institute of Neuroscience and Medicine (INM-6), Computational and Systems Neuroscience, Forschungszentrum Jülich, Jülich, Germany*

*<sup>2</sup> BNI, CENEM and Programa de Fisiología y Biofísica, ICBM, Facultad de Medicina, Universidad de Chile, Santiago, Chile*

*<sup>3</sup> Theoretical Systems Neurobiology, RWTH Aachen University, Aachen, Germany*

*<sup>4</sup> RIKEN Brain Science Institute, Wako-Shi, Japan*

#### *Edited by:*

*Andrey R. Nikolaev, KU Leuven, Belgium*

#### *Reviewed by:*

*Alberto Mazzoni, Italian Institute of Technology, Italy Conrado A. Bosman, University of Amsterdam, Netherlands*

#### *\*Correspondence:*

*Junji Ito, Institute of Neuroscience and Medicine (INM-6), Forschungszentrum Jülich, Wilhelm-Johnen-Straße, Jülich 52428, Germany. e-mail: j.ito@fz-juelich.de*

Recent studies have emphasized the functional role of neuronal activity underlying oscillatory local field potential (LFP) signals during visual processing in natural conditions. While functionally relevant components in multiple frequency bands have been reported, little is known about whether and how these components interact with each other across the dominant frequency bands. We examined this phenomenon in LFP signals obtained from the primary visual cortex of monkeys performing voluntary saccadic eye movements (EMs) on still images of natural-scenes. We identified saccade-related changes in respect to power and phase in four dominant frequency bands: delta-theta (2–4 Hz), alpha-beta (10–13 Hz), low-gamma (20–40 Hz), and high-gamma (>100 Hz). The phase of the delta-theta band component is found to be entrained to the rhythm of the repetitive saccades, while an increment in the power of the alpha-beta and low-gamma bands were locked to the onset of saccades. The degree of the power modulation in these frequency bands is positively correlated with the degree of the phase-locking of the delta-theta oscillations to EMs. These results suggest the presence of cross-frequency interactions in the form of phase-amplitude coupling (PAC) between slow (delta-theta) and faster (alpha-beta and low gamma) oscillations. As shown previously, spikes evoked by visual fixations during free viewing are phase-locked to the fast oscillations. Thus, signals of different types and at different temporal scales are nested to each other during natural viewing. Such cross-frequency interaction may provide a general mechanism to coordinate sensory processing on a fast time scale and motor behavior on a slower time scale during active sensing.

**Keywords: local field potential, oscillation, saccade, natural vision, cross-frequency coupling**

### **INTRODUCTION**

Living organisms have the ability to actively explore their surroundings using their sensory organs. Such behavior is called active sensing and it includes for instance, sniffing for odor sensation (Uchida and Mainen, 2003), whisking for touch sensation (Kleinfeld et al., 2006), and eye movements (EMs) for visual sensation (Land, 1999). All these sensing behaviors are performed as rhythmic repetitions of short, discrete sampling actions. Intriguingly, the sampling frequencies in active sensing are similar across different sensory modalities; the frequencies typically lie within the delta-theta frequency band (2–10 Hz) (Schroeder et al., 2010; Cao et al., 2012). This may indicate that this frequency band reflects an optimal time scale for the coordination of the brain activities in the sensory and the motor systems required for active sensing (Kleinfeld et al., 2006; Uchida and Kepecs, 2006; Schroeder et al., 2010).

In natural vision, primates achieve discrete sampling of the visual environment by the combination of visual fixations (with typical durations of 200–400 ms) and ballistic EMs, called saccades, that direct the gaze from one fixation location to the next. While the neural mechanisms underlying the execution of saccades and the neural activities in the visual cortices around the time of saccades have been extensively studied, those related to active visual sensing, i.e., visual exploration with voluntary, successive saccades, have remained largely unknown. Recently, Ito et al. (2011) showed in monkeys freely viewing natural-scene images that the local field potential (LFP) in V1 expresses oscillatory modulations in the beta frequency band (10–25 Hz) that are locked to the onset of saccades. They also found that the onset of the visually evoked spiking activity is phase-locked to these LFP modulations, which was identified as the mechanism for the occurrence of excess spike synchrony found among V1 neurons (Maldonado et al., 2008). These results suggest a functional relevant role of the beta band LFP oscillations in the coordination of EMs and visual sensory processing during natural viewing behaviors.

Other studies have also demonstrated instrumental roles of oscillatory brain signals in natural vision. Belitski et al. (2008) and Mazzoni et al. (2011) showed that during passive viewing of natural movies while maintaining prolonged fixation on a central spot (i.e., Without EMs), the delta-theta frequency band (1–8 Hz) component of the LFP in V1 was modulated coherently with the temporal changes in the contrast level of the movies, on the same time scale as the LFP oscillations. Thus, LFP signals carry information about the slow changes in the visual stimulus, which is not contained in the spiking activities in the same area. Rajkai et al. (2008) reported modulations in current-source density signals in the delta-theta frequency band related to spontaneous EMs in the dark, and Bosman et al. (2009) reported micro-saccade related LFP oscillations in the same frequency band. Thus, converging evidence suggests a distinct functional implication of the oscillatory activity in the delta-theta frequency band in visual processing.

We aimed here at understanding how these various oscillatory activities on different time scales are interrelated to each other, and what might be their functional role. A potential mechanism for cross-frequency interactions could be implemented by phaseamplitude coupling, (PAC) i.e., modulation of the amplitude of faster oscillations by the phase of slower oscillations (Jensen and Colgin, 2007; Canolty and Knight, 2010). This type of coupling has recently been found ubiquitously in various brain regions across a number of species (Bragin et al., 1995; Lakatos et al., 2005; Canolty et al., 2006; He et al., 2010), including visual cortices of macaques (Lakatos et al., 2005) and humans (Osipova et al., 2008; Händel and Haarmeier, 2009). Though the relevance of such cross-frequency coupling in brain functioning is yet to be elucidated, one intriguing hypothesis is that it provides a mechanism for the coordination of fast, spike-based computation with slower, external sensory and motor events (Canolty and Knight, 2010; Giraud and Poeppel, 2012). Thus, such a mechanism may be at work in free viewing which continuously requires such coordination of sensory (visual) inputs with motor action, as do occur during EMs.

With the aim of elucidating the interaction between the EMrelated LFP activities on different time scales, we examined the changes in the spectral power and phase of LFP oscillations in monkey V1, in relation to self-initiated saccades during free viewing of natural-scene images. We found that the delta-theta band oscillations of the LFP are phase-locked to the timing of voluntary EMs, and that the degree of this phase-locking is strongly correlated with the degree of LFP power modulation in higher frequency bands, as expected from the hypothesis of PAC.

### **MATERIALS AND METHODS**

#### **EXPERIMENTS**

All experiments followed institutional and NIH guidelines for the care and use of laboratory animals. The experimental procedures are documented in detail in previous papers (Maldonado et al., 2008; Ito et al., 2011). Here, we therefore summarize key aspects of the experiments. Four capuchin monkeys (referred to as D, S, M, and G) participated in the experiments, where they were presented with a series of natural-scene images. Each image was presented for 3–7 s (depending on the monkeys) interleaved by a blank screen with a fixation spot. A computer monitor (frame rate: 60 Hz) located 57 cm in front of the animals, subtending 40 × 30◦ of visual angle, was used for the presentation of naturalscenes and visual control stimuli. The monkeys were allowed to make self-initiated EMs during the natural-scene image presentation, and rewarded with a drop of juice if they maintained their gaze within the edges of the monitor during the presentation. Upon the presentation of the blank screen with a fixation spot, they were required to keep their gaze on the fixation spot for 1 s to be rewarded. This was relevant only for maintaining the attention of the monkeys during the experiment, and hence the data recorded during this period of the experiment was not used in the current analysis. The total number of natural image presentations for each monkey is as follows: 427 presentations for monkey D, 776 presentations for monkey S, 793 presentations for monkey M, and 145 presentations for monkey G.

### **DATA COLLECTION**

#### *Eye-movement recordings*

For the recording of the EMs, we implanted a scleral search coil in one eye of each monkey (Judge et al., 1980). Vertical and horizontal eye positions were monitored with a search coil driver (DNI Instruments, Resolution: 1.2 min of arc), and then stored in a hard disk at 2 kHz sampling rate. The onsets of saccades and fixations were extracted based on the following definitions. Saccades were defined as EMs with an angular velocity higher than 100◦/s and lasting for at least 5 ms. In addition, saccades were required to exhibit a minimum acceleration of 170◦/s2. The onset and the offset of saccades were defined as the moments when the velocity threshold was crossed from below and above, respectively. Saccade duration was defined as the interval between the onset and the offset of a saccade. Post-saccade periods were classified as fixations when they lasted at least 100 ms with the eye position maintained within 1◦ off of the gaze location reached at the end of the preceding saccade. Sustained movements with angular velocities ranging from 70 to 150◦/s and durations of at least 100 ms were classified as drifts, the neuronal recordings during which we did not analyze in the present study. The total numbers of detected saccades for the four monkeys were 2291, 2365, 11,467, and 1448, for monkeys D, S, M, and G, respectively.

#### *LFP recordings*

LFP signals in the primary visual cortex were recorded with an array of eight individually adjustable custom fabricated nichrome tetrodes (1–2 M impedance). The tetrodes were positioned in a circular array, with a center to center distance of ∼400µm. LFP signals were selected from the first electrode of each tetrode. As reference we utilized a metal screw anchored in the medial line of the occipital area of the skull. The signals were amplified (10 K), band-pass filtered (1–300 Hz) and then stored in an electronic device at 2 kHz sampling rate. A notch filter was applied off-line to the LFP signals in order to remove the 50 Hz noise of the power line. Thus, in one recording session we sampled neuronal signals from 8 recording sites, and we performed about 20–100 recording sessions per monkey. The total number of recording sites for each monkey is as follows: 274 sites for monkey D, 388 sites for monkey S, 822 sites for monkey M, and 120 sites for monkey G.

#### **DATA ANALYSIS**

#### *Inter-saccade interval and saccade frequency*

We denote the series of saccade onset times during one trial, i.e., a 3–7 s image presentation, as τ*<sup>i</sup>* (1 < *i* < *N*), where *N* is the total number of the saccades performed during the trial. Inter-saccade intervals (ISIs) are defined as the intervals between successive saccade onsets, i.e., τ*<sup>i</sup>* <sup>+</sup> <sup>1</sup> − τ*i*(1 < *i* < *N* − 1). For each of the monkeys, we collected ISIs from the whole recording sessions and calculated their histogram and median. The inverse of the median ISI was considered as saccade frequency, i.e., an estimate of the repetition rate of saccades.

#### *Extraction of power and phase of LFP oscillations*

For the estimation of the instantaneous power and phase of the LFP signals we used a wavelet transform (WT) with - Morlet wavelets, defined at frequency *f* and time *t* by *f* · exp *i*2π*f* (*u* − *t*) exp <sup>−</sup>(*<sup>u</sup>* <sup>−</sup> *<sup>t</sup>*)2/(2σ2) (Le Van Quyen et al., 2001). The parameter σ was set to 5/(6*f*), so that a wavelet contains about 5 cycles of oscillations. The center frequency *f* of the wavelet was varied within a range between 1 and 256 Hz, starting from 1 Hz and multiplied by 1.2 until it exceeded 256 Hz. This results in a uniform sampling of frequencies on a log scale. From the obtained WT at time *t* and frequency *f*, denoted as *S*(*t*, *f*), we extracted the instantaneous power *A*(*t*, *f*) and the phase φ(*t*, *f*) as the squared norm (*A*(*t*,*f*) = Re *S*(*t*,*f*) <sup>2</sup> <sup>+</sup> Im *S*(*t*,*f*) 2 ) and the argument (φ(*t*,*f*) = atan2[Im *S*(*t*,*f*), Re *S*(*t*,*f*)]) of the WT, respectively.

#### *Saccade-onset triggered average of instantaneous LFP power*

To study the modulations of the instantaneous LFP power in relation to saccadic EMs, we computed the time-resolved average of the instantaneous power triggered by saccade onsets. To do so, we first normalized the LFP power time series to the zscore using the mean and the standard deviation of the LFP power across all trials. Then these segments of the instantaneous LFP power time series, denoted as *Ai*(τ, *f*) = *A*(τ*<sup>i</sup>* + τ,*f*), where τ*<sup>i</sup>* is the *i*-th saccade onset time and τ taken as −100 to 300 ms around saccade onset, were averaged to yield the averaged power *A*(τ,*f*) = (1/*N*) *<sup>N</sup> <sup>i</sup>* <sup>=</sup> <sup>1</sup> *Ai*(τ, *<sup>f</sup>*). The parameters <sup>τ</sup> and *<sup>f</sup>* are the variables for the time in relation to saccade onset and the frequency of the WT, respectively, constituting the axes of the time-frequency plots shown in **Figure 3**.

#### *Saccade-onset triggered inter-saccade phase consistency of instantaneous phase*

To study the locking of LFP oscillations to the onset of saccadic EMs, we calculated the inter-saccade phase consistency (ISPC) of the oscillation phase, which quantifies how coherent the phases are across saccades, at each time step (in relation to saccade onset) and each frequency, defined as ISPC(τ,*f*) = (1/*N*) *<sup>N</sup> <sup>j</sup>* <sup>=</sup> <sup>1</sup> exp *i*φ*j*(τ, *f*) , with <sup>φ</sup>*i*(τ, *<sup>f</sup>*) <sup>=</sup> φ(τ*<sup>i</sup>* + τ, *f*) (−100 <τ< 300 [ms]). As for the LFP power, the parameters τ and *f* are the time in relation to saccade onset and the frequency of the WT, respectively, and they constitute the axes of the time-frequency plots shown in **Figure 4**.

#### *Definition of frequency components*

In the further analyses we focused on four distinct frequency components which were defined separately for each of the monkeys based on their time-frequency profiles of the LFP power and the IPSC. The center frequency of the first component was defined to be the saccade frequency of each monkey. The center frequencies of the second and the third components are defined as the frequencies (rounded to integer values) at which two peaks of the LFP power were located at 100 ms after saccade onset, and thus vary across monkeys. The center frequency of the last component was taken to be 180 Hz for all monkeys. The frequencies were defined as the center frequencies obtained by the WT of the LFP recordings at these frequencies with the same mother wavelet and the same way to define the value of the parameter σ as described in "Extraction of Power and Phase of LFP Oscillations." We termed these four components as delta-theta, alpha-beta, lowgamma, and high-gamma components, respectively. Their center frequencies are summarized in **Table 1**. The band widths of these components are proportional to their center frequencies *f*<sup>c</sup> and can be derived approximately as [*f*<sup>c</sup> − 1/(πσ ), *f*<sup>c</sup> + 1/(πσ )].

#### *Correlation between the ISPC and the average LFP power*

We examined whether the variability in the delta-theta band ISPC values across recording sites is related to that of the LFP power values of the other frequency components by cross-correlating these measures. For each recording site, we sampled the mean ISPC values and the mean LFP power values in intervals 0–150 ms (for the delta-theta component), 50–150 ms (for the alpha-beta and the low-gamma components), and 100–200 ms (for the highgamma component) from saccade onset. These time ranges were determined so that they are centered at the peak timing of the respective measures and include the whole episode of their EM-related modulations. Pearson's correlation coefficients were calculated between the delta-theta band ISPC and the LFP power of all frequency components. Significance of the obtained correlation coefficients deviating from zero was tested by computing the two-sided *p*-values of the coefficients based on the t-distribution with the corresponding degree of freedom (i.e., sample size −2).

#### **RESULTS**

#### **EYE MOVEMENTS**

As reported in previous papers (Maldonado et al., 2008; Ito et al., 2011; Berger et al., 2012), the monkeys voluntarily made exploratory EMs when they were presented with natural-scene images. **Figure 1A** shows a typical trace of the EMs performed by the monkeys. The time courses of the eye positions in horizontal and vertical directions are shown in **Figure 1B**, together with the eye velocities derived from them by calculation of temporal derivative. Saccades appear as "spikes" of high velocities (lower panel).

**Table 1 | Center frequencies of the frequency components identified by the time-frequency analysis of the LFP power modulation and the inter-saccade phase consistency for each of the monkeys.**


**FIGURE 1 | (A)** Representative trace of eye movements performed by monkey D during free viewing of a natural-scene image (3 s presentation). The trace (white) is superimposed on the presented image, which is darkened for the purpose of a better visibility of the trace. Fixation points, seen as "knots" along the trace, are marked by their onset times measured from the image onset. **(B)** The top panel shows the time series of the

horizontal (red) and the vertical (blue) eye positions during the eye movements shown in **(A)**. The origin, i.e., 0◦ for the horizontal and the vertical directions, is the center of the image. The bottom panel shows the velocity of the eye movements as derived from the eye positions shown in the top panel. The dotted line indicates the threshold for the detection of saccades (see "Materials and Methods" for the detection criteria).

We recorded EMs from four monkeys while they performed free viewing of natural-scene images. The onsets and offsets of the saccades were detected based on the velocities and accelerations of the EMs (see "Materials and Methods" for the details). Note that in our definition the onset of a saccade is identical to the offset of the previous fixation, and the offset of a saccade is identical to the onset of the following fixation. To characterize the dynamics of the EMs, we computed the distribution of the intervals between the onsets of successive saccades (ISIs). In all four monkeys, the ISIs exhibited a unimodal distribution with a long tail toward longer intervals (**Figure 2**, main plot). The median of the ISIs of the individual monkeys varied in a range between 220 and 400 ms. We also computed the distribution of saccade durations, defined as the time interval between onset and offset of a saccade. The saccade duration distributions were more consistent across the monkeys than the ISI distributions: the median saccade durations were about 30 ms for all four monkeys (**Figure 2**, inset). This means that the differences of the median ISIs are mostly due to differences in the durations of fixations.

As a measure of the time scale of the saccade repetitions performed by the monkeys, we defined *saccade frequency* as the inverse of the median ISI. The saccade frequencies of the individual monkeys all lay within the delta-theta frequency band; specifically, 2.49, 3.40, 3.98, and 4.38 Hz for monkey S, D, M, and G, respectively.

#### **SACCADE-RELATED CHANGES IN THE OSCILLATORY POWER OF THE LFP**

LFP signals from the primary visual cortex were recorded concurrently with the EMs. To assess the modulation of the LFP activity related to the EMs, we first analyzed the temporal changes in the LFP power in relation to the onset of saccades. The LFP power was estimated using the Morlet WT, normalized (subtraction of the mean and division by the standard deviation) separately for each frequency in the range of 1–256 Hz, and then averaged across individual saccades with the saccade onset as trigger (see "Materials and Methods" for analysis details).

The resulting time-frequency profiles of the LFP power modulations (**Figure 3**) were consistent across four monkeys. In all the monkeys, noticeable changes in the power were restricted to frequencies above ∼8 Hz. This result is reasonable given the following argument. An estimation of the instantaneous power of an oscillation requires an observation lasting for roughly one cycle of an oscillation. Accordingly, that the waxing and waning of the oscillatory power becomes visible within one fixation period (200–400 ms on average) requires more than two cycles contained within this time period. This requirement can only be met for oscillations faster than ∼8 Hz.

The most prominent modulation in power is observed in the frequency range between 8 and 64 Hz. Two separate components are present within this range, with frequency bands and their peak frequency varying across the monkeys, but on average covering one band of 8–32 Hz and the other of 16–64 Hz. Both of these components exhibit an increase in power starting at around 50 ms after saccade onset, reaching the peak at around 100 ms, and are then followed by a period of reduced power.

Another component of power modulation is observed in a frequency band above ∼100 Hz. The power of this component starts to increase at around 100 ms, which is considerably later than the components in the lower frequency ranges (8–64 Hz). The degree of the power modulation of this component was much smaller than those of the lower frequency components.

#### **SACCADE-RELATED PHASE-LOCKING OF LFP OSCILLATIONS**

We analyzed the relation of the phase of the oscillation to the timing of EMs. We estimated the phase of the LFP oscillations using

**FIGURE 2 | Histograms of the inter-saccade intervals (ISI) (main plot, 5 ms bin) and saccade durations (inset, 1 ms bin) for the four monkeys (color code: red, green, blue, and magenta for monkeys D, S, M, and G, respectively).** Vertical lines indicate the medians of the respective histograms.

the Morlet WT and computed the phase consistency across saccades (ISPC; see "Materials and Methods" for analysis details). ISPC values were computed in a time-resolved manner aligned to saccade onset and separately for each frequency in the same ranges as used for the power modulation analysis.

The time-frequency profile of the ISPC was, similarly to the power modulation, consistent across monkeys (**Figure 4**). All monkeys exhibited high phase consistencies at their respective saccade frequencies. The highest ISPC value in this frequency band was observed at 100–200 ms after saccade onset. In addition, we found another ISPC component in a frequency range between 8 and 32 Hz, which roughly corresponds to the lower one of the two power modulation components in the range of 8–64 Hz. No significant phase consistency was observed in the frequencies above 64 Hz.

Based on the above observations of the time-frequency profiles of the power and the IPSC, we defined four distinct frequency components of EM-related LFP activity (see the "Materials and Methods" for the detailed definition procedure). The lowest component is defined to be centered at the saccade frequency of each monkey. This component is characterized by the strong phase consistency observed in all animals. The other three components are centered at the frequencies where the power modulation is maximal. The center frequencies of the second and the third lowest components are defined as the frequencies of the maximum LFP power at 100 ms after saccade onset, and thus vary across monkeys. The center frequency of the highest component is taken as 180 Hz for all monkeys. We termed these components as delta-theta, alphabeta, low-gamma, and high-gamma components from the lowest to the highest frequency. The center frequencies of these components are summarized for each of the monkeys in **Table 1**.

#### **CORRELATION BETWEEN THE POWER AND ISPC VALUES ACROSS FREQUENCIES**

The power and ISPC values computed from recordings in different sessions, i.e., at different electrode positions, exhibited variability around the average values that are shown in **Figures 3** and **4**, and it sometimes became very large even across simultaneous recordings in an identical recording session. We examined whether this variability is correlated across different frequency components; in other words, whether there are interactions between the different frequency components. Recent studies on cross-frequency interactions have reported modulations of the amplitude of fast oscillations in relation to the phase of slow oscillations (PAC or nested oscillations; Bragin et al., 1995; Lakatos et al., 2005; Canolty et al., 2006; He et al., 2010; see sketch in **Figure 5A**). If we assume that also here such an interaction exists, in particular between the phase of the delta-theta band oscillations and the amplitudes of the other frequency components, then, it is expected that the amplitude of the fast oscillations averaged across saccades would be positively correlated with the ISPC of the slow oscillations. In the case where the ISPC is high (**Figure 5B** left), the positive modulation of the fast amplitude by the slow phase would occur at a consistent timing in relation to the timing of EM, while, in the case where the ISPC is low (**Figure 5B** right), such amplitude enhancement would occur at arbitrary timings and hence the average amplitude becomes small compared to the other case. As a consequence, by examining the correlations between the delta-theta ISPC values and the powers in the other frequency components, we can infer which frequency pairs are phase-amplitude-coupled.

To test the outlined hypothesis, we analyzed the variability of the spectral power and the ISPC values across different recording sites. We first computed the mean ISPC values of the delta-theta band component between 0 and 200 ms after saccade onset separately for each of the single recording sites. The distribution of the obtained mean ISPC values (**Figures 6A–D**, top panels) is broad, indicating a large variability of the delta-theta band ISPC across different recording sites. We then computed the same kind of distribution for the LFP power for each frequency component (**Figures 6A–D**, left columns). The distributions for the alphabeta band and the low-gamma band components are found to be wide, including large positive values and a mean larger than zero, while the distributions for the delta-theta and the high-gamma band components are centered around zero and are narrowly peaked.

We further examined whether the variability in the deltatheta band ISPC is correlated to the variability of the power in other frequency bands. **Figures 6A–D** (right columns) shows scatter diagrams of the power and the ISPC values for all pairs of frequency components. Positive correlations are found for all comparisons, with slopes significantly different from zero for most of the cases. Strongest correlations, in terms of both the slope of the linear regression and *R*-value, are found between the delta-theta ISPC and the alpha-beta or the low-gamma band power, suggesting cross-frequency coupling between the phase of the delta-theta band and the saccade-related power modulation in the alpha-beta or the low-gamma band. In the Discussion section we will provide possible interpretations of these results and discuss a plausible mechanism underlying these observations.

#### **TIMING RELATIONSHIP OF LFP SIGNALS TO SACCADE- AND FIXATION-ONSETS**

Ito et al. (2011) have recently shown that the EM-related LFP oscillations in the alpha-beta band are locked to saccade onset rather than to fixation onset. Here we examine this timing relationship by computing averages of the LFP power and the LFP phase resolved by saccade duration. Therefore, we computed the saccade-onset triggered average of the LFP phase for the deltatheta band component and of the LFP power for the other frequency components separately for different saccade durations (in steps of 2 ms bins). The results are shown as a function of time and saccade duration in **Figure 7**. We found that the delta-theta phase is largely locked to fixation onset rather than to saccade onset, as indicated by the oblique stripes of iso-phase domains, while the alpha-beta power and the low-gamma power are locked to saccade onset, which is most clearly seen for monkeys M and G (**Figures 7C,D**). This suggests that the delta-theta band LFP oscillations are engaged in different aspects of visual processing than those related to the alpha-beta and the low-gamma band oscillation.

### **DISCUSSION**

We identified four frequency components of EM-related LFP activity based on the time-frequency characteristics of the LFP modulations in the primary visual cortex of monkeys performing voluntary visual exploration of natural-scene images. The center frequencies of the four components were found in the delta-theta band (2–4 Hz), the alpha-beta band (10–13 Hz), the low-gamma band (20–40 Hz), and the high-gamma band (>100 Hz). The strongest changes in the LFP power in response to EMs were observed in the alpha-beta and the low-gamma band components, while the phase-locking to the timing of EMs was strongest in the delta-theta band component. We found positive correlations between the degree of phase-locking in the delta-theta band and the magnitude of the power

increase after EMs in the other frequency bands. This correlation was strongest for the alpha-beta and the low-gamma band power.

The strongest phase-locking within the delta-theta band was observed at slightly different frequencies for different monkeys, but these frequencies systematically matched the individual saccade frequencies, i.e., the inverse of the median ISI. This result is consistent with a previous finding on the phase-locking of the delta-theta band LFP oscillations in V1 and V4 to the onset of micro-saccades during prolonged fixation (Bosman et al., 2009). This consistency offers a strong support for the view that regular saccades and micro-saccades constitute a functional continuum of ocular movements that influence the visual cortex (Otero-Millan et al., 2008; Hafed et al., 2009). Our results about the delta-theta band phase-locking are also consistent with previous findings on the delta-theta band LFP phase during passive viewing of natural movies (Belitski et al., 2008; Mazzoni et al., 2011). These studies found that the LFP phase in this frequency band contains information about the slow fluctuations of the luminance of a presented movie. The authors argued "Because movies contain most power at low frequency, it is conceivable that some of the features for which LFPs are selective are characterized by slow fluctuations and thus are reflected in LFPs at low frequency" (Belitski et al., 2008). We assert that, in natural vision, the temporal changes in the afferent input to the visual system on this time scale are caused by voluntary saccadic EMs, even if there are no movements in the external visual world. Thus, the phase-locking of the delta-theta band LFP oscillation in the present study is considered as an extension of the previous finding in passive movie viewing to the condition of active visual exploration.

The degree of the observed phase-locking in the delta-theta band was not homogeneous across different recording sites, but it was highly variable. We could not find any systematic relation between the strength of the phase-locking and (a rough estimate of) the recording depth within the cortical layers. This variability may be explained by the differences in the response properties of the local neuronal populations across V1 and/or the statistical properties of the natural image stimuli we used. An elucidation of this issue would require further experimentation.

Modulation of oscillatory power in relation to EMs was observed in the alpha-beta, the low-gamma, and the-high gamma frequency bands. The increase in power in the alpha-beta band is consistent with our previous results (Ito et al., 2011). The power modulations in the low- and the high-gamma bands, and their temporal relation to that in the alpha-beta band are novel findings of the present study. The low-gamma component shares a common time course of power changes with the alpha-beta component, while the high-gamma component shows a clearly different time course in the changes than the others. The degree of the power changes is larger in the alpha-beta and the low-gamma bands than in the high-gamma band. The phases of the high- and the low-gamma components are not locked to the onset of EMs, which is a signature of induced oscillations, while the power increase in the alpha-beta component is associated with phase-locking, which is a signature of evoked oscillations. All these differences in response properties across different frequency components strongly suggest that the neuronal activity in V1 during active vision is composed of multiple oscillatory components which have different mechanisms of generation.

The peak values of the power in response to EMs showed a large variability across recording sites, as was also found for the ISPC of the delta-theta component. We found that the variability in the ISPC and the power were not independent, but show positive correlation between the delta-theta band ISPC and the power of the other frequency components derived from identical electrodes. This correlation was particularly strong for the alpha-beta and the low-gamma power. A parsimonious interpretation of this result is that the delta-theta, the alpha-beta, and the low-gamma components are just the reflections of an identical physiological process that possesses power in a wide frequency range (for example, evoked oscillatory activity with a non-sinusoidal waveform). However, the results of our saccadeduration resolved analysis argue against this view. We found that the delta-theta phase is locked more to fixation onset than to saccade onset, while the power of the alpha-beta and the lowgamma is locked more to saccade onset. This strongly suggests that there are at least two separate underlying neuronal processes that are related to the onset of fixations and that of saccades, and that the former is responsible for the generation of the deltatheta component and the latter for the other (the alpha-beta and the low-gamma) components. Furthermore, recent studies have shown that, while the power of the high-gamma broad band activity reflects the amount of the spiking activity of local neuronal pools, the power in lower frequencies is more related to network oscillations (Ray et al., 2008; Ray and Maunsell, 2011). We found that the phase-locking of the delta-theta component is more strongly correlated to the power of the low-gamma component than to the power of the high-gamma component. This suggests that the LFP activity in the delta-theta band is not a mere reflection of the changes in the spiking activity of the local neuronal pool, but it would be related to network oscillations of neuronal excitability.

As illustrated in **Figure 5**, the observed correlation between the phase-locking in the delta-theta band and the evoked power in the alpha-beta and the low-gamma bands can be interpreted as cross-frequency interactions occurring via the mechanism of PAC (Jensen and Colgin, 2007; Canolty and Knight, 2010). Our observations of the ISPC-power correlation can be explained as a reflection of PAC between the slow LFP oscillation at the saccade frequency and the fast-evoked LFP oscillations in the alpha-beta or the low-gamma frequency band. Under the assumption of such PAC, a recording with high delta-theta ISPC would be associated with strong alpha-beta or low-gamma power, since the enhancement of the alphabeta or low-gamma amplitude at a proper delta-theta phase occurs at a consistent timing in relation to the timing of EM, which results in a high-evoked power on average across EMs (**Figure 5B** left). On the other hand, if the delta-theta phase is not locked to the timing of EMs, such amplitude enhancement occurs at arbitrary timing and hence the average-evoked power becomes smaller compared to the case with strong phase-locking (**Figure 5B** right). A possible mechanism underlying such crossfrequency interaction is, as proposed by Mazzoni et al. (2008, 2010), a modulation of the baseline excitability of V1 neurons by unspecific slow cortical activity. In natural vision, such a slow activity could be a top-down, predictive signal entrained to the rhythmic EMs (Lakatos et al., 2008), or could originate from the LGN activity that is rhythmically modulated by a corollary signal derived from the motor commands to the eye muscles (Wurtz et al., 2011). Recent studies have reported the evidence that visual attention is temporally modulated at the theta band rhythm (Landau and Fries, 2012) and that such modulation is mediated by cross-frequency interaction between the theta and the gamma band LFP activities (Bosman et al., 2012). Since EMs are tightly related to visual attention (Corbetta et al., 1998), the EM-related cross-frequency interaction between the delta-theta and the higher frequency components identified in the present study could be a candidate mechanism for modulation of attention during natural vision with voluntary EMs.

Previous studies on natural viewing have shown that firing rates of V1 neurons during exposure to complex scenes are characteristically low (Gallant et al., 1998; Vinje and Gallant, 2000; Olshausen and Field, 2005; MacEvoy et al., 2008; Maldonado et al., 2008). For example, Maldonado et al. (2008) reported that the peak firing rate during visual fixations is on average ∼15 Hz. Under such a condition, rate coding, i.e., information coding by spike counts during an certain period, would be unreliable, since given the low firing rates the number of spikes within one fixation would be at most 3–5 spikes and hence any additional spontaneous spiking during a fixation period could considerably alter the information content. This perspective is also supported by theoretical and experimental works that have proposed that information may not be encoded solely in firing rates but also in the precise and coordinated timing of action potentials, such as in the response latency (Gawne et al., 1996; Reich et al., 2001; VanRullen and Thorpe, 2002) or the spike timing in relation to background LFP oscillations (Montemurro et al., 2008; Nadasdy, 2009). We found in our previous studies that spike synchrony between V1 neurons exceeding chance synchrony predicted by the firing rates occurs and increases at around the onset of the rate change in response to visual fixation (Maldonado et al., 2008). In Ito et al. (2011) we additionally found that the first visually evoked spikes during fixations are locked to a specific phase of the LFP oscillations in the beta frequency band. Thus, the beta band LFP oscillations seem to provide a time-reference for spike synchrony among V1 neurons. Our present results suggest the enhancement of the alpha-beta power by phase-locking of the delta-theta oscillations to EMs. Taken together, these results point toward hierarchically organized brain activity in the temporal domain such that slower activities on the behavioral time scale influence the timing of single spikes via multiple levels of interaction between different time scales. Thus, experiments that employ voluntary, exploratory sensing behaviors by the animals provide the context for studying such temporal organization of neuronal activities and reveal the dynamic aspects of the sensory systems of the brain.

#### **AUTHOR CONTRIBUTIONS**

Experiments and data acquisition was performed by Pedro Maldonado. Data analysis was made by Junji Ito and Sonja Grün. Manuscript was written by Junji Ito, Pedro Maldonado, and Sonja Grün.

### **GRANTS**

Financial support for this study was provided by funding from Iniciativa Cientifica Milenio P10-001-F and P09-015-F (Pedro Maldonado), the Helmholtz Alliance on Systems Biology (Sonja Grün, Junji Ito), and German-Japanese Joint Computational Neuroscience Program (BMBF grant 01GQ1114) (Sonja Grün, Junji Ito).

### **REFERENCES**


computational principles and operations. *Nat. Neurosci.* 15, 511–517.


saccades. *J. Neurophysiol.* 99, 460–472.


their potential implications in electrocorticography. *J. Neurosci.* 28, 11526–11536.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 20 November 2012; paper pending published: 07 December 2012; accepted: 28 January 2013; published online: 14 February 2013.*

*Citation: Ito J, Maldonado P and Grün S (2013) Cross-frequency interaction of the eye-movement related LFP signals in V1 of freely viewing monkeys. Front. Syst. Neurosci. 7:1. doi: 10.3389/fnsys. 2013.00001*

*Copyright © 2013 Ito, Maldonado and Grün. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## Linking cortical visual processing to viewing behavior using fMRI

### *Jan Bernard C. Marsman1,2\*, Remco Renken1, Koen V. Haak2,3 and Frans W. Cornelissen2*

*<sup>1</sup> NeuroImaging Center, University Medical Center Groningen, Groningen, Netherlands*

*<sup>2</sup> Laboratory for Experimental Ophthalmology, University Medical Center Groningen, Groningen, Netherlands*

*<sup>3</sup> Radboud University Nijmegen, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, Netherlands*

#### *Edited by:*

*Sebastian Pannasch, Technische Universität Dresden, Germany*

#### *Reviewed by:*

*Michael A. Silver, University of California, USA Robert N. S. Sachdev, Yale School of Medicine, USA Jens R. Helmert, Technische Universitaet Dresden, Germany*

#### *\*Correspondence:*

*Jan Bernard C. Marsman, BCN NeuroImaging Center, University Medical Center Groningen, Antonius Deusinglaan 2, Groningen, 9713 AW, Netherlands e-mail: j.b.c.marsman@umcg.nl*

One characteristic of natural visual behavior in humans is the frequent shifting of eye position. It has been argued that the characteristics of these eye movements can be used to distinguish between distinct modes of visual processing (Unema et al., 2005). These viewing modes would be distinguishable on the basis of the eye-movement parameters fixation duration and saccade amplitude and have been hypothesized to reflect the differential involvement of dorsal and ventral systems in saccade planning and information processing. According to this hypothesis, on the one hand, while in a "pre-attentive" or ambient mode, primarily scanning eye movements are made; in this mode fixation are relatively brief and saccades tends to be relatively large. On the other hand, in "attentive" focal mode, fixations last longer and saccades are relatively small, and result in viewing behavior which could be described as detailed inspection. Thus far, no neuroscientific basis exists to support the idea that such distinct viewing modes are indeed linked to processing in distinct cortical regions. Here, we used fixation-based event-related (FIBER) fMRI in combination with independent component analysis (ICA) to investigate the neural correlates of these viewing modes. While we find robust eye-movement-related activations, our results do not support the theory that the above mentioned viewing modes modulate dorsal and ventral processing. Instead, further analyses revealed that eye-movement characteristics such as saccade amplitude and fixation duration did differentially modulate activity in three clusters in early, ventromedial and ventrolateral visual cortex. In summary, we conclude that evaluating viewing behavior is crucial for unraveling cortical processing in natural vision.

**Keywords: eye movements, fMRI, fixation-based event related fMRI, natural viewing behavior, dorsal stream, ventral stream, independent component analysis, scene perception**

### **INTRODUCTION**

In daily life, we make numerous eye movements. This natural viewing behavior of human observers has been characterized and studied extensively. One of the first and most famous studies is by Alfred Yarbus, who showed that human eye movement behavior depends upon task context and stimulus content (Yarbus, 1967). Since, numerous studies have confirmed this aspect of human viewing behavior (e.g., Rothkopf et al., 2007).

Unema et al. (2005) reported another aspect of human viewing behavior. Following the presentation of a novel scene, observers initially scan the scene by quickly making a series of relatively large saccadic eye movements. Each of these large-amplitude saccades is followed by a relatively brief fixation, enabling the observer to cover a large image region in the first few seconds of a presentation. Over time, the average duration of the fixations increases, while at the same time the average saccadic amplitude decreases. Such longer fixations in combination with smallamplitude saccadic eye movements allow for a more detailed inspection of scene elements (Antes, 1974; Unema et al., 2005; Over et al., 2007; Pannasch et al., 2008).

This behavior has been interpreted to imply that people buildup some sort of spatial map by quickly visiting key elements in the scene for further analysis at a later stage. This hypothesis is in line with findings of studies on scene perception (Fize et al., 2000; Rensink, 2004). In only a few milliseconds, the gist of a scene can be extracted in order to determine salient objects, which are then quickly scanned during an initial series of brief fixations. Over time, fixation duration increases to allow for more detailed inspection of specific elements in the scene.

Many lines of evidence suggest two separate information streams project from V1 into other brain regions (Ingle et al., 1967; Milner and Goodale, 1993, 2008; Velichkovsky, 2002). One stream—referred to as the ventral or "what" stream projects toward temporal areas of the brain and is involved in object analysis (Milner and Goodale, 2008). The second stream referred as the dorsal or "where" stream—projects to parietal areas and deals with spatial vision. Based on the previously described eye-movement findings, it has been suggested that natural viewing behavior can be categorized into two distinct types of viewing behavior that are associated with processing in the dorsal and the ventral pathways (Velichkovsky, 2002; Unema et al., 2005). Preattentive scanning behavior, evident from large saccades combined with short fixations, would reflect dorsal pathway processing. In contrast, attentive inspection behavior, evident from small saccades combined with long duration fixations, would reflect ventral pathway processing. Whether and how these two different types of viewing behavior indeed imply the involvement of these distinct neural systems is—at present—not known.

Here, we use combined eyetracking and fMRI to investigate the neural correlates of the different types of viewing behavior. Specifically, we test the hypothesis that short fixations coupled with large saccadic amplitudes—which would be related to the build-up of a spatial map—reflect dorsal stream processing. In contrast, longer fixations coupled with small saccades would show more activity in regions along the ventral visual stream.

### **METHODS**

#### **SUBJECTS**

Sixteen healthy right-handed subjects (three of whom were female) were scanned in a Philips 3 Tesla Intera MRI scanner (Philips, Best, The Netherlands). All subjects maintained normal healthy vision. All subjects gave informed consent and ethical approval was provided by the local medical ethical committee.

#### **STIMULI**

Stimuli were taken from the original eye tracking study by Unema et al. (2005) and consisted of 12 computer generated indoor scenes, each containing eight household objects (hereafter referred to as "normal scenes"). Furthermore, we created two additional sets by manipulating the original images (**Figure 1**): one set in which the background was removed, so that only the objects are visible on a solid grey background ("cutout objects"). In the other set the objects were scrambled, leaving the scene's background intact ("scrambled objects"). This scrambling was performed by rasterizing a square patch the size of each object in patches of 5 × 5 pixels, and shuffling these patches across the raster. Images were 800 × 600 pixels and were displayed on a translucent display positioned at the head-end of the fMRI scanner using a video projector (Barco, Kortrijk, Belgium) with a resolution of 1024 × 768 pixels. Participants viewed the screen via a mirror. The distance from the eyes to the screen was 75 cm, and the width and height of the translucent display was 44 and 34 cm, respectively. This subtends a visual angle of 32 × 25.5◦ for the entire screen. The stimuli were not presented in full-screen, due to known eye-tracking difficulties in the upper and lower corners of the screen: The corneal reflection would fall behind the lower eyelid when subjects would be looking entirely upwards. Moreover, when looking entirely downward, subjects tended to close their eyes more, which also resulted in loss of eye tracking. Therefore, the visual angle of the stimuli subtended 25 × 20◦. Each stimulus was shown for 10 s.

#### **EXPERIMENTAL PARADIGM**

Participants were instructed to perform normal viewing behavior during the experiment. After each set of four stimuli a fixation cross was shown for 10 s. Four functional runs and one anatomical scan was recorded (see the "functional imaging" section). During the first and third run the stimuli from "normal scenes" and "cutout objects" were displayed. During the other two runs, subjects were presented the "normal scenes" and the "scrambled objects" sets. Each participant viewed a stimulus two times during each run in a pseudo-random order: First, the "normal scenes" were randomly mixed with either the "cutout objects" (run 1 and 3) or the "scrambled objects" (run 2 and 4) to create a stimulus series. This series was presented twice during a run. All visual stimuli were programmed using the Psychtoolbox (Brainard, 1997), and fed to the projector using an Apple MacBook Pro laptop (2.33 GHz Intel Core 2 Duo processor with 2 GB's of RAM).

#### **EYETRACKING**

Eye movements made during the fMRI experiments were recorded using an MR compatible eyetracker (IviewX MRI) with a temporal resolution of 50 Hz. (SMI, Teltow, Germany). Before commencing the functional runs, calibration of the eyetracking system took place using a nine-point calibration technique. The nine points were placed on a grid covering the central 800 × 600 pixels of the display where the images were displayed. The calibration was validated, and a recalibration was performed when necessary until a good calibration was achieved.

#### **FUNCTIONAL IMAGING**

Four runs of 157 BOLD volumes (EPI) each were recorded with a Repetition Time (TR) of 2000 ms, Echo Time (TE) of 28 ms and

a flip angle of 70◦. Each functional volume contained 39 slices with an in-plane resolution of 64 × 64 pixels. The Field of View was set to 224 × 156 × 224 mm (voxel size: 3.5 × 4 × 3.5 mm). This setting was chosen to allow for recording of the whole brain. Furthermore, an anatomical T1 (Fast Field Echo) scan was recorded (160 slices with a resolution of 256 × 256 pixels). Field of View was 224 × 160 × 224 mm (voxel size: 0.8 × 1 × 0.8 mm).

#### **FIELD OF VIEW EXPERIMENT**

We tested the influence of the narrow bore of the MR scanner on eye movement parameters in a separate experiment performed outside the scanning environment. For this experiment, a 17 LCD monitor at a resolution of 1024 × 768 was used. Stimuli were presented a total of four times using two display sizes were used (full screen and half the size of the screen in both dimensions) and two presentation times (10 and 20 s). Stimuli contained all the images from the main experiment and were presented two times in random order within a block of identical presentation size and duration. 15 new participants (10 of whom were female) with healthy vision performed this experiment for which eye tracking was recorded (monocular, right eye) using an Eyelink 1000 (Desktop mount version) at a temporal resolution of 1000 Hz. The order of conditions was balanced across subjects to limit possible effects due to order of presentation. For stability, participants were asked to place their head into a chin rest. We examined the existence of separate viewing modes in eye behavior by plotting fixation duration vs. saccade amplitude as reported in Unema et al. (2005).

### **ANALYSIS**

Fixations were extracted using IViewX software (SMI, Teltow Germany) with minimum fixation duration set to 80 ms. All subsequent analyses were performed in Matlab 7.4 (Mathworks, Natick MA, USA). Saccadic amplitudes were calculated based on screen positions of subsequent fixations (not separated by blinks), due to the resolution of the eyetracker (50 Hz). Events where blinks occurred in between were filtered out.

Fixations during stimulus presentation were extracted and their durations were plotted against a binned timeline. In total, participants made 11027 fixations during stimulus presentation. For the initial analysis, fixations and subsequent saccades were classified into one out of four categories: Short fixations (< 200 ms) followed by small saccadic amplitudes (< 7.8◦, i.e., 250 pixels on screen), short fixations followed by large saccadic amplitudes (>= 7.8◦) ("scanning"), long fixation durations (>= 200 ms) followed by small saccadic amplitudes ("inspection") and long fixation durations followed by small saccadic amplitudes. Cutoff values were data-driven and determined based on the 70th percentile (30% short fixations, 70% long fixations, 30% small saccades, 70% large saccades). The onsets and durations of all fixations in these categories were written to a design file in SPM format. Beforehand, eye movement timings were orthogonalized on the presentation sequence (block design) for each stimulus types (normal scenes and cutout objects, random objects). This orthogonalization was performed to rule out possible effects due to the type of scene ("normal scenes", "cutout objects" or "scrambled objects").

#### **fMRI ANALYSIS**

Preprocessing of the functional imaging data was performed in SPM5<sup>1</sup> in Matlab and consisted of realignment to correct for subject movement, coregistration to align all functional data to the subjects' T1 image, normalization to convert all images to MNI space. Smoothing was applied using a full width at half maximum (FWHM) of 8 mm. Statistical parametric maps were generated using the design files with the canonical haemodynamic response function.

First, the overall effect of both scanning and inspection types of eye movements were calculated vs. baseline (i.e., the level of brain activity while a white fixation cross was presented on a black screen). A direct comparison of both modes of viewing behavior was constructed using the contrasts "scanning > inspection" and "inspection > scanning".

#### **INDEPENDENT COMPONENT ANALYSIS**

We conducted a spatial Group Independent Component Analysis (ICA) of 30 components using the Group ICA of fMRI toolbox version 1.3g (Calhoun et al., 2001) . This number of components was estimated beforehand using the mean value of Minimum Description Length (MDL) across subjects (McKeown et al., 1998; Calhoun et al., 2001). The MDL provides a criterion for the selection of models, regardless of their complexity, without the restrictive assumption that the data form a sample from a "true" distribution. Next, we tested whether any of the components was significantly related to viewing behavior using the following two contrasts: (1) short fixation durations > long fixation durations; (2) small saccades > large saccades and two interaction contrasts; (3) short fixations combined with small saccades > short fixations combined with large saccades; and (4) long fixations combined with small saccades > long fixations combined with large saccadess. Note that for these particular tests, their reverse is equivalent. A component was considered to be significant on the basis of *p* < 0.05, bonferroni corrected.

To further explore the effect of both fixation duration and saccade amplitude on the activity in the significant components, we extracted beta weights (i.e., effect sizes) averaged across each component map. For this, event-related statistical parametric models were built with all fixation events in one regressor and with three parametric modulations; one for fixation duration, one for saccade amplitude and one for the interaction term "fixation duration × saccade amplitude". This resulted in a total of four beta weights. Finally, a series of *t*-tests were performed to investigate difference in effect size between each pair of significant components.

### **RESULTS**

#### **ANALYSIS OF VIEWING BEHAVIOR**

Results from the eye tracking recordings (**Figure 2**) show that fixation duration increases across the 10 s of stimulus presentation (**Figure 2**, Panel **D**). Initial fixation durations are relatively short. Fixation durations increase rapidly over the first 2 s, and remain relatively constant after that. This behavior is very similar to that reported by Unema et al. (2005). At the same time, saccadic

<sup>1</sup>http://www.fil.ion.ucl.ac.uk/spm/software/spm5/

**FIGURE 2 | Eyetracking results**. Panel **A** indicates the distribution of fixations and following saccades plotted in terms of fixation duration against saccade amplitude. Panels **B** and **C** show the distribution of all viewing events (fixation followed by a saccade) of each viewing mode across the display of the stimulus (red = scanning, blue =

inspection). Count is the total number of events for all subjects. Panel **D** shows fixation duration over stimulus presentation time with a running average across 50 ms. Panel **E** shows the saccade amplitude over stimulus presentation time with a running average across 50 ms.

amplitude remains relatively constant over entire duration of the presentation (**Figure 2**, Panel **E**). This deviates somewhat from that reported by Unema et al. (2005). They described an initially steep decrease in saccade amplitude as a function of the stimulus presentation time. Panels **B** and **C** of **Figure 2** show eye movement behavior after categorizing it in terms of the scanning (Panel B) and inspection (Panel **C**) types of behavior. Both types of viewing behavior are encountered approximately equally frequently across the presentation duration of the images. Panel **A** in **Figure 2** provides a scatter plot of one examplary individual showing fixation duration vs. saccade amplitude.

#### **FIELD-OF-VIEW EXPERIMENT**

A possible cause for the difference between our present results and those of Unema et al. (2005) is the relatively small field of view of the display in the MR scanner. To examine the influence of display size, we compared fixation durations and saccade amplitudes for two different field of views. This experiment was conducted outside the MR scanner with different subjects. One display was comparable in size to that used by Unema et al. (2005) (31 × 26◦) whereas the second one was comparable to that used in the scanner (25 × 20◦).

**Figure 3** shows fixation duration (left) and saccade amplitude (right) plotted as a function of presentation time using a bin size of 500 ms. These results shows that the increase of fixation

**FIGURE 3 | Results from the Field-of-View Experiment**. Figures show fixation duration duration (left) and saccade amplitude (right) for four conditions (Full/Half size presentation of a stimulus, presentation duration for 10/20 s).

duration with presentation time remains present also for relatively smaller stimuli, but that the initial decrease in saccade amplitude is smaller.

To test this, we performed a least squares linear fit withinsubject across the 10 s stimulus presentation duration on both fixation duration and saccade amplitude. For saccadic amplitude, we found that for the "Half-size, 10 s" condition the average fitted slope was −0.02◦ (standard error of 0.0275◦), whereas for the "Full-size, 10 s" condition the slope was −0.36◦ (standard deviation of 0.14◦). This difference was significant (*p* < 0.05; paired *t*-test).

For fixation duration, for the condition "Half-size, 10 s" the average fitted slope was 0.25 ms (standard error of 0.225 ms) while for the condition "Full-size 10 s" the average fitted slope was 0.2 ms (standard deviation of 0.275 ms). This difference was not significant. Therefore, this experiment indicates that the smaller decreasing trend in saccadic amplitude inside the MR scanner can be attributed to the limited field of view of the display used.

#### **fMRI RESULTS**

**Figure 4** shows the brain activations for the two categories of viewing behavior when compared against fixation cross (baseline). Scanning behavior, i.e., short fixations followed by large saccades, is correlated with activity that predominates in ventromedial occipital areas. Inspection behavior, i.e., longer fixations followed by small saccades, is correlated with activity in more ventrolateral occipital regions. At first glance, there appears to be little overlap in the regions activated by the two different categories of viewing behavior. **Figure 5** shows the statistical parametric map for the direct comparison of the two viewing modes ("scanning > inspection"). This analysis indicates, however, that only at a relaxed threshold, (*p* < 0.001, uncorrected), a statistical differentiation of the two viewing modes in the ventral visual cortex can be demonstrated. The contrast "inspection > scanning" did not reveal significant results.

Standard GLM, as performed above, informs about the activity of certain areas in certain conditions, but not about the degree to which a particular region can be considered to contribute to a network. ICA, on the other hand, will reveal independent and separate networks, that can than be associated with a particular experimental condition or specific behavior. For this reason, we

**FIGURE 4 | fMRI results of each viewing mode vs. baseline for 16 subjects**. Scanning (short fixations followed by large saccades) indicate visual regions near the cuneus. Inspection (long fixations followed by small saccades) indicate brain activity along the ventral stream. *Results display T-maps, thresholded with a value of T > 3.*

**"inspection"**. Results are based on 16 subjects presented at a lenient threshold of *p* < 0.001, uncorrected.

analysed the same dataset again, this time first performing an ICA in order to segregate the brain activity into different components/clusters that can be considered seperate networks. This resulted in 30 components.

Next, four contrasts were examined to test whether and how activity in each of these components was associated with viewing behavior. Only the interaction term "short fixations and large saccades > long fixations and small saccades" was significant in three of the 30 components. None of the other contrasts reached significance in any of the components. The three components cover distinct regions in visual cortex and are shown in the upper row of **Figure 6**. The first component (displayed in red, **Figure 6**) is located in the ventromedial occipital cortex and covers parahippocampal areas. The second component (displayed in green, **Figure 6**) is located more occipital and ventrolateral and covers the lateral occipital complex. The third component (displayed in blue, **Figure 6**) covers early visual cortex in particular. The significance of the interaction term "short fixations and large saccades > long fixations and small saccades" indicates that scanning behavior resulted in more activity than inspection behavior throughout early and ventral visual cortex.

To further explore the underlying activity patterns, we extracted average effect sizes for a statistical parametric model with fixation events, two parametric modulations (fixation duration and saccade amplitude) and an interaction term. In **Figure 6**, the extracted effect sizes are shown below each of the three clusters found in the ICA (upper row). Note that in itself the directions of these findings should not come as a surprise, as this is anticipated based on the significance of the above mentioned interaction term "short fixations and large saccades > long fixations and small saccades". What is revealed by this analysis though, is the relative magnitude of these effects in the three different clusters.

In the ventromedial cluster (red in **Figure 6**), the effect size for saccade amplitude is positive and relatively large, confirming that more activity is associated with larger than with shorter saccades. As expected, the effect size for fixation duration is negative, indicating more activity for shorter than for longer fixations.

In the ventrolateral cluster (green in **Figure 6**), the effect sizes are much smaller than those in the ventromedial cluster. Post-hoc paired *t*-tests between the magnitude of the effect sizes for the three clusters were performed for all conditions and are displayed

duration and saccade amplitude models. Out of 30 components these remained significant with the direct comparison of scanning and inspection. Brain maps were thresholded at *Z* > 2.5. Lower graphs

modulation, and (4) the interaction between fixation duration and saccadic amplitude. Error bars denote standard error of the mean over subjects.

in **Table 1**. The effect sizes for fixation event, saccade amplitude and fixation duration differ between the ventromedial and the ventrolateral cluster. For the cluster in early visual cortex (blue in **Figure 6**), the effect size for fixation event differs from that in the ventromedial cluster, while the modulatory effect of fixation duration differs from that in the ventrolateral cluster. Other effect sizes do not differ from those in the other two clusters. In all three clusters, there is a small negative interaction term, indicating that the modulating influence of saccade amplitude is less for longer fixations than for shorter fixations (**Table 2**). The magnitude of this interaction effect does not differ between the clusters.

We also tested whether the effects found in these components could stem from other, picture-related effects. To do so, we analyzed the following contrasts: "normal scenes > cutout objects", "cutout objects > normal scenes", "cutout objects > scrambled objects", "scrambled objects > cutout objects", "normal scenes > scrambled objects" and "scrambled objects > normal scenes". None of the tests revealed a significant effect for these components. Therefore, we conclude that the differential activity in the clusters is primarily related to the differences in viewing behavior of the observers.

### **DISCUSSION**

We report on a functional magnetic resonance study in which we measured the brain activity of 16 observers' during the free viewing of computer-generated images. Observer's eye-movements were recorded using a MR-compatible eye tracker. Using a combination of ICA and fixation-based eventrelated analysis (Marsman et al., 2012), we find that the activity in different regions in the visual cortex is differentially associated with observer's viewing behavior. Below, we discuss the conclusions we draw from our study, as well as the limitations of our present approach.

**Table 1 | Results per condition from all paired** *t***-tests performed between the effect sizes of all three components against all other effect sizes**.


∗*p < 0.05,* ∗∗*p < 0.001*

**Table 2 | Results per condition of student's** *t***-tests for each effect size for each cluster significantly different from 0**.


∗*p < 0.05,* ∗∗*p < 0.001*

#### **PREATTENTIVE AND ATTENTIVE VIEWING MODES DO NOT MODULATE DORSAL VISUAL PROCESSING**

One of the motivatons for performing this study came from a behavioral eye-movement study by Unema et al. (2005). According to the theory proposed by these authors, dorsal and ventral processing would be associated with distinctive viewing behavior ("pre-attentive" and "attentive" in nature, respectively). Our results do not corroborate this theory. Neither in the GLMbased approach, nor in our ICA-based approach, we found clear evidence of dorsal processing coupled to eye-movements. Our main eye-movement–related activations occured in early visual cortex and the ventral visual cortex.

#### **VIEWING MODES IN EYE TRACKING DATA ARE INFLUENCED BY DISPLAY SIZE**

Unema et al. (2005) theory about the existence of distinct modes of visual processing was grounded in findings about how eye movement behavior develops as a function of stimulus presentation time. It is therefore important to establish that the viewing behavior we recorded in the scanner environment conformed to this same pattern. Indeed, the pattern of fixation duration in our eye tracking results (**Figure 2**) was similar to that of Unema et al. (2005), although on average they found shorter initial fixation durations. However, such longer initial fixation durations (approx. 200 ms), as we find now, have also been reported previously (Unema et al., 2005; Hooge et al., 2007). Furthermore, Unema et al.(2005) reported also a decreasing trend for saccadic amplitudes. This initial drop in saccadic amplitude was less clearly visible in our experiment.

To study the origin of this difference, a separate eye tracking experiment using different subjects conducted outside of the scanner indicated that these findings are due to the relatively small size of the stimulus inside the bore of the magnet (see **Figure 3**). During this experiment, pictures as used in the MRI experiment were shown in two sizes and for two presentation durations. For the smaller images, the decreasing trend for saccadic amplitude was much less distinctive. The initial increase of fixation duration across stimulus presentation was present for both large and small presentations of the images. Unema et al. (2005) used a smaller cut-off value to determine ambient and focal viewing modes for saccadic amplitude. We used a data-driven approach in which the 70th-percentile of the saccades was defined as the cut-off value 30% small saccades, 70% large saccades). This is the reason that we employed a different cut-off value for our saccadic amplitude in the MR experiment in comparison to Unema et al. (2005).

Based on the significant difference between fitted slopes of the saccadic amplitude curves for the "Half-size" and the "Fullsize" conditions, we conclude that despite the smaller display size facilitated by the scanner environment and a smaller number of different stimuli used, our observers' viewing behavior conformed to the patterns described by Unema et al. (2005).

#### **THREE INDEPENDENT COMPONENTS IN VISUAL CORTEX ARE ASSOCIATED WITH VIEWING BEHAVIOR**

We chose to explore the use of blind source separation (ICA), as it has been proven to be very suitable for studying natural viewing in fMRI (Bartels and Zeki, 2004; Malinen et al., 2007). Using such blind source separation methods we find evidence for three separate components that are related to our measures of viewing behavior, of which one component is situated in primary visual cortex and two in the ventral cortex (**Figure 6**).

#### **PRE-ATTENTIVE VIEWING MODULATES ACTIVITY IN EARLY VISUAL AND VENTROMEDIAL CORTICES**

The GLM-based analysis indicates that the main difference between activity associated with different viewing modes could be found in ventromedial cortex. However, the effect was not very strong and could only be retrieved when applying a relatively lenient statistical threshold (**Figure 5**). Nevertheless, the ICA approach corroborated that the ventromedial cluster in particular is modulated by eye movement characteristics. Activity in this cluster (red in **Figure 6**) was significant and positively modulated by saccade amplitude and negatively by fixation duration. The processing in this region therefore appears to be most clearly associated with the "preattentive", "ambient", or "scanning mode" viewing behavior as defined by Unema et al. (2005) (short fixations in combination with large saccades). Activity in the cluster in the ventrolateral visual cortex (green in **Figure 6**) was much less distinctively modulated by any of the eye-movement characteristics considered. In both other clusters, the modulatory influence of fixation duration was larger. The modulatory influence of saccade amplitude was similar to that in the visual cortex cluster, but much smaller than that in the ventromedial cluster.

Previous studies on scene perception suggest that during the early stages of perception, a schematic representation of the scene is captured, which subsequently guides eye movements (Rensink, 2004). This initial representation is commonly referred to as the "gist" of a scene (Torralba et al., 2006). Presently, regions in the ventromedial cortex are assumed to be involved in generating the gist of a scene (Fize et al., 2000). Furthermore, several studies have investigated the nature of this mechanism and propose that it is based on extracting global statistical features (Cant et al., 2009; Cornelissen et al., 2009). In parallel, behavioral studies have shown that the average fixation duration of viewing behavior increases as a function of stimulus presentation time (Antes, 1974; Friedman and Liebelt, 1981; Unema et al., 2005; Hooge et al., 2007). This indicates that early stages of perception involve brief fixations coupled with large saccadic eye movements. Unema et al. (2005) proposed that this early viewing behavior represents a preattentive or "ambient" mode of perception. During this ambient mode, the dorsal pathway was hypothesized to be mostly active, when it deals with layout of objects in the scene. However, in contrast with this hypothesis that predicts more parietal activity, we find predominantly ventromedial activity for this type of viewing. This could imply that during such scanning behavior, information is processed at a statistical level, where—in line with findings in the scene perception literature—global features are extracted. In turn, this suggests that the visual system may comprise two types of processing, the activity of which is associated to the eye movements we make.

#### **DOES EYE-MOVEMENT RELATED CORTICAL ACTIVITY REFLECT TOP-DOWN OR BOTTOM-UP PROCESSING?**

Eye-movements not only depend on bottom-up components of processing, but will also be associated with top-down processing related to saccade-planning and determining currently required task-relevant information (Ballard and Hayhoe, 2009). As such, we believe that it is most likely that our current activity patterns integrate activity of both top-down and bottom-up processing components. For this reason, it is also unlikey that each fixation and saccade would initially have the same neural activity map that only starts to deviate after a particular time. The use of imaging modalities with higher temporal resolutions could perhaps give a more detailed insight in the spreading of activation throughout the visual system following a fixation. In our experiments, participants were performing natural viewing behavior. However, when specific task instructions would be given, we would expect to find different patterns of viewing behavior (conform the earlier results of Yarbus (1967)).

#### **LIMITATIONS OF THE PRESENT STUDY**

In both the "Field-of-View" and the fMRI experiments, we presented each stimulus more than once. This could have influenced both the eye movement patterns as well as perception over time, and, consequently, may have affected the fMRI signal as well. Another limitation in the current paradigm is that participants viewed static computer-generated stimuli for 10 s. Future experiments could therefore improve on the present paradigm by examining viewing behavior in dynamic, natural stimuli.

### **CONCLUSION**

We started the present experiment, expecting that activity patterns associated with different types of viewing behavior would reveal dorsal and ventral visual regions in the human brain. We do not find this. Further exploratory analyses revealed that eye movement behavior consisting of short fixations and large saccades ("scanning behavior") in particular is associated with activity in a ventromedial occipital region. This corroborates with the current understanding of the involvement of this region in fast "gist-based" scene perception. Ventrolateral parts in visual cortex, currently understood to be involved in (detailed) shape and object recognition, was much less affected by the specific eyemovement parameters. Eye-movement characteristics thus differentially influence neural processing in different regions in visual cortex. In summary, we conclude that evaluating the modulatory influence of viewing behavior is crucial for unraveling natural cortical visual processing.

#### **ACKNOWLEDGMENTS**

This research is supported as a Pathfinder-project "PERCEPT" by the European Commission within the Measuring the Impossible call as part of the NEST (grant number #043261). The authors would like to thank all partners within this project for their useful comments. In particular we thank Dr. Jens Helmert and Dr. Sebastian Pannasch for their help in performing and analyzing the field of view experiment. Furthermore, we would like to thank Anita Sibeijn- Kuiper and Judith Streurman-Werdekker for their help in data acquisition. Frans W. Cornelissen and Koen V. Haak were additionally supported by European Union grant #043157 (Syntex).

#### **REFERENCES**


Velichkovsky, B. M. (2002). Heterarchy of cognition: the depths and the highs of a framework for memory research. *Memory* 10, 405–419. doi: 10.1080/ 09658210244000234

Yarbus, A. L. (1967). *Eye Movements and Vision*. New York: Plenum press.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 18 March 2013; paper pending published: 22 April 2013; accepted: 25 November 2013; published online: 18 December 2013*.

*Citation: Marsman JC, Renken R, Haak KV and Cornelissen FW (2013) Linking cortical visual processing to viewing behavior using fMRI. Front. Syst. Neurosci. 7:109. doi: 10.3389/fnsys.2013.00109*

*This article was submitted to the journal Frontiers in Systems Neuroscience*.

*Copyright © 2013 Marsman, Renken, Haak and Cornelissen. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms*.

## Cross-frequency phase synchrony around the saccade period as a correlate of perceiver's internal state

#### *Chie Nakatani <sup>1</sup> \*, Mojtaba Chehelcheraghi 1, Behnaz Jarrahi 2, Hironori Nakatani 3,4 and Cees van Leeuwen1*

*<sup>1</sup> Faculty of Psychology and Educational Sciences, Laboratory for Perceptual Dynamics, University of Leuven, Leuven, Belgium*

*<sup>4</sup> Emotional Information Joint Research Laboratory, RIKEN Brain Science Institute, Wako, Japan*

#### *Edited by:*

*Sebastian Pannasch, Technische Universität Dresden, Germany*

#### *Reviewed by:*

*Georgia, USA*

*Paul Sauseng, University of Surrey, UK Jordan P. Hamm, University of*

### *\*Correspondence:*

*Chie Nakatani, Laboratory for Perceptual Dynamics, University of Leuven, Tiensestraat 102, Leuven B-3000, Belgium. e-mail: chie.nakatani@ ppw.kuleuven.be*

In active vision, eye-movements depend on perceivers' internal state. We investigated peri-fixation brain activity for internal state-specific tagging. Human participants performed a task, in which a visual object was presented for identification in lateral visual field, to which they moved their eyes as soon as possible from a central fixation point. Next, a phrase appeared in the same location; the phrase could either be an easy or hard question about the object, answered by pressing one of two alternative response buttons, or it could be an instruction to simply press one of these two buttons. Depending on whether these messages were blocked or randomly mixed, one of two different internal states was induced: either the task was known in advance or it wasn't. Eye movements and electroencephalogram (EEG) were recorded simultaneously during task performance. Using eye-event-time-locked averaging and independent component analysis, saccade- and fixation-related components were identified. Coss-frequency phase-synchrony was observed between the alpha/beta1 ranges of fixation-related and beta2/gamma1 ranges of saccade-related activity 50 ms prior to fixation onset in the mixed-phrase condition only. We interpreted this result as evidence for internal state-specific tagging.

**Keywords: EEG-eye movement co-registration, efference copy, visual tokens, up-date of visual coordinates, information processing over multiple fixations**

### **INTRODUCTION**

Our visual world appears stable and integrated, and yet it is largely patched together from the information acquired during single eye fixations. The eyes perform a saccadic movement followed by a fixation a few times every second; the resulting number of samples to be assembled is considerable. Are these movement all equally important, or do perceivers prioritize some samples over others?

As seen in the famous eye movement records by Yarbus (1967), the eyes do not move at random, but are directed by visual salience (e.g., Findlay and Walker, 1999; Itti and Koch, 2000; Reichle et al., 2012). On the other hand, a considerable number of fixation samples will be irrelevant and/or redundant to the current goal of visual inspection. For instance, conflicting selections may sometimes send eye movements astray (Trappenberg et al., 2001; Meeter et al., 2010; Nikolaev et al., 2011; Devue et al., 2012). Even in a simple task such as single alpha-numeric character identification, multiple fixations to a target are often made, e.g., on one third of all trials in Nakatani and van Leeuwen (2008). Of these fixations, often the first one is sufficient to perform the task. Being able to distinguish relevant from irrelevant samples would be beneficial to the efficiency of perception.

To achieve this aim, each sample perceivers expect to be relevant may be tagged with a brain signal, marking it out for later processing. Such a mechanism is more flexible than a rigorous early selection through on-line analysis of the sample, which may result in discarding information that may later be found out as relevant. Thus, early availability of a predictive signal is essential for "tagging" an upcoming fixation. Tagging, moreover, should be fast enough to keep pace with the rate of fixations. Electrophysiological markers for fixation tagging should therefore be found around saccade onset or, at the latest, in an early post-fixation period.

The efference copy of the saccade signal meets these temporal criteria—the information is available at saccade onset. The efference copy contains information, such as size, direction, and retinal coordinate of each saccade. To some degree, this information might also be obtained from the sensory feedback of extra-ocular muscle activity. For our present purpose, we cannot distinguish these two; we will address them jointly as "saccade information." The saccade information is stochastically predictive about the upcoming fixation content. Consider the size of saccades in common, every-day settings: larger saccades are more likely between than within objects/areas. Larger saccades will therefore be more relevant for exploration than small ones; small saccades will be more relevant for detailed inspection (Unema et al., 2005; Tatler and Vincent, 2008; Graupner et al., 2011; Mills et al., 2011).

Task context will determine whether and how a predictive tag is used (Mills et al., 2011). Therefore, tagging-related brain activity would be sensitive to a manipulation on task context. In the

*<sup>2</sup> Klinik für Neurologie, Universitätsspital Zürich, Zurich, Switzerland*

*<sup>3</sup> Okanoya Emotional Information Project, Exploratory Research for Advanced Technology, Japan Science and Technology Agency, Wako, Japan*

current study, we investigated the tagging-related brain activity using EEG measurement. We recorded EEG and eye movements simultaneously from healthy human volunteers, who performed simple saccade tasks. In each of these tasks, participants were presented an object in lateral visual field, to which they were requested to make a saccade. A task instruction followed shortly after the object presentation. This involved answering an easy or a hard question about the object presented by pressing one of two alternative response buttons, or it could be an instruction to simply press one of these two buttons. Two task-conditions were introduced. In the *blocked-phrase* condition, the participant knew the type of task instruction prior to each trial, while in *mixedphrase* condition, this was not the case. Thus, in the latter case, the participant was more uncertain about the task than in the former. Task-uncertainty was expected to increase the probability of the tagging. Due to the uncertainty, more incoming visual information might be tagged as "relevant" for later processing. Tagging-related brain activity would therefore be higher in the mixed- than the blocked-phrase conditions.

#### **RESEARCH STRATEGY**

To investigate saccade-tagging, some practical problems need to be addressed. Here, we describe these problems and outline our strategy to handle them.

The first problem is the reduction of electro-oculogram (EOG) artifacts without discarding signals with saccades. EOG-related activity coincides temporally with the purported brain activity relevant for saccade information. The issue of the artifact reduction is a longstanding problem in EEG data analysis; thus varieties of solutions have been proposed (see Croft and Barry, 2000, for a review). We chose Independent Component Analysis (ICA). This method has widely been applied successfully to reduce ocular artifacts in EEG data (Makeig et al., 1996; Jung et al., 2000). In the current data set, we may assume that the origin of the artifacts is different from that of brain signals, and non-normality of the artifact signal, as are required for applying the method (Hyvärinen and Oja, 2000).

The second problem is how to distinguish saccade and fixation related processes. After artifact reduction, we constructed classifiers to extract for saccade information-related and visual information processing-related brain activity. To this purpose, peri-fixation EEG was averaged. ICA was applied to the averages, so as to obtain templates for saccade-related and visual information processing-related components. Such a methods of component identification has previously been successful for perifixation EEG signal analysis (e.g., Kamienkowski et al., 2012). The templates were, then, applied to single trial EEG, in order to extract saccade information-related and visual information processing-related brain activities.

Third and finally, a measure for tagging-related activity needs to be determined. Saccade-related and visual processingrelated information belong to different aspects of the visual system that need not always be coordinated. However, in tagging coordination between these two must be transiently established. This means that we can detect tagging activity by measuring the transient synchrony between these activities. We employed cross-frequency phase synchrony (Sauseng et al., 2008) to the classified EEG signals. The measure (cfPSI) indicates how reliable the phase relation of two oscillatory signals is over trials at a given time point. When phase synchrony is high, two signals are likely to be connected functionally. For example, Sauseng et al. (2008) reported that cfPSI between gamma (30–50 Hz) and theta (3–7 Hz) was higher for attended than unattended visual target. They interpreted the synchrony as an indication of successful memory matching between incoming visual information (gamma) and stored information (theta).

In the current study, we propose that uncertainty about the task would enhance tagging of visual information processingrelated activity with saccade information-related activity. When the task context is uncertain, more information is likely to be tagged, in order to assure flexibility of later processing. We expect the tagging to be reflected in a transient coupling between the frequencies of, respectively, visual information-related and saccade-related activity components.

For the visual information processing-related activity component, the most relevant frequency band around fixation is likely to be alpha (8–12 Hz): this is the main frequency band of the Lambda complex, which reflects early visual information processing in eye-fixation related potentials (Marton and Szirtes, 1982; Kazai and Yagi, 1999). Phase locking to a fixation of ongoing alpha activity may be crucial for the emergence of the Lambda complex (Ossandon et al., 2010).

The saccade-related activity component, by contrast, may not be restricted to a specific frequency band, since saccade-related potentials typically consist of a spike in the evoked activity (Thickbroom and Mastaglia, 1985). Such a potential will appear in wide range of frequency bands of the saccade-related component. Thus, tagging would involve coupling with a wide band of the saccade-related activity. Then again, also more sustained saccade-related activity has been observed that may have its own characteristic frequency (Bellebaum et al., 2005).

### **METHODS**

#### **PARTICIPANTS**

Ten residents of Tokyo metropolitan area (Five men and five women, mean age: 22.60 year-old) volunteered to participate in the experiment. All were right-handed and had normal or corrected-to-normal vision. Participants received a remuneration of 1000 yen per hour. The research ethics committee of RIKEN had approved the experiment.

#### **STIMULI, TASK, AND DESIGN**

Ninety images of natural and artificial objects were rendered using a 3D object database (500 3D Object images, Volumes 1 and 2, Taschen, Köln). Natural object stimuli consisted of 45 images of animals, fish, and insects, while artificial object stimuli included 45 images of automobiles, airplanes, hand tools, and furniture. All stimuli were rendered with realistic colors and shades. All images were scaled to fit to 5 × 5◦ area.

The sequence of events in a trial is illustrated in **Figure 1**. A central fixation cross was presented for 100–500 ms, uniformly distributed in steps of 100 ms, immediately followed by object presented 8.5◦ either right or left, 50/50 in random order. From its

onset, a small lateral fixation cross was superimposed in the center of the object as a saccade target. The object was presented for 500 ms, while the lateral fixation cross remained 1000 ms more. Participants were asked to make a saccade to the lateral fixation cross as soon as it appeared, and to keep fixating after the disappearance of the object until a phrase appeared. The phrase describes the task to be performed as either basic-level identification (e.g., "Was the image a dog?"), feature-level identification (e.g., "Was the head black?"), for which they were instructed to press the right button of a hand-held button box for "yes", or to press left for "no", or they were simply instructed to press the button, e.g. "Press right button." Participants were instructed to press the button as correct and fast as possible. No response feedback was given during the trials.

In one condition the participant was informed in advance on the type of question required (basic, feature or button-press) for a following task block of 30 trials, i.e., blocked-phrase condition. The other condition was a mixed-phrase condition, in which all three types of tasks were intermixed in a pseudo-random order within each 30-trial block (10 trials for basic, feature, and button press).

#### **PROCEDURE**

After electrode attachment, participants were seated in a sound attenuated experimental chamber under a dimly lit condition, in front of a computer monitor used for stimulus presentation. Half of the participants started from the mixed, while the other started from the blocked conditions. Prior to each condition, 10 practice trials for each question type were given. In each condition, there were 90 trials; A half of the participants started with the mixed-phrase condition, while the other half started from the blocked-phrase condition. Within the blocked condition, the order of tasks was counterbalanced. Participants were informed about the trial type prior to each block. After completing all task blocks, the participants gave retrospective reports on their performance, and were prompted, if needed, to inform us about which condition they experienced as more "difficult."

The task was controlled by a PC using a Visual C++ program which recorded key-press responses and generated marker signals for co-registration of eye movement and EEG recordings.

#### **EYE MOVEMENT AND EEG RECORDINGS**

Binocular eye movement was recorded by a head-mounted eye tracker device (EyeLink I, SR Technologies, Ontario) with a sampling rate of 250 Hz. Calibration was performed prior to a task block, and repeated before a task when measurement error exceeded 2◦. A nine point calibration pattern (center, four corners, and four mid points of four sides of the display) was used for calibrating eye-position. A drift correction procedure was used before a trial when a 1–2◦ error was observed. Only right-eye data were analyzed.

EEG was recorded from 14 electrodes (F3, Fz, F4, P3, Pz, P4, PO3, POz, PO4, O1, Oz, O2, HEOGs, Left VEOG) according to the international 10-10 system using differential amplifiers (Nihon Kohden MME-3132). Ag/AgCl electrodes were used for the recording. Prefrontal, central and temporal loci were not available due to the placement of the headband for the eye tracking system. Left earlobe was used as reference, and right earlobe as ground. Electrode impedance was kept under 5 kOhm. Sampling rate was 500 Hz, and low-cut 0.08 Hz and high-cut 100 Hz were applied. The data were registered separately from the eye movement and task event data. Marker signal from an independent source was sent to both systems via parallel port, which was used to align the EEG with the eye movement record off line.

#### **EEG DATA PREPROCESSING**

For EOG artifact reduction, we applied an ICA algorithm (Hyvärinen and Oja, 1997, 2000) to the raw EEG and vertical and horizontal EOG (VEOG and HEOG) signals. For each of the 14 independent components obtained, the correlation with the EOG signals was computed. We identified as EOG components those ICA components which showed more than 70% temporal correlation with either of the EOG signals. The 70% criterion was chosen in order to balance EOG reduction and preservation of signals. In each participant, one or two EOG components were identified. These EOG components were removed before signal reconstruction. By visual inspection, we assured that the reconstructed signals showed a reduction in the VEOG and HEOG channels, while the signals in the 12 EEG channels were well-preserved (see **Figure A1** of the Appendix).

The EOG artifact-reduced EEG signals was segmented into 6-s episodes (from −2500 ms to +3500 ms from fixation onset), allowing the segments to overlap. The segments were labeled by fixation type (e.g., single fixation to the stimulus or first of two fixations to the stimulus) and performance (correct response or error). Segments in error trials were excluded from further analyses together with bad eye movements (e.g., eyes moved before object onset, and blinks) and bad EEG (e.g., base-line drift and EMG/body movements detected by visual inspection) trials. In total, about 21% of all trials were discarded.

#### **EXTRACTION OF SACCADE- AND VISUAL PROCESSING-RELATED COMPONENTS**

For the sake of component identification, single-fixation EEG segments of both mixed and blocked conditions for each participant were averaged, from which a grand average was computed. The grand average showed the typical peri-fixation waveform, including the spike potential (SP) and the Lambda complex (see blue traces in **Figure A1** of the Appendix). To separate saccadeand visual processing-related components, InfoMax ICA (Makeig et al., 1996) was applied to the grand average. Unlike the ICA for EOG artifact reduction, only the 12 EEG channels were used. This resulted in a 12-channel × 12-independent- component forward ICA matrix. Of the 12 components, two components showed eye-fixation related activity (**Figure A2** of the Appendix and **Figure 2**). The first component (C1) showed a sharp onset; while it corresponded to the primary saccade, this sharp onset was followed by a positive activity which showed two positive peaks around 80 and 200 ms, respectively, which are characteristic of the P1 and P2 latency of the Lambda complex (Kazai and Yagi, 2003). The topology of this component, strongest in occipital electrodes, also matches to that of the Lambda complex in previous studies. Taken together, C1 may therefore be considered as a mixture of saccade-contingent and early visual processing-related neural activity.

The second ICA component (C2) also showed a spike corresponding to the primary saccade. The spike, however, was followed by a slow wave, which peaked around 200 ms from fixation onset. The spike showed a wide scalp-distribution. The polarity was positive in occipital, occipito-parietal and parietal, but negative in frontal electrode sites. The scalp distribution and polarity match to the SP (Thickbroom and Mastaglia, 1985). The following slow wave has the same scalp distribution and polarity as the spike. A component similar to the C2 slow wave was reported in previous studies; some authors regarded the component as an EOG/eye-muscle artifact (Thickbroom and Mastaglia, 1985; Godlove, 2010); recently, however, it was associated with efference copy-based up-dating of the retinal coordinate after a saccade (Bellebaum et al., 2005). For example, patients who had suffered a focal cerebellar lesion which disabled the efference copy showed reduced amplitude in post-saccade ERPs which are compatible to the C2 slow wave (Peterburs et al., 2013). In other words, the slow wave distinguished the C2 component from an EOG artifact. Thus, the entire C2 component may be considered as a saccade-related.

The forward ICA matrix was used as classifier; the C1 classifier was a 12 × 12 matrix of which the values were zeroes except for the C1-related coefficients. The matrix was dot-multiplied with the single-trial EEG segments (12 channels by 3000 peri-fixation samples = −2500 ms to +3500 ms from fixation onset in a sampling rate of 500 Hz), which extracts the C1 contribution from the single-trial EEG segments. The C2 classifier was created and the single-trial C2 contribution was extracted likewise.

In order to assure that the classifiers effectively extracted C1 and C2 signals from single-trial EEG, the extracted C1 and C2 signals were averaged, respectively. A baseline period was chosen between −500 and −200 ms. As illustrated in **Figure 2B**, the average waveforms were faithful to those of the C1 and C2 templates.

### **CROSS FREQUENCY PHASE SYNCHRONY**

The extracted single-trial components C1 and C2 were used for cross frequency phase synchrony analysis. cfPSI values were computed between C1 and C2 single trial signals following the procedure in Sauseng et al. (2008). Gabor expansion was applied to each single-trial component between 1 and 45 Hz using a 1-Hz step size between the center of frequencies (alpha = 0.5). This procedure estimates instantaneous phase and amplitude. The range of analysis was chosen to avoid 50 Hz AC noise and muscular activity. Arbitrary frequency pairs from C1 and C2, fm,c1 and f*n*,*c*<sup>2</sup> (1 ≤ *m*, *n* ≤ 45), were chosen to compute the phase difference at time *t* in trial *k*, which is:

$$
\Delta \Phi\_k(f\_{m,c1}, f\_{n,c2}, t) \approx ((n+m)/2 \times \mathbf{m} \times \Phi\_{k,c1}(f\_{m,c1}, t))
$$

$$
$$

The cfPSI in the frequency pair over the trials is defined as:

$$\begin{aligned} & \text{cfPSI}\left(f\_{m,\varepsilon1}, f\_{n,\varepsilon2,}, t\right) \\ & = \text{abs}\left( < \exp\left(j \times \Delta \Phi\_k\left(f\_{m,\varepsilon1}, f\_{n,\varepsilon2,}, t\right)\right) >\_k \right), j = \text{sqrt}(-1). \end{aligned}$$

The combination of 45 by 45 bins yielded 2025 frequency pairs of cfPSI in each time point.

### **RESULTS**

#### **TASK RESULTS**

Percentage of correct responses were, on average, 87 and 88% for the mixed and blocked-phrase conditions, respectively. Difference between the conditions was not statistically significant, *t* < 1. On the other hand, all participants reported finding the mixed condition more "difficult" and/or "uncertain" than the blocked one. Although the percent correct did not differ, the subjective report showed that the internal states of the participants had been different between the two conditions.

#### **EYE MOVEMENT RESULTS**

Saccade and fixation parameters were computed from eye position data, using the saccade detection algorithm which is a part of the eye-tracking system. Trials were classified based on the number of saccades/fixations to the object. In single-fixation trials, only one saccade was made during stimulus presentation. In correctly answered trials, the percentage of the single-fixation trials was 41% (*n* = 621), while in 51%, a small secondary saccade was observed before target offset, i.e., two-fixation trials (*n* = 866). The ratio of single- vs. two-fixation trials was 0.72. In error trials, the ratio 0.68, was about the same, *t* < 1; multifixations occur in the same ratio in correct and error trials. The error trials were excluded from further analyses.

In the single-fixation trials, latency of the first-and-only (1st/1) saccade from the central fixation to the lateral object was 145 ms, saccade size was 8.41◦, and duration was 45 ms on average. Eyes stayed on the object for about 300 ms. In two-fixation trials, latency, size, and duration of the first saccade (1st/2 saccade) were: 130 ms from image onset, 7.57◦, and 43 ms on average, while those of the second (2nd/2) saccades were: 195 ms from 1st/2 saccade offset, 1.33◦, and 17 ms, on average. The eye movement parameters within a category showed small variance; standard deviation was 0.68, 0.81, and 0.71◦ in saccade size, and 3, 4, and 4 ms in saccade duration, for the 1st/1, 1st/2, and 2nd/2 saccades, respectively. These saccade parameters did not differ between the blocked and mixed-phrase conditions; no paired t-test yielded *p* < 0.1. The ratio of single vs. two-fixation trials was 0.76 and 0.67 for the blocked and mixed-phrase conditions, respectively. The difference was not significant, *t* < 1. The results showed that the task conditions did not affect to saccade control.

#### **CROSS FREQUENCY PHASE SYNCHRONY**

cfPSI values from all pairs were computed in all time points. **Figure 3** shows eight time points round fixation onset. The x axis shows C1 frequency and the y axis shows C2 frequency. In the single-fixation trials, the cfPSI increased around saccade onset in the mixed condition. The synchrony was prominent between 10–20 Hz of C1 and 20–35 Hz of C2 activity. In contrast to the mixed condition, cross-frequency phase synchrony was not observed in the blocked condition (**Figure 3A**). To test the difference between conditions, the cfPSI values were averaged over 150 C1 (10–20 Hz) by C2 (20–35 Hz) frequency bin pairs and over 25 (−50 to 0 ms) time bins. The difference in the average was tested against a probability distribution generated by bootstrapping. For the bootstrapping, trials in the mixed or blocked conditions were pooled together. From this pool, *K* trials were selected allowing repeated sampling (*K* is the actual number of trials in the mixed or blocked conditions). To the re-sampled trials, the cfPSI computation and the averaging procedure were applied as in the original samples. This was repeated for 400 times to generate a probability distribution for the average difference under the null hypothesis of zero difference between the mix and blocked conditions (*H*0). The threshold value was 0.046 (99th percentile value, which is the upper threshold for a = 0.05 in a two-tailed test). The actual difference between conditions, 0.057, exceeded the threshold, i.e., *p*(*H*0) < 0.05.

The phase synchrony effect appeared when C1 and C2 showed spike-shaped activities. Frequency decomposition of spike-shaped activity may yield spurious phase-locking of oscillatory components, within as well as between C1 and C2. To check if the current effect was spurious, based on a suggestion by one of our reviewers, we reasoned that amplitude correlation should show the same pattern of results. We computed Pearson's correlations between each C1 and C2 frequency pair, following the same procedure as for cfPSI above. That is, for each participant Pearson's correlations averaged over 150 C1 (10–20 Hz) by C2 (20–35 Hz) frequency bin pairs and over 25 (−50 to 0 ms) time bins. The difference in averaged correlations between conditions was tested against the probability distribution generated by bootstrapping. The difference, 0.062, did not exceed the upper threshold value for a = 0.1, which was 0.075, i.e., *p*(*H*0) > 0.1. This is accordance with the waveform of the spike, which shows no visible differences in amplitude between conditions. We concluded that the difference in cfPSI was not based on an artifact.

In two-fixation trials, the cross-frequency phase synchrony also appeared higher in the mixed than in the blocked conditions (**Figure 3B**). Similar to the 1st/1 trials, cfPSI between C1 10–20 Hz and C2 20–35 Hz was prominent in the 1st/2 but not in the 2nd/2 saccade onset. The difference between conditions was tested using the same procedure as for the 1st/1 samples to the 1st/2 and 2nd/2 samples. For the 1st/2 samples, the difference, 0.045, exceeded the upper threshold for a = 0.05, which was 0.032, i.e., *p*(*H*0) < 0.05. For the cross-frequency amplitude correlation coefficient, however, the difference, −0.039, did not exceed threshold for a = 0.1, i.e., *p*(*H*0) > 0.1. We therefore concluded that the difference in cfPSI in the 1st/2 saccades was not attributable to an artifact. For the 2nd/2 saccades, the difference in cfPSI, 0.014, was not significant, *p*(*H*0) > 0.1. Neither was the difference in the amplitude correlations, 0.039, *p*(*H*0) > 0.1.

Specific to the two-fixation trials, phase synchrony within a band (i.e., *m* = *n*, 1:1 synchrony) was also observed. In **Figure 3C**, the within-band phase synchrony is prominent in 30–45 Hz and 4–8 Hz. The cfPSI of the fifteen pairs between 30 and 45 Hz were evaluated by the boot-strapping test. The difference between the mixed and blocked condition in the 1st/2 trials was 0.015, *p*(*H*0) < 0.05; but amplitude correlations showed the same pattern, the difference was 0.204, *p*(*H*0) < 0.05. Likewise, in the 2nd/2 trials the difference in cfPSI was 0.012, *p*(*H*0) < 0.05, but also the difference in amplitude correlation was 0.207, *p*(*H*0) < 0.05. The results for cfPSI within the 30–45 band, therefore, appear to be artifactual. The cfPSI of the four pairs between 4 and 8 Hz showed a difference between conditions in the 1st/2 trials of 0.039, *p*(*H*0) < 0.05. In the amplitude correlations, the difference was −0.118, *p*(*H*0) < 0.05. i.e., the correlations were *lower* in mixed than in blocked conditions. In the 2nd/2 trials, the difference in cfPSI was 0.051, *p*(*H*0) < 0.05, and the difference in the difference in amplitude correlations was, again, opposite: −0.132, *p*(*H*0) < 0.05. We concluded that the differences in cfPSI within the 4–8 bin were not artifactual. The opposite effect for the correlations is difficult to interpret. A tentative explanation is provided in the discussion.

#### **DISCUSSION**

Tagging of fixations could help selecting samples according to the current task context, which renders down-stream information processing more efficient. We propose that fixation tagging makes use of saccade information, and investigated its neural correlates. Reducing the EOG artifact from the peri-fixation EEG, and extracting single-trial saccade-related and visual information processing-related signals, provided data sufficient for testing our hypothesis. Phasic synchronization of the two signals was estimated using a cross-frequency phase synchrony measure (cfPSI). The results show that the synchrony increased around saccade onset in the condition where the task was designed (and reported) to have an uncertain task context.

A number of studies suggest that the most prominent electrophysiological activity of the period around the saccade is of extraocular muscle origin (Thickbroom and Mastaglia, 1985; Sasaki et al., 2002). The cfPSI pattern, however, is difficult to explain based on synchronization between muscular activities only. First, the saccade profiles were virtually identical in the mixed and blocked conditions. Nevertheless, cfPSI increased only in the mixed condition. Second, the frequency band of extra-ocular muscle activity spans 20–200 Hz (Kovach et al., 2011); however, the C1 band for the synchrony was 10–20 Hz, which spans the alpha and beta1 bands. Third, the cross-phase synchrony was observed before the primary, but not before the secondary fixation on the target. The classifier successfully extracted secondary saccade-contingent activities in C1 and C2, so the absence of

a cfPSI effect in secondary fixations is unlikely to be due to a lack of component activity. These considerations suggest that the observed cross-frequency phase synchrony indicates coordination of visual-processing-related (C1) and saccade-related (C2) activities.

The frequency band of the C1 component overlaps in part with the alpha range (8–12 Hz), which is the main frequency of the Lambda complex. The Lambda complex reflects early visual information processing (Marton and Szirtes, 1982; Kazai and Yagi, 1999). The synchronization occurred before the onset of the Lambda complex. A contribution of pre-Lambda/ongoing alpha activity to the Lambda complex itself was reported (Ossandon et al., 2010). The C1 component extends to the beta1-band. Ito et al. (2011) reported that in monkeys, the beta1-band Local Field Potential (LFP) modulated visually induced spiking of V1 neurons. The LFP modulation was time-locked to saccade onset. Moreover, their study considered the origin of the saccade-related LFP modulation as a corollary signal.

The frequency band of the C2 component spanned beta2 to gamma1 bands. The beta2 to gamma1 band is known to modulate eye-movement, in particular saccadic reaction times (Diederich et al., 2012). Our understanding of C2 as saccade-related is based on the scalp distribution and polarity of the C2 component, which matched to the Spike Potential (Thickbroom and Mastaglia, 1985), while subsequent C2 activity corresponds to the up-date of spatial coordinates after a saccade (Bellebaum et al., 2005). The effect of task on cross-frequency synchrony, however, is limited to the SP interval. The fact that it is concentrated on the beta2 to gamma1 band may suggest, in accordance with (Diederich et al., 2012) that this activity is specifically relevant to saccade timing.

#### **REFERENCES**


frontal feedback-related potentials in nonhuman primates. *J. Neurosci.* 30, 4187–4189.


Based on these interpretations of C1 and C2, it may be possible to conceive of a function of their synchrony for impending visual information processing. We proposed that the synchrony reflects the use of efferent saccade information for tagging of fixations. Uncertainty of the task situation in our mixed condition makes fixations more likely to be tagged. According to our tagging hypothesis, efferent information is evaluated in its task context. Evaluation is reserved for a planned saccade, i.e. the primary saccade. This explains why there is no task effect for secondary saccades.

In this study, also non-cross-frequency (i.e., *m* = *n*) phase synchrony (or, in other words, conventional PSI) in the theta bands showed sensitivity to the task manipulation. This effect was observed in the two-fixation trials. It might be considered as a mechanism for organizing multiple fixations into a cognitive cluster (Graupner et al., 2011; Nikolaev et al., 2011). This remains open to investigation; another issue on which our results are inconclusive is to what extent tagging determined the fate of the fixation sample in visual encoding. A comparison between the signals on correct responses and errors could have provided this information. Unfortunately, the low numbers of errors did not permit such an evaluation of the signals.

#### **CONCLUSION**

Phasic synchronization before fixation onset, between saccaderelated and visual-information related activity constitutes a highly plausible neural mechanism for tagging of fixation information.

#### **ACKNOWLEDGMENTS**

Chie Nakatani and Cees van Leeuwen are supported by an Odysseus grant from the Flemish Organization for Science FWO.


intracranial recordings. *Neuroimage* 54, 213–233.


in fixation duration studies of cognitive processes. *J. Eye Mov. Res.* 1, 1–12.


Reader to simulate eye movements in nonreading tasks: a unified framework for understanding the eye-mind link. *Psychol. Rev.* 119, 155–185.


potential: investigation of topography and source. *Brain Res.* 339, 271–280.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 02 February 2013; accepted: 06 May 2013; published online: 28 May 2013.*

*Citation: Nakatani C, Chehelcheraghi M, Jarrahi B, Nakatani H and van Leeuwen C (2013) Cross-frequency phase synchrony around the saccade period as a correlate of perceiver's internal state. Front. Syst. Neurosci. 7:18. doi: 10.3389/ fnsys.2013.00018*

*Copyright © 2013 Nakatani, Chehelcheraghi, Jarrahi, Nakatani and van Leeuwen. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

#### **APPENDIX EFFECT OF OCULAR ARTIFACT CORRECTION**

**Figure A1** illustrates the effect of the EOG artifact reduction (see Methods for the procedure). Eye-fixation-related potentials (EFRPs) of the 1st/1 fixation trials computed from the original EEG signals (black) and those from the signals after the artifact correction (blue) are shown. (Time 0 is the fixation onset. In practice, the EFRPs are also time-locked to the saccade onset, since the variance in saccade duration was only 3–4 ms. See Eye fixation results for more details). As expected, the artifact correction reduced horizontal and vertical EOG related signals, without distorting waveforms.

### **TWELVE INDEPENDENT COMPONENTS GENERATED FROM 1st/1 EFRPs**

**Figure A2** lists all (12) independent components identified.

## Antecedent occipital alpha band activity predicts the impact of oculomotor events in perceptual switching

### *Hironori Nakatani 1,2\* and Cees van Leeuwen3*

*<sup>1</sup> Okanoya Emotional Information Project, Exploratory Research for Advanced Technology, Japan Science and Technology Agency, Wako-shi, Japan*

*<sup>2</sup> Emotional Information Joint Research Laboratory, RIKEN Brain Science Institute, Wako-shi, Japan*

*<sup>3</sup> Laboratory for Perceptual Dynamics, Experimental Psychology Unit, Faculty of Psychological and Educational Sciences, KU Leuven, Leuven, Belgium*

#### *Edited by:*

*Artem Belopolsky, Vrije Universiteit Amsterdam, Netherlands*

#### *Reviewed by:*

*Natasha Sigala, University of Sussex, UK Eelke Spaak, Donders Institute for Brain, Cognition, and Behaviour, Netherlands*

#### *\*Correspondence:*

*Hironori Nakatani, Okanoya Emotional Information Project, Exploratory Research for Advanced Technology, Japan Science and Technology Agency, 2-1 Hirosawa, Wako-shi, Saitama 351-0198, Japan. e-mail: hnakatani@brain.riken.jp*

Oculomotor events such as blinks and saccades transiently interrupt the visual input and, even though this mostly goes undetected, these brief interruptions could still influence the percept. In particular, both blinking and saccades facilitate switching in ambiguous figures such as the Necker cube. To investigate the neural state antecedent to these oculomotor events during the perception of an ambiguous figure, we measured the human scalp electroencephalogram (EEG). When blinking led to perceptual switching, antecedent occipital alpha band activity exhibited a transient increase in amplitude. When a saccade led to switching, a series of transient increases and decreases in amplitude was observed in the antecedent occipital alpha band activity. Our results suggest that the state of occipital alpha band activity predicts the impact of oculomotor events on the percept.

**Keywords: ambiguous figures, Necker cube, perception, electroencephalogram (EEG), blinking, saccade**

### **INTRODUCTION**

Oculomotor behavior such as blinking and saccades are prominent in visual perception. We spontaneously blink our eyes every few seconds. The role of blinking goes beyond merely moistening the eyes; among other things, blinking reflects the deployment of attentional resources. For example, blinking frequency decreases when cognitive demand increases (Veltman and Gaillard, 1998). Also, blinking tends to occur at breakpoints of attention (Nakano et al., 2009; Nakano and Kitazawa, 2010) and may have an active role in the disengagement of attention (Nakano et al., 2013).

Saccadic eye movements occur with a similar intensity; several times per second we spontaneously shift our gaze from one location to the next. Saccades are closely associated with visual attention. For example, the allocation of spatial attention is tightly time-locked to saccade execution (Filali-Sadouk et al., 2010). Neuroimaging studies have shown that attention and saccade planning share common neural substrates in the frontal and parietal areas (Corbetta et al., 1998; Nobre et al., 2000). Attention modulates the content of our percept; when a certain part within the visual input is selectively attended, the corresponding information is enhanced (Chelazzi et al., 2001).

Although both blinking and saccades are tightly associated with attentional processes and cause transient changes in retinal stimulation, we are seldomly aware of this. Sensitivity to visual input normally is actively suppressed during blinking and saccades (Volkmann et al., 1980; Burr et al., 1994; Bristow et al., 2005). The blink and saccadic suppression mechanisms, in combination with the constructive abilities of perception (Koenderink et al., 2012), explain why our visual experience remains continuous across transient interruption of visual input by blinking.

However, recent studies suggested that both blinking and saccades could have an impact on our visual experience. Blinking could, for instance, trigger illusory motion in the Rotating Snakes illusion (Otero-Millan et al., 2012). These authors showed that, besides blinking, also microsaccades led to the perception of illusory motion. Together, these results indicate that oculomotor events sometimes lead to visual transients affecting the percept. Other such effects have been reported in the case of multistable perception. For example, in one of our previous studies, we found that some, but not all, blinking and saccades led to *perceptual switching* in an ambiguous figure (Nakatani et al., 2011). Perceptual switching is the phenomenon that a percept switches spontaneously between possible interpretations of an ambiguous figure (e.g., Attneave, 1971; Leopold and Logothetis, 1999; Ito et al., 2003; Parker and Krug, 2003; Nakatani and van Leeuwen, 2005, 2006; Nakatani et al., 2011, 2012).

Why is it that some oculomotor events lead to changes in our visual percept, whereas others do not? We propose that the ones that lead to changes are preceded by a shift in visual attention. Attention modulates the content of our percept by selectively enhancing an attended part within the visual input (Chelazzi et al., 2001). We, therefore, expect that attention-related brain signals predict whether oculomotor events have an influence on the percept. With new analyses on previously reported data about oculomotor behavior and the electroencephalogram (EEG) in perceptual switching (Nakatani et al., 2011), we show that preceding occipital alpha band activity predicts the impact on the percept of blinks or saccades.

#### **MATERIALS AND METHODS PARTICIPANTS**

Six participants (aged 21–34 years) participated in this study. Participants gave their written informed consent to the study. The Research Ethics Committee of the RIKEN had approved our procedures.

#### **EXPERIMENTAL DESIGN**

Since some of the results from the present study have been published earlier; here we report only the main characteristics of the design. For further details, see Nakatani et al. (2011). The experiment consisted of two conditions: a *perceptual switching* condition and a *stimulus initiated* condition (Nakatani et al., 2011). Each lasted 240 s and both conditions were presented in counterbalanced order within a session. In the *perceptual switching* condition, a Necker cube (**Figure 1**) was continuously presented as a white line-drawing, subtending 5◦ of visual angle, on a black ground. The stimulus was shown at eye-height in a sound proof room with reduced ceiling illumination. Since perceptual switching may fail to occur if participants know only one of the possible interpretations of an ambiguous figure (Rock et al., 1994), in order to assure that all participants started with equal information, they were advised before the experiment that the Necker cube could be seen in two alternative orientations, which are referred to here as "downward" and "upward" orientation. Participants pressed a response button corresponding to the perceived switching direction, i.e., from upward to downward or vice versa. They had been instructed to do so whenever their visual percept of the Necker cube reversed but not when it merely became inconsistent or vague. The *stimulus initiated* condition was a control condition, which is not relevant for the present paper. Three sessions separated by a break were conducted for each participant.

#### **MEASUREMENTS OF OCULOMOTOR EVENTS AND EEG**

Oculomotor events (blinks and saccades) were measured with an SR Research Eyelink system in three participants and an SR Research Eyelink 1000 system in the others. Both are videobased eye-tracking systems. As the SR Research Eyelink system was broken halfway during our study, we used the SR Research Eyelink 1000 system for the remaining participants. We presented

the stimulus to both eyes, and measured oculomotor events (in particular, blinks and saccades) from the dominant eye.

Simultaneously with oculomotor events, we also measured EEG. Disk-type Ag/AgCl electrodes were placed on O1, O2, P3, Pz, P4, F3, Fz, and F4 recording sites in accordance with the international 10/20 system (Jasper, 1958). When the SR Research Eyelink 1000 system was used for oculomotor events measurement, it was possible to place additional electrodes on C3, Cz, and C4 recording sites. Reference and ground electrodes were placed on left and right ears of each participant, respectively. Vertical and horizontal electrooculogram (EOG) were also recorded. Sampling frequency was 500 Hz.

### **EEG ANALYSIS**

We used independent component analysis (Hyvärinen and Oja, 2000) to reduce oculomotor artifacts in EEG recordings. Using the FastICA algorithm (Hyvärinen and Oja, 1997), we decomposed the EEG and EOG recordings into independent components. We then reconstructed the EEG recordings after we removed components that had larger correlation with vertical or horizontal EOG than with EEG.

In order to analyze EEG in the time-frequency domain, we applied a continuous wavelet transform to EEG. The mother function of the wavelet transform was the complex Gabor function *g*(*t*),

$$\mathbf{g}(t) = \frac{1}{2\sqrt{\pi\alpha}} \exp\left(-\frac{t^2}{4\alpha^2}\right) \exp\left(i2\pi t\right),$$

where α = 0*.*5. The size of the mother function was about 5 cycles. Wavelet coefficients of a signal *x*(*t*), each channel of EEG, were obtained as follows:

$$W(t,f) = \sqrt{f} \int \mathfrak{x}(t) \, \mathfrak{g}^\* \left( f \, (\mathfrak{x} - t) \right) d\mathfrak{x},$$

where *g*(*t*)∗ is the complex conjugate of a complex Gabor function, and *t* and *f* indicate time and frequency, respectively. We then obtained EEG amplitude |*W* (*t*, *f*)| in the time-frequency domain.

After the continuous wavelet transform, we calculated average waveforms of EEG amplitude that were aligned with oculomotor events of interest for each participant. We used trial average data per participant for the following statistical analyses.

To detect EEG episodes that were associated with oculomotor events preceding perceptual switching (*pre-switch* oculomotor events), we conducted two types of comparisons. First, we compared average waveforms of EEG amplitude with a baseline amplitude. We defined a separate baseline amplitude for each frequency. The baseline amplitude was the mean amplitude at each frequency calculated from the entire 12 min of EEG recorded from the three experimental sessions. Second, we compared average waveforms of EEG amplitude aligned with *pre-switch* oculomotor events with average waveforms of EEG amplitude aligned with *no-switch* oculomotor events. For statistical comparison, we applied the bootstrap resampling method (see below in this section) to the EEG activity that preceded the oculomotor events. To avoid massive multiple comparison, we divided the time-frequency domain into relatively large segments. The size of each segment was 50 ms in width and 1 Hz in height (see **Figure 2B**) and the amplitude of each segment was defined by the mean amplitude within the segment. When a certain EEG episode was statistically different in amplitude of the average waveform compared to the baseline, we considered that this episode was associated with the oculomotor event of interest. Furthermore, we considered that the episode occurred in relation to a *pre-switch* oculomotor event when its average waveform in pre-switch oculomoter events was statistically different from that in *no-switch* oculomotor events.

For graphic representation of average waveforms in the Results section, we used Z-transformed amplitude. We used mean and standard deviation of EEG amplitude calculated from 12 min of whole recordings for Z-transform. As baseline amplitude used for statistical testing was 0 in the Z-transformed amplitude, this makes it easier to observe whether EEG amplitude increased or decreased in relation to oculomotor events of interest. We applied Z-transform to un-averaged EEG amplitude |*W* (*t*, *f*)|. We calculated average waveforms of Z-transformed amplitude that were aligned with oculomotor events of interest for each participant, and then calculated average waveforms across participants. We graphed the average waveforms across participants.

We applied the bootstrap resampling method (Efron, 1979; Efron and Tibshirani, 1986), in order to evaluate the statistical significance of differences in parameters of interest between two data sets (Nakatani et al., 2011). This method is a non-parametric approach and, therefore, there is no need to assume that the parameters of interest follow the normal distribution. When we compare two data sets *x<sup>i</sup> <sup>A</sup>* and *<sup>x</sup><sup>i</sup> <sup>B</sup>*, we first calculate the difference between them as follows:

$$
\boldsymbol{\alpha}\_{\text{eff}}^{i} = \boldsymbol{\alpha}\_{A}^{i} - \boldsymbol{\alpha}\_{B}^{i},
$$

where *i* is the integer number that takes from 1 to *n* to identify individual participants, and *n* is the number of participants (*<sup>n</sup>* <sup>=</sup> 6, in this study). Then, we calculated group average of *<sup>x</sup><sup>i</sup>* diff . That is,

$$\overline{\mathbf{x}}\_{\text{diff}} = \frac{1}{n} \sum\_{i=1}^{n} \mathbf{x}\_{\text{diff}}^{i}$$

**FIGURE 2 | An amplitude increase in occipital alpha band activity preceded blinking, when blinking was followed by perceptual switching. (A)** Average waveform of the occipital recordings (O2) aligned with the *pre-switch* blinkings that were followed by perceptual switching. The amplitude was Z-transformed, and the amplitude *zero* indicates baseline amplitude. **(B)** Statistical difference between average waveform aligned with the *pre-switch* blinkings and baseline amplitude. The colors *red* and *blue* denote that amplitude was larger in average waveform or baseline amplitude, respectively (*P <* 0*.*05). **(C)** Average waveform of the occipital recordings (O2) aligned with the *no-switch* blinkings that were not

*pre-switch* blinks and average waveform aligned with the *no-switch* blinks. The colors *red* and *blue* denote that amplitude was larger in the *pre-switch* blinking-related average waveform or the *no-switch* blinking-related average waveform, respectively (*P <* 0*.*05).

The null hypothesis to be tested is that there is no difference between the two data sets. That is,

$$
\overline{\mathbf{x}}\_{\text{iff}} = \mathbf{0}.
$$

We generated bootstrap parameters for *x*diff that satisfy the null hypothesis. The distribution of these bootstrap parameters is used to evaluate the statistical significance of *x*diff . We first generated the bootstrap parameters for each participant,

$$
\boldsymbol{\pi}\_{\text{diff}}^{\*i} = \left(\boldsymbol{\pi}\_{A}^{i} - \overline{\boldsymbol{\pi}\_{A}}\right) - \left(\boldsymbol{\pi}\_{B}^{i} - \overline{\boldsymbol{\pi}\_{B}}\right) \dots
$$

As the group average of *x*∗*<sup>i</sup>* diff , *<sup>x</sup>*<sup>∗</sup> diff is 0, the bootstrap parameters satisfy the null hypothesis. With the bootstrap resampling method (Efron, 1979; Efron and Tibshirani, 1986), we estimated the distribution of *x*∗ diff , that can be considered as the distribution of *x*diff in case it satisfies the null hypothesis, and obtained a Monte Carlo approximation of the *p*-value for *x*diff .

To deal with the multiple comparison problem, we used a cluster-based permutation test (Bullmore et al., 1999; Maris and Oostenveld, 2007; Groppe et al., 2011). This method is a non-parametric approach, suitable to detect broadly distributed effects (Groppe et al., 2011). We ignored all segments of the time-frequency domain of which the test statistic, *x*diff , does not exceed a pre-determined threshold, equaling an uncorrected *p-*value of 1%. The remaining segments were composed into clusters by grouping together adjacent segments on the timefrequency domain. We calculated the mean value of test statistic for each cluster, in order to define a cluster-level value for the test statistic. The most extreme cluster-level value of the test statistic was used for permutation procedures, in order to derive a distribution for the null hypothesis. Same as in statistical testing with the bootstrap resampling method, the null hypothesis was: zero difference between the two data sets. The corrected *p-*value of each cluster was derived from its ranking in the null hypothesis distribution, and then each segment of the cluster was assigned the *p-*value of the entire cluster. We considered significant those segments of which the corrected *p-*value exceeded the 5% level. From hereon, corrected significance levels only will be reported.

For our analyses, we used custom scripts written in C (gcc compiler version 4.2.1 on MacOSX 10.6.8).

#### **TEMPORAL DISTRIBUTIONS OF SACCADIC PROBABILITIES**

To investigate the relationship between blinking and saccades, we calculated temporal distributions of saccadic probabilities in alignment to onsets of blinking of interest. Choosing onsets of blinking of interest as the reference (0 ms), if certain types of saccades are time-locked to these blinks, this would be revealed by a peak in the aligned saccade frequency temporal distribution. Likewise, if certain types of saccades are systematically omitted in a time-locked fashion, the distribution would show a dip. We calculated the occurrence probability of saccades within 100 ms width bin with 50 ms overlap and obtained average probabilities across participants.

### **RESULTS**

In our previous study (Nakatani et al., 2011), about 5% of blinking occurred in a short period in which blinking was temporarily increased, about 1000 ms prior to a switching response. Such *pre-switch* blinks were followed by a transient amplitude increase of theta band activity. The switches showed a larger than average bias to the interpretation of the Necker cube that individual participants preferred. About 150 ms prior to a switching response, ∼5% of saccades occurred in a short period in which saccade frequency was temporarily increased. Such *pre-switch* saccades were preceded by a transient amplitude increase of theta band activity. The direction of the saccades was systematically related to the interpretation of the Necker cube after the switch. For the blinks and saccades occurring in these two specific intervals, we here investigated the EEG activity prior to these events.

For the *pre-switch* blinking, alpha band activity (around 10 Hz) over the occipital area exhibited an increase in amplitude around 250 ms before the onset of the blink (**Figure 2A**). The increase was significant at *P <* 0*.*05 level, compared to baseline amplitude (**Figure 2B**). In the same intervals prior to *no-switch* blinks, i.e., blinks that were not followed by perceptual switching, the occipital area did not exhibit such amplitude increase (**Figures 2C**,**D**). We also directly compared the *pre-switch* blinking-related EEG with the *no-switch* blinking-related EEG, and found that the amplitude increase was specific to the *pre-switch* blink-related EEG (**Figure 2E**).

To investigate whether the amplitude increase over occipital areas was due to an eye-movement artifact, we compared saccade probabilities between *pre-switch* and *no-switch* blinks (**Figure 3A**). Around the time when the occipital area exhibited an amplitude increase in relation to *pre-switch* blinks, no difference in saccade probabilities between *pre-switch* and *no-switch* blinks was observed (*P* = 0*.*740 *>* 0*.*05, for the 300–250 ms interval prior to blinking onset). Saccade probabilities were significantly larger, however, in the interval between the occipital alpha band activity and blinking onset in *pre-switch* blinks, compared with *no-switch* blinks (*P* = 0*.*048 *<* 0*.*05 for the 200–150 ms prior to blinking).

These *pre-blink* saccades potentially constitute an even earlier oculomotor predictor of an ensuing switch than the subsequent blink. As they preceded *pre-switch* blinking that occurred 1000 ms before switch responses, they clearly differ in their timing from the *pre-switch* saccades currently under investigation. In our previous study (Nakatani et al., 2011) *leftward* (130–240◦) *pre-switch* saccades tended to be followed by perceptual switching to a downward interpretation of the Necker; switching to an upward interpretation tended to follow *rightward* (300–60◦) *pre-switch* saccades (in a polar coordinate system with right = 0◦, top = 90◦, left = 180◦, bottom = 270◦). If the present, earlier *pre-blink* saccades reflect the switching process, we may expect that the same relationship holds; the direction of the *pre-blink* saccades was expected to be associated with the preferred interpretation of the Necker cube after a switch, because the *pre-switch* blinks led to perceptual switching to the preferred interpretation (Nakatani et al., 2011). Downward interpretation was the preferred one for five out of six participants and upward interpretation was the preferred one for other one participant (Nakatani et al., 2011).

As shown in **Figure 3B**, indeed this relationship was evident in our data. We may conclude that these *pre-blink* saccades reflect the ensuing switch in a manner similar to the *pre-switch* saccades.

We next analyzed the EEG activity before the *pre-switch* saccades, that occurred 150 ms before switch responses (Nakatani et al., 2011). For the *pre-switch* saccades, the occipital area exhibited an increase in the alpha band (around 10 Hz) 650 ms before and a decrease in the higher alpha band (around 11 Hz) 150 ms before the onset of the saccade, compared to baseline amplitude (**Figures 4A**,**B**). The lower theta band (around 5 Hz) activity before 400 ms also exhibited increased amplitude, as we had observed in our previous study (Nakatani et al., 2011). In contrast, for the *no-switch* saccades, the occipital area did not exhibit such amplitude in- and decrease (**Figures 4C**,**D**). We also directly compared the *pre-switch* saccade-related EEG with the *no-switch* saccade-related EEG, and found that the amplitude inand decrease were specific to the *pre-switch* saccade-related EEG (**Figure 4E**).

The mother wavelet used in continuous wavelet transform of EEG had a width of 5 cycles. At 10 Hz, EEG in a width of 500, 250 ms before and 250 ms after, could affect amplitude at time of interest in time-frequency domain. To check whether amplitude in- or decrease before *pre-switch* oculomotor events were due to some part of oculomotor events bleeding in to the preoculomotor events activity, we applied the same analyses with a mother wavelet, of which the width was 3 cycles; we obtained similar results (see **Figures A1**, **A2** of Appendix).

#### **DISCUSSION**

We analyzed the properties of EEG episodes preceding the onset of eye-blinks and saccades. Occipital alpha band activity prior to blinks and saccades was predictive of whether these would lead to subsequent switching of perceived orientation in the Necker cube. When the occipital alpha band activity exhibited increased amplitude, blinks led to perceptual switching, and when it exhibited an increase followed by a decrease in amplitude, the saccades led to perceptual switching.

Alpha band activity at occipital sites has been associated with attentional deployment. For example, an anticipatory shift of visual attention to a target decreases the amplitude of alpha band activity in cortical areas tuned to the newly attended location (Sauseng et al., 2005; Yamagishi et al., 2005, 2008; Thut et al., 2006; Rihs et al., 2009), suggesting that a decrease in alpha reflects facilitation of future visual processing. On the other hand, cortical areas that are tuned to unattended locations exhibited increased alpha amplitude (Worden et al., 2000; Sauseng et al., 2005; Kelly et al., 2006; Rihs et al., 2007), suggesting that this reflects active inhibition of task-irrelevant processing (Klimesch et al., 1999, 2007). Thus, a sequence of increased and decreased alpha band activity would contribute to the deployment of visual spatial attention through disengaging and shifting of attention, respectively.

Attentional deployment might facilitate perceptual switching. According to the focal-feature hypothesis (Toppino, 2003), different focal regions within an ambiguous figure favor one perception of an ambiguous figure over another, by selectively enhancing a certain part within the visual input (Chelazzi et al., 2001).

Both blinks and saccades are associated with attentional process. Blinkings tend to occur at breakpoints of attention (Nakano et al., 2009; Nakano and Kitazawa, 2010) and are involved in the process of attentional disengagement (Nakano et al., 2013). Thus, the combination of amplitude increase of the alpha band activity and blinking would reflect the disengagement of attention to initiate the process of perceptual switching. On the other hand, saccades are associated with the reallocation of spatial attention (Filali-Sadouk et al., 2010). The combination of amplitude decrease of the alpha band activity and saccade might reflect the shift of attention that elicits the process of perceptual switching.

Our observations were correlational and therefore we cannot point out a causal relationship between occipital alpha band activity and perceptual switching. It is possible that the alpha band activity appeared time-locked to blinks or saccades with no functional role in the switching process. Based on a number of studies about the alpha band activity and attentional

**alpha band activity preceded saccade, when saccade was followed by perceptual switching. (A)** Average waveform of the occipital recordings (O2) aligned with the *pre-switch* saccades that were followed by perceptual switching. The amplitude was Z-transformed, and the amplitude *zero* indicates baseline amplitude. **(B)** Statistical difference between average waveform aligned with the *pre-switch* saccades and baseline amplitude. The colors *red* and *blue* denote that amplitude was larger in average waveform or baseline amplitude, respectively (*P <* 0*.*05). **(C)** Average waveform of the occipital recordings (O2) aligned with the *no-switch* saccades that were not

deployment (Worden et al., 2000; Sauseng et al., 2005; Yamagishi et al., 2005, 2008; Kelly et al., 2006; Thut et al., 2006; Rihs et al., 2007, 2009), we suggest, nevertheless, that the presently observed amplitude modulation of the alpha band activity was associated with attentional processes that play an active role in perceptual switching.

In our previous study (Nakatani et al., 2011), we discussed possible relationships between *pre-switch* oculomotor events and posterior theta band (around 5 Hz) activity. The *pre-switch* blinks occurred 1000 ms prior to switching responses. The theta band activity appeared 400 ms after the blinks. *Pre-switch* saccades occurred 150 ms prior to switching responses. Here, the theta band activity appeared 400 ms before the saccades. The posterior theta band activity was also observed in the control condition, where presented stimuli were switched from one to other. As the theta band activity followed changes of presented stimuli, we considered it to reflect the change of percept. Taken together current findings and previous findings, we may describe the processes of perceptual switching as follows.

In the case of *pre-switch* blinking, first a disengagement of attention occurs, reflected by the amplitude increase of the alpha band activity. The subsequent blink facilitates the detachment of attention from a part of the Necker cube that corresponds to the current percept. This process is more likely to be observed during the non-preferred interpretation of the Necker cube, as blinks tend to be followed by a switch to the preferred interpretation. Saccades that occur between the amplitude increase of the alpha band activity and the *pre-switch* blinks likewise facilitate detachment of attention by shifting the gaze from an attended location to other location. After attentional disengagement, the process of changing the percept is reflected by the theta band activity. In the case of the *pre-switch* saccade, first a disengagement of attention occurs, reflected by amplitude increase of the alpha band activity 600 ms before saccades. Second, the process of changing the percept is reflected by the theta band activity. Then, the shift of attention occurs to facilitate the change of percept, reflected by amplitude decrease of the alpha band activity and saccade whose direction was associated with the interpretation of the Necker cube after the switch (Nakatani et al., 2011), and it leads to perceptual switching.

In conclusion, we pointed out that preceding occipital alpha band activity predicts the impact of oculomotor events on current percept during perception of an ambiguous figure. Our results

#### **REFERENCES**


suggest that spontaneous oculomotor events dynamically play an active role in perceptual organization.

### **ACKNOWLEDGMENTS**

Cees van Leeuwen is supported by an Odysseus Grant from the Flemish Organization for Science FWO.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 February 2013; accepted: 08 May 2013; published online: 24 May 2013.*

*Citation: Nakatani H and Van Leeuwen C (2013) Antecedent occipital alpha band activity predicts the impact of oculomotor events in perceptual switching. Front. Syst. Neurosci. 7:19. doi: 10.3389/ fnsys.2013.00019*

*Copyright © 2013 Nakatani and Van Leeuwen. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

### **APPENDIX**

The mother wavelet used in the continuous wavelet transform of EEG had a width of 5 cycles. At 10 Hz, oculomotor behavior in a width of 500, 250 ms before and 250 ms after, could affect amplitude of EEG at the time of interest. To check whether amplitude increase or decrease before *pre-switch* oculomotor events were due to some part of oculomotor event-related activity bleeding in to the pre-oculomotor events activity, we applied the same analyses with a mother wavelet of which the width was 3 cycles. **Figure A1** is for amplitude of EEG prior to blinks, and **Figure A2** is for amplitude of EEG prior to saccades. The results are similar to those obtained with a 5-cycles wavelet analysis.

**FIGURE A2 | EEG amplitude preceding a saccade, when a mother wavelet with a size of 3 cycles was used for the continuous wavelet transform. (A)** Average waveform of the occipital recordings (O2) aligned with the *pre-switch* saccades that were followed by perceptual switching. The amplitude was Z-transformed, and the amplitude *zero* indicates baseline amplitude. **(B)** Statistical difference between average waveform aligned with the *pre-switch* saccades and baseline amplitude. The colors *red* and *blue* denote that amplitude was larger in the average waveform or baseline amplitude, respectively (*P <* 0*.*05). **(C)** Average waveform of the occipital recordings (O2) aligned with the *no-switch* saccades. The

*blue* denote that amplitude was larger in average waveform or baseline amplitude, respectively (*P <* 0*.*05). **(E)** Statistical difference between the average waveform aligned with the *pre-switch* saccades and the average waveform aligned with the *no-switch* saccades. The colors *red* and *blue* denote that amplitude was larger in the *pre-switch* saccade-related average waveform or the *no-switch* saccade-related average waveform, respectively (*P <* 0*.*05).

## Visual encoding and fixation target selection in free viewing: presaccadic brain potentials

#### *Andrey R. Nikolaev1 \*, Peter Jurica2, Chie Nakatani 1, Gijs Plomp3 and Cees van Leeuwen1*

*<sup>1</sup> Laboratory for Perceptual Dynamics, University of Leuven, Leuven, Belgium*

*<sup>2</sup> Laboratory for Advanced Brain Signal Processing, RIKEN Brain Science Institute, Wako-shi, Japan*

*<sup>3</sup> Functional Brain Mapping Laboratory, Université de Genève, Genève, Switzerland*

#### *Edited by:*

*Sebastian Pannasch, Technische Universität Dresden, Germany*

#### *Reviewed by:*

*John E. Richards, University of South Carolina, USA Thierry Baccino, University of Paris 8, France*

#### *\*Correspondence:*

*Andrey R. Nikolaev, Laboratory for Perceptual Dynamics, University of Leuven, Tiensestraat 102, Box 3711, Leuven B-3000, Belgium e-mail: andrey.nikolaev@ ppw.kuleuven.be*

In scrutinizing a scene, the eyes alternate between fixations and saccades. During a fixation, two component processes can be distinguished: visual encoding and selection of the next fixation target. We aimed to distinguish the neural correlates of these processes in the electrical brain activity prior to a saccade onset. Participants viewed color photographs of natural scenes, in preparation for a change detection task. Then, for each participant and each scene we computed an image heat map, with temperature representing the duration and density of fixations. The temperature difference between the start and end points of saccades was taken as a measure of the expected task-relevance of the information concentrated in specific regions of a scene. Visual encoding was evaluated according to whether subsequent change was correctly detected. Saccades with larger temperature difference were more likely to be followed by correct detection than ones with smaller temperature differences. The amplitude of presaccadic activity over anterior brain areas was larger for correct detection than for detection failure. This difference was observed for short "scrutinizing" but not for long "explorative" saccades, suggesting that presaccadic activity reflects top-down saccade guidance. Thus, successful encoding requires local scanning of scene regions which are expected to be task-relevant. Next, we evaluated fixation target selection. Saccades "moving up" in temperature were preceded by presaccadic activity of higher amplitude than those "moving down". This finding suggests that presaccadic activity reflects attention deployed to the following fixation location. Our findings illustrate how presaccadic activity can elucidate concurrent brain processes related to the immediate goal of planning the next saccade and the larger-scale goal of constructing a robust representation of the visual scene.

**Keywords: saccades, EEG, presaccadic interval, attention, visual encoding, saccade guidance, change detection, heat maps**

### **INTRODUCTION**

While scrutinizing a visual scene, observers typically make saccadic eye movements from one fixation location to the next. During fixation intervals, two component processes can be distinguished. Visual encoding, the first of these processes, serves the overall goal of building a robust representation of the scene. Visual information extracted from attended and fixated locations is accumulated across eye movements, (Melcher, 2001, 2006; Tatler et al., 2003, 2005; Pertzov et al., 2009) and is transferred to visual short- and long-term memory (Hollingworth and Henderson, 2002; Henderson and Hollingworth, 2003; reviewed in Hollingworth, 2004).

The second component process serves a more immediate goal of perception: deciding where to move the eyes next. Selection of the next fixation target involves directing covert attention, which precedes the execution of a saccade to the next target (Hoffman and Subramaniam, 1995; Deubel and Schneider, 1996). The selection is controlled by bottom-up target salience, in combination with top-down relevance of the target (reviewed in Awh et al., 2006).

Visual encoding and next-target selection are likely to share informational resources: information accumulated during the current fixation involves the spatial and semantic properties of a scene that determine what would be an interesting target for the next fixation. We may therefore expect that both encoding and next-target selection draw on the same attentional mechanisms, and that their neural markers overlap in time.

The goal of this study is to pinpoint and analyze in scalprecorded electrical brain activity the processes of visual encoding and target selection as they evolve during the fixation interval. Our analysis is focused on the interval preceding saccade onsets. This interval has been studied in relation to covert attention shifts to the next fixation target, the initial phase of trans-saccadic remapping of receptive fields, and oculomotor preparation (reviewed in Melcher and Colby, 2008; Mathot and Theeuwes, 2011). Correspondingly, scalp-recorded electrical brain activity in the presaccadic interval reflects directing spatial attention (Wauschkuhn et al., 1998; Krebs et al., 2012), transsaccadic remapping (Parks and Corballis, 2008), and oculomotor preparation (Kurtzberg and Vaughan, 1982; Csibra et al., 1997; Richards, 2003).

The results of these studies throw light on brain processes related to the control of eye movements. Little is known, however, about the presaccadic activity related to the accumulation of visual information during viewing a scene. In previous work we found that this activity is predictive of performance in a change-detection task (Nikolaev et al., 2011). Since change detection depends on successful accumulation of scene information (Simons and Rensink, 2005), we can use this as a criterion to study encoding in the presaccadic interval.

We may expect effects of encoding during the presaccadic interval to be modulated by periodic systematic tendencies in scene viewing (Tatler and Vincent, 2008), in other words, by viewing strategies. Viewing strategies are reflected in characteristic sequences of saccades and fixations (Unema et al., 2005; Tatler and Vincent, 2008; Graupner et al., 2011; Mills et al., 2011). For example, global scanning is reflected in large-amplitude saccades and short fixation durations, whereas local scanning is reflected in small-amplitude saccades and long fixation durations (Unema et al., 2005; Tatler and Vincent, 2008). Although patterns of short and long saccades tend to alternate throughout free viewing episodes (Tatler and Vincent, 2008; Mills et al., 2011), long saccades predominate during the first 2 s of free viewing (Unema et al., 2005; Pannasch et al., 2008; Graupner et al., 2011). This suggests that in the course of scrutinizing a scene a shift from global to local scanning strategy occurs. We may posit a corresponding shift from bottom-up to top-down saccade guidance (Findlay and Walker, 1999). Following these previous studies, we will analyze presaccadic potentials related to short, medium and long saccades separately, in order to determine scanning strategy and its influence on encoding.

Saccade size is reflected in eye fixation-related potentials (EFRPs) time-locked to the fixation onset. Graupner et al. (2011) occasionally presented circular distractors at fixation location, in 100 ms after the fixation onset. The amplitude of the distractorevoked EFRP components depended on size of the preceding and/or following saccades. In contrast with Graupner et al. (2011), our study considers viewing strategies as they are reflected in the electrical brain activity *before* saccade onset.

The scalp-recorded activity in the presaccadic interval is characterized by a slow positive wave over parietal brain areas, which is called the antecedent potential (Becker et al., 1973; Kurtzberg and Vaughan, 1982; Moster and Goldberg, 1990; Csibra et al., 1997; Richards, 2003; Parks and Corballis, 2008), as well as by positive potentials over frontal areas (Richards, 2000; Gutteling et al., 2010).

The parietal and frontal potentials may reflect activity of, respectively, the lateral intraparietal area (LIP) and the frontal eye field (FEF). On the one hand, these areas are strongly interconnected (Andersen et al., 1985; Bullier et al., 1996) and share eye movement control functions between them (Medendorp et al., 2011): they map salient or task-relevant objects (Gottlieb and Balan, 2010) and are involved in guidance of spatial attention (Thompson et al., 1996; Goldberg et al., 2006). On the other hand, these areas are functionally distinct. For example, the frontal area is more closely associated with oculomotor functions than the parietal area (Curtis and D'Esposito, 2006; Connolly et al., 2007); in addition, whereas in top-down attention tasks frontal neurons respond earlier than parietal ones to saccade target location, in bottom-up attention tasks it is the other way around (Buschman and Miller, 2007). We will therefore distinguish the presaccadic activity over frontal and parietal areas in our analyses.

In sum, in our study we will distinguish between processes of visual encoding and target selection, as they are reflected in the electrical brain activity in the presaccadic interval. Since these processes may differ in bottom-up and top-down scanning strategies associated with saccade size, we will consider the presaccadic activity for different saccade sizes separately.

### **MATERIALS AND METHODS**

We re-analyzed data from a previously published study. The details of the experimental procedure can be found elsewhere (Nikolaev et al., 2011). Here we outline the main steps only.

### **PARTICIPANTS**

Nineteen healthy participants (ages 19–24, median age 20, 5 men) took part in the study. The main analyses were done in seventeen participants who had sufficient numbers of epochs per condition, as described below. All participants gave written informed consent. The study was approved by Institutional Review Board No.2 (Research Ethics Committee) of RIKEN Brain Science Institute (Wako-shi, Japan) where we conducted the experiment.

#### **STIMULI**

We used 48 pairs of color photo images of real-world scenes from the study by Rensink et al. (1997). The images were 28◦ wide and 22◦ high. There were three types of differences between images in a pair: color, position, or presence/absence of an object. Color difference involved either an object or part of the background. Position difference referred to displacement of an object by several degrees of visual angle. Presence/absence involved the occurrence or non-occurrence of an object in the display. Stimuli were presented using custom-made software written in Python using Vision Egg interface (Straw, 2008).

#### **PROCEDURE**

Stimuli were presented on a 21-in. CRT Gateway monitor placed at 85 cm from the participant in a dimly lit room. In a practice session participants were familiarized with examples of three types of difference between images. In the main session, after stable fixation on a central crossways reached, the first image of a pair, the *memorization display*, was presented for 20 s. Then, after a 1-s mask, the second image, the *search display*, was presented until response but no longer than 20 s. We asked participants to memorize the first image and detect a change in the second image. Participants had to respond with two mouse clicks: first, as soon as a change was detected, second, after placing the cursor on the change region. Feedback was immediately given by showing the image with a red ellipse over the changed region. The order of images was counterbalanced across participants, such that half of the time each image was used as a memorization or search display.

#### **EYE MOVEMENT RECORDING**

Eye movements were recorded with a video-based infrared eyetracking system (EyeLink 1000/Tower, SR Research Ltd., Ontario, Canada). The participant's head was stabilized using a chin and forehead rest. The right eye was tracked with a sampling rate of 500 Hz. The eye-tracker was calibrated using nine points: in the center, four corners and mid-points of the four sides of the screen. Mean difference between calibration and validation measurement was kept below 1.5◦.

#### **EEG RECORDING**

EEG was recorded with a Nihon Kohden MEG-6116 amplifier using an ECI electrode cap (Electro-Cap International, Inc., Eaton, USA) with 15 electrodes (F3, F4, C3, C4, P3, P4, O1, O2, T3, T4, T5, T6, Fz, Cz, Pz) placed according to the international 10/20 system with electrode FCz as ground and the linked mastoids as reference. Data were recorded with 0.5 Hz high-pass and 100 Hz low-pass online filters (and a 50 Hz notch filter) and digitized at 500 Hz.

The analog output of the EyeLink system was connected to the EEG amplifier and the eye movement horizontal (X) coordinate was recorded as an additional channel (the *EyeLink channel*) along with the regular EEG channels. To mark the eye movement events in the EEG data, we translated the time stamps of the events from the EyeLink data file. To that end, we correlated the EyeLink channel (recorded by the EEG acquisition computer) and the X coordinate time-series from the EyeLink data file (recorded by the eye movement acquisition computer). In each trial (which consisted of a presentation of two displays, a memorization and a search display), we selected a blink-free segment and determined the lag of maximum correlation between the EyeLink channel and the time-series from the EyeLink data file using normalized cross-correlation. The correlation coefficient for synchronized segments was never below 0.98. Finally, the time stamps of the eye movement events (computed by the EyeLink system) were adjusted for this lag and inserted in the EEG data as markers of saccades and fixations.

#### **EYE MOVEMENT ANALYSIS**

We analyzed eye movements and EEG from the 20-s presentations of the memorization display.

Onset time, location, and duration of saccades and fixations were determined by EyeLink software. In our eye movement and EEG analyses we defined a fixation-related epoch by the combination of parameters of a fixation and a following saccade: the fixation has to be longer than 200 ms and shorter than 2000 ms, and the following saccade has to be shorter than 60 ms. Presaccadic intervals containing blinks were excluded.

#### **FIXATION HEAT MAPS**

For each image and each individual participant, heat maps were computed as a function of fixation duration and fixation density (**Figure 1**). All fixations collected during presentation of a memorization display were accumulated to compute a heat map using

**FIGURE 1 | Temperature map computed as a function of fixation duration and fixation density.** The saccades are superimposed on the image. Green circles designate the starting points of saccades and white circles designate the end points. The direction of the saccade is given a positive or negative sign, depending on the difference between the temperature values of start and end points: positive if the end temperature is higher than the start, negative if vice versa.

visual kernel density estimation (Jones et al., 1996). Contribution of each fixation was represented by a Gaussian kernel with unit amplitude and spread proportional to the fixation duration

$$t\_{i(X,Y)} = \exp\left[-0.5\left(\frac{(X-\varkappa\_i)^2 + (Y-\wp\_i)^2}{\sigma^2}\right)\right]$$

where *ti* is the temperature contribution of the *i*th fixation, (*X, Y*) is the pixel coordinate within the stimulus image, (*xi*, *yi*) is the center of *i*th fixation, and spread (σ = τ*i*/*k*) is proportional to the duration (τ*i*) of the fixation. The linear bandwidth parameter *k* = 20 was determined visually to obtain a smooth unimodal density function above each cluster of fixations. A larger bandwidth covers a wider area and reduces the relative advantage of pixels in near proximity of the center of the fixation. Longer fixations allow more time and space for exploration of the surroundings of the center of fixation using micro-saccades. The final heat map was the average of all contributions *T* = sum*iti*.

We superimposed for each saccade the start and end points on the images (**Figure 1**) and extracted their temperatures as the average of all temperatures located within a radius of 0.5◦ of visual angle. The radius was chosen in accordance with the measurement error in the eye-tracking equipment.

High temperature on the heat maps indicates areas with long fixation durations and small saccade sizes, i.e., regions which were carefully scrutinized using a local scanning strategy. Low temperature areas, by contrast, indicate regions that were only occasionally visited during global scanning or not visited at all.

As an indicator of target selection for the next fixation we used the difference in temperature between the start and end point of a saccade. The difference can be understood as a contrast in anticipated task-relevant information between locations. We considered a saccade as going in the *positive* direction if the temperature at the end point was higher than at the start point, and vice versa if the saccade direction was *negative*.

#### **DIVISION IN SACCADE SIZE BINS**

Since we proposed that the corresponding processes depend on viewing strategies as reflected in saccade size, we divided up the data into three bins according to saccade size: short, medium and long. Three bins were used in order to secure a minimal number of epochs per condition (which was set to 50) for a sufficiently large number of participants. We still had to exclude two participants who did not have enough epochs in one of the conditions. Thus, eye movement and EEG analyses were performed in 17 participants.

#### **EYE FIXATION-RELATED POTENTIAL ANALYSIS**

For the EEG analysis we used Brain Vision Analyzer software (Brain Products GmbH, Gilching, Germany). The EEG signal was filtered with a Butterworth zero-phase filter with a high cut-off frequency of 30 Hz, 12 dB/oct. From the 15 recorded channels, two temporal channels (T3 and T4) were excluded because of artifacts resulting from muscle activity.

We analyzed EEG in the fixation intervals preceding saccade onsets. We used only those EEG epochs that corresponded to fixation intervals without blinks. These epochs are by definition free of artifacts caused by eyelid and eyeball movements (Dimigen et al., 2011). However, if a blink had occurred in the preceding fixation interval or if the fixation was short, the tail of the activity evoked by the blink or the saccade would still have had a chance to contaminate the current fixation interval. We used ICA to remove these effects from the EEG data (Jung et al., 2000) as follows. We selected a 300-s interval from 100 to 400 s of continuous EEG recording, which included a number of blinks and saccades. The data from 13 regular EEG channels in this interval constituted the training dataset for computing the unmixing matrix. Then in each participant we identified the ICA components which picked up eye movement artifacts. The time course of these components mirrored blinks or saccades in the "EyeLink channel" recorded together with EEG channels. These components had typical topography with a maximum at the frontal (F3, Fz, F4) sites: the blink-related component had a symmetrical maximum at the frontal electrodes; the saccade-related component had a characteristic asymmetrical topography over the frontal sites reflecting the directionality of the saccades. Finally, the whole duration of the EEG was reconstructed without these components. As expected, ICA artifact correction primarily affected the activity at the frontal sites. Since the unmixing matrix was computed using an EEG interval encompassing all experimental conditions, it is unlikely that ICA selectively or systematically altered the presaccadic activity in a certain condition.

The markers of saccade onset were computed by EyeLink software and were incorporated into the EEG time series. EEG was segmented into epochs from −200 ms to 50 ms relative to saccade onset. Using a semi-automatic artifact rejection procedure, we excluded epochs if the absolute voltage difference exceeded 50 μV between two neighboring sampling points and if the amplitude exceeded +100 or −100μV. After the artifact rejection the mean number of epochs per condition was 158 (SD: 36; range 109–220).

For each participant, we set the number of epochs to be equal in all conditions. To that end, we first identified the condition where the number of epochs was minimal. Then for all other conditions we randomly selected a number of epochs equal to this minimum.

Based on visual inspection of the grand averaged potentials (**Figure 4A**), we selected the interval −100 to 20 ms before the saccade onset for statistical evaluation of the presaccadic activity. We computed the mean amplitude of this interval. In addition, we evaluated the peak-to-peak amplitude of the saccadic spike potential by measuring the difference between the positive and negative peaks of this potential.

We averaged the epochs for each participant and condition separately. As we discussed in the Introduction, one of our goals was to distinguish between activities over the frontal and parietal brain areas. Therefore we preselected anterior (F3, F4, Fz, C3, C4) and posterior (P3, P4, Pz, O1, O2) groups of electrodes and averaged the potentials across electrodes within each group.

For the baseline correction we selected a 20-ms interval in the beginning of the presaccadic epoch (i.e., −200 to 180 ms before the saccade onset).

#### **STATISTICAL ANALYSIS**

We considered three main factors: Saccade size (short, medium, long), Correctness (Correct detection vs. failure) and Saccade direction (positive vs. negative). For the analysis of the eye movement measures, unless otherwise stated, we used univariate repeated-measures ANOVA with these factors. As for the eventrelated potentials (ERP) analysis, the effect of scalp locations on amplitude differences is not an additive effect but a multiplicative one; therefore additive ANOVA models cannot unambiguously evaluate topographic differences (McCarthy and Wood, 1985). For this reason, we treated amplitude in the anterior and posterior electrode groups as two dependent variables in a multivariate design (MANOVA). Whenever MANOVA revealed a significant effect, we proceeded to univariate ANOVA follow-up analyses and *post-hoc* tests, in order to identify the specific dependent variables that contributed to the effect. In the univariate ANOVA we applied the Huynh-Feldt correction (ε) of *p*-values associated with more than two degrees of freedom, in order to compensate for violation of sphericity. For *post-hoc* analyses we used Fisher's LSD (Least Significant Difference) test.

For presaccadic activity, the correlation between anterior and posterior signals was *r*(17) = 0.35. This is in accordance with the MANOVA requirement that the dependent variables should be moderately correlated with each other (i.e., 0.20–0.60; Meyers et al., 2006). The mean peak-to-peak amplitude of the saccadic spike potential, however, highly correlated between the anterior and posterior electrode groups [*r*(17) = 0.97]. We therefore ran two separate ANOVAs on the anterior and posterior saccadic spike potentials.

### **RESULTS**

#### **EYE MOVEMENT RESULTS**

First, we tested for changes in viewing strategy. We divided 20-s memorization intervals into five 4-s time bins. In each bin we computed the saccade duration for correct detection and failure. We found that saccade duration decreased after the first bin and then remained unchanged (**Figure 2A**). A repeated-measures ANOVA with factors of Time Bins (5 levels) and Correctness (correct detection vs. failure) revealed an effect of Time Bins [*F*(4, <sup>64</sup>) = 8.9, *p* < 0.001, Huynh-Feldtε = 0.82] and no correctness effect nor interaction. *Post-hoc* test showed that the effect of Time Bins occurred because of longer saccades in the first bin than all other ones (all *p* < 0.001). This finding can be understood as a shift in emphasis from global to local viewing strategy in the course of free viewing, consistently with previous reports (Unema et al., 2005; Pannasch et al., 2008; Graupner et al., 2011).

The shift in viewing strategy may affect the processes of visual encoding and fixation target selection. Since we determined strategy according to saccade size (as reflected in saccade

**FIGURE 2 | Saccade durations (sizes). (A)** Saccade durations in the course of free viewing during 20-s presentations of the memorization display for subsequent correct detection and failure. The 20-s presentation of the memorization display was divided in five 4-s time bins. Saccade durations decrease after the first bin. **(B)** The ranges of saccade durations after division in 3 saccade size bins: short, medium, and long saccades. Saccade sizes are shown for positive ("pos") and negative ("neg") saccade directions, and for correct detection and failure, in order to demonstrate that their values did not differ between conditions. The data points are the means and the error bars represent standard errors across 17 participants.

duration), we divided up the data into three bins accordingly: short, medium and long. **Figure 2B** illustrates the ranges of saccade durations after binning. A repeated-measures ANOVA with factors of Saccade size (short, medium, long), Saccade direction (positive vs. negative), Correctness (correct detection vs. failure) revealed that the saccade durations did not differ between negative and positive saccade directions [*F*(1, <sup>16</sup>) = 0.09, *p* = 0.77] and between correct detection and failure [*F*(1, <sup>16</sup>) = 1.7, *p* = 0.22]. This suggests that the oculomotor component of saccade preparation was similar for all conditions within a saccade size bin.

For the three saccade sizes, we considered the duration of their *preceding* fixations (**Figure 3A**). Saccade size conditions showed a prominent effect on preceding fixation duration [*F*(2, <sup>32</sup>) = 10.7, *p* < 0.001, Huynh-Feldt ε = 1.0] and there was an interaction between Saccade size and Saccade direction [*F*(2, <sup>32</sup>) = 37.1, *p* < 0.001, Huynh-Feldt ε = 1.0]. These effects indicate shorter fixation durations for positive than for negative direction for medium (*p* < 0.001) and long (*p* < 0.001) saccades. Correctness did not yield an effect [*F*(1, <sup>16</sup>) = 0.4, *p* = 0.52]; no further interactions were found.

In contrast to fixation duration, the *temperature difference* between fixation locations appears to be sensitive to correctness of change detection (**Figure 3B**). The absolute values of the temperature difference was higher in correct detection than in failure [*F*(1, <sup>16</sup>) = 5.7, *p* = 0.03].

In our study we used the temperature difference as an indicator of fixation target selection since it represents the contrast of task-relevant information between two fixation locations. The absolute temperature difference was larger for positive than negative direction [*F*(1, <sup>16</sup>) = 33.7, *p* < 0.001]. In order to evaluate how the target selection depends on saccade sizes we compare the absolute temperature differences for short, medium and long saccades. After a short saccade the gaze is likely to land into an image region with similar temperature, whereas after a long saccade the gaze may land into a region with a different temperature. As expected, we found a prominent increase of the temperature difference with saccade size [*F*(2, <sup>32</sup>) = 93, *p* < 0.001, Huynh-Feldt ε = 1.0].

In addition, the processes of target selection are different for the short and long saccades. We observed an interaction between Saccade size and Saccade direction: *F*(2, <sup>32</sup>) = 4.98, *p* = 0.03, Huynh-Feldt ε = 0.70) which was occurred because of the larger temperature difference in the negative than positive direction for the long (*p* < 0.001) and medium (*p* < 0.001) saccades, but not for short (*p* = 0.57) ones (**Figure 3C**).

#### **EEG RESULTS**

**Figure 4A** shows the scalp topography of the spike potentials averaged over 17 participants relative to saccade onset for three saccade sizes.

#### *Saccadic spike potential*

The saccadic spike potentials emerged as a biphasic wave of the same positive polarity in all recording sites. Typically the positive polarity is observed only in the parietal sites and alternates between positive and negative polarities in the frontal

**FIGURE 3 | Eye movement results. (A)** Duration of the fixations preceding the saccades used for dividing trials into saccade size bins. **(B)** Absolute temperature difference on the fixation heat maps, correctness effect. **(C)** Absolute temperature difference, saccade direction effect. "pos" indicates positive and "neg" indicates negative saccade direction. The data points are the means and the error bars represent standard errors across 17 participants. The asterisks designate significant differences between correct detection and failure.

sites depending on saccade direction (Csibra et al., 1997). The observed topographical distribution of the polarity was a consequence of the linked-mastoid reference used: re-referencing to the average reference inverted the polarity of the saccadic potentials over the frontal sites, keeping the positive polarity over the parietal sites, as it is illustrated in **Figure A1** in the Appendix (for a similar effect see Figure 6B in Plochl et al., 2012). However, even for the mastoid reference used, the peak-to-peak amplitude of the saccadic potential was much larger in the posterior than in the anterior electrode group [7.9μV (SEM 0.52) and 6.7μV (SEM 0.59), respectively, *t*(17) = 7.8, *p* < 0.001], consistently with a parietal amplitude maximum of the saccadic potential (Csibra et al., 1997; Keren et al., 2010).

The amplitude of the saccadic potential strongly depends on saccade size: the amplitude linearly increased with saccade size for the anterior [*F*(2, <sup>32</sup>) = 72, *p* < 0.001, ε = 0.64] and posterior [*F*(2, <sup>32</sup>) = 81, *p* < 0.001, ε = 0.7] electrode groups (**Figure 4B**). There were no other effects, nor interactions, so the saccadic potential did not differ between conditions in any of the preselected saccade size (short, medium or long). This result is consistent with the common finding that the saccadic spike potential reflects only saccade sizes and is not sensitive to cognitive influences (e.g., Keren et al., 2010).

#### *Presaccadic activity*

The presaccadic activity is represented by a slow positive wave with a maximum about 100 ms before saccade onset (with the exception of O1 and O2 sites). The amplitude of the presaccadic activity (−100 to 20 ms before the saccade) averaged across all conditions appeared much larger in the anterior than in the posterior electrode group (**Figure 4C**).

#### *Effects of saccade size*

The MANOVA revealed a significant effect of saccade size on the amplitude of presaccadic activity [*F*(4, <sup>13</sup>) = 9.9, *p* < 0.001; Wilk's - = 0.25]. The effect was significant for both anterior [univariate *F*(2, <sup>32</sup>) = 7.9, *p* = 0.002, ε = 1.0] and posterior [univariate *F*(2, <sup>32</sup>) = 7.0, *p* = 0.003, ε = 1.0] electrode groups. **Figure 4C** shows smaller amplitude for short than for medium saccades for both electrode groups (both *post-hoc p* < 0.001). For the long saccades the amplitude was smaller than for the medium saccades, for the anterior (*p* = 0.04) but not for the posterior electrode group (*p* = 0.34).

#### *Correctness effect*

The main effect of correctness of change detection on the presaccadic amplitude was not significant; however, the correctness depended on saccade size and electrode location (**Figure 5A**). The MANOVA revealed an interaction between Correctness and Saccade size [*F*(4, <sup>13</sup>) = 3.2, *p* = 0.048; Wilk's - = 0.50]. The univariate ANOVAs revealed an interaction tendency [*F*(2, <sup>32</sup>) = 2.7, *p* = 0.097, ε = 0.84] only for the anterior group, with larger amplitude for correct detection than for failure, for short (*posthoc p* = 0.049) but not for medium (*p* = 0.3) or long (*p* = 0.18) saccades (**Figure 5B**).

#### *Saccade direction effect*

The presaccadic amplitude appeared larger for saccades in the positive than in the negative saccade direction (**Figure 6**). The MANOVA revealed an effect of Saccade direction [*F*(2, <sup>15</sup>) = 8.3, *p* = 0.004; Wilk's -= 0.48]. The effect of Saccade direction

onset. **(B)** The peak-to-peak amplitude of the saccadic spike potential. The amplitude gradually increases with saccade size. **(C)** Amplitude of the presaccadic activity (the mean in the interval −100 to 20 ms prior to

and "neg" indicates negative saccade direction. The data points are the means and the error bars represent standard errors across 17 participants.

was significant for both anterior [univariate *F*(1, <sup>16</sup>) = 11.6, *p* = 0.003] and posterior [univariate *F*(1, <sup>16</sup>) = 15.3, *p* = 0.001] electrode groups.

The effect of Saccade direction on the presaccadic amplitude depends on saccade size and electrode location (**Figure 6A**). The univariate ANOVA revealed an interaction between Saccade direction and Saccade size [*F*(2, <sup>32</sup>) = 4.0, *p* = 0.03, ε = 1.0] for the posterior electrode group only. The amplitude was larger for the positive than for the negative direction for the short (*post-hoc p* = 0.04) and medium (*p* < 0.001) but not for the long saccades (*p* = 0.49) (**Figure 6B**).

#### **DISCUSSION**

We investigated the brain processes related to visual encoding and selection of the next fixation target when observers are scrutinizing natural scenes. We raised the question, whether these processes could be observed in presaccadic electrical brain activity recorded at the scalp. The presaccadic activity was measured in the fixation intervals before saccade onsets while observers were inspecting the first of two scenes, between which change had to be detected. An image heat map was computed for each scene based on individual fixation durations and densities. The temperature differences between the start and end points of

saccades on the map were taken as a measure of the expected taskrelevance of the information concentrated in specific regions of a scene. Visual encoding was indicated by correctness of change detection. Selection of fixation target was evaluated by saccade directions on the heat maps. We found that both visual encoding and fixation target selection are reflected in presaccadic activity. Visual encoding was associated with presaccadic activity over anterior brain areas for short saccades. Target selection was associated with presaccadic activity over posterior areas for short and medium saccades. Together, we may conclude that presaccadic activity specifies the role of attention in scrutinizing natural scenes.

#### **FUNCTIONAL SIGNIFICANCE OF THE PRESACCADIC ACTIVITY**

What are the factors most likely affecting the amplitude of the presaccadic activity?

The presaccadic interval includes both a shift in covert attention to a novel target and motor preparation for a saccade to that target. Since we did not observe any effects on saccade size (**Figure 2B**) and the amplitude of the saccadic spike potential (**Figure 4B**) of our conditions, the oculomotor components of the saccades are equal in strength within each size bin. Therefore the amplitude of the presaccadic activity is likely to reflect a covert shift of attention. An attentional explanation of the presaccadic activity is consistent with previous interpretations of the scalp-recorded presaccadic potentials (Wauschkuhn et al., 1998; Gutteling et al., 2010; Krebs et al., 2012).

Furthermore, the amplitude of the antecedent potential indicates trans-saccadic remapping (Parks and Corballis, 2008). This does not contradict the attentional explanation above since only attended visual features are remapped (reviewed in Mathot and Theeuwes, 2011). In our study the amplitude of the presaccadic activity was larger for saccades in the positive than in the negative direction (**Figure 6B**). We defined the positive saccade direction, according to fixation heat maps, by higher temperature at the end than start point of a saccade. The temperature difference was higher for saccades in the negative than in the positive direction (**Figure 3C**), because the *preceding* fixation was shorter for saccades in the positive than in the negative direction (**Figure 3A**). The fixation *following* was, correspondingly, longer for saccades in the positive than in the negative direction. Since fixation duration indicates amount of attention(Henderson, 2007), the presaccadic activity reflects attention deployed to the *following* fixation location. Thus, the larger amplitude of presaccadic activity for saccades in the positive than in the negative direction may reflect remapping of attended information across a saccade.

It is known that visual information collected during the last 100 ms before a saccade (so called "saccadic dead time") does

the posterior group of electrodes. 0 ms is saccade onset. **(B)** Amplitude of the presaccadic activity (the mean in the interval −100 to 20 ms prior points are the means and the error bars represent standard errors across 17 participants.

not influence the destination of the current saccade (Becker, 1991), but influences the following saccade (Caspi et al., 2004). This suggests functional dissociation of two processes occurring in this interval: accumulation of information for later use and remapping of previously collected information across a saccade to maintain visual stability (Mathot and Theeuwes, 2011). We propose that both processes are reflected in presaccadic activity, taking place in parallel.

Factors of overall alertness or enhanced voluntary processing are less likely to affect selectively the presaccadic activity, because systematic amplitude changes were revealed for saccades of various sizes and directions occurring within the same memorization display, i.e., short enough to render unlikely any systematic changes of alertness or volition.

#### **VISUAL ENCODING**

On the fixation heat maps, saccades followed by correct change detection had larger temperature differences (in absolute values) than ones followed by detection failure (**Figure 3B**). Since temperature difference reflects the informational contrast between two fixation locations, we conclude that successful encoding depends on entering scene regions that are expected to be taskrelevant and containing a high concentration of information.

Remarkably, even though the computation of temperature considers fixation durations, these are not sensitive to correctness (**Figure 3A**). Since temperature also considers fixation density (the spatial distribution of the fixation locations), this must be the variable most directly relevant for correct detection. This conclusion is consistent with previous observations: the number of regional fixations rather than their duration matters for memorization of localized information (Loftus, 1972). Consequently, the heat maps differ for correct detection and failure: in correct detection the information is concentrated in widespread clusters, whereas in failure it is more randomly scattered. Thus, scanning strategy in successful encoding consists of thorough scrutiny of scene regions which expected to be task-relevant.

An effect of correctness on the amplitude of presaccadic activity was found only over anterior brain areas. This effect depended on saccade size and appeared to be associated with short saccades only (**Figure 5B**). Short saccades mainly occur towards the end of exploring a display (**Figure 2A**), when potential targets have already been localized and the visual strategy changes to scrutinizing the local regions (Unema et al., 2005; Tatler and Vincent, 2008; Graupner et al., 2011). The amplitude of presaccadic activity was larger in correct detection than in failure (**Figure 5A**). Since we attributed the presaccadic activity to attention being deployed to the fixation location following, this finding implies that attention is needed for successful encoding. Specifically, successful encoding may depend on scrutiny of the local regions guided by top-down attention rather than global visual exploration of a scene<sup>1</sup> .

#### **SACCADIC TARGET SELECTION**

The absolute values of the temperature difference between start and end points of saccade increase with saccade size (**Figures 3B,C**). Small differences are associated with short saccades because after a small shift in fixation the gaze is likely to land into an image region with similar temperature, whereas after a large shift the gaze may land into a region with a distinct temperature.

The absolute temperature difference was larger for the saccades in the negative than in the positive direction; however, this happened for the medium and long, but not for the short saccades (**Figure 3C**). Since temperatures depend on fixation duration, this is a consequence of the shorter duration of fixations preceding the medium and long saccades in the positive than in the negative directions (**Figure 3A**). Temperature indicates how attractive a certain region is as a fixation target. The effect, therefore, may indicate facilitation of fixation target selection if the next fixation target is in an attractive location.

Selection of a saccade target seems to be guided by attractiveness of the next target location only for the medium and long (>30 ms) saccades (**Figure 3C**). This effect may occur because of the predominant role of bottom-up guidance in long saccades. During initial exploration, visual salience determines the attractiveness of scene regions. The initial exploration is accomplished with long saccades (**Figure 2A**) (Unema et al., 2005; Pannasch et al., 2008; Graupner et al., 2011). The bottom-up character of the initial exploration offers an explanation for why we did not observe the difference in the presaccadic activity for long saccades (**Figure 6B**).

Once potential targets have been localized with long saccades, the visual strategy changes to scrutiny of the local regions

#### **REFERENCES**


*Clin. Neurophysiol. Suppl.* 33, 99–104.


(Unema et al., 2005; Graupner et al., 2011). This change in strategy is accompanied by a shift from bottom-up to top-down saccade guidance (Findlay and Walker, 1999). The local scanning strategy involves short saccades (Unema et al., 2005; Tatler and Vincent, 2008; Graupner et al., 2011). For the short saccades the informational contrast between current and next target locations is low and, correspondingly, the temperature difference for the positive and negative saccade direction is about equal (**Figure 3C**). Thus, local scanning is not guided by salient informational content at the next fixation locations. Instead, it may be guided by a top-down mechanism directing the saccades to scene regions which are expected to be task-relevant. This is reflected in the larger presaccadic activity in the positive than in the negative saccade direction over posterior areas (**Figure 6B**).

Overall, our findings support the notion of different selection mechanisms for short and long saccades, in line with previous observations (Tatler et al., 2006; Foulsham and Kingstone, 2012).

The medium saccades may take up an intermediate position and are guided by combination of bottom-up vs. top-down processes. This is reflected in the difference between positive and negative saccade directions observed in both eye movement and EEG measures (**Figures 3C**, **6B**).

In sum, scalp-recorded electrical brain activity in the presaccadic interval reflects processes related to trans-saccadic perception. We provide evidence for the sensitivity of the presaccadic activity to encoding of visual information and to selection of a target for the next fixation. In these processes the presaccadic activity reflects systematic tendencies in oculomotor behavior which differ in attentional demand.

### **ACKNOWLEDGMENTS**

We thank Ronald Rensink for providing us with the stimulus set, and Hironori Nakatani and Tatiana Tyukina for valuable technical support. Andrey R. Nikolaev, Chie Nakatani and Cees van Leeuwen were supported by an Odysseus grant from the Flemish Science Organization (*FondsWetenschappelijkOnderzoek, FWO*).

in the human frontal cortex. *Neuroimage* 34, 1209–1219. doi: 10.1016/j.neuroimage.2006.10.001


(2011). Coregistration of eye movements and EEG in natural reading: analyses and review. *J. Exp. Psychol. Gen.* 140, 552–572. doi: 10.1037/a0023885


<sup>1</sup>Note, however, that the change blindness phenomenon indicates that attention alone is not sufficient for successful memorization (Mack and Rock, 1998; Rensink, 2002).


Saccadic spike potentials in gamma-band EEG: characterization, detection and suppression. *Neuroimage* 49, 2248–2263. doi: 10.1016/j.neuroimage.2009.10.057


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 25 January 2013; accepted: 08 June 2013; published online: 27 June 2013.*

*Citation: Nikolaev AR, Jurica P, Nakatani C, Plomp G and van Leeuwen C (2013) Visual encoding and fixation target selection in free viewing: presaccadic brain potentials. Front. Syst. Neurosci. 7:26. doi: 10.3389/fnsys. 2013.00026*

*Copyright © 2013 Nikolaev, Jurica, Nakatani, Plomp and van Leeuwen. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

### **APPENDIX**

## Cortical sources of ERP in prosaccade and antisaccade eye movements using realistic source models

### *John E. Richards\**

*Department of Psychology, Institute for Mind and Brain, University of South Carolina, Columbia, SC, USA*

#### *Edited by:*

*Sebastian Pannasch, Technische Universität Dresden, Germany*

#### *Reviewed by:*

*Sven Mueller, University of Ghent, Belgium Ulrich Ettinger, University of Bonn, Germany Nikolaos Smyrnis, National and Kapodistrian University of Athens, Greece*

#### *\*Correspondence:*

*John E. Richards, Department of Psychology, Institute for Mind and Brain, University of South Carolina, 1800 Gervais Street, Columbia, SC 29028, USA e-mail: richards-john@sc.edu*

The cortical sources of event-related-potentials (ERP) using realistic source models were examined in a prosaccade and antisaccade procedure. College-age participants were presented with a preparatory interval and a target that indicated the direction of the eye movement that was to be made. In some blocks a cue was given in the peripheral location where the target was to be presented and in other blocks no cue was given. In Experiment 1 the prosaccade and antisaccade trials were presented randomly within a block; in Experiment 2 procedures were compared in which either prosaccade and antisaccade trials were mixed in the same block, or trials were presented in separate blocks with only one type of eye movement. There was a central negative slow wave occurring prior to the target, a slow positive wave over the parietal scalp prior to the saccade, and a parietal spike potential immediately prior to saccade onset. Cortical source analysis of these ERP components showed a common set of sources in the ventral anterior cingulate and orbital frontal gyrus for the presaccadic positive slow wave and the spike potential. In Experiment 2 the same cued- and non-cued blocks were used, but prosaccade and antisaccade trials were presented in separate blocks. This resulted in a smaller difference in reaction time between prosaccade and antisaccade trials. Unlike the first experiment, the central negative slow wave was larger on antisaccade than on prosaccade trials, and this effect on the ERP component had its cortical source primarily in the parietal and mid-central cortical areas contralateral to the direction of the eye movement. These results suggest that blocked prosaccade and antisaccade trials results in preparatory or set effects that decreases reaction time, eliminates some cueing effects, and is based on contralateral parietal-central brain areas.

**Keywords: prosaccades, antisaccades, eye movements, cortical source analysis, ERP**

#### **INTRODUCTION**

The prosaccade and antisaccade procedure has been useful in the study of the brain control of eye movements. These eye movements are studied with the presentation of a target in one of two peripheral locations. An eye movement is made either to the target ("prosaccade") or away from the target to the opposite location ("antisaccade"). This procedure has been used in a wide variety of studies to examine visual attention and eye movement control (Everling and Fischer, 1998; Munoz and Everling, 2004) and may be useful for examining neuropsychological status (i.e., schizophrenia, McDowell and Clementz, 2001; ADHD, Klein et al., 2003). Several studies with non-human animals have shown that areas of the frontal cortex, such as the frontal eye fields (FEF), supplementary eye fields (SEF), dorsolateral prefrontal cortex (DPC), and prefrontal cortex are involved in the generation of eye movements and differ for prosaccade and antisaccade eye movements. Neuroimaging studies using PET, block-design fMRI, and event-related fMRI have examined these eye movements in human participants and have found activity in several of these brain areas (see review by McDowell et al., 2008). The human neuroimaging studies lack the temporal resolution used in the non-human studies and therefore may not be able to examine the neural processes that are time-locked to these eye movements. Alternatively, studies have recorded scalp event-related-potentials (ERP) and show several types of presaccadic ERP activity related to prosaccade and antisaccade eye movements (Brickett et al., 1984; Evdokimidis et al., 1996; Everling et al., 1997, 1998). Two studies used cortical source analysis to determine the brain areas responsible for the generation of the ERP linked to these eye movements (Richards, 2003; McDowell et al., 2005). This paper describes a study of college-age participants' ERP activity for prosaccade and antisaccade eye movements. The cortical sources of the ERP activity were studied with realistic source models based on individual MRIs, and the effect of mixed-choice trials and blocked trials on ERP components was studied.

Many neuroimaging studies of prosaccade and antisaccade eye movements in humans have used PET or fMRI and blocked designs. The first studies in this area used PET imaging and a blocked design (Fox et al., 1985; O'Driscoll et al., 1995; Sweeney et al., 1996; Doricchi et al., 1997). Participants were given blocks of prosaccade trials, blocks of antisaccade trials, and perhaps trials with steady fixation, and subtraction techniques were used to identify brain areas more active in eye movements than fixation, or differential activity in prosaccade and antisaccade blocks. Similar studies have been done using fMRI (Connolly et al., 2000; Kimmig et al., 2001; Matsuda et al., 2006; Domagalik et al., 2012). Although the results from these neuroimaging studies are not entirely consistent, several areas of the frontal cortex (FEF, SEF, DPC, ventromedial or ventrolateral PFC) are more active in these eye movements than during fixation, or more active in antisaccade than prosaccade testing blocks. Other brain areas show such activation, such as the superior parietal cortex, intraparietal sulcus, and extrastriate occipital cortex (see McDowell et al., 2008).

Some studies used event-related fMRI in order to do mixedchoice trial presentations and to link the brain areas to specific components for these eye movements (Cornelissen et al., 2002; Curtis and D'Esposito, 2003, 2006; Desouza et al., 2003; Ford et al., 2005; Brown et al., 2007; Dyckman et al., 2007; Ettinger et al., 2008). It is possible with event-related fMRI to use a design in which prosaccade and antisaccade trials are randomly intermixed (mixed-choice trials design). In mixed-choice fMRI experiments, BOLD activation in the fMRI may be the same size for prosaccade and antisaccade trials (Cornelissen et al., 2002; Dyckman et al., 2007). For example, Dyckman et al. (2007) used event-related fMRI and had a antisaccade trial block, a prosaccade trial block, and a mixed antisaccade-trial/prosaccade trial block. They found several brain areas that were more active on the antisaccade trials in the single-type block than the prosaccade trials in the single-type block, but which were not more active on antisaccade trials than prosaccade trials in the mixed-choice trial block. This suggests that some of the additional activation during antisaccade blocks is due to preparatory set psychological processes. Event-related fMRI studies also have used designs to separate brain areas involved in preparatory eye movement planning and eye movement generation (Ford et al., 2005; Brown et al., 2007; Ettinger et al., 2008). For example, Ford et al. (2005) distinguished the early part of a preparatory period (first 6 of 10 s), the latter part of a preparatory period (last 4 of 10 s), and events following the target and saccade (events: 0.5 s; fMRI 5 s; see Figure 1 in Ford et al., 2005). They found that the FEF, SEF, areas in the prefrontal cortex (DPC; anterior cingulate cortex) and posterior cortex (intraparietal sulcus, parietal-occipital sulcus) have preparatory activity that distinguishes prosaccades and antisaccades. Brown et al. (2007) used a mixed-choice trials event-related fMRI design with trials on which a signal indicated a prosaccade or antisaccade was to be made, and on some trials the signal was followed by a response whereas other trials the response was not made. In this study they found the no-response trials showed more frontal and parietal brain activity on antisaccade cues than for prosaccade cues even when a response was not made. They suggested that the preparatory set effects occurring in response to the movement type stimulus elicits the larger brain responses for antisaccade trials. This type of event-related design might distinguish between preparatory events and the events surrounding the eye movements.

The study of the brain control of antisaccade and prosaccade eye movements has been aided by using ERP. The ERP activity provides better temporal resolution than PET or fMRI and may be especially important in distinguishing the brain areas controlling eye movements near the generation of the saccades. There is a slow negative ERP component before eye movements that begins up to 1 s prior to saccade onset and has its maximum value over the vertex. In blocked designs (e.g., Everling et al., 1997, 1998) this ERP component has a larger amplitude and more widespread scalp distribution for antisaccade eye movement blocks than for prosaccade eye movement blocks. In mixed-choice trial designs, if the warning stimulus in the preparatory interval is informative about the upcoming saccade and there is a response stimulus indicating the eye movement, this potential may be larger on antisaccade than on prosaccade trials (Klein et al., 2000; Richards, 2003; but cf. Mueller et al., 2009). But if the warning stimulus is uninformative with respect to the eye movement type, and the response stimulus is a target that indicates the eye movement type, this ERP component still occurs but there is no difference between the amplitude of this component on prosaccade and antisaccade trials (Evdokimidis et al., 1996; Richards, 2003). The close link of this component to the preparatory period and its time course (500–1000 ms before saccade onset) suggest it represents the preparatory activity in mixedchoice trial designs, or response set in blocked designs, similar to the preparatory BOLD activity occurring in event-related fMRI studies. There is a slow positive ERP component about 30–300 ms prior to saccade onset and which occurs over central and parietal areas (Everling et al., 1997; Richards, 2003). This positive component is more closely linked to the eye movement itself. There is a small positive ERP component occurring about 70 ms prior to the eye movement over frontal pole electrodes contralateral to the eye movement that is larger on antisaccade than on prosaccade eye movements (Richards, 2003). These latter components represent control processes closely tied to eye movement execution. The time course of these components imply that they represent brain activity that cannot be studied in detail with PET or fMRI-BOLD methods, even in event-related fMRI designs.

Two studies used cortical source analysis of the ERP components found in antisaccade and prosaccade eye movements (Richards, 2003; McDowell et al., 2005). Cortical source analysis (Scherg, 1990; Scherg and Picton, 1991; Scherg, 1992; Huizenga and Molenaar, 1994) is a technique for estimating the location and amplitude of cortical areas that general the EEG. McDowell et al. (2005) used a block design and measured EEG and MEG activity in response to the onset of the cue to move the eyes and activity preceding the eye movement. Activity in response to the imperative stimulus occurred primarily in posterior areas (cuneus, middle occipital gyrus). The activity preceding saccade onset occurred in FEF, SEF, and DPC and was larger on antisaccade than prosaccade trials. They were able to detail activity with ms resolution through the period immediately preceding saccade onset. Richards (2003) used a mixed-choice trial design with experiment events that distinguished preparatory activity, presaccadic activity, and activity in response to the target onset. The slow negative ERP component was associated with preparatory target activity had its cortical sources in Brodmann areas 6, 9, and 11 (near FEF, SEF, and DPC). The activity associated with this region did not differ on prosaccade and antisaccade trials. It was concluded that this area is related most closely to target preparatory activity and that the blocked-trials design may

be necessary to show a prosaccade-antisaccade difference in this activity in the ERP (cf., Richards, 2003 and Everling et al., 1997; Evdokimidis et al., 1996 or McDowell et al., 2005). Richards (2003) found a small positive ERP component about 70 ms prior to eye movement, contralateral to the eye movement, and larger on antisaccade than prosaccade trials. This ERP component had cortical sources located in Brodmann areas 10 (frontal pole), 11 (orbital-frontal gyrus), and 8. The close link of this ERP component with the eye movement suggests brain areas closer to the frontal pole are closely tied to eye movement execution. Richards' findings suggest that the brain areas associated with eye movement preparatory activity (e.g., parietal, FEF, SEF, DPC) can be separated from brain areas associated with eye movement execution (e.g., frontal pole, orbital-frontal gyrus) with source analysis of ERP.

There were two aims of the current study. The first aim was to examine the brain areas involved in antisaccade and prosaccade eye movements using cortical source analysis of ERP with realistic models based on individual participant MRIs. The high temporal resolution of the ERP might allow the distinction between brain preparatory activities for eye movement in response to stimulus demand and brain activities related to saccade execution that occur immediately prior to saccades. The procedures for eliciting prosaccade and antisaccade eye movements may consist of preparation for the target occurrence, evaluation of the target, and saccade execution. These activities could be distinguished in the time domain of ERPs (e.g., < 1 s preceding saccades) but not in fMRI. The current study used high-density EEG recording (128 channels) in a targeted procedure with a mixed-choice trials design in Experiment 1 (Evdokimidis et al., 1996; Klein et al., 2000; Richards, 2003). College-age participants were tested in a targeted procedure in which a cue signals a 2 s preparatory interval followed by a target that indicates the direction and type of eye movement and is the imperative signal for the eye movement. The cue during the preparatory interval acts as a warning stimulus and may induce preparatory brain activity, which would be expected to be larger on antisaccade than on prosaccade trials. The cortical source analysis used structural MRIs from individual participants to restrict the source solution to the gray matter of that participant. This allowed the source locations to be restricted to gray matter locations for that participant and defined specific anatomical areas tailored to the individuals' anatomical space rather than a generic brain or normalized Talairach space (Ha et al., 2003).

The second aim of the current study was to examine the effect of blocked and mixed-choice trial presentations on the ERP responses during antisaccade and prosaccade eye movements. Experiment 2 consisted of a design in which antisaccade trials and prosaccade trials were presented in separate blocks, or presented in the mixed-choice design of Experiment 1. One conclusion from fMRI studies comparing block- and mixed-choice designs is that blocked designs may result in preparatory set effects and lead to larger and more widespread activation of brain areas in antisaccade trials in those brain areas involved in eye movement preparation (Cornelissen et al., 2002; Ford et al., 2005; Brown et al., 2007; Dyckman et al., 2007). The different areas showing activation in the cortical source analysis studies of McDowell et al. (2005) and Richards (2003) may be due to the use of the block design in the former and the mixed-choice design in the latter. The comparison of the mixed-choice trials and blocked-trials may help distinguish the effects often studied in ERP tasks and the block-effects found in fMRI studies. Experiment 2 also used high-density EEG recording and cortical source analysis of the ERP using realistic source models based on individual participant MRIs.

### **EXPERIMENT 1**

### **METHOD**

#### *Participants*

The participants were nineteen adults (9 F). The participants ranged in age from 20 to 41 at time of testing (mean = 25.6, *SD* = 5.35) and consisted of undergraduate and graduate students. All participants were of normal intelligence and had no medical problems. The research was approved by the Institutional Review Board for the Use of Human Subjects and informed consent to participate in the study was obtained from each participant.

### *Apparatus and stimuli*

Each participant sat in a comfortable chair approximately 75 cm from a 29-- (56 × 42 mm) color video computer monitor (NEC Multisync XM29) displaying at 1280 horizontal and 1024 vertical pixels. The screen had a 2.6◦ square outline at each of three areas located in the center or 10◦ to the right or left of center which remained on at all times (**Figure 1**). The pretarget period was indicated by a small solid square in the center square outline. At target onset, the small solid square was removed, and a solid triangle, a checkerboard pattern, or a four-point star replaced one of the peripheral squares. The peripheral spatial cue consisted of a solid blinking square in the right or left target location.

#### *Procedure*

The participant sat in the chair and the viewing area facing the television monitor. The participant was informed that this was a study of the brain control of eye movements and was given instructions and practice in the procedure. **Figure 1** shows the flow of each trial. The pretarget center square was presented for 2 s, followed by the presentation of the target for 2.5 s, followed by an interstimulus interval varying randomly from 1 to 3 s. The participants were instructed to make an eye movement toward the checkerboard target when it appeared (prosaccade), away from the triangle target to the opposite outline square (antisaccade), or to keep the eyes fixated in the center location when the four-point star appeared (catch trial). The targets were presented randomly and equally often on the left or right peripheral squares. The antisaccade, prosaccade, and catch trials were presented continuously, in random order for 5-trial blocks (2 antisaccade, 2 prosaccade, 1 catch trial). **Figure 1** (middle panel) shows the continuous presentation sequence.

There were two cueing procedures used in the study. The "uncued" procedure consisted of the presentation of the pretarget and target stimuli without a cue (**Figure 1**, top two diagrams). The "cued" procedure consisted of the presentation of the blinking square for the first 500 ms of the pretarget period in

the peripheral position where the target would occur (**Figure 1**, bottom two diagrams). Ten of the participants received these two conditions in alternating 5-min blocks, with as many presentations as possible within the blocks. The other 9 participants had multiple sessions, with the other sessions consisting of one of these two cueing conditions and some other conditions (see Richards, 2003 for conditions) but which were not analyzed in this study. There were 33 sessions used in the study.

#### *Recording of EEG and segmenting of EEG for ERP*

The EEG was recorded with a 128 channel EEG system (EGI, Inc.; Tucker, 1993; Tucker et al., 1994), referenced to vertex during recording and re-referenced algebraically to an average reference, recorded with 20 K amplification, at a sampling rate of 250 Hz, and with impedances below 100 kW<sup>1</sup> . The segments for the EEG were extracted for 1000 ms preceding target onset through the target onset until the saccade toward the target, and for 100 ms after saccade onset for target trials. The saccade to make the eye movement to the target was identified in the electrooculogram (EOG) recording (Matsuoka and Harato, 1983; Matsuoka and Ueda, 1986). Trials with incorrect eye movements or blinks were excluded from the analysis. For the ERP analysis the electrodes were grouped into sets of electrodes from the 128 channel GSN Sensornet that were close to the 10–20 locations into "virtual 10–20" electrodes (**Table 1**; see Supplementary Material). The ERP displays are based on these combined electrodes, and a multivariate approach to repeated measures was used for analysis by analyzing the groups of electrodes as multiple dependent variables and the experimental factors with a general linear models approach. The grouping of the electrodes and the multivariate analysis controlled for inflated error rates due to repeated tests and heterogeneity in the covariance matrix of the electrode effects.

#### *Anatomical MRI, head segmenting, brodmann locations*

A structural (anatomical) MRI was done for each participant in the study. Three of the MRIs were 3D T1-weighted images done on a 1.5T GE MRI, with 0.859 mm slices and 256 (axial) × 184 (sagittal) X 256 (coronal) resolution (Palmetto Imaging, Columbia, SC). The rest of the MRIs were 3D T1 weighted images done on a 3.0T Philips Intera MRI, rapid FLASH acquisition, 15.0◦ flip angle, *TE* = 5.7 ms, *TR* = 9.5 ms per FLASH line, effective *T*1 = 800 ms, 1.0 mm slices and 256 × 159 × 256 resolution (Center for Advanced Imaging Research, Medical University of South Carolina, Charleston, SC).

The structural MRI of each participant was used to construct realistic head models for the cortical source analysis, so that the

#### **Table 1 | Virtual 10–20 electrodes for GSN.**


source analysis was based on a realistic model of that participants head. This included four steps (see Appendix). First, an average electrode placement map was generated for the participant. This was done by identifying fiducial electrode locations on the skull in the MRI volume, registering the fiducal locations on the skull to the same locations in the average electrode placement map, and transforming the electrode placement map to fit into the AC-coordinate system for that participant (Richards et al., unpublished). Second, the materials in the head were segmented, including scalp, skull, CSF, white matter, gray matter, nasal cavity, and eyes (Richards, 2005; Richards, unpublished). The segmenting resulted in a MRI volume with each voxel representing a specific material. Third, three-dimensional tetrahedral wireframes were computed that contained the location of each corner of the tetrahedron and the type of material making up the tetrahedron, using the MR Viewer module of the EMSE computer program (Source Signal, Inc.). **Figure 2** (top row) shows the segmented wireframe from an anatomical MRI from one participant. Fourth, individual participant MRIs had anatomical areas defined by an atlas for that individual. The atlas either came from an average MRI template of 20–24 year old adults (Sanchez et al., 2012) that was registered/transformed to the individual participant's head space, or from atlases computed on the individual participants (Phillips et al., unpublished). The atlases were used to define several anatomical areas by identifying common designations from each of the atlases for the ROIs for that participant. There were nine ROIs chosen for which bilateral activity was represented by separate left and right regions, and six ROIs chosen along the midline for which bilateral activity was combined (**Table 2** for midline regions; **Table 3** for bilateral regions).

**Figure 2** (bottom two rows) show the ROIs for one individual for the frontal pole (BA 10), orbital-frontal gyrus (BA 11), and the cingulate cortex (areas 23, 24, 32, and 25). The cingulate cortex

<sup>1</sup>The choice of 100 kW as the maximum impedance value was based on the high input impedance of the EGI amplifiers. These amplifiers have an input impedance of about 200 MW compared with traditional EEG amplifier impedances of about 10 MW. Given the recommendation of interelectrode impedances being at least 1% of amplifier input impedance (e.g., 10 kW for 10 MW amplifier; Picton et al., 2000), 100 kW is appropriate for this amplifier. Ferree et al. (2001) estimate that for this amplifier system a 50 kW preparation would lead to a maximum 0.025% signal loss, and therefore the current levels should lead to no more than a 0.050% signal loss. They found no discernible signal loss with electrode preparations at about 40 kW.

was further divided into three regions: posterior to the anterior commisure (posterior cingulate cortex), the dorsal area anterior and superior to the anterior commisure (dorsal anterior cingulate cortex), and the ventral portion of the anterior cingulate (ventral anterior cingulate cortex). **Figure 3** shows the ROI source regions on a 3-D rendered brain.

#### *Realistic cortical source analysis*

Cortical sources estimated as the current source density of cortical source locations with current density reconstruction (CDR; Darvas et al., 2001) using sLORETA (Pascual-Marqui et al., 1994; Pascual-Marqui, 2002) as the constraint for the CDR. The electrode locations, source locations, and head model were used with EMSE's Data Analysis (Source Signal, Inc.) to estimate the forward model, inverse model, and current density reconstruction. The realistic cortical source models used a "finite-element method" (FEM) mapping of the electrical conductivity of the head to calculate the forward model. The FEM forward model was calculated offline with the Data Analysis module of the EMSE computer program (Source Signal, Inc.). The forward model and

**Table 3 | Regions-of-Interest for the individual participant atlases for**

**lateralized areas (separate left, right ROIs).**

#### **Table 2 | Regions-of-Interest for the individual participant atlases for central (non-lateralized) areas.**


the ERP from the pretarget period were used to estimate a leadfield matrix representing the inverse model using the sLORETA restriction algorithm. This required the estimation of a lead-field matrix based on the realistic locations of the electrodes on the scalp, the source locations defined by the segmented gray matter and the location of the eyes, and the FEM model for the individual participant.

The pretarget or presaccadic ERP were used in a 4-ms by 4 ms segment averaged over the appropriate experimental factors and conditions, and the entire ERP segment was used to estimate the CDR for each ERP slice. This resulted in a MRI volume representing the source volumes at each sampled ERP time. The MRI volumes contain the CDR for each voxel in the source locations. The CDR were summed over each voxel of the ROI and divided by the total volume of the ROI. This results in an average current per mm value for each ROI. The ROIs for the analysis were anatomical areas determined on theoretical grounds or by reference to past cortical source analysis studies (Richards, 2003;

*Phillips et al., unpublished, has list of all atlases and segmented areas.*

temporal lobes which were not included in the other ROIs

McDowell et al., 2005) and PET or MRI neuroimaging studies (see Supplementary Material).

#### **RESULTS**

#### *Saccade error and latency*

The onset of the saccade from the center to the targeted square was analyzed. There were 8082 eye movements in the experiment, distributed approximately equally for antisaccades and prosaccades (4029 and 4053 eye movements, respectively) and across the experimental conditions (from 1952 to 2101 eye movements for each block-type/eye movement type combination). The error rate for the uncued and cued trials was approximately equal (3.24 and 3.29%, respectively), but there were slightly more errors on the antisaccade than on the prosaccade trials (2.50 and 4.04%, respectively), but errors on the two trial types (uncued, cued) were similar for prosaccade and antisaccade trials. The latency of the saccade onset from the target onset was

analyzed by a repeated measures ANOVA2 with trial type (cued, uncued) and eye movement type (prosaccade, antisaccade) as factors. There were main effects for trial type, *F*(1, <sup>29</sup>) = 28.66, *p* < 0.001, movement type, *F*(1, <sup>29</sup>) = 86.09, *p* < 0.001, and an interaction between them, *F*(1, <sup>29</sup>) = 12.46, *p* < 0.001. As expected, saccades were faster on cued than uncued trials (*M* = 435.6, *N* = 4029, *SE* = 2.53; *M* = 473.0, *N* = 4053, *SE* = 2.38) and for prosaccade eye movements than antisaccade eye movements (*M* = 422.8, *N* = 4162, *SE* = 2.33; *M* = 487.9, *N* = 3920, *SE* = 2.51). The interaction between trial type and eye movement type occurred because the cue facilitated the reaction time of the prosaccade eye movements by 29 ms whereas it facilitated the antisaccade eye movements by 54 ms.

<sup>2</sup>The ANOVAs for the analyses were done with a general linear models approach using non-orthogonal design because of the unequal number of eye movements across factors, and because of the different numbers of eye movements across subjects (see Searle, 1971, 1987; Hocking, 1985). In all analyses, the Scheffe method was used to control for inflation of testwise error rate for post hoc comparisons. The error mean squares for each post-hoc comparison was obtained from the error term for the omnibus interaction for that posthoc evaluation. The significance of the post-hoc tests was *p* < 0.05 for all tests and these individual probabilities were not reported in the text.

#### *Grand average ERP*

There were three ERP components that were examined as a function of the experimental variables. First, grand average ERP and topographical maps were constructed to display the targetlocked ERP changes occurring in this task (e.g., Richards, 2003). **Figure 4** (top panel) shows the pretarget ERP from 1 s before target onset through 200 ms of target onset for several frontalcentral electrodes (baseline is 1.1–1.0 s before target onset). The pretarget ERP averages showed a positive slow component with maximal amplitude in the prefrontal scalp leads (e.g., FP1, FPz, FP2) that tapered off in the frontal scalp leads (Fz). There was a negative slow ERP component primarily in the central and parietal leads. The negative slow wave that occurred over the central areas of the scalp was analyzed with a multivariate approach to testing for the FrontalZ, CentralZ, ParietalZ, and OccipitalZ virtual electrode groups. The extent of the pretarget slow wave component was quantified by computing the mean ERP level in the last 50 ms of the pretarget interval, which is the target onset, minus the mean ERP level in the first 50 ms of the pretarget interval. This difference was analyzed with a Cue Type (2: uncued, cued) × Movement Type (3: prosaccade, antisaccade, catch) MANOVA. The only significant effect was a Cue Type main effect on the CentralZ virtual electrodes, Wilk's - = 0.5676, *F*(6, <sup>23</sup>) = 2.92, *p* = 0.0288. The negative slow wave was larger for the uncued trials than for the cued trials (*M*'s of CentralZ virtual electrode group = −5.65 and −5.00 µV, respectively). Since the participant did not know the type of eye movement in advance of the target, this indicates that the negative contingency was attenuated as a result of the spatial cue.

**Figure 4** shows the presaccade ERP for several parietal electrodes (middle panel). This shows a slow positive component beginning about 150 ms prior to saccade onset (difference from −220 to −200 ms presaccade). The positive presaccadic potential shown in **Figure 4** occurring immediately before saccade onset appears to be the "spike potential."

The slow positive slow wave occurring primarily over the parietal areas was examined by computing the mean value in the presaccadic interval from about –50 to –20 ms preceding the saccade (**Figure 4**). This was analyzed for the FrontalZ, CentralZ, ParietalZ, and OccipitalZ electrode groups with a Cue Type (2) × Movement Type (2: prosaccade, antisaccade) for the trials on which a correct eye movement occurred. The only significant effect was a Movement Type main effect on the ParietalZ virtual electrodes, Wilk's - = 0.6131, *F*(5, <sup>25</sup>) = 3.15, *p* = 0.0242. The parietal slow wave was larger for the antisaccade trials than for the prosaccade trials (*M*'s = 2.33 and 1.89µV, respectively). This slow wave was larger over the parietal leads contralateral to the eye movement. The side of the ERP data were switched so that the side of the eye movement was toward the right on each trial, and the contralateral parietal virtual electrodes (i.e., Parietal3) and ipsilateral electrodes (i.e., Parietal4) were analyzed with a Cue Type × Movement Type MANOVA. The ERP for the contralateral parietal electrode group was significantly affected by Movement Type, Wilk's - = 0.4935, *F*(4, <sup>26</sup>) = 6.67, *p* = 0.0008, with the parietal slow wave was larger before antisaccade eye movements than prosaccade eye movements.

The presaccadic spike potential was analyzed. The difference between the ERP in the intervals immediately preceding the saccade (−24 to −16 ms) and the ERP occurring at the time of the saccade (−8 through + 8 ms) was analyzed for the central electrode groups with a Cue Type (2) × Movement Type (2) MANOVA. The Movement Type factor affected both the CentralZ electrode group, Wilk's - = 0.4762, *F*(6, <sup>24</sup>) = 4.40, *p* = 0.0039, and the ParietalZ electrode group, Wilk's - = 0.5083, *F*(5, <sup>25</sup>) = 4.84, *p* = 0.0031. The presaccadic ms-by-ms changes in the ERP of the CentralZ and ParietalZ are shown in **Figure 4**. These figures used the intervals immediately preceding the saccade as the baseline. The presaccadic spike potential was larger on trials on which a prosaccade eye movement occurred than on trials on which an antisaccade eye movement occurred. The difference in this spike potential between eye movement types also occurred over lateral parietal leads (Parietal3, Parietal4) but not over central or occipital center or lateral electrodes (no effects for CentralZ, OccipitalZ, Central3, Central4, Occipital1, Occipital2).

#### *ERP source analysis*

The cortical sources of the ERP were analyzed. I restricted these analyses to examine the presaccadic spike potential and the presaccadic positive slow wave, which were found in the ERP analyses to be significantly affected by the type of eye movement. **Figure 5** shows a 3-D rendering of the current density reconstruction of the ERP occurring at the peak of the spike potential, separately for prosaccades and antisaccades. The primary area showing activity was below the anterior cingulate in the ventral anterior cingulate and the orbital-frontal gyrus. This activity appears to be greater for prosaccade (top panel) than for antisaccade (bottom panel) eye movements. **Figure 6** shows the ms-by-ms mean *nAm* for selected ROIs. The ventral anterior cingulate and the orbital frontal gyrus both showed the largest increase in the positive slow wave and a large spike potential.

The sources of the presaccadic positive slow wave were examined by computing a mean current density value in the interval from about −50 to −20 ms preceding the saccade. This value was analyzed with a ROI (e.g., frontal pole, orbital frontal gyrus, Brodmann areas 6 and 8, dorsal ACC, ventral ACC, pre- and postcentral gryi, superior parietal lobe, posterior cingulate, intraparietal sulcus, supramarginal gyrus, angular gyrus, dorsolateral PFC) × Cue Type (2) × Movement Type (2) ANOVA. The positive slow wave current density was significantly affected by the ROI, *F*(15, <sup>375</sup>) = 34.33, *p* < 0.0001, and an interaction between ROI and eye movement type, *F*(15, <sup>255</sup>) = 3.32, *p* < 0.0001. The interaction reflected a significant effect of movement type for the current density coming from the ventral ACC and orbital frontal gyrus. The current density was larger for antisaccades than for prosaccades (**Figure 6**, for ventral ACC and orbital frontal gyrus). There were smaller (non-significant) effects for the Brodmann areas 6 and 8, and the dorsolateral PFC in the same direction.

The cortical sources of the presaccadic spike potential were examined by calculating the difference for the current density from the period immediately preceding the saccade [immediately preceding the saccade (−24 to −16 ms) and the ERP occurring at the time of the saccade (−8 through + 8 ms)]. This was analyzed with a ROI × Cue Type × Movement Type ANOVA. There

frontal (Fz) electrodes showing a slow positive ERP component simultaneous with central (Cz) and parietal (Pz) electrodes showing a slow negative ERP component (CNV). Grand average ERP for presaccade activity for

average ERP for the presaccadic spike potential on the CentralZ and ParietalZ virtual electrode group for prosaccade and antisaccade eye movements **(bottom panel)**.

were main effects of the cue type, *F*(1, <sup>17</sup>) = 17.78, *p* = 0.0126, ROI type, *F*(15, <sup>375</sup>) = 57.83, *p* < 0.0001, and an interaction of cue type and ROI type, *F*(15, <sup>255</sup>) = 7.41, *p* < 0.0001. As with the positive slow wave, the cue type effects occurred in the ventral ACC and the orbital frontal cortex, and to a lesser degree, in the dorsal ACC.

#### **DISCUSSION**

There were three types of ERP activity found in this study that replicated findings from other studies. First, there was a negative potential shift in the ERP that occurred before target onset and was associated with the preparatory interval of the task rather than the saccade itself. Several studies of ERP activity in the prosaccade and antisaccade task report this negative shift in the EEG that begins up to 1 s prior to saccade onset and has its maximum over the vertex (Brickett et al., 1984; Evdokimidis et al., 1996; Everling et al., 1997, 1998; Klein et al., 2000; Richards, 2003; Mueller et al., 2009). Two of the ERP components were closely tied to events surrounding the saccade itself. The second type of ERP activity was the slow positive potential shift in ERP beginning about 100 ms before saccade onset with maximum values over central and parietal leads. This positive slow component over parietal leads is not unique to studies of the antisaccade and prosaccade but is also found in voluntary eye movements. Some studies report no difference in this component between antisaccade and prosaccade trials (Evdokimidis et al., 1996; Everling et al., 1997, 1998; Richards, 2003), but in the current study it was larger for antisaccade than for prosaccade eye movements The third type of ERP component was the sharp spike in ERP over the central and parietal leads called the "spike potential." In one study this potential was larger for antisaccade trials than prosaccade trials (blocked design, Everling et al., 1997) but generally this potential was the same on prosaccade and antisaccade trials (Evdokimidis et al., 1996; Everling et al., 1997, 1998; Klein et al., 2000; Richards, 2003). In the current study it was larger on the prosaccade eye movement trials than on the antisaccade eye movements trials.

The cortical sources of the presaccadic eye movements were examined with current density reconstruction using realistic head models. The sources for the components around the presaccadic eye movements both were found primarily in the ventral portion of the anterior cingulate cortex and the orbital frontal gyrus. The current density was larger for the antisaccade eye movements than for prosaccade eye movements in the period similar to the positive slow wave in the parietal scalp leads. This difference, and the timing of the current density over the presaccadic interval (e.g., **Figure 6**) was similar to that of the presaccadic positive slow wave (**Figure 4**). These findings suggest that this area is the cortical source generating this ERP component on the scalp. The spike potential in the ERP also appears to be localized to the same cortical area. At least, the timing of the spike potential occurring in the ERP was similar to what was found in the ms-by-ms current density reconstruction values in these ROIs. It is interesting that the spatial cue affects the current density in this ROI, whereas the type of eye movement affected the ERP component.

The second experiment was designed to test the effects of presenting stimuli in a mixed-choice design to presenting trials in a blocked design. Studies fMRI and ERP have shown brain activity during antisaccades that is larger than prosaccades in blocked trials, and many of these differences disappear for mixed-choice trials (e.g., Dyckman et al., 2007). Similarly, event-related fMRI studies that separate preparatory periods and execution periods find many more areas in which the preparatory brain activity is larger for antisaccade than for prosaccade eye movements. This suggests that the blocked trials result in preparatory psychological processes that do not exist in mixed-choice trials. One advantage of the timing resolution of ERP and the instantaneous response of the electrical changes in the brain is that eye movement preparatory and execution activity in the brain might be distinguished in either blocked or mixed-choice designs. The second experiment was designed to compare the mixed-choice trial design with a design in which prosaccade or antisaccade trials were presented in separate blocks.

#### **EXPERIMENT 2**

#### **METHOD**

#### *Participants*

The participants were eleven adults (7F). The participants ranged in age from 19 to 34 at time of testing (mean =

25.4, *SD* = 4.84) and consisted of undergraduate and graduate students.

#### *Procedure*

The procedure differed from Experiment 1. There were six types of presentations. Four presentations were blocked according the type of eye movement and cueing procedure. This resulted in four blocked conditions: (1) uncued prosaccade trial blocks, (2) uncued antisaccade trial blocks, (3) cued prosaccade trial blocks, and (4) cued antisaccade blocks. Two additional mixedchoice trial blocks were given: (5) uncued prosaccade/antisaccade mixed trial blocks, (6) cued prosaccade mixed trial blocks. **Figure 7** shows a representative set of trials for the uncued prosaccade trial block; **Figure 1** (middle panel) shows the corresponding uncued mixed-choice prosaccade and antisaccade trial blocks. All trial blocks included catch trials, and trials were presented in random order for 5-trial sequences (4 prosaccade and 1 catch trial for prosaccade blocks; 4 antisaccade and 1 catch trial for antisaccade blocks; 2 antisaccade, 2 prosaccade, 1 catch trial for mixed-choice trial blocks). The six presentations types resulted in prosaccade and antisaccade trial data for cuedand non-cued presentations, and for blocked and mixed-choice presentations.

Each participant received all six types of blocked presentation, in 5-min blocks, with the order of presentation being randomly chosen without replacement for the six block types.

#### *Other methods*

All the MRIs for Experiment 2 were 3D T1-weighted images done on a 3.0T Philips Intera MRI, with 1.0 mm slices and 256 × 159 × 256 resolution (Center for Advanced Imaging Research, Medical University of South Carolina, Charleston, SC).

## **RESULTS**

#### *Saccade error and latency*

The onset of the saccade from the center to the targeted square was analyzed. There were 4436 eye movements in the experiment, distributed approximately equally for antisaccades and prosaccades (2100 and 2166 eye movements, respectively) and across the experimental conditions (from 512 to 580 eye movements for each block-type/eye movement type combination). The error rate for the uncued and cued trials was approximately equal (2.54 and 2.14%, respectively), as were errors on the antisaccade than on the prosaccade trials (2.22 and 2.24%, respectively), and slightly more errors on mixed-choice trials than on blocked trials (2.80 and 1.91%, respectively). The latency of the saccade onset from the target onset was analyzed by a repeated measures ANOVA with trial type (cued, uncued), eye movement type (prosaccade, antisaccade), and procedure type (blocked, mixed) as factors. There were main effects for trial type, *F*(1, <sup>10</sup>) = 112.28, *p* < 0.001, movement type, *F*(1, <sup>10</sup>) = 20.74, *p* = 0.0011, procedure type, *F*(1, <sup>10</sup>) = 108.28, *p* < 0.001, and an interaction between trial type and procedure type, *F*(1, <sup>10</sup>) = 36.39, *p* < 0.001. **Figure 8** shows the RT's as a function of the three experimental facts. As found in Experiment 1, saccades were faster on cued than uncued trials, and faster for prosaccades than antisaccades. The trial type by procedure type was due to a larger cueing effect on mixed-choice trials (98 ms) than on blocked trials (48 ms).

I examined the Trial Type × Movement Type effect separately for the blocked and mixed-choice trials. For the mixedchoice trials, similar to Experiment 1, there was a significant interaction between trial type and movement type for the mixedchoice trials (*p* = 0.0530). However, the interaction of trial type and movement type was not significant on the blocked trials

done in Experiment 2 (*p* = 0.2177). For the mixed-choice trials, similar to Experiment 1, the cue facilitated the reaction time of the prosaccade eye movements by 41 ms whereas it facilitated the antisaccade eye movements by 107 ms. Alternatively, on blocked trials used in the current experiment, the cue effect was similar for prosaccade (43 ms) and antisaccade (53) trials, and was approximately the same size as the prosaccade cueing condition on the mixed-choice trials in Experiment's 1 and 2.

#### *Grand average ERP*

The same ERP components that were examined in Experiment 1 were tested as a function of the cue type (2: uncued, cued), movement type (2: prosaccade, antisaccade), and the procedure type (2: blocked, mixed). There was a negative slow wave primarily in the central and parietal leads. A multivariate analysis of the FrontalZ, CentralZ, ParietalZ, and OccipitalZ electrode groups was tested with a Cue Type (2) × Movement Type (2) × Procedure Type (2) design. The effect of the cue type on the CentralZ virtual electrodes approached statistical significance, Wilk's - = 0.1860, *F*(6, <sup>5</sup>) = 3.65, *p* = 0.0885. As in Experiment 1, the negative slow wave was larger on the non-cued than on the cued trials. There was a significant interaction of the movement type and procedure type on the CentralZ virtual electrode group, Wilk's - = 0.1978, *F*(12, <sup>30</sup>) = 3.12. *p* = 0.0056. As in Experiment 1, there was no significant movement type effect on the CentralZ negative slow wave for the mixed-choice procedure. However, in the blocked trials, the negative slow wave was larger for the antisaccade trial block than for the prosaccade trial block (*M*'s of CentralZ virtual electrode group for prosaccade and antisaccade trial blocks = −4.06 and −4.60 µV, respectively).

The slow presaccadic positive slow wave occurring primarily over the parietal areas was examined by computing the mean value in the presaccadic interval from about −50 to −20 ms preceding the saccade (**Figure 4**, middle panel). This was analyzed for the FrontalZ, CentralZ, ParietalZ, and OccipitalZ electrode groups with a Cue Type (2) × Movement Type (2: prosaccade, antisaccade) × Procedure Type (2: blocked, mixed). There were no significant main effects or interactions involving the procedure type effect. This indicates that the positive slow wave occurring over the parietal leads was similar in magnitude for the blocked and mixed-choice trial types. Unlike this ERP component in Experiment 1, there was a main effect of the cue type on the ParietalZ electrode group, Wilk's - = 0.0731, *F*(5, <sup>6</sup>) = 15.21, *p* = 0.0024. The parietal slow wave was larger for the uncued trials than for the cued trials (*M*'s = 3.21 and 2.01µV, respectively for uncued and cued trials). Even though the Cue Type × Procedure Type interaction was not significant, because this effect did not occur in Experiment 1 I tested the Cue Type effect for the blocked and mixed-choice procedure types separately with *post-hoc* error control methods. This cue type factor significantly affected the positive slow wave for the blocked trials (*p* < 0.05) but not for the mixed-choice trials.

The presaccadic spike potential was analyzed. The difference between the ERP in the intervals immediately preceding the saccade (−24 to −16 ms) and the ERP occurring at the time of the saccade (−8 through + 8 ms) was analyzed for the central electrode groups with a Cue Type (2) × Movement Type (2) × Procedure Type (2) MANOVA. There were no significant main effects or interactions involving the procedure type, indicating that the presaccadic spike potential was not significantly different for the blocked and mixed-choice trials. The effects found in Experiment 1 were also substantially replicated in Experiment 2, i.e., a significantly larger spike potential for the prosaccade eye movements in the CentralZ, ParietalZ, ipsilateral parietal electrode groups, than for the antisaccade eye movements in those electrodes.

#### *ERP source analysis*

There were two effects of the blocked trials on the ERP. First, on the blocked trials the negative slow wave was larger for antisaccade eye movements than for prosaccade eye movements. For the negative slow wave, the sources of the central negative slow wave was examined by computing the difference between the current density values from the last 50 ms of the pretarget interval and the first 50 ms, only for the blocked files. This was examined with a ROI (e.g., frontal pole, orbital frontal gyrus, Brodmann areas 6 and 8, dorsal ACC, ventral ACC, pre- and post-central gryi, superior parietal lobe, posterior cingulate, intraparietal sulcus, supramarginal gyrus, angular gyrus, dorsolateral PFC) × Movement Type (2) × Side (3: contralateral, central, ipsilateral) ANOVA. There were several main effects and interactions, including the three way interaction between ROI, movement type, and side, *F*(9, <sup>90</sup>) = 3.66, *p* = 0.0005. Three of the ROIs had a significant interaction between movement type and side. This occurred because the current density on the contralateral side was larger for antisaccade trials than prosaccade trials, but the current density on the ipsilateral side was not significantly different for prosaccade and antisaccade eye movements. This occurred for pre- and post-central gryi, the superior parietal lobe, and the frontal pole.

**Figure 9** shows the ms-by-ms mean *nAm* for the parietal and pre- and post-central gyri ROIs combined, separate for the contralateral and ipsilateral sides and the prosaccade and antisaccade eye movements. The antisaccade trials resulted in larger current density the prosaccade trials in this parietal and mid-central areas on the contralateral side of the eye movement, but not on the ipsilateral side. **Figure 9** also shows the parietal and pre- and post-central gyri ROI source activation for the prosaccade and antisaccade trials. The difference in the activity between the sides appeared in the most superior position of these ROI on the contralateral side of the eye movement. **Figure 10** shows similar plots for the frontal pole area, for pro- and anti-saccade eye movements on the blocked trials. The activity was contralateral to the eye movement.

Second, there was a cue type effect for the presaccadic positive slow wave for the blocked trials that did not occur on the mixed-choice trials. The cortical sources of the presaccadic positive slow wave was examined by computing the difference between the current density values from presaccadic interval from about −50 to −20 ms preceding the saccade, only for the blocked trials. This was examined with an ROI × Cue Type × Movement Type ANOVA. There were some significant effects replicating the finding found in Experiment 1 for this ERP component. However, there were no significant effects or interactions involving the cue type effect or the movement type effect. This indicates that the current density for the cued and uncued trials was not significantly different in the blocked trials even though the ParietalZ component was affected by the trial type.

#### **DISCUSSION**

#### *Pretarget and presaccadic ERP*

There were three types of ERP activity found in this study that replicated findings from other studies. First, there was a negative potential shift in the ERP that occurred before target onset and was associated with the preparatory interval of the trial rather than the saccade itself. Several studies of ERP activity in the prosaccade and antisaccade eye movements report this negative shift in the EEG that begins up to 1 s prior to saccade onset and has its maximum over the vertex (Brickett et al., 1984; Evdokimidis et al., 1996; Everling et al., 1997, 1998; Klein et al., 2000; Richards, 2003; Mueller et al., 2009). This potential was similar to the "contingent negative variation" found in tasks with a preparatory interval and an imperative stimulus (e.g., S1-S2; CNV; Walter et al., 1964; Fabiani et al., 2000). In the current study using a block design, in which prosaccade and antisaccade trials are given in different blocks, this ERP component was larger in the antisaccade trial blocks than in the prosaccade trial blocks. In mixed-choice trials when the cue for the preparatory interval is informative for the type of eye movement, most often this negative presaccadic potential does not differ between prosaccade and antisaccade trials (Evdokimidis et al., 1996; Richards, 2003). The close link of this component to the preparatory period and its time course (500–1000 ms before saccade onset) suggests it represents the preparatory activity in mixed-choice trial designs, or response set in blocked designs, similar to the preparatory BOLD activity occurring in eventrelated fMRI studies. This presaccadic potential occurs for cuedcatch trials (Richards, 2003) or in the current study when the cue was uninformative about the type of eye movement on the trial.

Two of the presaccadic activities were closely tied to events surrounding the saccade itself. The second type of ERP activity was the slow positive potential shift in ERP beginning about 100 ms before saccade onset with maximum values over central and parietal leads. This positive slow component over parietal leads is not unique to studies of the antisaccade and prosaccade but is also found in voluntary eye movements. Most studies report no difference in this component between antisaccade and prosaccade trials (Evdokimidis et al., 1996; Everling et al., 1997, 1998; Richards, 2003), but in the current study it was larger for antisaccade than for prosaccade trials in the mixed-choice procedure of Experiment 1. The third type of ERP component was the sharp spike in ERP over the central and parietal leads called the "spike potential." In one study this potential was larger for antisaccade trials than prosaccade trials (blocked design, Everling et al., 1997) but generally this potential was the same on prosaccade and antisaccade trials (Evdokimidis et al., 1996; Everling et al., 1997, 1998; Klein et al., 2000; Richards, 2003). In the current study this ERP component was slightly larger on the prosaccade.

#### *Cortical activation for saccade preparation*

The cortical source analysis in the current study was useful in identifying the brain regions generating the ERP activity so it could be compared with neuroimaging studies using PET or fMRI. The studies of PET or fMRI using the block design find neural activity in this task in nearly every cortical area known to be involved in eye movement control in primates (FEF, SEF, superior parietal lobe, DPC [areas 9, 46], anterior cingulate cortex, anterior medial PFC [areas 8, 9], ventromedial PFC [area

10]; Everling and Fischer, 1998; Munoz and Everling, 2004; McDowell et al., 2008). Studies using event-related fMRI that separate the preparatory interval BOLD activity from response activity often report the BOLD activity in response to informative preparatory cues is larger on antisaccade than prosaccade trials in the FEF or SEF (SMA) (Curtis and D'Esposito, 2003, 2006; Desouza et al., 2003; Ford et al., 2005). The ERP activity in the current study similar to the preparatory BOLD response was the negative potential that began at the preparatory cue and preceded target onset. This ERP component was not different for prosaccade and antisaccade eye movements in the mixed-choice procedure used in Experiment 1. However, when the eye movement types were presented in separate testing

blocks, the pretarget negative slow wave was larger for the antisaccade block. This effect is similar to that found with fMRI studies when the prosaccade and antisaccade trials are given in blocked trials or in mixed-choice trials. In fMRI studies there is an extended network of areas in the parietal and frontal lobes that are involved in the preparation of antisaccade eye movements (Ford et al., 2005; Brown et al., 2007; Ettinger et al., 2008). However, these areas appear to be linked to preparatory activity of the antisaccade and not saccade execution (e.g., Ford et al., 2005; anterior cingulate cortex, FEF, SEF, DPC, intraparietal sulcus, parietal-occipital sulcus). Many of these brain areas are enhanced for eye movements occurring in antisaccade trial blocks over prosaccade trials blocks, but are less widespread for

mixed-choice trial blocks (Cornelissen et al., 2002; Dyckman et al., 2007).

The cortical sources of the pretarget negative slow wave were widespread in the current study, but the functional relation between eye movement type and these sources occurred primarily in the contralateral parietal and around the central gyrus (precentral and post-central gyri). Studies using event-related fMRI that separate saccade preparatory activity from saccade execution have reported higher activation for antisaccade eye movements in similar areas (e.g., SMG in Ettinger et al., 2008; parietal-occipital sulcus in Ford et al., 2005). It should be noted that this ERP component did have cortical sources in portions of the anterior cingulate, especially the ventral regions. The sources in the anterior cingulate cortex were similar to several studies showing activity in blocked designs (Brown et al., 2004; Raemaekers et al., 2005; Matsuda et al., 2006) or event-related fMRI activity in the preparatory period (Ford et al., 2005; Brown et al., 2007). In addition to being larger for antisaccades than prosaccades, the area corresponding to the anterior portion of the cingulate gyrus shows a larger BOLD response in the preparatory interval on antisaccade trials that were correct compared to error trials (Ford et al., 2005). Both the anterior cingulate cortex and anterior portions of the cingulate gyrus have been shown to be active after saccades in this task differentially for error and correct antisaccades (Polli et al., 2005). This implies that this area is heavily involved in both saccade planning and eye movement evaluation in this task.

#### *Cortical activation for saccade execution*

The analysis of the responses immediately preceding the saccade and their cortical sources has implications for fMRI analysis. Both the presaccadic positive slow wave and the spike potential occurring at saccade onset appear to have their primary origin in the ventral areas of the prefrontal cortex. This includes the ventral region of the anterior cingulate cortex and the orbital frontal gyrus (**Figure 6**). The presaccadic positive slow wave was larger for antisaccade eye movements than for prosaccade eye movements, and this occurred on both the blocked and mixed-choice tasks. The close temporal relation of these components to the saccade onset is a finding uniquely suited for EEG/ERP work. These saccade-oriented effects cannot be shown in either blocked or event-related fMRI because of the temporal resolution of fMRI. It appears that the main effect of the blocking procedure vis-à-vis the eye movement types was on parietal and central brain areas rather than on prefrontal or anterior cingulate cortex. Alternatively, the responses specifically related to saccade execution occurred in the anterior cingulate and were not affected by the blocking manipulation. This finding suggests that the blocked trials used in typical fMRI (or ERP) studies results in a preparatory set that affects the brain areas controlling eye movement planning, whereas the brain areas more closely related to eye movement execution in the prefrontal cortex are unaffected by such preparatory sets.

One goal of this study was to improve the analysis of the cortical sources of the presaccadic ERP. The current study improved the analysis of the cortical sources of eye movement control over two prior studies (Richards, 2003; McDowell et al., 2005) in several ways. First, the McDowell et al. (2005) study used a "region-of-interest" approach and examined only a few selected cortical areas for the presaccadic ERP (lateral and medial FEF, SEF, and DPC). The extensive involvement in the current study of the anterior cingulate cortex, frontal pole, and orbital-frontal areas in the brain activity immediately preceding the eye movement therefore would have been overlooked. Second, these prior studies used a single MRI for the identification of source locations (average of individual MRIs in McDowell et al., 2005; single MRI in Richards, 2003) and McDowell et al. averaged the source locations to show continuous source areas. The use of an averaged MRI, generic brain, or normalized space leads to an artificial restriction of source locations and apparent localization precision, whereas the averaging of different underlying scalp sources may lead to smearing of the EEG (ERP) potential and smearing of the source locations. Doing source analysis tailored to the individual's anatomical space may be particularly important in the presence of significant differences in head size or the relation of the brain areas to scalp landmarks (Ha et al., 2003). Third, the use of finite element methods (FEM) for the resistance pathways for calculating the forward model (Rosenfeld et al., 1996; Awada et al., 1997; Buchner et al., 1997; Michel et al., 2004; Slotnick, 2004) was an improvement over the three-shell spherical model used by Richards (2003) or the three-compartment boundary element model used by McDowell et al. (2005). The boundary element models are geometrically more realistic than the spherical models, but cannot faithfully represent changes in resistance between gray matter, white matter, CSF, and muscle within the central compartment. The FEM models account for non-homogenous tissue within the head, local variations in tissue depth or area, and electrical anisotropies; though in practice, boundary element models and finite element models may give very similar results (Slotnick, 2004).

### **SUMMARY: COMPARING fMRI AND ERP FOR STUDYING EYE MOVEMENTS**

The results of the cortical analysis in the current study compare favorably with neuroimaging studies using PET, blockdesign fMRI, and event-related fMRI. This study and others (Evdokimidis et al., 1996; Everling et al., 1998; Richards, 2003) showed the negative presaccadic potential was related to preparatory responses to the target and not the events surrounding the saccadic eye movements. Several studies using event-related fMRI showed that activity in the areas of the brain that differentiate antisaccade from prosaccade eye movements (e.g., FEF) have substantial preparatory BOLD activity (Cornelissen et al., 2002; Curtis and D'Esposito, 2003; Desouza et al., 2003; Ford et al., 2005; Ettinger et al., 2008). In these studies when the BOLD activity was linked to the saccade execution period, some of the differentiation for antisaccades and prosaccades was eliminated. In addition to the likelihood that blocked presentations lead to preparatory set enhancement of the antisaccade brain activity, it also may be the case that in mixed-choice trials with informative cues indicating the upcoming eye movement that antisaccade eye movements are preceded by more activity than prosaccade eye movements (e.g., Richards, 2003; Brown et al., 2007, and Experiment 1 cueing effects). Both blocked presentations and instructional cues lead to shorter reaction times at the target onset, and a smaller difference between antisaccade and prosaccade trials (Richards, 2003; and both experiments of this study). Although the event-related fMRI design can separate blocked trials and mixed-choice trials effects and can separate preparatory effects from eye movement execution effects, the temporal resolution of the fMRI neuroimaging technique cannot identify the temporal process of brain activity surrounding the saccade (c.f. 4 s preparatory effects with 0.5 s saccade effects, Figure 1 in Ford et al., 2005). Studies using ERP and cortical source analysis (e.g., current study; Richards, 2003; McDowell et al., 2005) identified brain areas that were associated with the saccadic eye movement and pinpointed brain activity in the 100 ms preceding the saccade. I conclude that ERPs with cortical source analysis are useful in differentiating brain activities associated with general preparatory processes for control and brain activities associated with eye movement execution.

### **ACKNOWLEDGMENTS**

This research was supported by grants from the National Institute of Child Health and Human Development, #R37-HD18942 and a Major Research Instrumentation Award, #BCS-9977198, from the National Science Foundation. I wish to acknowledge Michael Stevens and William Campbell for their aid in testing participants, data editing, and analysis.

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/Systems\_Neuroscience/10.3389/ fnsys.2013.00027/abstract

### **REFERENCES**


task. *Exp. Brain Res.* 118, 27–34. doi: 10.1007/s002210050252


toolkit," in *Paper* P*resented at the 2nd UK e-Science All-hands Conference* (Nottingham).


379–399. doi: 10.1080/87565641. 2012.688900


study of voluntary saccadic eye movements and spatial working memory. *J. Neurophysiol.* 75, 454–468.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 02 March 2013; accepted: 11 June 2013; published online: 02 July 2013.*

*Citation: Richards JE (2013) Cortical sources of ERP in prosaccade and antisaccade eye movements using realistic source models. Front. Syst. Neurosci. 7:27. doi: 10.3389/fnsys.2013.00027*

*Copyright © 2013 Richards. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any thirdparty graphics etc.*

## A new high-speed visual stimulation method for gaze-contingent eye movement and brain activity studies

#### *Fabio Richlan1 \*†, Benjamin Gagl <sup>1</sup> \*†, Sarah Schuster 1, Stefan Hawelka1, Josef Humenberger <sup>2</sup> and Florian Hutzler <sup>1</sup>*

*<sup>1</sup> Centre for Neurocognitive Research and Department of Psychology, University of Salzburg, Salzburg, Austria*

*<sup>2</sup> HTBLA Leonding, Linz, Austria*

#### *Edited by:*

*Andrey R. Nikolaev, KU Leuven, Belgium*

#### *Reviewed by:*

*Hironori Nakatani, Japan Science and Technology Agency, Japan David C. Jangraw, Columbia University, USA*

#### *\*Correspondence:*

*Fabio Richlan and Benjamin Gagl, Centre for Neurocognitive Research and Department of Psychology, University of Salzburg, Hellbrunnerstr. 34, 5020 Salzburg, Austria e-mail: fabio.richlan@sbg.ac.at; benjamin.gagl@sbg.ac.at*

*†These authors have contributed equally to this work.*

Approaches using eye movements as markers of ongoing brain activity to investigate perceptual and cognitive processes were able to implement highly sophisticated paradigms driven by eye movement recordings. Crucially, these paradigms involve display changes that have to occur during the time of saccadic blindness, when the subject is unaware of the change. Therefore, a combination of high-speed eye tracking and highspeed visual stimulation is required in these paradigms. For combined eye movement and brain activity studies (e.g., fMRI, EEG, MEG), fast and exact timing of display changes is especially important, because of the high susceptibility of the brain to visual stimulation. Eye tracking systems already achieve sampling rates up to 2000 Hz, but recent LCD technologies for computer screens reduced the temporal resolution to mostly 60 Hz, which is too slow for gaze-contingent display changes. We developed a high-speed video projection system, which is capable of reliably delivering display changes within the time frame of *<* 5 ms. This could not be achieved even with the fastest cathode ray tube (CRT) monitors available (*<* 16 ms). The present video projection system facilitates the realization of cutting-edge eye movement research requiring reliable high-speed visual stimulation (e.g., gaze-contingent display changes, short-time presentation, masked priming). Moreover, this system can be used for fast visual presentation in order to assess brain activity using various methods, such as electroencephalography (EEG) and functional magnetic resonance imaging (fMRI). The latter technique was previously excluded from high-speed visual stimulation, because it is not possible to operate conventional CRT monitors in the strong magnetic field of an MRI scanner. Therefore, the present video projection system offers new possibilities for studying eye movement-related brain activity using a combination of eye tracking and fMRI.

**Keywords: eye fixation-related potential, eye tracking, EEG, ERP, fMRI, projector, gaze-contingent display changes**

### **INTRODUCTION**

The combined recording and analysis of eye movements and brain activity is one of the most promising developments in neuroscience (see Eye Movement-Related Brain Activity During Perceptual and Cognitive Processing). Recent studies used eye movements as markers for brain responses related to perceptual and cognitive processes during reading (e.g., Baccino and Manunta, 2005; Hutzler et al., 2007; Simola et al., 2009; Dimigen et al., 2011, 2012; Richlan et al., 2013), visual search (e.g., Healy and Smeaton, 2011; Kamienkowski et al., 2012), object identification (e.g., Rämä and Baccino, 2010; Marsman et al., 2012), and scene perception (e.g., Graupner et al., 2011; Nikolaev et al., 2011). Eye-movement-based research has particularly benefitted from the implementation of gaze-contingent display change paradigms such as moving window (McConkie and Rayner, 1975), moving mask (Rayner and Bertera, 1979), and invisible boundary paradigm (Rayner, 1975). The core of these paradigms is a display change during the very short time of a saccade. In the present paper, we present a novel high-speed visual stimulation system facilitating gaze-contingent display change paradigms,

which replaces currently used but no longer produced cathode ray tube (CRT) monitors. Besides execution of extremely fast display changes for various kinds of visual experiments (e.g., involving gaze-contingent display changes, short-time presentation, masked priming), the main advantage of a projector-based system is its applicability in functional magnetic resonance (fMRI) experiments. This opens up a novel line of research of combined eye tracking and fMRI studies with the above-mentioned experimental paradigms. Our system enables the implementation of visual experiments in the fMRI scanner with the same temporal precision as outside the fMRI environment. This was previously not possible because of the incompatibility of CRT monitors with fMRI and the poor temporal properties of current MR-compatible LCD monitors and projectors.

Gaze-contingent paradigms rely on fast and exactly timed display changes in response to the participant's eye movement behavior. Crucially, the display changes have to occur during a saccade, when visual processing is suppressed and the participant is unaware of the change. The duration of the time window for this process depends on the amplitude of a saccade. To illustrate the time constraints on gaze-contingent display changes in natural reading (i.e., sentence or text reading), the amplitude of a typical saccade is about 6–7 letters (i.e., about 2 visual degrees), resulting in saccade durations of around 20– 35 ms (Rayner, 2009; Slattery et al., 2011). Note that in most cases the invisible boundary is roughly in the middle between the starting position and the landing position of the saccade. Consequently, more than half of the saccade duration has already elapsed before the invisible boundary is crossed and this boundary crossing is detected by the eye tracker. Therefore, the display change should be completed within the short period of time of the second half of the saccade, immediately before the next fixations begins. As a consequence, fast timing and low variability is needed in order to guarantee that the majority of display changes take place during a saccade rather than during a subsequent fixation.

Using conditions that resemble existing studies, Slattery et al. (2011) showed that there can be irredeemable artifacts in the eye movement data when some of the display changes are not finalized before a subsequent fixation begins. Specifically, there is an interaction between timing of the display change and the amount and quality of information that is changed between the pre- and post-boundary stimulus. Critically, they found that the slower the display change was, the larger were the differences in the eye movement patterns. Thus, the effects of interest may be affected by slow display changes and may lead to a complex distortion of the experimental effects, which renders interpretation of findings difficult, if not impossible. Hence, for experiments employing gaze-contingent paradigms, it is crucial to deliver fast display changes during the time of saccadic blindness, when visual processing is suppressed. This can only be accomplished via rigorous control over the timing of display changes via a combination of high-speed eye tracking and high-speed visual stimulation. Especially for combined eye movement and brain activity studies, fast and exact timing of display changes is important, because of the high susceptibility of the brain to visual stimulation. Even if a delayed display change is not consciously perceived by the participant, it is likely to affect visual information extraction, which, in turn, may influence measures of brain activity (e.g., event-related potentials or hemodynamic responses).

In order to avoid display change artifacts in gaze-contingent paradigms, we developed a high-speed video projection system based on light-emitting diode (LED) technology. An LED-based tachistoscope was recently shown to provide a powerful means for extremely fast on-and-off switching of a visual display (Thurgood and Whitfield, 2013). This technology was proven useful in enabling minimal stimulus exposure durations for psychological experiments. Here, we extend this approach by presenting a system that should be capable of reliably delivering changes between two different visual displays in a similarly short time. The system is based on two converging projectors, which are toggled exactly at the moment when a boundary crossing is detected. This means that, rather than switching from one display to the next display with a single stimulation device (i.e., a monitor or a projector), we use two stimulation devices (i.e., two projectors), which are switched on and off, respectively, to change the display. Therefore, the display change can be realized independently from the projectors' LCD panel refresh rates (which are limited to 60 Hz). The present paper introduces and describes this high-speed video projection system. Furthermore, we present a hardware-based method for measuring the delay between the time point of the intended display change (i.e., immediately after the boundary crossing is detected by the eye tracker) and the actual display change. This method is based on a combination of a real-time photosensitive diode and an electroencephalography (EEG) amplifier. It can be used to assess the temporal properties of any visual stimulation setup (monitor or projector). For the present paper, we used this measurement circuit in order to assess the temporal properties of our newly developed video projection system in relation to a conventional CRT monitor with two refresh rates of 150 and 200 Hz. To do so, we implemented a typical gaze-contingent invisible boundary paradigm in a sentence reading task. We expected markedly faster display changes in our LED-based projector system compared to the CRT monitor for two reasons. First, the display change was controlled by a combination of the fast parallel port (Stewart, 2006) and a fast electronic circuit. Second, the projector setup was realized to bypass the process of building up a new display, which—in this context—is rather time consuming.

The present visual stimulation system should facilitate the realization of cutting-edge eye movement research requiring reliable high-speed visual stimulation in combined eye tracking and brain electrophysiological studies. It is not only applicable to experiments employing gaze-contingent display change paradigms but should also be feasible for short-time presentation and masked priming studies. In this context it replaces conventional CRT monitors. In addition, our system presents the necessary hardware features for a novel line of combined eye tracking and fMRI studies by enabling extremely fast visual presentation and implementation of gaze-contingent display change paradigms in the fMRI scanner. Fast visual presentation in the fMRI environment was previously not possible because of the poor temporal properties of LCD-based MR-compatible monitors and projectors.

#### **MATERIALS AND METHODS PARTICIPANT**

One participant conducted the sentence reading task in the projector setup measurement and both monitor setup measurements.

#### **MATERIALS AND PROCEDURE**

A horizontal 3-point calibration routine preceded the experiment. Fixating between two vertical lines on the left margin of the screen triggered sentence presentation in such a way that the participant's fixation was at the center of the sentence's first word. One-hundred sentences from a currently conducted study were presented in black letters on a white background by the Experiment Builder software (SR-Research) in mono-spaced font (Courier New; single character width: ∼0.3◦ of visual angle; see **Figure 1** for an example). In each trial, a display change was initiated by a saccade from a pre-target to a target word, which was realized by the classical boundary paradigm (Rayner, 1975). To ensure fast display changes, we included a prepare sequence before each of our pre-built trials. Before the boundary crossing, all or some letters of the target word and all letters of the

following words were degraded (i.e., about 45% of black pixels were displaced). When the eye crossed the invisible boundary, the display changed from the presentation of the degraded stimuli to a presentation without degradation. Sentence presentation was terminated after fixating an "X" in the lower right corner of the screen and recalibration was initiated in case the fixation control at the start of a trial failed. The target words were composed of five letters, were placed right after the invisible boundary, and were not predictable from sentence context. The invisible boundary and corresponding target words were never at the first, second, or last position of a sentence. In addition to the standard sentence presentation, which was vertically centered starting on the left of the screen, a black square (about 30◦ × 41◦) was presented above the vertical center on the right end of the screen. This square triggered the measurement by the photosensitive diode. To specify, when the black square was presented, the photosensitive diode was switched off by the low amount of light that fell on the diode. After the invisible boundary was crossed, a white screen instead of the black square increased the light intensity presented to the photosensitive diode. The increase in light intensity reduced the resistance of the photosensitive diode, which consequently was switched on.

#### **APPARATUS AND SETUP**

For all measurements, an EyeLink CL eye tracker (SR-Research, Canada) was used to record the movements of the right eye (at 2000 Hz). A forehead and chin rest stabilized the participant's head 52 cm in front of the monitor. For presentation of the stimuli, we used a PC with a Pentium 4 processor (2.8 GH processor speed), 2 GB RAM, a Nvidia GeForce 6200 graphics card, and a Windows XP operating system. In the following, two presentation setups were compared: the novel projector setup, which was realized with two projectors, and a monitor setup, which was realized with one of the fastest CRT monitor available.

#### *Monitor setup*

First, we used the state-of-the-art setup to estimate the latencies of gaze-contingent display changes for refresh rates of 150 and 200 Hz with the Vision Master Pro 454 monitor (Iiyama, Japan). For the 150 Hz refresh rate the display resolution was 1024 × 768 pixels, and for the 200 Hz refresh rate the display resolution was 640 × 480 pixels.

#### *Projector setup*

The projector setup is a new approach to present gaze-contingent displays. Two projectors (with a resolution of 1024 × 768 pixels), which were mounted on top of each other, behind a semi-transparent screen, were either switched on or off by an electronic circuit. The switching circuit allows very fast display changes despite the low refresh rate of the projectors' LCD units (60 Hz; display change latency about 45 ms as measured by the electronic circuit described in section Display Change Latency Measurement.). In the present experiment, we used an invisible boundary paradigm with one display change, which was realized by switching from one to the other projector. In order to be independent from the refresh rate of the projectors, the same display was presented by both projectors, with the lower part of projector 1 and the upper part of projector 2 converging at the center of a semi-transparent screen (**Figure 1A**). In the present paradigm, several words on the right-hand side of the invisible boundary were degraded prior to the display change. Therefore, projector 1 presented the degraded sentence in the lower part of the display, which was visible to the participant, before the boundary was crossed (**Figure 1B**). After the boundary was crossed, projector 1 was switched off and the un-degraded sentence was presented at the very same position by projector 2 (see **Figure 1C**). Importantly, only the area where the two projectors converged was visible to the participant sitting in front of the screen. In sum, the display change was realized by switching projector 1 off and projector 2 on at the moment the boundary was crossed. This on-and-off switching is independent of the actual refresh rate of the projectors and, therefore, results in faster display change latencies.

In detail, the display change was controlled by the display PC (**Figure 2**). After the eye tracker indicated that the eye crossed the invisible boundary, the display change was initiated by a TTL trigger (latency between boundary cross and TTL trigger: between 1 and 2 ms). This trigger was read out by the electronic circuit and resulted in an immediate switching of the two projectors. The fast switching was possible as both projectors (SP-F10M, Samsung Electronics Co., Ltd., South Korea) used lighting based on LEDs, which were either switched on or off by transistors. As a consequence, the transistors either connected or disconnected the projectors' LEDs from their original power sources.

Technically, the display changes of the projector setup were realized by a small but effective manipulation of the two projectors and a, rather simple, switching circuit. **Figure 3A** shows

the top view of one projector with marked power supply ports of the three LEDs (red, blue, and green). Here, the original power cable connections of both identically constructed projectors were removed from their plugs. In order to control the power supply of the LEDs of the projector, the circuit presented in **Figure 2** was applied in between the original power supply ports and cable connectors of the projectors. The dashed boxes in **Figure 2** indicate the two projectors including a power supply and a LED. The cornerstones of the circuit were the power MOSFET transistors (BUZ 22) that were interposed between each LED of each projector and their original power sources. Note that for both projectors three transistors were used: one for each LED (red, blue, and green) of the projectors. For simplicity, **Figure 2** presents only one power transistor for each projector, but the circuit was identical for all three LEDs. At the moment the eye crossed the invisible boundary, the parallel port of the display PC controlled the power transistors via TTL triggers. This TTL trigger raised a potential from 0 to 5 V at one data pin of the parallel port, which triggered a toggle between the projectors. This toggle was realized by an inverter (see **Figure 2**), which allowed, before the boundary was crossed and no TTL signal was present, that projector 1 was switched on (transistor *T*<sup>1</sup> connected the power source and the LEDs) and projector 2 was switched off (transistor *T*<sup>2</sup> disconnected the power source of the LEDs of projector 2). After the boundary crossing, the TTL trigger set the data pin to 5 V with the result that the power transistor of projector 1 was switched off by the inverter (i.e., signal inverted to 0 V) and the power transistor of projector 2, which was directly controlled by the parallel port, connected the power source of projector 2 to their LEDs. This on-and-off switching of the two projectors allowed extremely fast display changes despite the low refresh rate of the projectors (60 Hz). **Figure 3B** shows the two projectors mounted in a wooden box with the electronic circuit (mounted in an aluminum box) on top.

#### **DISPLAY CHANGE LATENCY MEASUREMENT**

Another circuit was used to measure the display change latency after the eye crossed the invisible boundary (see **Figure 4**; for a similar measurement see Dorr, 2004). The cornerstone of this circuit was a photosensitive diode, which was placed either on the black square of the monitor setup or on projector 2. The monitor or projector 2 was used as light sources that were switched on at the boundary cross. After the boundary cross, the light intensity at the photosensitive diode was increased by removing the black square or switching on projector 2. This difference in illumination was measured by the photosensitive diode and allowed assessing the display change latencies of all the setups (projector, monitor 150 Hz, and monitor 200 Hz setups).

In **Figure 4**, the circuit is presented in detail. The display change latencies were measured by the combination of the electronic circuit (consisting of two resistors, a battery, and the photosensitive diode) and an EEG amplifier (BrainAmp MR+; sampling rate of 1000 Hz). Importantly, the photosensitive diode (SFH 203) decreased its resistance when the illumination at the diode increased. As a consequence, the voltage levels, which were measured by the EEG amplifier, increased. In the present experiment, before the invisible boundary was crossed, the diode was

**FIGURE 3 | (A)** The power plugs for the original power supplies of the red, blue, and green LEDs of the projector used in the present setup (Samsung SP-F10M). **(B)** The projectors were mounted in a wooden box with the electronic circuit (mounted in an aluminum box) on top.

display change latencies in all setups (projector, monitor 150 Hz, and monitor 200 Hz setups).

not illuminated and, therefore, had a high resistance (black square or projector LEDs were switched off). After the boundary was crossed, the black square was removed in the monitor setup or projector 2 was switched on. At this moment, the amount of light at the photosensitive diode increased, which resulted in a decrease of the resistance of the diode, and as a consequence voltage levels increased. In addition, at the time the boundary was crossed, a TTL trigger was sent to the EEG amplifier, which allowed referencing the signal from the photosensitive diode to the point in time when the boundary was crossed (as initiated by the display PC). Note that the TTL trigger was sent on a different data pin than the one used to toggle the projectors. The voltage measured by the EEG amplifier and the TTL trigger reference allowed estimating the display change latencies.

### **RESULTS**

For the voltage change at the photosensitive diode circuit, the baseline correction was based on the 100 ms prior to the gaze-contingent display change (indicated by the TTL trigger). Furthermore, no signal processing filters were used, and due to absolute differences in voltage levels between the setups (projector setup: maximum of about 7 mV; monitor setup: maximum of about 2 mV), the voltage values were *z*-transformed. The difference between the voltage values was the result of a stronger illumination change in the projector setup. In contrast to the projector setup, where a projector was switched on, in the monitor setup only a black square on the normally illuminated monitor was removed. After the *z*-standardization, the normalized voltage values were once more baseline corrected, based on the same pre-display change interval of 100 ms prior to the crossing of the invisible boundary.

**Figure 5** shows all single trial voltage changes from the two measurements of the monitor setup (light red for 150 Hz and light orange for 200 Hz) and the projector setup (light green) with the corresponding mean voltage changes in red, orange, and green. The light green lines, which correspond to one display change each in the projector setup, indicate relatively short display change latencies by a fast increase of mean and single trial voltages. Furthermore, the green lines indicate a low variability of the display change latencies in the projector setup. In contrast, in the monitor setup, the display change latencies of both refresh rates were prolonged and the variance of the single trials was markedly increased. Surprisingly, the two refresh rates of the monitor setup did not differ substantially, with the exception that the 150 Hz refresh rate tended to have especially prolonged display change latencies (up to 16 ms).

In the present investigation, we defined display change latency by the point in time when a threshold at 0.2 of standardized voltage was reached. This threshold (gray line in **Figure 5**) was selected in a way that the noise before the display change was not able to meet the voltage threshold. The boxes-and-whiskers, at the bottom of **Figure 5**, display the median (black vertical bar) and the 95% confidence interval of the display change latencies. The medians indicate that the majority of the display changes in the projector setup started with about 4 ms and in the monitor setup with 200 and 150 Hz the majority of the display changes started at 11 and 13 ms, respectively. In addition, a much larger deviation of display change latencies in both monitor setups in contrast to the projector setup was found. Note, for the projector setup we not only measured the latency for switching on the second projector but also the latency for switching off the first projector, which was highly comparable with a median latency of 4 ms.

Furthermore, each of the black horizontal bars right above the time axis of **Figure 5** indicates the start of one fixation of the eye tracking measurement in the 200 Hz monitor condition. This exemplary eye movement data illustrates the importance of the fast display changes. In case that the projector setup was used to present the sentences, the number of trials in which the display change was too slow would be only about 4 out of 100 (4%). In contrast, the display changes were too slow in about 17 (17%) and 22 (22%) trials in the monitor setup with 200 and 150 Hz, respectively.

### **DISCUSSION**

In the present paper, we compared a novel LED-based video projection system to a state-of-the-art CRT monitor in a gazecontingent invisible boundary paradigm during sentence reading. As expected, the projector setup outperformed the monitor setup with respect to both median and range of display change latencies. Specifically, median latencies for the projector setup were about one-third of the median latencies for the monitor setup (4 ms compared to 11/13 ms). Furthermore, while the monitor setup latencies ranged up to 16 ms, the projector setup latencies did not exceed 5 ms. Thus, the very short display change latencies of the novel projector setup are much more likely to be finalized during a saccade compared to the display changes in the monitor setup. In the following, we will discuss some methodological considerations in gaze-contingent display change paradigms, the implications of our findings, and possible applications as well as practical aspects of the novel visual stimulation system.

Already two decades ago, it was argued that early implementations of gaze-contingent display change paradigms were likely to have suffered from technical problems, leading to the conclusion that results were partly not more than artifacts of the paradigm. Specifically, O'Regan (1990) proposed two kinds of technical problems related to stimulus presentation. First, because of temporal limitations of the eye tracking device, the experiment computer, and the refresh rate of the stimulus screen, display changes, which were intended to take place during a saccade, actually took place after the saccade ended. In other words, the display change actually took place during the time of the next fixation. The delayed display change resulted in a flicker or contrast change during the subsequent fixation, which may have influenced sensory information extraction and, in turn, affected eye movement behavior (e.g., fixation duration).

Second, older monitors suffered from prolonged persistence of CRT phosphors, leading to afterglow effects. These effects resulted in smearing and reduced contrast of subsequently presented stimuli. The proposed concerns were directly addressed by Inhoff et al. (1998), who measured eye movement behavior in gaze-contingent display change paradigms for four different screen refresh rates. Furthermore, they compared a phosphorbased CRT with an electroluminescent panel, which should not suffer from erosion of contrast. Inhoff et al. (1998) found no evidence that technical artifacts compromise the results of gaze-contingent display change paradigms unless atypically slow refresh rates or relatively slow phosphors are used. Therefore, with high eye tracker sampling rates (1000 Hz and above), fast computers, and fast CRT monitors, technical limitations regarding stimulus presentation should not be a problem nowadays.

Nevertheless, sometimes it happens that the timing is too slow and participants are able to detect gaze-contingent display changes. In an invisible boundary paradigm (involving a gazecontingent display change), White et al. (2005) directly compared parafoveal preview effects in participants who were not aware of the change to effects in participants who were aware of the change. By using short and distinctive orthographically illegal previews (consonant strings), they increased the proportion of aware participants to one out of three. The results clearly demonstrated a qualitatively different pattern of eye movement data in readers who were aware of the display change compared to readers who were not aware. Thus, whether or not participants detect supposedly "invisible" gaze-contingent display changes has important implications for interpretation of the findings of such experiments.

As already mentioned in the Introduction, Slattery et al. (2011) showed that detection of gaze-contingent display changes does not only depend on the timing of the display change relative to the end of the saccade but also on the position of the pre-boundary fixation relative to the to-be-changed target and the nature of the experimental manipulation of the target. Specifically, Slattery et al. (2011) used a gaze-contingent invisible boundary paradigm and manipulated the delay of the display change (0, 15, 25 ms) as well as the properties of the parafoveal preview (letter identity or letter case change). They found a complex pattern evidenced by a marked interaction between the timing of the display change and the relationship between the preview and target characteristics. Importantly, even without an artificial delay in the display change, detection sensitivity was influenced by the amount and quality of information that was changed between the pre- and postboundary stimulus. In addition, proximity of the pre-boundary fixation to the boundary influenced display change detection

150 Hz, respectively. An increase in standardized voltage indicates a display change measured by the photosensitive diode. Red, orange,

sensitivity. To put it simply, participants' sensitivity to detect the display changes depended on the when, where, and what of the display changes. This complex interaction can lead to significant artifacts in the eye movement data, which would distort the experimental effects, thereby rendering interpretation of findings the start of one fixation at this moment. change latencies were actually longer than reported in the studies, which, in turn, may have affected the results of these studies. Note that the presently identified relatively long latencies in the

extremely difficult. Nearly all studies from the last years employing invisible boundary paradigms during reading used conventional CRT monitors with a refresh rate between 150 and 200 Hz. Moreover, they reported mean display change latencies between 5 and 10 ms with a range up to 20 ms but the procedure for measuring these display change latencies was hardly ever described. In many cases, it seems like display change latencies were solely estimated based on the refresh rate of the monitor (e.g., assumed 5 ms latency for a 200 Hz monitor). However, as shown by our diode-based measurements, such assumptions are invalid because they only consider the best-case scenario (i.e., the command to change the display is sent immediately before a screen refresh cycle) and they do not take into account the processing time of hardware (e.g., eye tracker) and software (e.g., Experiment Builder) components of the experimental setup. In our hardwarebased measurements, we found a median latency of 11 ms for a 200 Hz monitor. Therefore, it is most likely that the display CRT setup cannot be due to an artifact of our measurement circuit because the combination of a real-time photosensitive diode (with a rise time *<*1 ms) together with 1000 Hz EEG recordings (enabling a 1 ms resolution) led to reasonably short latencies in the projector setup. In the projector setup, in contrast to the monitor setup, we found short latencies with little variability (range 4–5 ms). As evidenced by the results of our measurements, the LED-

based projector system provides a reliable tool for fast display changes. In so doing, it replaces slower and less reliable conventional CRT monitors, which are based on outdated technology. Today, many labs face the practical problem that they run out of reasonable visual stimulation equipment because CRT monitors have almost vanished from the market and widely available LCD monitors are limited to mostly 60 Hz, which are likely to result in much slower gaze-contingent display changes. Even modern projector systems with a refresh rate up to 120 Hz and recently developed gaming monitors with up to 144 Hz might be too slow for precise gaze-contingent display changes as they still require the construction of a new display, which involves several

time-consuming processes (e.g., the response time of the monitor). In contrast, by switching between two already constructed displays, our projector system bypasses the construction of a new display. By offering a future-proof, advanced alternative based on LED technology, our system provides a more than feasible solution for precise visual stimulation.

Besides application in conventional eye tracking experiments, our system is particularly suitable for experiments with simultaneous registration of eye movements and brain electrophysiology. Brain electrophysiology (as measured by EEG or MEG) is extremely susceptible to visual stimulation. Therefore, even if a display change that takes place during a fixation is not consciously detected by the participant, it is highly likely that such a display change interferes with visual information extraction and, as a consequence, affects brain activity measures (e.g., event-related potentials). Therefore, rigorous control over exact timing of visual stimulation is absolutely necessary in experiments that combine eye tracking and brain activity methods. Besides application in electrophysiological studies, our system is especially suitable for visual presentation in fMRI studies. Functional MRI was previously excluded from high-speed visual stimulation because it is not possible to operate conventional CRT monitors in the strong magnetic field of an MRI scanner. In addition, as already mentioned, present LCD technology as implemented in MR-compatible monitors and video projectors is limited to a rather slow refresh rate of 60 Hz, which excludes fast and exactly timed visual presentation. Therefore, the present LED-based video projection system offers new possibilities for combined eye tracking and fMRI studies using gaze-contingent display change paradigms and other experiments that require fast and reliable visual presentation.

Possible applications of our projector system not only include invisible boundary experiments but also subtle temporal manipulations in short-time presentation or masked priming studies. Specifically, it should be possible to implement fine-grained variations of visual presentation durations with previously unrivaled precision. Certainly, these benefits are not limited to experiments in the domain of reading research but may also be relevant for other fields like visual object processing, attention, search, and scene perception.

Despite the potential benefits of our system, there are a number of issues that have to be carefully addressed by experimenters.

### **REFERENCES**


(2011). Coregistration of eye movements and EEG in natural reading: analyses and review. *J. Exp. Psychol. Gen.* 140, 552–572. doi: 10.1037/ a0023885


The positioning of the two projectors requires delicate alignment of the respective projection areas. Once perfect pixel-to-pixel alignment is achieved, the projectors should be protected from further movement (e.g., by mounting in a solid box or cage). In addition, there should be a standard routine for checking the alignment of the projectors before data from participants is acquired. A further issue is related to matching of color and brightness of the two projectors. Although the two projectors are identical models from a single manufacturer, there are slight differences in these parameters. The differences have to be accommodated for by manual adjustment via the projectors' built-in settings. In our case, we used black on white presentation, thereby keeping the need to accommodate for color to a minimum. Moreover, brightness of each projector was set to 490 lx (as measured by a light meter). On the positive side, there is no measurable dimming or brightening on the screen when the projectors switch. Furthermore, there are no special demands on the computer's graphics card (a single VGA output is sufficient) and there is no extra software required (the TTL triggers can be sent by the Experiment Builder software). In principle, our system could be used for more than one display change per trial, but the temporal limit for every second change is determined by the projectors' refresh rate (60 Hz in our case).

### **CONCLUSION**

The present video projection system provides a solution for highspeed visual stimulation as required by many psychological and neuroscientific experiments. Because it is based on projectors, it may be used not only for behavioral, eye tracking, and electrophysiological studies but also for fMRI studies. By enabling high-temporal precision of display changes, it facilitates the realization of gaze-contingent paradigms and other time-sensitive visual experiments. In addition, our system offers completely novel possibilities for such experiments in fMRI.

### **ACKNOWLEDGMENTS**

This work was supported by the Austrian Science Fund (FWF P 25799-B23). We would like to thank Tim Kaiser for help with data acquisition and Julia Sophia Crone and Matthias Schurz for a helpful discussion about an earlier version of this manuscript.

effect. *Int. J. Psychophysiol.* 80, 54–62. doi: 10.1016/j.ijpsycho. 2011.01.013


Fixation based event-related fmri analysis: using eye fixations as events in functional magnetic resonance imaging to reveal cortical processing during the free exploration of visual images. *Hum. Brain Mapp.* 33, 307–318. doi: 10.1002/ hbm.21211


Rayner, K. (1975). The perceptual span and peripheral cues in reading. *Cogn. Psychol.* 7, 65–81. doi: 10.1016/0010-0285(75) 90005-5

Rayner, K. (2009). The Thirty Fifth Sir Frederick Bartlett Lecture: eye movements and attention during reading, scene perception, and visual search. *Q. J. Exp. Psychol.* 62, 1457–1506. doi: 10.1080/ 17470210902816461


processing: evidence from eyefixation-related potentials. *Brain Lang.* 111, 101–113. doi: 10.1016/j. bandl.2009.08.004


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 28 March 2013; accepted: 05 June 2013; published online: 01 July 2013.*

*Citation: Richlan F, Gagl B, Schuster S, Hawelka S, Humenberger J and Hutzler F (2013) A new high-speed visual stimulation method for gaze-contingent eye movement and brain activity studies. Front. Syst. Neurosci. 7:24. doi: 10.3389/ fnsys.2013.00024*

*Copyright © 2013 Richlan, Gagl, Schuster, Hawelka, Humenberger and Hutzler. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## Eye movement related brain responses to emotional scenes during free viewing

### *Jaana Simola\*, Jari Torniainen, Mona Moisala , Markus Kivikangas and Christina M. Krause*

*Cognitive Science/Cognitive Brain Research Unit, Institute of Behavioural Sciences, University of Helsinki, Helsinki, Finland*

#### *Edited by:*

*Sebastian Pannasch, Technische Universität Dresden, Germany*

#### *Reviewed by:*

*Preston E. Garraghty, Indiana University, USA Melissa L.-H. Vo, Harvard Medical School, USA Lauri Nummenmaa, Aalto University, Finland*

#### *\*Correspondence:*

*Jaana Simola, Cognitive Science/ Cognitive Brain Research Unit (CBRU), Institute of Behavioural Sciences, University of Helsinki, Siltavuorenpenger 1B, FI-00014 Helsinki, Finland e-mail: jaana.simola@helsinki.fi*

Emotional stimuli are preferentially processed over neutral stimuli. Previous studies, however, disagree on whether emotional stimuli capture attention preattentively or whether the processing advantage is dependent on allocation of attention. The present study investigated attention and emotion processes by measuring brain responses related to eye movement events while 11 participants viewed images selected from the International Affective Picture System (IAPS). Brain responses to emotional stimuli were compared between serial and parallel presentation. An "emotional" set included one image with high positive or negative valence among neutral images. A "neutral" set comprised four neutral images. The participants were asked to indicate which picture—if any—was emotional and to rate that picture on valence and arousal. In the serial condition, the event-related potentials (ERPs) were time-locked to the stimulus onset. In the parallel condition, the ERPs were time-locked to the first eye entry on an image. The eye movement results showed facilitated processing of emotional, especially unpleasant information. The EEG results in both presentation conditions showed that the LPP ("late positive potential") amplitudes at 400–500 ms were enlarged for the unpleasant and pleasant pictures as compared to neutral pictures. Moreover, the unpleasant scenes elicited stronger responses than pleasant scenes. The ERP results did not support parafoveal emotional processing, although the eye movement results suggested faster attention capture by emotional stimuli. Our findings, thus, suggested that emotional processing depends on overt attentional resources engaged in the processing of emotional content. The results also indicate that brain responses to emotional images can be analyzed time-locked to eye movement events, although the response amplitudes were larger during serial presentation.

**Keywords: attention, emotion, EEG, eye movements, co-registration, fixation-related potentials, free viewing, LPP**

### **INTRODUCTION**

Real world scene viewing is an active process during which viewers select regions of scenes that will be processed in detail by prioritizing highly salient and unexpected stimuli at the expense of other stimuli and ongoing neural activity. Converging research evidence supports a processing advantage for emotional stimuli, indicating that humans are able to detect emotional content rapidly among other salient stimuli in order to activate motivational resources for approach or avoidance (Crawford and Cacioppo, 2002; Vuilleumier, 2005; Olofsson et al., 2008). Although there is a vast amount of research showing that attention is efficiently drawn toward emotional stimuli, the current theories disagree on the role of attention in emotional processing and the actual time course of attention and emotion processes.

One line of research suggests that emotional stimuli automatically activate brain regions largely independent of attentional control. For example, visual search studies propose that emotional detectors work preattentively by directing attention automatically toward threat without conscious effortful processing. These studies have shown that potentially threatening stimuli are found efficiently among neutral distractors (Öhman et al., 2001; Blanchette, 2006; Fox et al., 2007). EEG studies recording steady-state visual evoked potentials (ssVEPs) have also suggested that non-attended emotional information modulates brain activity independent of the focus of spatial attention. This modulation occurs especially when the emotional content is presented in the left visual field (Keil et al., 2005). A decrease in the amplitudes of the ssVEPs and in target detection rates have also been observed when the primary attentional task (detecting coherent motion of dots) is superimposed over pictures of emotional scenes as compared to neutral scenes (Hindi Attar et al., 2010). Taken together, these studies support the view that affective processing can occur without allocation of attentional resources, and that emotional processing precedes semantic processing (i.e., *the affective primacy hypothesis*).

The assumption that emotional stimuli capture attention automatically has been challenged by studies suggesting that prior to affective analysis, the features of objects must be integrated and the objects must be categorized and identified (reviewed in Cave and Batty, 2006; Storbeck et al., 2006). These studies support the *cognitive primacy hypothesis*, which states that identifying an object is a necessary prerequisite for evaluating its significance. For example, brain responses to emotional facial expressions have been shown to depend on sufficient attention resources being available to process the faces (Pessoa et al., 2002). These results demonstrate that responses to foveally presented emotional expression disappear when attention is directed in detecting the orientation of peripherally presented bars. Moreover, Holmes et al. (2003) have shown an enhanced positivity in eventrelated potentials (ERPs) as a response to fearful relative to neutral faces only when attention is directed toward the face stimuli, while the emotional expression effect is completely eliminated in trials where faces were unattended. The data by Acunzo and Henderson (2011) also failed to demonstrate any automatic "popout" effect of emotional content by showing no differences in latencies of the first fixations to emotional and neutral objects within scenes. However, once the emotional items were fixated, they held attention longer than neutral objects. In sum, these studies argue against the preattentive view of emotional processing. What they posit instead is that the detection and processing of emotional information depends on the current locus of spatial attention.

In addition to the opposing views about the automatic processing of emotional content, other studies support a more flexible view of automaticity (see Moors and De Houwer, 2006). Eye movement studies have shown that encoding of emotional valence can take place even when affective processing is not relevant for the task (i.e., when participants are supposed to report the semantic category of the images) (Calvo and Nummenmaa, 2007). Moreover, emotional pictures are more likely to be fixated earlier than neutral pictures (e.g., Calvo and Lang, 2004; Nummenmaa et al., 2006), even when participants are instructed to fixate the neutral image (Nummenmaa et al., 2006). These results suggest that processing of affective information is facilitated over perceptual and semantic information. However, the facilitation of affective responses by emotionally congruent primes depended on pre-exposure to the primes (Calvo and Nummenmaa, 2007; Calvo and Avero, 2008), suggesting that the degree of awareness of the unattended stimulus valence affects affective priming. Furthermore, a gradual increase in affective priming occurred when the parafoveal primes were pre-exposed foveally as compared to when the primes were preexposed parafoveally (Calvo and Nummenmaa, 2007). Studies have also shown that when the primary task is more difficult, automatic orienting to emotional stimuli diminishes (Calvo and Nummenmaa, 2007; Becker and Detweiler-Bedell, 2009). Moreover, the exogenous drive of attention to emotional content disappears when emotional items are embedded in a scene, a condition in which the foveal and perceptual load is high (Acunzo and Henderson, 2011). These findings support a view that emotional processing can be fast, involuntary and performed in parallel with unrelated foveal tasks, but that emotional processing is sensitive to regulatory attentional influences (Vuilleumier, 2005).

Neurophysiology and neuroimaging results demonstrate that selective attention in perception is mediated by enhanced processing in sensory pathways (Vuilleumier, 2005). Studies recording ERPs have shown that in addition to the early sensory components (e.g., N1/P1 and N2/P2), picture emotionality is reflected as an "early posterior negativity" (EPN) difference between emotional and neutral stimuli, and as an enhanced "late positive potential" (LPP) component during processing of affective as compared to neutral stimuli (reviewed in Olofsson et al., 2008). The LPP is a sustained P300-like component that has an onset at around 250 ms post-stimulus and a posterior midline scalp distribution (Hajcak and Olvet, 2008). Similar to the P300, which is larger for attended than unattended stimuli, the enhanced LPP reflects greater attention to emotional stimuli (Cuthbert et al., 2000; Schupp et al., 2000, 2007). Prior research indicates that emotional information is highly salient and therefore also detected in the visual periphery. Parafoveally/peripherally presented emotional stimuli modulate both eye movement (Nummenmaa et al., 2009; Coy and Hutton, 2012) and ERP responses. For example, a modulation of the early and late ERPs by picture emotionality occurred also when pictures were presented in the peripheral vision (up to 8◦ eccentricity) and with short exposure times that prevent saccadic eye movements (De Cesarei et al., 2009).

In the present study, we investigated the time course and role of attention in emotional processing. In particular, we were interested in how attention is directed to emotional content during a free viewing task. Previous ERP studies have used paradigms that investigate the neural responses to emotional visual stimuli presented in isolation and intervened with unnaturally long inter-stimulus intervals. Neuronal activity under free viewing may, however, differ significantly with what is observed under restrictive stimulus conditions. Thus, it is not clear how well the neural responses obtained in constrained experimental conditions could explain the responses under natural oculomotor behavior, because natural visual processing is often motivated by specific goals, or the internal states of the viewer (Maldonado et al., 2009).

There is a rapidly growing interest in the use of co-registration of eye movements and EEG to study brain mechanisms during free viewing (see Baccino, 2011). In the analysis of co-registered data (i.e., the eye-fixation-related potential, EFRP analysis), the EEG signal is segmented based on eye movement events. Previous research using co-registration of eye movements and EEG has reported corresponding ERP data during unconstrained viewing conditions as compared to serial visual presentation (SVP) (Hutzler et al., 2007; Dimigen et al., 2011). Co-registration studies have also shown that parafoveal processing affects the ERPresponses at current fixation in reading (Dimigen et al., 2012) and reading-like tasks (Baccino and Manunta, 2005; Simola et al., 2009). Moreover, an earlier onset of the N400 was observed during natural reading than in SVP, possibly due to the parafoveal preview obtained in natural reading (Dimigen et al., 2011). In scene perception, information around the current fixation can be acquired from a wider region than during reading (see Rayner and Castelhano, 2008). This is especially evident in studies investigating attention to emotional stimuli (De Cesarei et al., 2009; Nummenmaa et al., 2009; Coy and Hutton, 2012). The high saliency of parafoveal information may constrain the use of the co-registration technique in emotional scene perception tasks. Therefore, the second aim of the present study was to validate the co-registration technique when participants were exposed to emotional scenes. Previous studies using co-registration of eye movements and EEG have mainly considered word recognition and reading processes (Baccino and Manunta, 2005; Simola et al., 2009; Dimigen et al., 2011, 2012). To our knowledge, no previous research has used the co-registration technique during free viewing of emotional scenes.

The use of co-registration technique involves several technical challenges including, for example, (i) the artifacts in EEG recordings caused by eye movements, (ii) accurate hardware synchronization between the eye movement and EEG data sets, (iii) temporal overlap between background EEG and fixation evoked ERPs as well as the temporal overlap of potentials elicited by successive fixations, and (iv) the phase differences of ERP responses due to systematic differences in eye movement variables. However, previous research suggests that most of these technical problems appear to be solvable (see Dimigen et al., 2011; Kliegl et al., 2012). We will discuss later how these problems were minimized in the present setup. Despite the technical challenges, the co-registration technique provides a valuable tool to understand the relation between oculomotor and brain electrical signals during cognitive processing. Using eye movements to segment the brain potentials helps to study brain activity under self-paced perceptual and cognitive behavior during free viewing tasks. This is relevant because even though eye movements can provide with indicators of cognitive processing under naturalistic viewing conditions, the eye movement data do not inform us about the time course of underlying processes that occur within subsequent fixations. Further, the combination of eye movement and EEG methods allows possibilities to investigate both spatial and temporal aspects of visual attention simultaneously.

Attention to emotional stimuli, in the present study, was investigated by recording eye movement related ERP-responses while participants performed visual search tasks to determine whether a group of scenes were neutral or whether there was an emotional scene among the neutral scenes. The stimulus material was presented in two conditions. That is, the participants saw sets consisting of four images either serially or in parallel. An "emotional" set included one image with highly pleasant or unpleasant content among neutral images. A "neutral" set comprised of four neutral images. A visual search paradigm was selected because it is a typical setup used in the studies of emotional processing (Öhman et al., 2001; Flykt, 2005; Blanchette, 2006; Fox et al., 2007). In contrast to many previous ERP-studies investigating parafoveal/peripheral processing of emotional content (e.g., Rigoulot et al., 2008), participants were allowed to move their eyes freely across the stimulus images. This kind of task condition permitted a natural foveal load across fixations (see Acunzo and Henderson, 2011).

Previous studies suggest that differences in tasks and measures may influence the effects of attention to emotional stimuli (e.g., Lipp et al., 2004; Blanchette, 2006). To ensure a fair comparison of the results from different data sets and to allow within-participant comparisons between the two presentation conditions, for each participant we collected different data (i.e., behavioral, eyetracking and ERP-measures) during the same recording session. In the parallel condition, participants' eye movements and EEG were recorded simultaneously. Eye movement recordings allowed a comparison of results to previous eye movement studies of emotional processing (Calvo and Lang, 2004; Nummenmaa et al., 2006, 2009). Importantly, co-registration of eye movements and ERP responses permitted the analysis of brain responses timelocked to eye movement events. In order to validate the coregistration technique, the responses from parallel presentation were compared to the results from the serial condition and previous findings from the SVP studies (reviewed in Olofsson et al., 2008). The expectation was that if co-registration of eye movements and EEG is a valid technique to measure responses to emotional scenes, similar responses would occur in both presentation conditions. That is, we expected the LPP as a response to emotional processing in the serial presentation as well as in the parallel presentation when the ERPs were time-locked to the first entry of the target image.

Further, the emotional information is likely to be processed, at least to a certain degree, before the eyes have landed on the region of the emotional content. In order to examine the time course of emotional processing (i.e., the detection and parafoveal processing of emotional content), the ERP responses in the parallel condition were also time-locked to the stimulus onset. Since covert attention may be allocated to the emotional content when eyes are directed elsewhere on the stimulus (see Calvo and Lang, 2004), peripheral attention to emotional stimuli was expected to become visible in the ERP responses before the eyes move to the target image.

Facilitated attention has been reported in association with both pleasant and unpleasant stimuli (e.g., Nummenmaa et al., 2009; Coy and Hutton, 2012). These studies support "the emotionality hypothesis" by showing that attention is drawn to emotional information despite its emotional valence. On the basis of existing studies (e.g., Calvo and Lang, 2004; Nummenmaa et al., 2006, 2009), we expected that in the parallel presentation condition both pleasant and unpleasant pictures would be attended faster and for longer durations than neutral stimuli. In both presentation conditions, attention to emotional stimuli was also expected to elicit increased LPP responses for pleasant and unpleasant as compared to neutral pictures.

In addition to the "emotionality hypothesis," several studies have reported that the valence of the stimulus determines how fast it is likely to capture attention. These studies have found that attention is automatically drawn to negative information more strongly than to positive information (Ito et al., 1998; Crawford and Cacioppo, 2002; Smith et al., 2003). This phenomenon is referred to as "the negativity effect" (or "the negativity hypothesis"). The evaluation of threat (or fear) may be the underlying component of this mechanism, and it may have developed during evolution as a survival mechanism (Öhman et al., 2001; Carretié et al., 2009). Both behavioral and ERP-studies have found support for the negativity effect. For example, the results from recognition and recall memory tests suggested that negative stimuli were better memorized than positive or neutral regions of the scenes (Humphrey et al., 2012). Also, a larger and more sustained LPP was elicited by unpleasant than pleasant stimuli (Ito et al., 1998; Smith et al., 2003; Hajcak and Olvet, 2008). Based on earlier studies, we also expected a negativity effect reflected in facilitated eye movement and ERP responses to unpleasant than to pleasant stimuli.

#### **MATERIALS AND METHODS**

#### **PARTICIPANTS**

Eleven volunteers [right-handed, 6 female, mean age: 21*.*3 ± 1*.*27 (SD)] with normal or corrected to normal vision participated in the experiment. All participants provided a written informed consent and were informed about the possible provocative content of the stimuli prior to the experiment. The participants reported no history of mental illness or neurological injury and were not on medication.

#### **STIMULI**

The stimuli were 160 images selected from the International Affective Picture System (IAPS) (Lang et al., 2008). From the stimulus material, trials consisting of four images were generated. In a *pleasant* trial, one of the four images depicted people experiencing positive affect. In an *unpleasant* trial, one of the images was unpleasant, and presented people suffering serious threat or harm. In a *neutral* trial, four neutral images, showing daily non-emotional activities, were presented. The stimulus groups were selected such that there was no overlap in IAPS normative valence ratings between the categories. Mean valence ratings with 9-point scales were as follows, pleasant: 7*.*2 ± 2*.*4 (SD), unpleasant: 2*.*0 ± 1*.*5 (SD), neutral: 6*.*0 ± 1*.*5 (SD). Mean arousal ratings per stimulus groups were the following: pleasant: 6*.*7 ± 2*.*0 (SD), unpleasant: 6*.*5 ± 1*.*9 (SD), neutral: 3*.*9 ± 2*.*3 (SD). Appendix A lists the images used in this study.

Stimulus size was 560 × 420 pixels and the images subtended 15◦ horizontally and 11.4◦ vertically. In the serial condition, the images were presented at the center of the screen. In the parallel condition, the image size was identical, and the stimuli were presented symmetrically in the centers of the quadrants of the screen. The closest corner of the image to the screen center was 4.32◦. Stimuli were presented on a 22-inch screen with the screen resolution of 1680 × 1050 pixels.

Previous research shows that low-level saliency guides our eye movements when inspecting a scene. We calculated the low-level image properties for the stimuli, in order to control for the possibility that the effects of emotional valence on eye movements and EEG responses would merely be a result of differences in the low-level visual features between neutral and emotional images (see e.g., Delplangue et al., 2007). The complexity of the images was assessed in terms of the size of each compressed JPEGimage in kilobytes (Donderi, 2006). The percentage of the area covered by faces was assessed for each image using ImageJ software, since human faces capture attention especially effectively (Calvo and Lang, 2005). Moreover, the percentage of images containing human faces was calculated per emotional conditions. The brightness and saturation levels per pixel were calculated for each image, and the skewness (i.e., the lack of symmetry of the intensity value distributions) and kurtosis (i.e., the pointiness of the distribution) were assessed for each color layer (red, green and blue). The mean scores and standard deviations for the low-level image characteristics are presented in **Table 1**.

**Table 1 | Mean scores (and standard deviations) of the low-level image features for the emotional and neutral stimuli.**


A One-Way analysis of variance (ANOVA) showed no differences in image complexity, brightness, skewness or kurtosis between the three image categories (*p >* 0.05). The ANOVA showed that the percentage of images containing faces differed between the emotional conditions [*F(*2*,* <sup>153</sup>*)* = 21*.*25, *p <* 0*.*001]. Follow-up *t*-tests suggested differences between unpleasant and neutral [*t(*112*)* = 5*.*43, *p <* 0*.*001] and between pleasant and neutral conditions [*t(*117*)* = 5*.*52, *p <* 0*.*001], while pleasant and unpleasant conditions did not differ in the occurrence of face images. However, it should be noted that the ANOVA for face area did not show any difference between emotional conditions. This was because the unpleasant and pleasant conditions contained more pictures depicting human faces photographed from long distances, whereas the images that contained faces in the neutral condition were mostly portraits taken from short distance. The ANOVA for saturation levels showed a slight effect between the stimulus categories [*F(*2*,* <sup>153</sup>*)* = 3*.*15, *p* = 0*.*046], but *post-hoc* comparisons revealed no differences between the single image categories.

We also computed a saliency map for each four-image combination using the Saliency Toolbox (Walther and Koch, 2006) to further control for the possible bottom-up saliency effects in the parallel condition. A dyadic Gaussian pyramid was used for subsampling and three iterations were run for normalization. From the resulting saliency map the most salient location was extracted. In 21% of unpleasant trials, the emotional target was the most salient image, and in 23% of the pleasant trials, the target was the most salient image. A one-way ANOVA revealed no differences (*p >* 0.10) in the percentages of the most salient target images between the two conditions. These analyses suggested that low-level saliency could explain the attention effects to emotional targets in less than a chance level.

#### **PROCEDURE**

**Figure 1A** presents the trial structure in the serial condition. Each image was presented for 3 s, followed by a central fixation cross, presented on a gray background for 2–4 s. The task of the participants was to view the images. After each trial, they were asked to indicate by "yes"/"no"—responses whether they detected an

**FIGURE 1 | (A)** An example trial from the serial viewing condition with an unpleasant target (plane crash) as the second image of the sequence. Each image was presented for 3 s followed by a central fixation cross on a gray background for 2–4 s (only one fixation cross is shown in this image). **(B)** A stimulus sequence from the parallel (free viewing) condition with a pleasant target image (people around waterfall) in the upper left corner. Participants had an unconstrained viewing of the stimulus images during which their eye movements were tracked. Before each stimulus set a central fixation cross was presented for 3–5 s. **(C)** The 9-point self-assessment manikin (SAM) scales were used to evaluate the emotional valence (upper panel) and arousal (lower panel) of the selected target image. \*Note that none of the example images are part of the experimental stimulus material.

emotional image among the four images. If they clicked "yes," the same four images were presented simultaneously on the screen, and the participants were asked to indicate the emotional image by a mouse-click. Subsequently, they were asked to rate the selected image with two 9-point scales (**Figure 1C**). The first scale measured the valence of the scenes from very unhappy (1) to very happy (9). The second scale measured the arousal of the scenes from very calm (1) to very excited (9). The emotional image appeared equally often, but randomly, as the first, second, third or fourth image of the trial. The trial length was approximately 20–30 s. In the serial condition, 40 pleasant, 40 unpleasant, and 4 neutral trials were presented. The number of neutral trials was kept intentionally low, because otherwise the experiment would have been unnecessarily long. For the ERP analyses in the serial condition, the images for neutral condition were selected randomly among images that preceded the emotional targets in the serial trials.

The parallel condition consisted of 40 pleasant, 40 unpleasant and 40 neutral trials (**Figure 1B**). Participants were instructed to look through all four images in a trial freely and to respond by clicking a mouse when they were ready to continue onto the next trial. The trial length in the parallel condition, thus, varied. Participants' eye movements were recorded only in the parallel condition. Between each four-image set a central fixation cross was presented for 3–5 s. After the presentation of the images, participants were asked to indicate whether they saw an emotional image among the four image-set. Similar to the serial condition, if they answered "yes," the same set was presented again. The participants were then asked to indicate the selected image by a mouse-click and to rate that image using the valence and arousal scales. The serial and parallel conditions were presented in two blocks of each condition. The order of the blocks was counterbalanced across the participants.

#### **DATA ACQUISITION**

During recordings, participants were comfortably seated in an electrically shielded room, and the stimuli were presented on a 22-inch display. EEG signal was recorded from 64 scalp sites using an elastic cap by BioSemi (BioSemi Inc., Amsterdam, The Netherlands) with the BioSemi ABC position system montage of Ag-AgCl active electrodes. Additionally, three active electrodes were placed at tip of the nose and at left and right mastoids. Blinks and eye movements were monitored by two bipolar leads. The electrodes were connected to a BioSemi ActiveTwo EEG amplifier. EEG data were recorded using BioSemi ActiView, and the signal was amplified and digitized at a rate of 2048 Hz.

In the parallel presentation condition, participants' eye movements were recorded concurrently with the EEG recordings, using a remote iView X™ RED250 (SensoMotoricInstruments, SMI, Teltow/Berlin, Germany) eye tracker. Positions of both eyes were sampled with 250 Hz from a viewing distance of 60 cm. Before each block, a 9-point calibration was performed. Eye movement and EEG recordings were synchronized with the stimulus sequence by Presentation™ software (Neurobehavioral Systems Inc., Albany, CA, USA). Accurate hardware synchronization is a basic requirement for the analysis of eye movement related brain potentials, since the latency and location of the gaze data is the basis of segmenting the ERP responses. In the current setup, the Presentation software was programmed to send a shared pulse to both datasets every few seconds. This ensured that the time points in the eye movement data corresponded with the EEG data.

#### **EYE MOVEMENT ANALYSIS**

Fixations and saccades were extracted from the raw eye coordinate data using an adaptive saccade detection algorithm (Nyström and Holmqvist, 2010). The initial parameters given to the algorithm were: velocity threshold of 100◦/s, a minimum duration of 10 ms for saccade detection and minimum duration of 40 ms for fixation detection.

The eye movement data were analyzed with a 3 × 4 repeatedmeasures ANOVA with the emotional condition (pleasant, unpleasant, neutral) and the four quadrants in which the image was presented as within-participants factors. In order to preclude the possibility that parafoveal processing of emotional content would affect the processing of neutral images in the parallel condition, the neutral condition comprised of random images selected from the neutral trials. Early orienting of attention to images was measured as the target entry times and as the number of fixations before the first target entry from the stimulus onset. In addition, engagement of attention was measured as the number of fixations and the total dwell times (sum of fixation durations including re-fixations) per image. Moreover, we compared the likelihood of launching the first saccade toward the target image and the initial saccade latency between the emotional conditions.

#### **CONTROL ANALYSIS FOR OCULOMOTOR FACTORS**

A critical factor in the analysis of eye movement related brain potentials is to control for the systematic differences in oculomotor variables (e.g., saccade amplitudes and fixation durations), since systematic differences in eye movement measures are coupled with changes in the phases of the overlapping potentials (Dimigen et al., 2011). Previous research has also shown that saccade kinematics influences the amplitudes and waveforms of the eye fixation related potentials, EFRPs. That is, the amplitude of the spike potential (SP, an electrical eye muscle activity at saccade onset) increases with saccade size (see Keren et al., 2010; Dimigen et al., 2011). Moreover, the amplitudes of incoming saccades have been shown to influence the Lambda response, a response elicited by the afferent information inflow at the beginning of a fixation (Kazai and Yagi, 2003). The SP amplitudes gradually diminish from extra-ocular channels toward posterior sites. However, their scalp topography is strongly modulated by the direction of saccades with the scalp distributions biased toward the saccade direction. Therefore, the influences caused by differences in eye movement patterns need to be controlled in the EFRP analyses. We calculated the directions, amplitudes, and durations of pre-target saccades across the emotional conditions in the parallel condition (**Table 2**). In the stimulus onset-locked averaging of ERPs (in the serial condition), the effect of SP is nearly eliminated due to the latency jitter of the biphasic deflections. In order to control for the possible artifacts caused by within-image saccadic eye movements, we also calculated the number of withinimage saccades and their amplitudes and directions in the 500 ms time window that was critical for the ERP-analysis in the parallel condition (**Table 2**).

Moreover, to further control for the possible associated effects between eye movement variables and the amplitudes of the brain potentials, parafoveal processing of emotional stimuli was examined at a single-trial level. This was done by including pre-target saccade amplitudes and first target-fixation durations as covariates to a (conditional) liner mixed model (Hox, 2002, implemented in SPSS Version 20 Mac, IBM Corporation, New York, United States) considering the single-trial ERP amplitudes selected around the first target entry.

#### **EEG ANALYSES**

EEG data were processed with BESA (Version. 5.2; MEGIS Software, Graefelfing, Germany). Amplified voltages originally referenced to nose were rereferenced offline to linked mastoids, resampled to 512 Hz and off-line filtered with 0.5–40 Hz band pass.

The fluctuating electrical fields produced by eyelid movements and the rotation of the eyeball's corneoretinal dipole propagate to EEG electrodes and contaminate the recording of brain activity (Berg and Scherg, 1991; Rugg and Coles, 1995; Plöchl et al., 2012). Ocular artifacts make the analysis of eye movement related brain potentials challenging. One way to avoid the ocular artifacts is to restrict the EFRP analyses to the fixation period when the eye is relatively still (Baccino and Manunta, 2005; Simola et al., 2009). When the analysis is restricted only to the fixation period, it is possible to analyze the early sensory ERP components such as the P1/N1 or P2/N2 components (Olofsson et al., 2008). However, because the oculomotor and cognitive systems are partly independent, the eyes can leave the target before processing is completed (see Kliegl et al., 2012) and as a result, some events of interest occur at latencies that exceed the fixation duration. For example, in reading there is a discrepancy between the typical fixation durations (200–250 ms) and the latency of the N400 component, a robust measure of semantic processing that peaks around 400 ms post-stimulus (Kutas and Hillyard, 1980).

**Table 2 | Means and (standard deviations) of the affective ratings and eye movement measures across the emotional conditions.**


Thus, in normal reading situations, the eyes have already left the word when the N400 related to that word peaks. Despite this fact, the N400 has been successfully studied during a normal reading by using the EFRP analysis method (Dimigen et al., 2011; Kliegl et al., 2012). Such conditions require careful ocular artifact correction that spares the genuine brain activity. In the present study, corneoretinal eye movement artifacts were corrected using a principal component analysis (PCA)-based spatial filter (Ille et al., 2002). In order not to remove brain activity related to the stimulus processing (see Dimigen et al., 2011), we defined the representative PCA components for eye blink and eye movement artifacts manually outside the experimental trials<sup>1</sup> . Other remaining artifacts were removed automatically with ±160μV rejection level.

In the serial condition, the EEG signal was time-locked to the stimulus onset and segmented into epochs extending from −200 to 1500 ms around stimulus onset. The epochs were baseline corrected relative to 100 ms pre-stimulus interval. In the parallel condition, the EEG data were time-locked to the point at which the eyes first entered an emotional image in pleasant and unpleasant trials or a randomly selected image in the neutral trials. The EEG was segmented into epochs from −200 to 1500 ms that were baseline corrected relative to −200 to −100 ms interval before the first eye entry to the target image. The baseline correction was performed before the saccade onsets in order to avoid temporal overlap with the saccadic spike potentials. Moreover, to investigate the time-course of parafoveal processing of emotional stimuli, the ERP responses in the parallel condition were also time-locked to the stimulus onset and segmented into epochs of −200 to 1500 ms with 100 ms pre-stimulus baseline. In both conditions, the data were averaged according to the emotional condition: pleasant, unpleasant, or neutral.

The time windows for the ERP analyses were selected based on visually detected components. In the serial condition, modulation by emotional content was detected at 80–120 ms and at 220–280 ms time windows. In the parallel condition, a positive component at 125–175 ms was observed. In addition, a later sustained positive response, most likely the LPP response, was observed for both presentation conditions. An ANOVA for the LPP peak latencies revealed no differences between the presentation conditions [*F(*1*,* <sup>10</sup>*)* = 2*.*81, *p* = ns.] (serial condition: 387*.*65 ± 38*.*12 SD; parallel condition: 362*.*15 ± 41*.*86 SD). Because the response was sustained, the mean amplitudes were calculated in the 400–500 ms time window. In the serial condition, 7% of the trials were excluded based on the automatic artifact rejection (±160 μV) criteria. The number of trials that entered the ERP-analysis by emotional conditions were: pleasant: 37.1; unpleasant: 37.0; neutral: 37.5. In the parallel condition, 2% of the trials were excluded because the detection of image entry from the eye movement data failed. An additional 7% of the trials were excluded based on the artifact rejection criteria. The number of trials accepted in the parallel condition were: pleasant: 37.0; unpleasant: 36.8; neutral: 35.9. The mean amplitudes were calculated for nine electrodes along the anterior-posterior axis: anterior (F3, Fz, F4), central (C3, Cz, C4) and posterior (P3, Pz, P4), and into three hemispheric groups: left (F3, C3, P3), midline (Fz, Cz, Pz), and right (F4, C4, P4). The mean amplitudes of LPPs were subjected to 2 × 3 × 3 × 3 repeated measures ANOVA with the factors: presentation condition (serial, parallel), emotional condition (pleasant, unpleasant, neutral), anteriorposterior axis (anterior, central, posterior), and laterality (left, midline, right). The mean ERP amplitudes at 80–120 ms and at 220–280 ms time windows in the serial condition and at 125– 175 ms in the parallel condition were analyzed with 3 × 3 × 3 repeated measures ANOVA with the following factors: emotional condition (pleasant, unpleasant, neutral), anterior-posterior axis (anterior, central, posterior), and laterality (left, midline, right). *Post-hoc* multiple comparisons were Bonferroni corrected, and the *p*-values were corrected according to the Greenhouse-Geisser procedure when the sphericity assumption was violated.

### **RESULTS**

#### **BEHAVIORAL RESULTS**

Affective ratings confirmed the differences between emotional image contents (**Table 2**). Pleasant images were judged as more pleasant than unpleasant images [*F(*1*,* <sup>0</sup>*)* = 357*.*44, *p <* 0*.*001, η2 *<sup>p</sup>* = 0*.*97], and the unpleasant pictures were rated higher on arousal than the pleasant pictures [*F(*1*,* <sup>10</sup>*)* = 14*.*54, *p* = 0*.*003, η2 *<sup>p</sup>* = 0*.*59]. In the parallel condition, the quadrant in which the image was presented did not affect the valence or arousal ratings. Moreover, the trial durations in the parallel condition did not differ between emotional conditions or between the quadrants in which the emotional image was presented. The focus of this study was on the effects of emotional valence rather than on emotional arousal. Therefore, the following analyses are only performed for the three different emotional valence categories.

#### **EYE MOVEMENT RESULTS** *Orienting of attention*

The likelihood of launching *the first saccade to target* differed across the conditions [*F(*2*,* <sup>20</sup>*)* <sup>=</sup> <sup>13</sup>*.*17, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*579] with higher likelihood of launching the first saccade toward unpleasant [*t(*9*)* = 4*.*32, *p* = 0*.*005] and pleasant [*t(*9*)* = 3*.*41, *p* = 0*.*019] than toward neutral target images. There were no differences in the likelihood of launching the first saccade to pleasant or unpleasant targets. *The first saccade latencies* did not differ between the emotional conditions. *The target entry times* were affected by the emotional conditions [*F(*2*,* <sup>18</sup>*)* = 12*.*91, *p <* 0*.*001, η<sup>2</sup> *<sup>p</sup>* = 0*.*59], indicating that unpleasant [*t(*9*)* = −4*.*70, *p* = 0*.*003] and pleasant [*t(*9*)* = −3*.*51, *p* = 0*.*020] images were entered earlier than neutral images (**Table 2**). However, no differences occurred between the entry times to unpleasant and pleasant images. A main effect of location was also observed [*F(*1*.*38*,* <sup>12</sup>*.*43*)* <sup>=</sup> <sup>4</sup>*.*37, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*048, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*33], suggesting that images in the upper right [*t(*9*)* = 3*.*76, *p* = 0*.*027] quadrant were entered earlier than images in the lower right quadrant. The

<sup>1</sup>Other proposed ocular artifact correction techniques for co-registration studies include, for example, independent component analysis (ICA) (Baccino, 2011), or the surrogate MSEC model (Ille et al., 2002; Dimigen et al., 2011; Kliegl et al., 2012). Furthermore, it should be noted that even though some ocular artifacts remain on the frontal channels, it does not necessarily preclude inspection of the central or posterior channels, since the artifact potentials attenuate with increasing distance to the eyes (see Kretzschmar et al., 2009; Picton et al., 2000).

interaction between emotional condition and image location was not significant. *The number of fixations before* the target image was entered for the first time varied also between the emotional conditions [*F(*2*,* <sup>18</sup>*)* <sup>=</sup> <sup>18</sup>*.*96, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*001, <sup>η</sup><sup>2</sup> *<sup>p</sup>*= 0.68] and between quadrants [*F(*3*,* <sup>27</sup>*)* <sup>=</sup> <sup>9</sup>*.*49, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*51]. These results showed that pleasant [*t(*9*)* = −3*.*41, *p* = 0*.*008] and unpleasant [*t(*9*)* = 5*.*27, *p* = 0*.*002] images were fixated earlier than neutral images, and that unpleasant images were fixated earlier than pleasant images [*t(*9*)* = 3*.*51, *p* = 0*.*020]. The images at lower right quadrant were fixated later than images at other quadrants.

#### *Engagement of attention*

The emotional conditions differed also in *the number of fixations* on an image [*F(*1*.*18*,* <sup>10</sup>*.*61*)* <sup>=</sup> <sup>32</sup>*.*85, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*79]. That is, unpleasant [*t(*9*)* = 5*.*96, *p* = 0*.*001] and pleasant [*t(*9*)* = 6*.*82, *p <* 0*.*001] images were fixated more often than neutral images, and unpleasant images were fixated more often than pleasant images [*t(*9*)* = −3*.*96, *p* = 0*.*010]. Image location did not affect the amount of fixations on an image. *The dwell times*(i.e., the sum of fixation durations) showed differences between the emotional conditions [*F(*1*.*24*,* <sup>11</sup>*.*16*)* <sup>=</sup> <sup>5</sup>*.*46, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*034, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*38]. That is, pleasant [*t(*9*)* = 3*.*65, *p* = 0*.*016] images were looked at longer than neutral images. The image location did not affect the dwell times.

#### *Saccade kinematics*

The control analysis of saccade kinematics suggested no differences between emotional conditions in terms of directions of incoming saccades from the target (**Table 2**). The incoming saccade amplitudes did not differ between emotional conditions, but the incoming saccade durations differed [*F(*2*,* <sup>18</sup>*)* = 3*.*93, *p* = 0*.*041, η<sup>2</sup> *<sup>p</sup>* = 0*.*30] with shorter saccade durations before unpleasant than before pleasant targets [*t(*9*)* = 2*.*96, *p* = 0*.*048].

In the parallel condition, the ERPs were time-locked to the first target entry and were not restricted to fixation period. To control for the effects of within-image saccadic eye movements, we calculated the number of within-image saccades and their amplitudes and directions in the 500 ms time window that was critical for the ERP-analysis (**Table 2**). This analysis revealed that emotional conditions differed in the within-image saccade amplitudes [*F(*2*,*18*)* = 13*.*58, *p <* 0*.*001, η<sup>2</sup> *<sup>p</sup>* = 0*.*60], suggesting more widespread saccades in the neutral condition as compared to the unpleasant [*t(*9*)* = 4*.*15, *p* = 0*.*003] and pleasant [*t(*9*)* = 3*.*75, *p* = 0*.*014] conditions. Neither the number of within-image saccades nor their directions differed between the emotional conditions.

#### **EEG RESULTS**

#### *Eye movement related potentials*

In order to control for the possibility that the emotional LPPresponses in the parallel condition were affected by earlier differences between the emotional conditions during or after the offset of the saccadic eye movement, we analyzed the ERP amplitudes between −50 and 50 ms around the first target entry (i.e., at the time-locking point). The analysis revealed a three-way interaction of emotional condition × laterality × electrode position [*F(*8*,* <sup>80</sup>*)* <sup>=</sup> <sup>2</sup>*.*22, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*034, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*18]. The *post-hoc* analyses suggested only minor differences in the response topographies

between the emotional conditions. The responses for the unpleasant images over the left hemisphere were more negative at anterior [*t(*9*)* = 3*.*14, *p* = 0*.*032] and central [*t(*9*)* = 3*.*70, *p* = 0*.*012] than at posterior electrode sites. Moreover, the responses at posterior sites in the unpleasant condition were more negative over midline than left hemisphere [*t(*9*)* = 3*.*41, *p* = 0*.*020]. In the pleasant condition, the central responses were more negative over midline than at right hemisphere [*t(*9*)* = 3*.*45, *p* = 0*.*019]. The differences in emotional conditions around the time-locking point (**Figure 2B**) did not account for the much larger and systematic differences between emotional conditions that were observed at 400–500 ms after target entry.

The EFRPs locked to the first target entry showed a positive response in the time window of 125–175 ms. This component was emphasized in the central and parietal electrode sites (**Figure 2B**). The analyses showed no differences in the peak amplitude latencies or amplitudes between the emotional conditions, laterality or in the anterior-posterior axis. This component is most likely the Lambda response, which occurs as a

response to the afferent information inflow at the beginning of a fixation (Kazai and Yagi, 2003). In the present experiment, the Lambda responses were smeared and peaked relatively late (at 150 ms) because the responses were not time-locked to the fixation onset but to the time point when the eyes crossed the border of the target image. The eyes were, thus, still moving at the timelocking point. Most likely due to differences in saccade durations, there is some jitter in the latencies of the Lambda responses resulting in longer responses than the ones typically observed in studies using the co-registration of eye movements and EEG.

#### *Emotional response*

In order to investigate the brain responses to emotional images, the LPP was analyzed in the time window of 400–500 ms for both presentation conditions (**Figures 2A,B**, **Table 3**). The results showed that responses were larger during serial presentation than during parallel presentation of images [*F(*1*,*10*)* = 8*.*42, *p* = 0*.*016, η<sup>2</sup> *<sup>p</sup>* = 46]. Further, these analyses showed that the LPPs differed between the emotional conditions [*F(*2*,*20*)* = 63*.*07, *p* = 0*.*001, η<sup>2</sup> *<sup>p</sup>* = 0*.*86], suggesting stronger responses for unpleasant [*t(*9*)* = −9*.*76, *p <* 0*.*001] and pleasant [*t(*9*)* = −6*.*67, *p <* 0*.*001] images than for the neutral images. Moreover, the responses were stronger for unpleasant than for pleasant images [*t(*9*)* = 5*.*60, *p* = 0*.*001]. The LPPs differed also along the anterior posterior axis [*F(*2*,* <sup>20</sup>*)* <sup>=</sup> <sup>6</sup>*.*33, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*021, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*39]. Overall, the LPP responses were stronger on the central than on the anterior electrode sites [*t(*9*)* = −5*.*97, *p <* 0*.*001]. The results also showed a main effect of laterality [*F(*2*,* <sup>20</sup>*)* = 5*.*96, *p* = 0*.*012, η2 *<sup>p</sup>* = 0*.*37] with stronger responses on the midline than over the right hemisphere [*t(*9*)* = 3*.*36, *p* = 0*.*022). The main effects were modulated by an interactions between the emotional condition <sup>×</sup> laterality [*F(*4*,*40*)* <sup>=</sup> <sup>6</sup>*.*23, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*38], suggesting that for unpleasant and pleasant conditions the responses were stronger over the midline than the left [unpleasant: *t(*9*)* = −3*.*36. *p* = 0*.*022; pleasant: *t(*9*)* = −2*.*89, *p* = 0*.*048] or the right hemisphere [unpleasant: *t(*9*)* = 4*.*52, *p* = 0*.*003; pleasant: *t(*9*)* = 3*.*50, *p* = 0*.*017].

The parallel and serial presentation conditions differed also in response topographies (**Figures 3**, **4**). This was indicated by the interaction between the presentation condition × emotional condition × anterior-posterior axis [*F(*4*,* <sup>40</sup>*)* = 10*.*03, *p* = 0*.*001, η2 *<sup>p</sup>* = 0*.*50], which suggested that for all conditions the parietal responses were larger during serial than during parallel presentation [unpleasant: *t(*9*)* = −4*.*15, *p* = 0*.*002, pleasant: *t(*9*)* = −5*.*38, *p <* 0*.*001, neutral: *t(*9*)* = −3*.*68, *p* = 0*.*004) (**Figure 4**). Further, the interaction between the presentation condition × laterality × anterior-posterior axis [*F(*4*,*40*)* = 3*.*91, *p* = 0*.*009, η2 *<sup>p</sup>* = 0*.*28] showed that during serial presentation, the responses were enhanced across all parietal sites (left: *t(*9*)* = −3*.*70, *p* = 0*.*004, midline: *t(*9*)* = −4*.*99, *p* = 0*.*001, right: *t(*9*)* = −5*.*39, *p <* 0*.*001) as compared to the parallel presentation. In the parallel condition, the responses were stronger over the frontal midline than over the frontal left site [*t(*9*)* = 4*.*03, *p* = 0*.*007]. In the serial condition, the frontal responses were stronger over midline than over the right site [*t(*9*)* = 3*.*14, *p* = 0*.*011]. Moreover, in the parallel condition, the responses were enhanced at frontal

**Table 3 | Mean amplitudes and peak latencies of the LPP response (400–500ms) across the studied electrode sites for the presentation conditions (Serial, Parallel) and for each emotional condition (Unpleasant, Pleasant, Neutral).**


[midline: *t(*9*)* = 4*.*83, *p* = 0*.*002, right: *t(*9*)* = 3*.*21, *p* = 0*.*009] and central [midline: *t(*9*)* = 6*.*16, *p <* 0*.*001, right: *t(*9*)* = 4*.*86, *p* = 0*.*001] sites as compared to the parietal sites. In contrast, during serial presentation, the responses were stronger over central than over the frontal sites [eft: *t(*9*)* = 3*.*41, *p* = 0*.*020, right: *t(*9*)* = 4*.*02, *p* = 0*.*007]. Further, the responses in the serial condition were enhanced at parietal as compared to the central sites [left: *t(*9*)* = 3*.*06, *p* = 0*.*036; midline: *t(*9*)* = 6*.*72, *p <* 0*.*001].

The presentation condition did not affect the LPP peak amplitude latencies. However, latencies of the LPP peak responses (between 200 and 500 ms time window) differed between emotional conditions [*F(*2*,* <sup>20</sup>*)* <sup>=</sup> <sup>19</sup>*.*56, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*66], suggesting that the responses for unpleasant [*t(*9*)* = 6*.*44, *p <* 0*.*001]

and pleasant [*t(*9*)* = 3*.*07, *p* = 0*.*036] images peaked later than the responses for neutral images. Also the unpleasant responses peaked later than the pleasant responses [*t(*9*)* = 3*.*18, *p* = 0*.*030]. Further, the LPP peak response latencies differed along the anterior-posterior axis [*F(*2*,* <sup>20</sup>*)* <sup>=</sup> <sup>26</sup>*.*40, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*73] with earlier peak responses at the parietal than at the central [*t(*9*)* = 6*.*76, *p <* 0*.*001] or the frontal [*t(*9*)* = 6*.*41, *p* = 0*.*001] electrode sites. Further, we performed one-sample *t*-tests for all recorded EEG channels to test whether the subtraction curves between emotional and neutral conditions differed from zero (**Figure 3**). These analyses suggested that the responses to emotional (vs. neutral) images started to deviate earlier (around 100 ms) in the parallel than in the serial condition.

#### *Parafoveal processing of emotional content*

In order to examine the early attentional orienting to emotional scenes, the ERPs in the parallel condition were also time-locked to the stimulus onset (**Figure 5**). The mean amplitudes of these responses were analyzed in 100 ms bins between 0 and 700 ms post-stimulus. In the time-window of 0–100 ms, the analyses showed a difference between emotional conditions [*F(*2*,* <sup>20</sup>*)* = <sup>5</sup>*.*11, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*016, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*34], suggesting more negative responses for the pleasant as compared to the unpleasant images [*t(*9*)* = 2*.*90, *p* = 0*.*047]. The responses at 100–200 ms, 200–300 ms, 300– 400 ms, 400–500, and 500–600 ms did not reveal any differences between the emotional conditions. At 600–700 post-stimulus, a main effect of emotional condition occurred [*F(*2*,* <sup>20</sup>*)* = 3*.*67, *p* = 0*.*044, η<sup>2</sup> *<sup>p</sup>* = 0*.*27], suggesting numerically larger positive deflections for unpleasant than for neutral scenes. The responses to unpleasant scenes were also larger than the responses to pleasant scenes, but these differences did not reach significance in the *post-hoc* multiple comparisons. The results, thus, showed that emotional stimulus content did not modulate the ERP responses until around 600 ms after stimulus onset in the parallel condition.

To further examine the parafoveal processing of emotional content, the ERP- amplitudes in the parallel condition were examined at a single-trial level. This was done by a linear

**FIGURE 5 | Grand average ERPs time-locked to the stimulus onset in the parallel condition.** A 30 Hz filter was used for data plotting.

mixed model, which considered the pre-target saccade amplitudes and the first target-fixation durations as covariates for the single-trial ERP amplitudes. The analysis revealed no relationship between the pre-target saccade amplitudes and ERP-responses at 125–175 ms and 400–500 ms time windows, suggesting that the distance from which saccades were launched toward the target images did not affect the ERP amplitudes. Thus, these analyses supported no parafoveal processing of emotional content. Further, the analysis controlled for the possible associated effects between eye movement variables and ERP-responses, by showing that the ERP amplitudes at 125–175 and 400–500 ms were not modulated by systematic differences in first target-fixation durations.

#### *Early modulation of responses in the serial condition*

Visual inspection of the waveforms in the serial condition revealed a negative (N1) response at 80–120 ms (**Figure 2**). Laterality affected these responses [*F(*2*,* <sup>20</sup>*)* = 10*.*32, *p* = 0*.*001, η2 *<sup>p</sup>* = 0*.*51], suggesting enhanced negative responses at midline than at right electrode sites [*t(*9*)* = 4*.*43, *p* = 0*.*004]. Moreover, the N1 responses differed along the anterior posterior axis [*F(*2*,* <sup>20</sup>*)* <sup>=</sup> <sup>17</sup>*.*46, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*64], suggesting enhanced negative response at frontal [*t(*9*)* = 3*.*86, *p* = 0*.*010] and central [*t(*9*)* = 4*.*86, *p* = 0*.*002] as compared to parietal sites.

Additionally, the waveforms in the serial condition contained a negative going wave at 220–280 ms (**Figure 2**). The latency of this response corresponds to the timeline of the EPN response that is often found in studies of emotional processing (Olofsson et al., 2008). The analysis showed that the EPN amplitudes differed along the anterior posterior axis [*F(*2*,* <sup>20</sup>*)* = 40*.*40, *p <* 0*.*001, η<sup>2</sup> *<sup>p</sup>* = 0*.*80], suggesting more negative responses at frontal as compared to the central [*t(*9*)* = 7*.*81, *p <* 0*.*001] and parietal [*t(*9*)* = 6*.*50, *p <* 0*.*001] electrode sites. The analyses also revealed an interaction between laterality × anterior posterior axis [*F(*4*,* <sup>40</sup>*)* <sup>=</sup> <sup>6</sup>*.*30, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*39], suggesting more negative responses at frontal [left: *t(*9*)* = 5*.*79, *p <* 0*.*001; midline: *t(*9*)* = 5*.*25, *p* = 0*.*001; right: *t(*9*)* = 07*.*03, *p <* 0*.*001] and central [left: *t(*9*)* = 7*.*21, *p <* 0*.*001; midline: *t(*9*)* = 5*.*84, *p <* 0*.*001; right: *t(*9*)* = 8*.*48, *p <* 0*.*001] than at parietal sites. At parietal sites, the responses were more negative at midline than over the right hemisphere [*t(*9*)* = 3*.*48, *p* = 0*.*018].

### **DISCUSSION**

#### **ALLOCATION OF ATTENTION TO EMOTIONAL CONTENT DURING FREE VIEWING**

The present study had two aims. The first aim was to investigate the time course of attention and emotion processes during free viewing of emotional scenes. Previous research has found no consensus on the role of attention on emotional processing. Some studies suggest that attention is automatically directed toward emotional stimuli (Öhman et al., 2001; Blanchette, 2006; Fox et al., 2007), while other researchers propose that emotional processing depends on attentional resources allocated to process the emotional content (Pessoa et al., 2002; Holmes et al., 2003). A third approach suggests a fast and involuntary attention capture by emotional content, which is sensitive to regulatory attentional influences (e.g., Calvo and Nummenmaa, 2007).

In the present study, co-registration of eye movement and EEG data was used to address the time course of attention to emotional stimuli during free viewing. The eye movement data supported previous research (Calvo and Lang, 2004; Nummenmaa et al., 2006, 2009; Coy and Hutton, 2012) in showing that viewers' attention was captured faster by emotional than by neutral content of the stimuli. This was indicated by earlier target entry times, decreased number of fixations before the target entry and higher likelihood of launching the first saccades toward the emotional than for the neutral scenes. Subsequently, sustained attentional focus on emotional stimuli was indexed in larger number of fixations and in longer dwell times for emotional than for neutral pictures. These results suggested that attention was engaged for a longer time, possibly in order to more fully process the emotional significance of the stimuli. The eye movement results, thus, showed that emotional images were detected faster in the parafoveal or peripheral visual fields, and were entered earlier with the eyes than neutral pictures. Previous research assumes that shifts of covert visual attention precede eye movements to a location in space (Deubel and Schneider, 1996). The finding that initial fixations occurred earlier to emotional than to neutral images implies that covert attention to emotional content was driving overt attention toward emotional content faster than to neutral content.

The ERP responses time-locked to the first target entry showed enlarged responses to both unpleasant and pleasant stimuli at 400–500 ms post target entry. The latency and topography of these responses correspond to the "late positive potential," LPP, response. A long lasting elevated positivity when participants attend to emotional pictures is a well-established finding in emotional research (Olofsson et al., 2008). However, the responses time-locked to the stimulus display onset in the parallel condition suggested no differences between the emotional conditions until around 600 ms from the stimulus onset. This time-course corresponds with the eye movement data, indicating that participants made approximately two fixations before they entered the unpleasant image with their eyes. The ERP data, thus, did not support parafoveal processing of emotional stimuli. Furthermore, the single-trial analysis that combined eye movement and ERP measures to examine the effects of pre-target saccade amplitudes on the ERP responses showed no relationship between the eye movement and ERP measures. Although the ERP analysis supported no parafoveal preview effects, there was some indication that the emotional conditions began to differ from each other earlier in parallel than in serial viewing condition (**Figures 2**, **3**). The one-sample *t*-tests performed for the difference curves between emotional and neutral conditions showed that the emotional responses occurred approximately 100 ms earlier in the parallel than in the serial condition. This could indicate a parafoveal preview effect (see Dimigen et al., 2011; Kliegl et al., 2012). However, the analysis supported no differences in the peak latencies for the LPP responses between the presentation conditions. Further, with the current setup, the latency differences in emotional responses cannot be dissociated from the temporal difference in baseline periods between the viewing conditions (also 100 ms earlier in the parallel condition).

Our results, thus, support the view according to which overt spatial attention needs to be directed to emotional content first before the ERP responses to emotional content could be observed. Similar findings have been previously reported in ERPs by Holmes et al. (2003) and by Pessoa et al. (2002) using fMRI. These findings suggest an involvement of higher-level processes in the interaction between emotion and attention. Moreover, both eye movement and EEG results demonstrated enhanced attention to emotional as compared to neutral scenes, supporting "the emotionality hypothesis." The ERP and eye movement results further confirmed the "negativity hypothesis" (Ito et al., 1998; Smith et al., 2003; Hajcak and Olvet, 2008) by showing larger LPP responses to unpleasant than to pleasant stimuli and faster attention capture by the unpleasant than pleasant scenes in terms of the number of fixations made before the first target entry. The unpleasant scenes also engaged attention for a longer duration. This was indicated by a larger number of fixations on unpleasant than on pleasant images.

#### **VALIDATION OF THE CO-REGISTRATION TECHNIQUE**

The second aim was to validate the co-registration technique. In the EEG-analysis, the emotional effects were first established in the SVP, which provided a foundation to investigate the emotional processing during parallel presentation of images. In the parallel condition, the ERP responses were time-locked to the first target entry times. Previous research indicates that emotional scene content can be processed in the parafoveal or peripheral visual fields (e.g., De Cesarei et al., 2009; Nummenmaa et al., 2009; Coy and Hutton, 2012). Therefore, we expected that the processing of emotional content might begin before the eyes landed on the target image. This was expected to confound the analyses of brain responses related to eye movements on the target regions. Contrary to these expectations, our results showed similar LPP responses in both presentation conditions. These findings suggest that co-registration of eye movements and EEG is a valid technique to measure brain responses to emotional visual stimuli during free viewing. However, the use of co-registration technique is faced with several technical and data-analytical problems, which are discussed in more detail the following chapters.

#### *Ocular artifact correction*

Eye movements create large artifacts to EEG recordings (Plöchl et al., 2012). Therefore, co-registration of eye movements and EEG depends on efficient tools for ocular artifact correction. In the present study, we applied a principal component analysis (PCA)-based spatial filter (Ille et al., 2002) to correct for corneoretinal eye movement artifacts. In order to spare brain activity related to the stimulus processing, representative PCA components for eye blink and eye movement artifacts were manually defined outside the experimental trials. The artifact correction was run for continuous data, which then allowed flexible segmentation of the corrected EEG to time-locking points around the first target entries. Moreover, to control for the possibility that the emotional differences in the LPP responses recorded in the parallel condition were due to earlier differences caused by eye movement artifacts, the ERP-amplitudes were analyzed between −50 and 50 ms around the target entry time. These analyses showed no systematic differences between the emotional conditions around the time-locking point, suggesting that the differences in LPPs were not due to early response deviations that could possibly result from oculomotor artifacts.

#### *Hardware synchronization*

Accurate information about the eye position at a given time is a basic requirement for time-locking the ERP responses with respect to the eye movement events. Because saccades produce large potentials in the electrodes attached close to the eyes (i.e., the electro-oculogram, EOG), these electrodes are suitable for determining the latency of large saccades in the EEG data (Dimigen et al., 2011). However, EOG-data do not provide accurate information about the spatial location of the fixations over the stimulus, while co-registration of EEG and video-based eyetracking data can measure accurate gaze position with reported spatial resolutions up to 0.01◦ (Holmqvist et al., 2011). We solved the synchronization between EEG and eye movement data with shared pulses that were sent by the stimulus presentation software to both data sets every few seconds. Other possible problems related to simultaneous recording of video-oculography and EEG, include, for example, the physical contact between EEG sensors and the eye-tracking device. In the present study, a remote eye tracker was used, which allowed a contact-free recording of eye movements. In order to avoid muscle artifacts resulting from head stabilization, participants were comfortably seated in an armchair and their sitting position was stabilized with cushions. The use of active electrodes prevented the electromagnetic fields produced by the eye tracker from disturbing the EEG data. Co-registration of eye movements and EEG is technically challenging, but as concluded also by other authors, the technical problems of hardware synchronization and ocular artifact correction appear to be solvable (see Dimigen et al., 2011; Kliegl et al., 2012).

#### *Overlapping potentials*

Temporally overlapping potentials evoked by target fixations and the background EEG activity, as well as the temporal overlap between the potentials elicited by successive fixations create another challenge for the co-registration technique. Differences in background EEG activity between EFRPs can be avoided, for instance, by excluding fixations in which background activity is likely to differ (e.g., the first fixation of a trial) (see Dimigen et al., 2011). Selection of fixation subsamples has also been proposed as a solution for overlapping potentials between successive fixations (see Dimigen et al., 2011)<sup>2</sup> . In the present study, the serial condition allowed a full control of the stimuli that were presented at a given time, but required unnaturally long inter-stimulus intervals (2–4 s) to prevent the overlap of subsequent potentials. The effects of overlapping potentials were partly solved by comparing the ERP responses time-locked to first target entries to the results established in the serial condition. These results indicated that processing of emotional content

<sup>2</sup>Mathematical techniques that decompose the effects overlapping potentials also seem promising. For example, the ADJAR-technique (Woldorff, 1993) has been proposed as a method to dissociate the effects of temporally overlapping potentials (Baccino, 2011; Dimigen et al., 2011).

elicited comparable responses between the serial and parallel presentation conditions. The corresponding response latencies and topographies, thus, suggested that the responses in the parallel condition reflected emotional processing and were not resulted by overlapping brain potentials from subsequent fixations or the oculomotor activity caused by the free viewing task. Moreover, the ERP-analysis in the parallel condition were restricted to the first target entries. In these situations, the eyes arrive from neutral images, which at least partly ensured that no ongoing emotional processing of previous images contaminated the responses. More generally, the co-registration technique allows investigations of reinspection events to previously viewed parts of the stimulus or investigations of events where processing is distributed over several fixations (e.g., lag/spill-over effects) (Kliegl et al., 2012). However, such events were not analyzed in the present study.

### *Associated effects between eye movement measures and ERP responses*

Previous research suggests that amplitudes of saccades that precede or follow fixations affect the EEG around fixations (Keren et al., 2010). Moreover, differences in fixation durations translate into changes in the phases of overlapping ERP responses (Dimigen et al., 2011). Therefore, several control analyses were performed for the saccade kinematics observed during the parallel condition. First, to control for the possible effects of saccades on ERPs time-locked to the first target entry, we calculated the directions, amplitudes, and durations of incoming saccades for the target images across the emotional conditions. None of these variables accounted for the observed differences in ERPs between emotion conditions.

Moreover, we were interested in a response (LPP) with a timeline that exceeded the fixation duration in most cases. Therefore, it was possible that the ERPs time-locked to the first target entry were contaminated by eye movement artifacts. In order to control for such effects, we calculated the number of within-image saccades and their amplitudes and directions in the 500 ms time window. These analyses revealed more widespread saccades in the neutral than in the unpleasant and pleasant target images. The number of within-image saccades and their directions did not differ between the emotional conditions. These analyses further confirmed that the observed ERP differences in the parallel condition were not due to eye movement artifacts. We hypothesized that longer within-image saccade amplitudes would result in elevated responses for the neutral condition. However, this was not the case. Further, the fact that the ERP responses differed between pleasant and unpleasant condition, while there were no differences in within-image saccade amplitudes between these conditions, supported the conclusion that differences were not due to remains of ocular artifacts.

The single-trial analysis that combined eye movement and ERP measures further suggested that systematic differences in eye movement measures did not explain the observed ERP effects. This analysis revealed no effect of the pre-target saccade amplitudes and the first target fixation duration on ERP-responses at the two studied time windows (125–175 and 400–500 ms).

#### *Evaluation of the results and conclusions*

Comparison of the ERP results between the serial and parallel presentation conditions suggested elevated responses in the serial condition. The experimental design prevented us from concluding whether the difference was due to the fact that four pictures competed for attentional resources simultaneously in the parallel condition, while in the serial condition, only one picture was attended at a time. Further, the parallel condition allowed parafoveal preview of the pictures, which may have attenuated the responses when the pictures were fixated for the first time. In future studies, these effects can be dissociated by systematically varying the amount of simultaneously presented pictures in the parallel condition.

In the present study, the participants were instructed to look through all four images freely and to respond by clicking a mouse when they were ready to continue onto the next trial. Thus, no explicit instruction to respond as fast as possible was given. This may have influenced the results, because the target entry times (over 1 s) after the trial onset were significantly longer than for example the saccade latencies reported by Nummenmaa et al. (2009). Another possible reason for the relatively long target entry times is that our experimental setup contained more competing distractor images than the previous studies (e.g., Nummenmaa et al., 2006, 2009; Coy and Hutton, 2012).

Further, it is interesting that pleasant and unpleasant conditions differed in the number of fixations before entering the target image, while the target entry times suggested no negativity effects. The participants possibly made longer fixations prior to entering an unpleasant (vs. pleasant) target image, which would explain the lower number of fixations before the target entry, but no difference in the target entry times. The two measures of attentional engagement also showed discrepant results in terms of the negativity effect. That is, the number of fixations suggested that unpleasant images were fixated more frequently than pleasant images, while the dwell times showed no differences between the unpleasant and pleasant conditions. This is a curious finding since the results of attentional orienting toward unpleasant images suggested that longer fixations were made prior to target entry. On the contrary, the results about engagement of attention suggest that after entering the target image the participants made many but shorter fixations on the unpleasant images, whereas the total dwell times were the longest for pleasant images.

Previous studies suggest that differences in tasks and stimulus materials may confound the results (e.g., Lipp et al., 2004). For example, the long-latency LPP response is strongly influenced by arousal (Olofsson et al., 2008). The observed differences in arousal ratings between unpleasant and pleasant conditions in the present study may partly explain the negativity effect in the ERPresults. Furthermore, the heterogeneity in the nature of the emotions depicted in the images may have confounded the results. The present study also adopted a typical paradigm for investigating attention to emotional stimuli. That is, the stimulus displays contained a number of independent images with unrelated contents and locations. Therefore, an independent emotional gist could have been extracted from each image, and the few possible locations where the items could be displayed may have eased the task by increasing the expectation of the emotional stimuli. Thus, it remains unclear whether these effects would remain under more natural conditions where perceptual and foveal load are high, and most importantly where the emotional objects are part of a whole scene. This necessitates that in the future, emotional information processing should be studied with emotional items embedded in the scene (see Acunzo and Henderson, 2011).

In summary, rapid processing of emotional stimuli is a critical aspect of emotional responsiveness. The eye movement results of the present study suggested that emotional content was detected in the parafoveal/peripheral visual field, and was therefore attended faster than neutral information. However, corresponding LPP responses to emotional stimuli were recorded across the SVP and free viewing of emotional scenes. The ERP results, thus, did not show any parafoveal processing effects. Our results were consistent with the view that emotional processing depends on overt attentional resources. Further, the present results suggest that recording of eye movements and ERPs simultaneously provides complementary information about cognitive

### **REFERENCES**


processing and allows for direct comparisons between neural activity and oculomotor behavior. According to Olofsson et al. (2008) the advantage of collecting behavioral measures simultaneously with ERPs allows us to validate the theoretical interpretations of the ERP results. That is, the behavioral and eye movement data can provide an index of attention revealing the functional significance of waveform modulations by emotional content. Mapping the correlates between behavioral performance measures and affective ERP changes helps to identify the psychological mechanisms underlying affective changes in neuroelectrical responses.

### **ACKNOWLEDGMENTS**

This study was supported by grants from Helsingin Sanomat Foundation (project number: 4701609) and the Academy of Finland (project number: 1137511). The authors thank Miika Leminen and Tommi Makkonen for technical support and Jari Lipsanen for help in statistics. We also thank the three reviewers for helpful comments.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 February 2013; accepted: 25 July 2013; published online: 20 August 2013.*

*Citation: Simola J, Torniainen J, Moisala M, KivikangasM and Krause CM (2013) Eye movement related brain responses to emotional scenes during free viewing. Front. Syst. Neurosci. 7:41. doi: 10.3389/ fnsys.2013.00041*

*This article was submitted to the journal Frontiers in Systems Neuroscience.*

*Copyright © 2013 Simola, Torniainen, Moisala, Kivikangas and Krause. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

### **APPENDIX A**

Full list of IAPS images used in the study:

pleasant: 1440, 1460, 1710, 1721, 2057, 4220, 4599, 4659, 4660, 4680, 5010, 5201, 5450, 5460, 5470, 5621, 5623, 5629, 5700, 5831, 5870, 5910, 8021, 8030, 8031, 8034, 8080, 8090, 8161, 8170, 8180, 8190, 8200, 8210, 8300, 8370, 8380, 8400, 8470, 8490, unpleasant: 1300, 3000, 3010, 3030, 3060, 3063, 3064, 3071, 3080, 3100, 3110, 3120, 3130, 3140, 3150, 3170, 3400, 3500, 3530, 6230, 6260, 6300, 6312, 6313, 6350, 6370, 6510, 6540, 6560, 9040, 9050, 9250, 9252, 9405, 9410, 9570, 9600, 9810, 9910, 9921, neutral: 1450, 1670, 2010, 2020, 2190, 2200, 2270, 2500, 2630, 2840, 2870, 2880, 2890, 4653, 4658 5020, 5250, 5390, 5410, 5500, 5510, 5520, 5530, 5531, 5532, 5533, 5534, 5622, 5720, 5731, 5740, 5800, 5900, 6150, 7000, 7002, 7004, 7006, 7009, 7010, 7020, 7030, 7034, 7035, 7040, 7050, 7060, 7080, 7090, 7100, 7140, 7160, 7170, 7175, 7182, 7183, 7185, 7190, 7205, 7207, 7233, 7237, 7238, 7286, 7351, 7352, 7490, 7491, 7500, 7503, 7510, 7550, 7620, 7710, 7820, 7830, 7950, 8311, 8461, 8465