# THEORIES OF VISUAL ATTENTION - LINKING COGNITION, NEUROPSYCHOLOGY, AND NEUROPHYSIOLOGY

EDITED BY: Søren Kyllingsbæk, Signe Allerup Vangkilde and Claus Bundesen PUBLISHED IN: Frontiers in Psychology

#### *Frontiers Copyright Statement*

*© Copyright 2007-2015 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

> *All copyright, and all rights therein, are protected by national and international copyright laws.*

*The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88919-637-1 DOI 10.3389/978-2-88919-637-1

### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

## **THEORIES OF VISUAL ATTENTION - LINKING COGNITION, NEUROPSYCHOLOGY, AND NEUROPHYSIOLOGY**

Topic Editors:

**Søren Kyllingsbæk,** University of Copenhagen, Copenhagen, Denmark **Signe Allerup Vangkilde,** University of Copenhagen, Copenhagen, Denmark **Claus Bundesen,** University of Copenhagen, Copenhagen, Denmark

Possible distribution of visual processing across the human brain according to the Neural Theory of Visual Attention (NTVA).

Figure by S. Kyllingsbæk, S. A. Vangkilde and C. Bundesen

The Neural Theory of Visual Attention of Bundesen, Habekost, and Kyllingsbæk (2005) was proposed as a neural interpretation of Bundesen's (1990) theory of visual attention (TVA). In NTVA, visual attention functions via two mechanisms: by dynamic remapping of receptive fields of cortical cells such that more cells are devoted to behaviorally important objects than to less important ones (filtering) and by multiplicative scaling of the level of activation in cells coding for particular features (pigeonholing). NTVA accounts for a wide range of known attentional effects in human performance and a wide range of effects observed in firing rates of single cells in the primate visual system and thus provides a mathematical framework to unify the 2 fields of research.

In this Research Topic of Frontiers in Psychology, some of the leading theories of visual attention at both the cognitive, neuropsychological, and neurophysiological levels are presented and evaluated. In addition, the Research Topic encompasses application of the framework of NTVA to various patient populations and to neuroimaging as well as genetic and psychopharmacological studies.

**Citation:** Kyllingsbæk, S., Vangkilde, S. A., Bundesen, C., eds. (2015). Theories of Visual Attention - Linking Cognition, Neuropsychology, and Neurophysiology. Lausanne: Frontiers Media. doi: 10.3389/978-2-88919-637-1

# Table of Contents


## Editorial: Theories of visual attention—linking cognition, neuropsychology, and neurophysiology

#### Søren Kyllingsbæk \*, Signe Vangkilde and Claus Bundesen

*Department of Psychology, Center for Visual Cognition, University of Copenhagen, Copenhagen, Denmark*

Keywords: neural, visual, attention, computational, model

The Neural Theory of Visual Attention of Bundesen et al. (2005) was proposed as a neural interpretation of Bundesen's (1990) theory of visual attention (TVA). In NTVA, visual attention operates via two mechanisms: by dynamic remapping of receptive fields of cortical cells such that more cells are devoted to behaviorally important objects than to less important ones (filtering) and by multiplicative scaling of the level of activation in cells coding for particular features (pigeonholing). NTVA accounts for a wide range of known attentional effects in human performance and a wide range of effects observed in firing rates of single cells in the primate visual system and thus provides a mathematical framework to unify the two fields of research.

In this Research Topic of Frontiers in Psychology, a host of new empirical findings from studies employing a theoretical and methodological framework based on NTVA are presented and discussed. The presented articles relate to the cognitive, neuropsychological, and neurocomputational levels of contemporary attention research and employ a variety of methods including behavioral testing, neuroimaging, and computational modeling.

In the first article of the Research Topic, Habekost (2015) offers a review of clinical TVA-based studies, in which the theoretical framework of TVA and NTVA is presented and discussed in relation to its clinical use. The review is followed by an article by Bogon et al. (2014), who present a TVA-based assessment of visual attention functions in developmental dyslexia. Following this, two papers on TVA-based measures of age-related effects and white matter brain microstructures are presented by Espeseth et al. (2014) and Wilms and Nielsen (2014). The fifth paper, by Nielsen and Wilms (2015), is also related to aging, but uses confirmatory factor analyses in Structural Equation Modeling in combination with TVA-based modeling. Next Bullock and Giesbrecht (2014) explore how acute exercise and aerobic fitness may influence selective attention during visual search. This paper is followed by a TVA-based study by Poth et al. (2014) combining a prospective memory task with traditional whole and partial report paradigms, thus studying effects of monitoring for visual events on distinct components of attention. Then Kyllingsbæk et al. (2014) present a study on automatic attraction of visual attention to supraletter features. In the final paper, Tsotsos and Kruijne (2014) propose an extension of their Selective Tuning model of attention, in which executive control over visual attention is implemented by Cognitive Programs. In the future, NTVA might also be extended with Cognitive Programs.

This collection of articles reflects the strong, continued interest in using a TVA-based framework for investigating and understanding visual attention in both healthy participants and patient groups, and the articles also provide important examples of how this may be done. If the reader should wish to delve into the latest theoretical developments of TVA and NTVA complementing the articles presented here, we further recommend the recent papers by Bundesen et al. (2014, 2015).

Edited and reviewed by: *Bernhard Hommel, Leiden University, Netherlands*

> \*Correspondence: *Søren Kyllingsbæk, sk@psy.ku.dk*

#### Specialty section:

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology*

Received: *06 May 2015* Accepted: *22 May 2015* Published: *12 June 2015*

#### Citation:

*Kyllingsbæk S, Vangkilde S and Bundesen C (2015) Editorial: Theories of visual attention—linking cognition, neuropsychology, and neurophysiology. Front. Psychol. 6:767. doi: 10.3389/fpsyg.2015.00767*

### References


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Kyllingsbæk, Vangkilde and Bundesen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Clinical TVA-based studies: a general review

### *Thomas Habekost\**

*Department of Psychology, University of Copenhagen, Copenhagen, Denmark*

In combination with whole report and partial report tasks, the theory of visual attention (TVA) can be used to estimate individual differences in five basic attentional parameters: the visual processing speed, the storage capacity of visual short-term memory, the perceptual threshold, the efficiency of top–down selectivity, and the spatial bias of attentional weighting. TVA-based assessment has been used in about 30 studies to investigate attentional deficits in a range of neurological and psychiatric conditions: (a) neglect and simultanagnosia, (b) reading disturbances, (c) aging and neurodegenerative diseases, and most recently (d) neurodevelopmental disorders. The article introduces TVA based assessment, discusses its methodology and psychometric properties, and reviews the progress made in each of the four research fields. The empirical results demonstrate the general usefulness of TVA-based assessment for many types of clinical neuropsychological research. The method's most important qualities are cognitive specificity and theoretical grounding, but it is also characterized by good reliability and sensitivity to minor deficits. The review concludes by pointing to promising new areas for clinical TVA-based research.

### *Edited by:*

*Bernhard Hommel, Leiden University, Netherlands*

#### *Reviewed by:*

*Celine R. Gillebert, University of Oxford, UK Peter Bublak, Jena University Hospital, Germany*

#### *\*Correspondence:*

*Thomas Habekost, Department of Psychology, University of Copenhagen, Oester Farimagsgade 2A, 1353 Copenhagen, Denmark thomas.habekost@psy.ku.dk*

#### *Specialty section:*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology*

*Received: 16 October 2014 Accepted: 28 February 2015 Published: 18 March 2015*

#### *Citation:*

*Habekost T (2015) Clinical TVA-based studies: a general review. Front. Psychol. 6:290. doi: 10.3389/fpsyg.2015.00290* Keywords: ADHD, alexia, assessment, attentional deficit, dyslexia, neglect, neurodegenerative disease

### Introduction

The theory of visual attention (TVA) is a mathematical model of visual attention that describes the process of selecting and encoding visual categorizations into short-term memory (Bundesen, 1990). The model accounts for many classical findings on visual attention from the cognitive and neurophysiological literature (see Bundesen and Habekost, 2008, 2014). Coupled with two experimental tasks, whole and partial report, TVA can also be used to estimate a set of basic attentional parameters in a given individual (Duncan et al., 1999). These parameters include visual processing speed, storage capacity of visual short-term memory, efficiency of attentional control, spatial bias of attention, and the visual perception threshold. This way, TVA-based assessment of visual attention is both grounded in basic research and provides highly specific measurements of several cognitive abilities.

Since 1999, these testing qualities have motivated widespread use of TVA-based assessment in studies of attentional deficits in neurological and psychiatric patients. The investigations fall under the general heading of clinical TVA-based studies and are the focus of the present review. About 30 such studies have been published to date, which can be grouped into four research areas: (1) neglect and simultanagnosia, (2) reading disturbances following stroke (alexia) or as a developmental disturbance (dyslexia), (3) aging and neurodegenerative diseases, and most recently (4) neuropsychiatric and neuropaediatric disorders. This article reviews the progress made in each of these research areas and discusses general trends in the empirical findings. The article also reviews methodological aspects of TVA-based assessment and looks toward promising new areas for clinical TVA-based research.

Besides clinical studies, it is important to note that TVA-based assessment has also been applied in other types of investigations that target attentional function in young healthy participants. The most important of these research lines deals with physiological or cognitive interventions designed to alter normal attentional function: transcranial magnetic stimulation (TMS; Hung et al., 2005, 2011), transcranial direct current stimulation (tDCS; Moos et al., 2012), pharmacological intervention (Finke et al., 2011; Vangkilde et al., 2012), meditation (Jensen et al., 2012), and video gaming (Wilms et al., 2013). A related line of research seeks to identify neural correlates of the TVA parameters using neuroimaging methods: functional magnetic resonance imaging (fMRI; Gillebert et al., 2012) and electroencephalography (EEG; Wiegand et al., 2014b). Studies of healthy young participants are not covered in the present review, but as suggested in Section "Future directions," it seems promising to combine clinical TVA-based research with these intervention and neuroimaging methods.

### Methodology of TVA-Based Assessment

### The TVA Model

Before explaining the assessment method, a brief introduction to the TVA model is necessary (see also Bundesen and Habekost, 2008, for a more detailed description). According to TVA conscious recognition of a visual object corresponds to encoding one or more of the object's properties into a visual short-term memory store. The memory store only has room for very few objects, typically 3 or 4 in young healthy individuals, and its capacity (the *K* parameter in TVA) varies individually. The encoding process that leads up to conscious recognition takes the form of a competitive race between the objects in the visual field. TVA assumes that the visual system processes all objects in the visual field independently and in parallel, but not equally fast. The processing rate of a given object determines its probability of winning the race and become encoded into visual short-term memory, given that the store is not already filled up by other objects. The sum of the processing rates for all objects in the visual field equals the total processing speed of the visual system (under the given stimulus conditions) and is represented by parameter *C*. The processing rate for each individual object reflects the proportion of the total processing capacity that has been allocated to the object (its attentional weight). The computation of attentional weights is also modeled in TVA, and comparisons between the weights of different objects provide the basis of two additional parameters. The attentional weight of a target versus a distractor object is a measure of the efficiency of top–down control of attention (parameter α). Making an alternative comparison, the attentional weights of objects in different parts of the visual field (e.g., left vs. right) provide a measure of spatial attentional bias (parameter *w*index). The fifth parameter of the TVA model, *t*0, represents the time at which the processing race starts and visual objects begin to have an above-zero probability of being recognized. This implies that *t*<sup>0</sup> is measure of the lower threshold for visual perception.

The TVA model is formulated mathematically (see Bundesen and Habekost, 2008, for details) and can therefore provide exact predictions for attentional performance under specific experimental conditions. Bundesen (1990) used this quantitative precision to model a wide range of classical findings in the literature on normal visual attention. The empirical account by Bundesen (1990) covered many experimental paradigms such as whole report, partial report, cued detection, single stimulus recognition, and visual search. It has also been shown that the TVA model can explain many of the attentional effects observed at the single-cell processing level, as reflected in firing rates of individual neurons (Bundesen et al., 2005).

### TVA-Based Assessment: General Method

TVA can account for findings from many different attentional paradigms, but individual values of the model's five parameters can be estimated most directly from performance on two specific tasks, whole report and partial report. Indeed, all studies with TVA based patient assessment use some version of whole report and/or partial report to estimate some or all of the parameters specified in the TVA model. In whole report one or more visually simple objects (typically letters) are flashed on a computer screen for a well-defined time period (the exposure time). The stimuli are typically followed by pattern masks, which erase the visual afterimage and precisely control the time the stimuli are available for processing. In other cases the stimuli are followed by a blank screen, which extends the effective exposure duration (the prolongation can be approximated by a constant, parameter μ) and thereby makes the task easier, which is relevant in many clinical contexts. The participant is instructed to report the identity of as many objects as possible, but refrain from guessing. Reporting must neither be too liberal or too conservative, which is usually operationalized as an error rate between 10 and 20% and controlled by feedback to the participant during testing. Some participants (e.g., young children or elderly persons) may find it difficult to comply with this well-defined accuracy level; in such cases forced-choice reporting can be employed coupled with correction for guessing in the data analysis. It is also recommended to precede testing with a short practice period (e.g., 30–40 trials) where participants can be familiarized with the task. When testing is repeated a sufficient number of trials, and with a selection of exposure times that avoids floor and ceiling effects, a systematic pattern of whole report performance then emerges (see **Figure 1**).

Below an individually variable exposure duration *t*0, the number of correctly reported letters is zero. *t*<sup>0</sup> is therefore a measure of the visual perception threshold for the stimuli. As the exposure duration is increased above the threshold the score increases monotonically, though with negative acceleration. The slope of the curve at its steepest point, *t* = *t*0, is a measure of the total visual processing speed *C*. At higher exposure durations the performance curve gradually levels off to approach an asymptote, which represents the maximum storage capacity of visual shortterm memory, *K*. *K* is classically assumed to represent a weighted

average of two maximum values (Shibuya and Bundesen, 1988). For example, storage capacity might alternate between 3 and 4 elements with a probability of occurrence at 40 and 60%, respectively. Recent developments of TVA analysis, however, assume a more broad distribution of *K* (see Reliability and other Psychometric Properties). If stimuli are simultaneously shown in both visual fields the spatial balance of attentional weights, *w*index, can also be estimated from whole report data. *w*index range from 0 to 1; symmetrical attentional weighting corresponds to a *w*index value of 0.5 and the degree of lateral bias to either the left or right side can be measured by the deviation of *w*index from this intermediate value.

whole report data are marked out: the perceptual threshold *t*0, the visual processing speed *C*, and the storage capacity of visual short-term memory *K*.

In the partial report task stimuli of two different types (typically distinguished by color) are presented. The task is to report only stimuli of one type (targets) and ignore the other stimuli (distracters). The performance reduction with distracters, compared to the situation when only targets are presented, provides a measure of the efficiency of top–down selectivity α. More specifically, α is defined as the ratio of attentional weights between a target and a distracter. An α value of 0 implies perfect selection, whereas a value of 1 implies no selectivity between targets and distracters.

### Variants of TVA-Based Assessment

The basic method of TVA based patient assessment was developed by Duncan et al. (1999), who used it to study patients with neglect following stroke in the right hemisphere (see Neglect and Related Conditions for the main findings of this study). Duncan et al. used a whole report task where five letters are displayed in a vertical column either to the left or right of fixation, shown at three individually adapted exposure times (see **Figure 2A**). In half of the conditions the stimuli are post-masked, in the other half they are followed by a blank screen. This way separate estimates of *C*, *K*, and *t*<sup>0</sup> can be obtained in each hemifield. Duncan et al. also used a partial report experiment, where either one or two stimuli (targets or distracters, defined by color) are shown at four possible locations around fixation, at one individually adapted, post-masked exposure duration (see **Figure 2B**). This |

(B) Different trial types of the partial-report experiment with targets (marked as "T") and distracters (marked as "D").

partial report paradigm allows for estimation of α and provides an indirect measure of sensory effectiveness by the parameter *A* (in which the effects of *t*<sup>0</sup> and *C* are not separated) for each of the four display positions. An estimate of *w*index can also be derived from this paradigm due to the inclusion of bilateral target displays.

The experimental design used by Duncan et al. (1999) has been adopted in many later studies, especially those carried out by German research groups (e.g., on neurodegenerative diseases) and in this way established as a standard paradigm for TVA based assessment. Other studies have altered the details of the experimental design considerably, either to optimize testing for a specific clinical group or to elucidate a particular research hypothesis. For example, Peers et al. (2005) used displays with just one centrally located stimulus to estimate visual processing speed. Habekost and Rostrup (2006, 2007) used circular displays and included a version of partial report with multiple targets and distracters. The stimuli themselves have also been varied, both with regards to letter appearance (color, font, size) and in some cases by including other stimulus types (e.g., faces: Peers et al., 2005; digits: Starrfelt et al., 2009; short words: Habekost et al., 2014a). In recent years a second standard paradigm for TVA based assessment has emerged, the so-called CombiTVA paradigm developed by Vangkilde et al. (2011). The CombiTVA paradigm uses six display stimuli and intermixes whole and partial report trials to better constrain the total set of parameter estimates (see **Figure 3**). The CombiTVA has already been used quite widely in studies that together number many hundred participants (e.g., Dyrholm et al., 2011; McAvinue et al., 2012b; Habekost et al., 2014b).

#### Developments in Data Analysis

The procedure used by Duncan et al. (1999) to analyze their data was specific to the experimental design in this study. The

analysis method was mathematically generalized by Kyllingsbæk (2006), who also provided an easy-to-use software package for automated analysis of many types of whole and partial report data. This fitting program has been used for a large part of the studies described in the present article. Dyrholm et al. (2011) further developed the model fitting procedure. Besides presenting a more analytically efficient procedure for parameter estimation, they documented a significant trial-by-trial variability in a large data set from 347 healthy participants, which was not captured by the standard TVA data modeling. Further, systematic biases were demonstrated in the standard estimation of the *K* and *t*<sup>0</sup> parameters: *K* was typically overestimated by at least half an item, whereas *t*<sup>0</sup> was underestimated by about 2 ms. The modified analysis procedure of Dyrholm et al. enables more robust fitting of the *K* parameter by assuming that the parameter follows a broader distribution (from trial to trial) rather than varying between just two maximum values. In addition, the *t*<sup>0</sup> parameter is allowed to vary from trial to trial following a Gaussian distribution to counter underestimation of the perceptual threshold due to a few "lucky guesses" during testing at short exposure durations. A new fitting parameter, the probability of an attentional "lapse" during testing (i.e., no report despite a long exposure duration) was also proposed by Dyrholm et al. and has been used in a recent study of children with ADHD (McAvinue et al., 2012a).

### Reliability and other Psychometric Properties

The psychometric properties of TVA-based assessment, especially the reliability of the tests, have been investigated in several ways. In two studies of stroke patients, Habekost and Bundesen (2003) and Habekost and Rostrup (2006) used bootstrap statistics to estimate the measurement error related to each of the TVA parameters (see Neglect and Related Conditions for further descriptions of these studies). In brief, bootstrap analysis produces an estimate of the variability inherent in a set of observed data (e.g., 324 observations in a whole report experiment) and provides a statistical confidence interval for each parameter estimate that is based on these data (Efron and Tibshirani, 1998). The bootstrap analysis showed generally low measurement error for most parameter estimates, especially the *K* parameter. Notably, the estimates of the α parameter were found to be less reliable than the others. The analysis was also useful for testing whether parameter estimates were significantly different from each other in the left or right hemifield, a major focus of these two studies.

The bootstrap method was used in a different way by Finke et al. (2005), who studied 35 young healthy participants by whole and partial report tasks (using the standard paradigms of Duncan et al., 1999). Finke et al. compared estimates based on the full data set (672 trials) vs. subsets of the data (i.e., the first 384, 288, or 192 trials) for each participant to investigate how test length affected the reliability of the parameter estimates. The bootstrap analysis indicated high internal reliability even for the shortest versions of the data set, with α as the least reliable parameter. This analysis was supplemented by computations of intra-parameter correlations between estimates based on the full or partial versions of the data set. This analysis showed very high stability of *C* and *K* (*r >* 0.90) even after just 192 trials, whereas α and *w*index

obtained sufficient reliability (*r* ≥ 0.75) only with 288 or more trials. Finke et al. also looked at the intercorrelations between different TVA parameters and found these to be generally small and non-significant, apart from a moderately strong relation between *C* and *K* (*r* = 0.40). This was taken as evidence of good functional specificity in the test's measurement. The specificity of the TVA parameters was further supported by the fact that each of them correlated well with clinical tests addressing corresponding cognitive functions.

The reliability of TVA based assessment was investigated more generally by Habekost et al. (2014b) in a sample of 68 young healthy participants, who were tested three times (1 week interspaced) with the CombiTVA paradigm (432 trials in total). The data modeling was carried out in accordance with the modified procedure of Dyrholm et al. (2011). The results were compared to another widely used test of attention, the Attentional Network Test (ANT; Fan et al., 2002), which is theoretically based in Posner and Petersen (1990) anatomical network model of attention. In line with the previous bootstrap investigations, Habekost et al. (2014b) found high or very high internal reliability (*r* = 0.90 or higher) for all five TVA parameters by a split-half analysis of the data. The very high internal reliability held up for *K* and *w*index in a shortened version of the CombiTVA (216 trials), whereas the other three parameters still showed good, but not high reliability under these circumstances. The study also provided the first systematic investigation of the test–retest reliability of TVA based assessment. *K* and *w*index showed good retest reliability (*r >* 0.80) throughout the repeated testing, whereas *C*, *t*0, and α showed only moderate reliability (*r* around 0.60) from the first to the second testing session, which, however, increased to *r* = 0.75–0.85 between the second and the third testing. Both the internal and test–retest reliability of the CombiTVA was consistently better than that of the ANT test, also when taking test length into account. Further, the results replicated Finke et al.'s (2005) finding of small and non-significant intercorrelations between the different TVA parameters, except for *K* and *C*, which in this study were highly correlated (*r* = 0.72). TVA parameters were generally not related to the ANT measures either, apart from a moderate correlation between α and the ANT's executive network score. Finally, the study showed significant practice effects over the course of the three testings, especially for the *C* and α parameters. This makes it clear that performance on TVA based assessment can only be meaningfully compared between participants with equal levels of practice with the task.

#### Section Summary

A full evaluation of the strengths and weaknesses of TVAbased assessment must include studies that have employed the method in clinical populations. These studies are to be presented in the next four sections, but the studies described in this section still allow for a number of preliminary conclusions. As a test of visual attention in healthy participants TVA-based assessment has a number of important strengths. One of the most important is the method's theoretical grounding, the fact that it is derived from a general model of visual attention that accounts for a large part of the basic research

within the field. This implies that the parameters measured are not narrowly bound to particular testing tasks, but represent more general aspects of attentional function. The ANT test of Fan et al. (2002) has a similar status by being grounded in Posner's anatomical network model of attention, which can also explain many findings in the attentional literature. The ANT, however, measures other, more response-related aspects of attention and produces less reliable test results than TVA-based assessment.

Another main strength of TVA-based assessment is the method's specificity: five different aspects of attention are being measured separately. Correlations between the different TVA parameters are generally low, with the exception of *C* and *K*, which may raise some concern about the empirical separability of these two parameters. The specificity of the assessment is also evident in another way: unlike most other tests of attention, whole and partial report tasks do not involve reaction-time measures, which means that motor processes do not influence performance significantly. For this reason the test results specifically reflect the efficiency of attentional processes within the visual system. TVA-based assessment also has good reliability, both as measured within a single test session and across re-testings. Practice effects, however, need to be controlled for when comparing performance between individuals, and estimation of some parameters (e.g., α) is less reliable than others.

From a practical consideration, it is useful that the measurement of all parameters is carried out in one integrated test setup with simple instructions and minor response requirements. The fact that the experiments can be tailored to different theoretical interests and test populations (e.g., by varying the stimulus types and display arrangement) provides flexibility for many types of investigations. The test produces reliable results from a few hundred trial repetitions, which implies that a minimum of about half an hour's testing is necessary. This time frame is compatible with many clinical examinations, but can still be an obstacle in some situations (e.g., bed-side testing). This leads us to the clinical studies.

### Neglect and Simultanagnosia

Neglect is a severe disturbance of attention that often follows unilateral stroke, especially after right-side lesions. Neglect can be defined as the failure to report, respond, or orient to stimuli in the contralesional side, when this failure cannot be attributed to either sensory or motor impairments (Heilman et al., 2003). Neglect is probably the most widely studied attentional syndrome in the neuropsychological literature and provided a natural starting point for clinical TVA-based studies when the field was opened up by Duncan et al. (1999). The primary aim of Duncan et al. was to use the cognitive specificity of TVA-based assessment to characterize different components of the neglect syndrome. This general idea of cognitive deficit analysis has inspired many of the other studies described in this review. Subsequent TVA-based investigations of neglect and similar conditions have followed up on the initial study of Duncan et al. in various ways. For example, studies have used TVA-based assessment to clarify the lesion anatomy underlying different attentional deficits or to characterize subclinical manifestations of the neglect syndrome.

Another classic neuropsychological syndrome, simultanagnosia, has also been investigated by TVA-based testing. Simultanagnosia is an even more severe attentional disturbance than neglect and characterized by an extreme reduction in the ability to perceive scenes and multiple objects at the same time. Simutanagnosia typically occurs after bilateral parietal lesions (dorsal simultanagnosia) in which case it is often associated with difficulties in visually guided reaching and eye movement control (Balint's syndrome). Also in the studies of simultanagnosia, a main aim has been to specify different functional deficits and thereby address theoretical hypotheses about the core deficits of the syndrome.

### Neglect and Related Conditions Duncan et al. (1999): Neglect

Duncan et al. (1999) studied nine patients with visual neglect after strokes in the right hemisphere. The lesions centered on the right parietal cortex, but adjacent structures were also affected to some extent. Compared to an age-matched control group Duncan et al. (1999) found that the patients had general reductions in processing capacity: *K* was abnormally low in both hemifields. *C* was also reduced bilaterally, most pronounced in the left hemifield. Given that neglect is traditionally associated with lateralized deficits, this was an interesting finding, and one that corresponds well with later understandings of the syndrome that emphasize nonlateralized deficits (e.g., Husain and Rorden, 2003). In addition, a significant bias of attentional weights (as measured by *w*index) for stimuli in the right visual field was found, consistent with the defining symptoms of neglect. Surprisingly, top–down control of selectivity was found to be intact, even in the left hemifield: α values were normal in both sides. Although the findings on α would later be questioned on grounds of low reliability for this parameter, Duncan et al. (1999) did provide a first demonstration of how TVA-based assessment can distinguish between impaired and preserved aspects of visual attention, inspiring many later studies.

### Habekost and Bundesen (2003): Subclinical Neglect

Habekost and Bundesen (2003) presented the first TVA-based study of brain damage outside the parietal cortex. The patient investigated in this single case study had damage to the right basal ganglia and overlying frontal cortex, but no neglect or other attentional disturbances as measured by standard clinical testing. However, she did have subjective complaints about poor apprehension of events in the left side. In this sense, the study can be viewed as the first investigation of subclinical attention disturbances by TVA based testing. The patient's subjective complaints were confirmed by the TVA-based testing: in the partial report experiment, where stimuli were presented very near the perception threshold (40 ms), performance was clearly better in the right hemifield (reflected in both *A* and *w*index values). This pattern differed markedly from the control group. In addition the patient's *K* parameter was significantly reduced in both sides, and the visual perception thresholds were elevated, particularly in the left side. Given that neither of these deficits was evident from standard testing, the study showed the sensitivity of TVA based assessment to subtle attentional deficits.

### Habekost and Rostrup (2006, 2007): Right Hemisphere Stroke

Habekost and Rostrup (2006) followed up on this case study by a group investigation of 26 patients with right hemisphere strokes. The strokes varied widely in size, but in most cases centered on the basal ganglia and overlying frontal cortex, with variable additional involvement of the temporal and parietal cortices. Four patients had small lesions that were restricted to the right thalamic area. As in the study of Habekost and Bundesen (2003), most patients had minor or no deficits on standard tests of neglect. Lateralized abnormalities in the TVA-based testing were, however, widespread in the patient group: visual processing speed was markedly lower in the left than the right hemifield for almost all patients in the group. In addition, when presented with bilateral displays, patients with large strokes showed an abnormally biased attentional weighting toward stimuli in the right hemifield (*w*index). This pattern was not found after small lesions, however, with one important exception: a patient with thalamic damage involving the pulvinar nucleus showed a similarly biased attentional weighting. The finding corresponds well with an anatomical interpretation of the TVA model that proposes a central role for the pulvinar in the computation of attentional weights (Bundesen et al., 2005).

In a supplementary analysis of data from the same study, Habekost and Rostrup (2007) found that both visual short-term memory and visual processing speed in the ipsilesional field were normal for most patients, in spite of their large lesions. This implies that lesions in a large region of the right hemisphere, inclucing the putamen, insula, and inferior frontal cortex, do not lead to general deficits in either *C* or *K*. Deficits in *K* did, however, occur in patients with severe leukoaraiosis or lesions extending deep into white matter, suggesting a vital role for white matter connectivity for visual short-term memory function.

### Kraft et al. (2015): Thalamic Stroke

Following up on the case investigation on thalamic damage by Habekost and Rostrup (2006), Kraft et al. (2015) presented a larger group study of patients with thalamic lesions. Sixteen patients with focal damage in different subregions of the thalamus (either left or right side) were tested by the whole and partial report tasks of Duncan et al. (1999). Their lesions were examined by structural magnetic resonance imaging and mapped in standard stereotactic space. Compared to an age-matched control group the patients were on average mildly impaired in terms of *C* and *K* values. Lateral thalamic lesions were related to deficits in visual processing speed, whereas medial thalamic lesions were associated with asymmetrical attentional weighting, as measured by *w*index. This implies that patients with lesions outside the traditionally defined visual areas of the thalamus showed deficits on the TVA parameters. The performance of one patient with pulvinar damage replicated the finding of Habekost and Rostrup (2006), a spatial bias to the ipsilesional field, but this time shown after a left-side lesion.

### Peers et al. (2005): Parietal vs. Frontal Strokes

Peers et al. (2005) also investigated the effects of unilateral stroke. This study included 25 patients with lesions restricted to the parietal cortex (13 patients) or the frontal cortex (12 patients), either in the left or right hemisphere. In one experiment the patients' visual processing speed was tested by means of a single, post-masked stimulus (either a letter or a face). The results showed that patients with parietal, but not frontal, lesions had significantly reduced *C* values for both letters and faces. In a second experiment six target letters were displayed for a relatively long time (200 ms) without masking to obtain a rough estimate of the maximum number of letters that could be perceived (i.e, *K*). Again, patients with parietal lesions were on average impaired, but patients with frontal damage did not differ significantly from the control group. The results from both experiments thus indicated that deficits in processing capacity (*C* or the derived *K* measure) were selectively associated with parietal lesions. Deficits related to attentional weighting, however, showed quite a different pattern: both α and *w*index were related to the volume of lesion, not its location. The finding on *w*index was in line with the results of Habekost and Rostrup (2006), who also found that large lesions were related to unbalanced attentional weighting.

### Bublak et al. (2005): Parietal vs. Frontal Stroke

Bublak et al. (2005) conducted a similar comparison between the effects of lesions in the parietal and frontal lobe. The study was based on two case investigations, where a patient with a large stroke involving the right inferior parietal lobe was compared to a patient with a circumscribed lesion in the right superior frontal lobe. The standard partial report paradigm of Duncan et al. (1999) was used. The patient with inferior parietal damage showed similar deficits to the patients in the study of Duncan et al.: rightward spatial bias (*w*index), reduced sensory effectiveness in the left visual field (parameter *A*), and preserved top–down control (α; Bublak et al., 2005 however, noted that this parameter could be reliably measured only in the right visual field). Interestingly, the other patient with superior frontal damage showed a complementary deficit in parameter α, but not in the other parameters, forming a double dissociation. A whole report test of the patient with parietal damage showed additional bilateral deficits in *C* and *K*, consistent with the findings of Duncan et al. Because the study only involved two patients, Bublak et al. could not conduct a systematic lesion analysis like Peers et al. (2005). Instead, the main contribution of the study was to demonstrate that short versions of the tests (30–40 min) were sufficient to demonstrate a double dissociation of deficits between two stroke patients, supporting both the specificity and clinical usability of TVA-based assessment.

### Finke et al. (2012): Phasic Alerting of Neglect Patients

Finke et al. (2012) assessed whether phasic alerting can influence the deficits in processing capacity and attentional weighting that are characteristic of neglect patients. The study was inspired by Robertson et al. (1998) who showed that phasic alerting may alleviate neglect symptoms. Six patients with neglect following

right temporo-parietal lesions were included in the study. Finke et al. (2012) used a simplified version of the second paradigm of Duncan et al. (1999; see **Figure 2B**) in which only targets were shown (i.e., whole report) and compared performance with or without pre-alerting by a visual cue. The time course of the alerting effect was investigated by varying the SOA between alerting cue and stimulus display (80, 200, or 650 ms). Finke et al. (2012) found that phasic alerting normalized the spatial imbalance of attentional weights (*w*index) in the control (no-cue) condition, but only for the two shortest SOAs: when 650 ms elapsed between the alerting cue and the display, the patients fell back to the usual rightward bias in *w*index. The effect of visual alerting on attentional weighting was thus fast evolving, but also short-lasting. The cue also increased sensory effectiveness (as measured by *A* values) mainly in the right visual field. This effect was more stable over time and evident at all SOAs.

### Simultanagnosia

### Duncan et al. (2003): Dorsal Simultanagnosia

In line with its name, simultanagnosia is classically conceived as a failure to perceive multiple objects at the same time (Kinsbourne and Warrington, 1962). Duncan et al. (2003), however, wanted to test whether the basic deficit lies in visual processing speed rather than in the ability to perceive multiple objects simultaneously. The two types of deficits correspond closely to TVA's distinction between *C* and *K* values (e.g., in a strong version of the classical interpretation, *K* should be only one for simultanagnosic patients). Duncan et al. (2003) tested a patient with dorsal simultanagnosia in two variations of whole report with multiple stimuli and in both cases found very severe reductions of *C*, whereas the deficit in *K* was more modest. In the third experiment only a single letter was shown at fixation. Even with no other objects competing for attention, the patient's performance was still very low. Thus the study pointed to a deficit in processing speed, rather than simultaneous perception, as central to the condition, at least for this patient. A second patient with ventral simultanagnosia (left occipito-temporal lesion) was also tested in this study. However this condition is quite different from dorsal simultanagnosia and will be reviewed in the section on alexia (see Alexia; see also Gerlach et al., 2005, for an application of TVA-based assessment to rule out simultanagnosia in a neuropsychological case study).

### Finke et al. (2007): Simultanagnosia and Huntington's Disease

Simultanagnosia is traditionally associated with bilateral stroke, but Finke et al. (2007) showed how it can also occur in a neurodegenerative disturbance, Huntington's Disease. The study was a follow-up to the first TVA-based study on Huntington's Disease (Finke et al., 2006), which is described in Section "Neurodegenerative Diseases." Ten Huntington's patients were tested with tasks that required perception of multiple overlapping figures under free viewing conditions, and the number of errors was correlated to performance on a whole report experiment. Finke et al. (2007) found that a deficit in the overlapping figures test was significantly correlated with low visual processing speed (*C* values), but not VSTM capacity (*K* values). Thus the findings paralleled the study of Duncan et al. (2003), who also found that simultanagnosia is primarily related to deficits in visual processing speed.

### Section Summary

TVA-based assessment has proven to be a relevant tool for studying neglect, simultanagnosia, and related conditions. The testing is clearly sensitive to central aspects of these neuropsychological conditions, also for patients with milder deficits, and the findings are consistent across studies. Especially the specificity of the assessment method has proven valuable to disentangle different components of neglect and simultanagnosia. This deficit analysis relates directly to fundamental theoretical discussions about the core characteristics of the two syndromes. The specificity of the assessment method has also proven useful for relating deficit patterns to their underlying lesion anatomy. For example, spatial biases in attention have been found after two very different types of lesions: large unilateral strokes or focal damage to the pulvinar nucleus. Finally, the study of Finke et al. (2012) indicates that the specificity of TVA-based assessment can also be useful to chart the efficiency of rehabilitation procedures, a theme that is revisited in Section "Future Directions."

### Reading Impairments: Alexia and Dyslexia

TVA-based assessment typically includes letters as test stimuli, which makes it natural to use the method for examining reading impairments. The main focus in this line of TVA-based research is on the basic visual efficiency of letter processing, represented by the two capacity parameters *K* and *C*. Two general types of disturbances have been investigated: alexia and dyslexia. Alexia, a selective deficit in reading ability after brain damage, typically occurs after stroke in the posterior left hemisphere (see Leff and Starrfelt, 2014, for a general overview of alexia research). In milder cases the reading disability may simply be explained by a visual field cut (e.g., hemianopia). Brain damage can, however, also affect perception of letters and words at more central processing levels. The main example of this is the classical neuropsychological syndrome of pure alexia. Pure alexia is characterized by a severe inability to read fluently, which is reflected in the so-called letter-by-letter reading pattern. Letter-by-letter reading is evident from naming tasks of single words, where reaction times for patients with pure alexia increase linearly (and strongly) with the length of the presented word. The reading deficit occurs in the absence of other problems in writing or language processes, hence the term "pure." A long-standing discussion concerns the nature of the basic cognitive deficit that produces the letter-by-letter reading pattern (see Alexia).

The other main line of TVA-based studies on reading disturbances focuses on dyslexia. Like in alexia, the defining symptom of dyslexia is reading difficulties, but dyslexia is a developmental condition rather than a neurological one and associated with less severe reading impairments than alexia. The dyslexia studies have also used TVA-based assessment to specify basic deficits in letter processing, in order to address theoretical hypotheses about the underlying causes of the reading problem.

### Alexia

Apart from the study of Habekost and Starrfelt (2006) described at the end of this section, TVA-based studies of reading problems after brain damage have focused on pure alexia. Several theoretical hypotheses about the basic deficit of pure alexia have been proposed, which roughly fall into two categories: (1) domainspecific theories, which suggest that pure alexia is related to a deficit in recognizing visual word or letter forms (Warrington and Shallice, 1980; Cohen et al., 2003), or (2) general visual accounts, which propose that the reading problem is caused by a general deficit in visual perception (Behrmann et al., 1998) for example an impairment in simultaneous perception (Farah, 2004). The different predictions of these models can be tested directly by TVA-based assessment. For example, a letter- or word-specific account would predict severe reductions in visual processing capacity (*C* and *K* values) for orthographic stimuli, but not other stimulus types. On the other hand, a primary deficit in simultaneous perception should lead to reduced *K* values for all types of visual stimuli, whereas visual processing speed for singly presented objects could be normal.

### Duncan et al. (2003): Ventral Simultanagnosia

The first to investigate these issues was Duncan et al. (2003) who studied a single patient with ventral simultanagnosia (i.e., pure alexia). In a whole report task with multiple letters the patient showed severe reductions of visual processing speed for letters, whereas the reduction in *K* was more modest. An attentional bias toward left-side stimuli (shown in the *w*index) was also shown, using the partial report design of Duncan et al. (1999). This initial investigation of pure alexia thus suggested a primary deficit in visual processing speed rather than simultaneous perception, accompanied by an attentional bias toward ipsilesional stimuli, as would be expected following a large unilateral lesion (cf. Neglect and Related Conditions).

### Starrfelt et al. (2009, 2010): Pure Alexia

In five case studies Starrfelt et al. (2009, 2010) investigated the stimulus selectivity of pure alexia by comparing performance with letters versus digits. The testing was conducted in singlestimulus recognition experiments (targeting the visual processing speed for each stimulus type) as well as by whole report of multiple items (testing the hypothesis of a deficit in simultaneous perception). The findings were very consistent across all five patients: both *C* and *K* were markedly reduced, and to the same extent for letters and digits. Starrfelt et al. (2009) concluded that these general deficits in visual "speed and span" provide a plausible explanation for the patients' problems with word reading. The hypothesis of a close relation between perception of individual letters and reading of whole words, however, still awaited direct investigation.

### Habekost et al. (2014a): Pure Alexia

Habekost et al. (2014a) therefore extended the investigations to include single-stimulus recognition of three-letter words and compared this task to performance with individual letters. Vocal naming speed for both types of stimuli was also tested to assess whether the main difficulty was at the level of perceptual or response-related processes. Four patients with pure alexia were investigated, one with a relatively mild reading impairment, the others with more severe symptoms. The healthy control participants showed a clear word superiority effect: consistently better visual recognition of words than letters. In contrast, none of the four patients showed this pattern; two patients had approximately equal recognition of letters and words, whereas the other two showed a significant "word inferiority" pattern. In the vocal naming task the word inferiority pattern was even more evident for all four patients. Besides these intra-individual differences between letters and words, all patients had clearly reduced recognition accuracy for both stimulus types compared to controls, replicating the previous findings of low visual processing speed in patients with pure alexia. There was, however, one interesting exception to this general pattern: patient SH showed normal recognition of single letters, which represented a significant dissociation from his slowed vocal naming of the same stimuli, as well as his highly deficient perception and naming of words. This may be the first clear demonstration of a pure alexia patient with intact letter recognition abilities, a finding of direct relevance to the hypothesis that the basic deficit in pure alexia is in recognition of individual letters (e.g., Behrmann and Shallice, 1995).

#### Habekost and Starrfelt (2006): Quadrant-Amblyopia

Habekost and Starrfelt (2006) investigated a more subtle reading problem than pure alexia, in a patient with a condition they labeled "quadrant-amblyopia." Following a stroke in the posterior left hemisphere, the patient's ability to read fluently was reduced, even though other language functions were intact. The reading pattern was similar to the well-known syndrome of hemianopic alexia, but the patient did not show visual field cuts on perimetric assessment. Other known neuropsychological causes for reading impairment were also ruled out. However, TVA-based testing by whole report showed severe deficits in the visual processing speed of letters to the right of fixation (and in the upper right quadrant) whereas processing of letters was normal in other parts of the visual field. This spatially selective amblyopia ("foggy vision") provided a plausible explanation for the patient's mild reading problem.

### Dyslexia

Similar to alexia research, several theories of dyslexia claim that the reading disturbance is caused by impairments in specific visual processes. The suggested hypotheses include a deficit in simultaneous perception (Bosse et al., 2007), a left "mini-neglect" (Hari et al., 2001), or slower processing of individual letters (Dubois et al., 2010). This corresponds to deficits in the *K*, *w*index, and *C* parameters, respectively, and the hypotheses are therefore directly testable by TVA-based assessment.

### Dubois et al. (2010): Childhood Dyslexia

The first TVA-based study of children with dyslexia was reported by Dubois et al. (2010), who studied two 9-years old children by whole report of 1, 3, or 5 letters. Dubois et al. (2010) found significant deficits in both visual processing speed and storage capacity of visual short-term memory for one of the children, whereas the other child only showed a borderline significant reduction in *C*. In both children a non-significant tendency for an attentional bias (*w*index) toward left-side stimuli was also noted.

### Bogon et al. (2014): Childhood Dyslexia

Bogon et al. (2014) followed up with a group study on 12 dyslexic children (mean age 10 years) using the whole and partial report designs of Duncan et al. (1999). Bogon et al. (2014) found that *K* and *C* were on average significantly reduced, whereas α and *w*index did not differ significantly from the control group. Interestingly, the *K* values of the dyslexic children correlated significantly with reading performance. This, however, contrasts with a study of normally developing children by Lobier et al. (2013), who instead found that differences in visual processing speed predict text reading speed.

#### Stenneken et al. (2011): Adult Dyslexia

In a study of 23 high-functioning young adults with dyslexia Stenneken et al. (2011) also found deficits in visual processing speed, but in this study *K* values were on average at normal levels. In addition, Stenneken et al. (2011) found that *w*index values correlated with the severity of the dyslexia, even though this parameter was on average normal in the group.

### Section Summary

Research on two types of reading disturbances have been reviewed in this section: alexia and dyslexia.

Theory of visual attention-based studies of pure alexia have consistently shown severe reductions of visual processing capacity, reflected in both *C* and *K* values. The reductions have been demonstrated for individual letters as well as digits, indicating a general visual deficit. Further, the processing deficit seems to be exacerbated in case of word stimuli. This way, TVA-based testing has proven useful in addressing some of the main theoretical hypotheses within this field. However, the investigations have yet to achieve their primary aim, which is to isolate the basic processing deficit in pure alexia (if one such exists: see Starrfelt and Shallice, 2014, for a discussion of current challenges in alexia research). Instead of one deficit the results point to a set of associated impairments, each of which may cause the reading impairment.

The findings on dyslexia are similar to pure alexia, but reflect the milder nature of this reading disturbance. Results converge on reductions in visual processing speed for letters as the main deficit. In some dyslexic children this impairment is accompanied by reductions in *K* values, which may predict additional reading problems. In high functioning adult dyslexics, only the deficit in *C* seems to remain. Interestingly, some of the results also indicate a relation between dyslexia and *w*index (leftward attentional bias), which is in line with some neuropsychological theories of dyslexia. The dyslexia studies have in several cases correlated TVA parameters with text reading speed and other markers of reading ability, which increases the clinical validity of the investigations. However, it is not clear from these investigations whether *C* or *K* is the best predictor of reading performance.

### Aging and Neurodegenerative Disorders

### Life-Span Development of TVA Parameters

Whereas normal aging is of course not a clinical condition, elderly people typically show substantial reductions of cognitive abilities compared to younger individuals. It is therefore relevant to study cognitive aging from some of the same perspectives as clinical conditions. Indeed the impairments found in studies of pathological aging, described in Section "Neurodegenerative Diseases," can only be fully understood in the context of the normal age development. The present review therefore also includes studies that have addressed the normal life-span development of the TVA parameters.

### McAvinue et al. (2012b): Development from 12 to 75 Years of Age

The first published study that looked into this issue was carried out by McAvinue et al. (2012b), who investigated 113 healthy participants between 12 and 75 years using the CombiTVA task. McAvinue et al. (2012b) found an approximately linear decline in the average values of *C* and *K* for each decade following the teenage years. The decline was strongest for *C*, but both parameters showed very large differences between young and elderly participants, as measured by effect sizes. The age development of *t*<sup>0</sup> and α was weaker and more complex: *t*<sup>0</sup> was approximately stable until the late 50s, but then increased markedly. In contrast, α increased from early adulthood until the age of about 50, but was approximately stable thereafter. The latter finding may, however, be related to the fact that α was close to the ceiling value of 1.0 in the older age groups.

### Habekost et al. (2013): Development from 69 to 87 Years of Age

The study of McAvinue et al. (2012b) did not include participants above age 75. Habekost et al. (2013) followed up by investigating the development of whole report parameters (*C*, *t*0, and the visual apprehension span, an indirect measure of *K*) in a sample of 33 non-demented participants between 69 and 87 years. The results showed a marked reduction of visual processing speed with age, which was approximately halved between 70 and 85 years of age. The other two TVA parameters also declined with high age, but to a lesser extent.

### Espeseth et al. (2014): Development from 19 to 81 Years of Age

Espeseth et al. (2014) followed up on the study by McAvinue et al. (2012b) and reported on TVA performance in 325 healthy participants between 19 and 81 years. Espeseth et al. (2014) also took the investigation further by studying how the TVA parameters relate to the integrity of the brain's white matter, as measured by Diffusion Tensor Imaging (DTI). Espeseth et al. (2014) found that all TVA parameters expect *w*index were significantly associated with age decline. As in the study of McAvinue et al. (2012b) both *C* and *K* showed an approximately linear decline from age 20 to 80, though with a weaker age effect in this study. The age effect on *t*<sup>0</sup> was also similar to the previous study: early stability until the 50s, but then a marked increase toward age 80. The age development of the α parameter followed the reverse pattern: an increase in early adulthood, but relative stability in the older years. However, the mean α values were close to 1.0 already at age 40, again indicating a ceiling effect in the results. With regard to white matter integrity, Espeseth et al. (2014) found a significant relation between mean diffusivity scores and *t*<sup>0</sup> values in the elderly participants. The effects were particularly seen in projection fibers such as the internal capsule, sagittal stratum, and corona radiate. This elaborates the preliminary findings on white matter lesions reported by Habekost and Rostrup (2007), and is broadly consistent with a neural interpretation of the TVA model that emphasizes the importance of thalamo-cortical projections for visual attentional computations (Bundesen et al., 2005).

### Wiegand et al. (2014a): EEG Patterns in Young vs. Older Participants

Wiegand et al. (2014a) added another neural aspect to the understand of life-span changes in attention by showing how two EEG markers of *C* and *K*, the N1 and CDA components (see also Wiegand et al., 2014b) change significantly with aging. The anterior N1 was reduced for elderly participants with relatively low processing speed, suggesting that age-related loss of attentional resources slows encoding. Also, an enhanced right-central positivity was found for those older participants who had relatively high *K* values, pointing to additional neural recruitment for visual short-term memory in these individuals. Wiegand et al. (2014a) hypothesized that such changes in EEG components reflect cognitive processes that attempt to compensate for the capacity reductions that occur with increasing age.

### Wilms and Nielsen (2014): Development from 60 to 75 years Age

Wilms and Nielsen (2014) used whole report testing to measure *C, K*, and *t*<sup>0</sup> in a sample of 91 healthy individuals aged between 60 and 75 years. The TVA parameter estimates were compared to a range of demographic and life style variables (e.g., gender, employment status, smoking, exercise, and video gaming). A significant aging effect was found only for *C*, which surprisingly showed an improvement with age. However, this result was not significant after controlling for the influence of demographic and life-style factors. Instead the results pointed to the importance of background variables like education level, employment status, and video gaming for *C*, but not *K* and *t*<sup>0</sup> (see also Nielsen and Wilms, 2015, for a structural equation modeling analysis of a related data set). The aging effects on TVA parameters found in this study, or rather lack of same, differed markedly from other findings in the field. However, as discussed by Wilms and Nielsen (2014), this may relate to biased selection and special demographic characteristics of the participant sample in this study.

### Neurodegenerative Diseases

Neurodegenerative diseases are pathological conditions which entail a progressive and debilitating loss of neurons, often developing over many years. The pathology typically starts in particular anatomical regions (e.g., the striatum or medial temporal lobe) and affects specific cognitive functions at first, but in many cases eventually spreads to other parts of the brain and causes global dementia. Prominent examples include Huntington's and Alzheimer's disease (AD), where TVA-based studies have so far focused, but also conditions like Parkinson's disease and multiple sclerosis, where the first TVA-based studies have yet to appear. In recent years there has been an increasing interest in characterizing early stages of the neurodegenerative processes, as for example seen in Mild Cognitive Impairment (MCI), a precursor of AD. This has motivated an interest in using the sensitivity and specificity of TVA-based assessment to provide biomarkers for early diagnosis and to chart the general progress of the diseases.

#### Finke et al. (2006): Huntington's Disease

The first TVA-based study on neurodegenerative diseases focused on Huntington's disease and was carried out by Finke et al. (2006). Huntington's disease is an autosomal dominant inherited disorder characterized by progressive degeneration of the caudate nucleus and putamen. Clinically visible symptoms of the disease typically appear in middle adulthood, but may be preceded by subtle cognitive impairments for years. Finke et al. (2006) used TVA-based testing to find cognitive biomarkers for the disease, which might aid in early identification. 18 Huntington patients were tested by whole and partial report experiments following the design of Duncan et al. (1999). The partial report experiment revealed a significant leftward bias of attentional weights (*w*index) in the patients, which was strongly related to age-ofonset for the first symptoms as well as to the genetic disease load (CAG repeat length). Thus the degree of lateral attentional bias seemed to mark the intensity of pathogenic mechanisms for individual patients. The finding is also consistent with the notion that the pathology in Huntington's disease is more pronounced the left side of the brain (Rosas et al., 2001). The whole report experiment showed severe bilateral reductions of both *C* and *K*. The reductions did not correlate with age-of-onset or genetic load, but rather with the number of years since disease onset. Finke et al. (2006) therefore interpreted the two capacity parameters as biomarkers for the stage of progression in Huntington's disease.

### Bublak et al. (2006, 2011): Mild Cognitive Impairment and Alzheimer's Disease

Bublak et al. (2006) suggested a generalization of the findings on Huntington's disease to another major neurodegenerative disorder: AD, as well as a less severe condition, MCI. AD is the most prevalent neurodegenerative disease and accounts for a majority of all age-related dementia cases. Individuals with MCI are characterized by subtle cognitive impairments that lies between the normal aging pattern and dementia; a large portion of these patients develop AD within 5 years (especially those with the so-called amnestic subtype of MCI). Bublak et al. suggested that Huntington's disease, AD, and MCI have in common a mixture of spatial and non-spatial disturbances of visual attention, which can be indexed by TVA-based assessment. This suggestion also represents an interesting parallel to the neglect syndrome, which too entails both spatial and non-spatial attention deficits. Preliminary data from 19 patients with MCI and nine patients with probable AD were mentioned by Bublak et al. (2006) to back up the hypothesis. The data indicated that, like in Huntington's patients, visual processing speed was reduced for both AD and MCI patients. The strong leftward attentional bias of Huntington's patients was also found in AD patients, though not in MCI patients. Thus a similar mixture of non-spatial (general) and lateralized deficits were reported across disease conditions.

The final set of results, however, diverged somewhat from this early report when they were presented by Bublak et al. (2011). In this study 18 AD patients and 18 MCI patients were tested by the whole report paradigm of Duncan et al. (1999). The results showed that deficits in MCI patients were confined to elevated visual thresholds (*t*<sup>0</sup> values), whereas AD patients were also impaired in the main processing capacity parameters, *C* and *K*. The deficits in *C* and *K* also correlated with impairments on other cognitive tasks. Overall, the results showed a staged pattern of decline from MCI to AD, with deficits in visual thresholds as an early indicator of cognitive problems followed by marked capacity reductions in patients who develop AD. Interestingly, AD patients medicated with cholinesterase inhibitors had significantly better *C* values than the other AD patients (for TVA-based studies on pharmacological effects in healthy individuals, see Finke et al., 2010, and Vangkilde et al., 2011).

### Redel et al. (2012): Mild Cognitive Impairment and Alzheimer's Disease

Redel et al. (2012) complemented the study of Bublak et al. (2011) by employing partial report testing of patients with MCI and AD. Redel et al. (2012) used the standard paradigm of Duncan et al. to test 32 patients with MCI and 16 AD patients. Compared to a matched control group, the MCI patients showed significant impairments of both α and *w*index values; further deterioration of the two parameters was observed in AD patients. Another interesting finding of this study was that individuals who were carriers of the apolipoprotein E ε4 allelle (ApoE4), a known genetic risk factor for AD, were generally characterized by a leftward spatial bias (*w*index). This spatial bias was also related to early disease onset. The study of Redel et al. (2012) thus presented further support for Bublak et al.'s (2006) proposal that deficits in attentional weighting are commonly associated with neurodegeneration, even at early stages of disease development.

### Sorg et al. (2012): Mild Cognitive Development and Alzheimer's Disease

Sorg et al. (2012) investigated the neural basis of Redel et al.'s (2012) findings on spatial bias, focusing on hypometabolism in the posterior parietal cortex, which is one of the first signs of AD development. Using a combination of positron emission tomography (PET) scans and TVA-based testing, Sorg et al. (2012) investigated the relation between this cortical hypometabolism and attentional function in seven patients with mild AD and 28 patients with prodromal AD. Prodromal AD is defined as MCI of the amnestic subtype with at least one biological sign of AD development. The participants were tested by whole and partial report following the design of Duncan et al. (1999). The theoretical interest of the study centered on *w*index, whereas the other TVA parameters were only used to control for other attentional subprocesses (and not discussed). The main finding of the study was that spatial bias (typically toward left-side stimuli) correlated significantly with the asymmetry of hypometabolism in the parietal cortex. The result again supports the notion that attentional deficits are an early aspect of AD development, and be detected using TVA-based testing.

### Section Summary

The studies on normal cognitive aging, with the possible exception of Wilms and Nielsen (2014), show that the two attentional capacity parameters *C* and *K* decline approximately linearly for each decade after they peak in young adulthood. The life-span development of the other TVA parameters is more complex. *t*<sup>0</sup> seems to be relatively stable until the 50s, but then increases significantly in the following decades. An inverse pattern has been reported for α: decline until about age 50 followed by late stability, but the latter finding may be confounded by ceiling effects in the CombiTVA testing. *w*index, in contrast, seems to remain stable (i.e., symmetrical on average) across the life-span of neurologically healthy adults. The neural basis for the age-related changes in TVA parameters is also beginning to be explored. Two newly published studies point to the importance of white matter tracts for *t*<sup>0</sup> and to compensatory brain activity related to *K*, respectively.

Pathological conditions associated with increasing age have been investigated in a series of TVA-based studies of neurodegenerative disorders. These studies have produced a number of interesting findings. Most importantly, they have shown how TVA based assessment can chart the progression of the diseases (from MCI to AD, and across the gradual development of Huntington's disease). These demonstrations are generally supported by correlations between the TVA parameters and clinically relevant measures, which validates the TVA parameters as biomarkers for disease progression. Also a theoretically interesting result, which has been replicated in several studies, is the demonstration of spatial asymmetries of attentional weighting in neurodegenerative conditions that are traditionally associated with general (i.e., bilateral) cognitive decline. These findings have important perspectives for the general neuropsychological understanding of these diseases.

### ADHD and other Neurodevelopmental Disorders

In the last few years a fourth research area has emerged for clinical TVA-based investigations: studies of neurodevelopmental disorders. Dyslexia, which has already been treated in the context of reading disturbances in Section "Dyslexia", may be regarded as one such condition. Other examples are preterm birth and congenital brain malformations (see Other Neurodevelopmental Conditions). At present most research within this field, both published and on-going, however, seems to be directed at the syndrome of Attention deficit hyperactivity disorder (ADHD).

### Attention Deficit Hyperactivity Disorder

Attention deficit hyperactivity disorder is a highly prevalent psychiatric disorder that is characterized by symptoms of inattention, impulsivity, and hyperactivity. ADHD symptoms emerge in childhood and in many cases persist into adulthood. The core cognitive deficit (or deficits) underlying the behavioral manifestations of ADHD is much debated, with suggestions including deficient working memory, delay aversion, and hypoarousal (Castellanos et al., 2006). Several of these hypotheses are directly translatable into TVA terms: for example, a deficit in working memory may correspond to *K* reduction, whereas hypoarousal can be related to lower *C* values (see Bundesen et al., 2015, for a theoretical account of the relation between visual processing speed and arousal). Based on such hypotheses, TVA-based studies attempt to reveal the core deficits in ADHD.

### Finke et al. (2011): Adult ADHD

The first study was made by Finke et al. (2011), who investigated a group of 30 adults with ADHD. The group was carefully screened for comorbidities, which are otherwise very frequently occurring in persons with ADHD. Finke et al. (2011) tested the participants using the standard whole and partial report paradigms of Duncan et al. (1999). They found a selective deficit in the *K* parameter (with medium to high effect size), whereas *C*, *w*index*,* and α values were not significantly different from the control group. The *K* values did not correlate significantly with age, IQ, income, or other socio–economic measures and thus seemed to represent a relatively pure neurocognitive endophenotype for ADHD. The results thus support theories of deficient working memory as the primary deficit in ADHD.

### McAvinue et al. (2012a): Childhood ADHD

Quite different results were obtained by McAvinue et al. (2012a), who studied 25 children (9–13 years) with ADHD using the CombiTVA paradigm. In contrast to Finke et al. (2011), McAvinue et al. (2012a) found selective deficits in the *C* parameter, whereas *K* and the other standard TVA parameters did not differ significantly from the control group. Notably, McAvinue et al. (2012a) fitted the data by a new algorithm that included estimation of the frequency of attentional lapses (see Developments in Data Analysis). Lapses represent trials in which participants are off-task (i.e., reporting zero items in spite of a relatively long exposure time). The lapse frequencies of children with ADHD were even more different from the control group (i.e., the effect size was larger) than was the case for visual processing speed. Overall, the results of McAvinue et al. (2012a) seem to favor an arousal-related model of ADHD, which can explain the slower processing and the frequent lapses of attention.

### Other Neurodevelopmental Conditions Caspersen and Habekost (2013): Spina Bifida Myelomeningocele

The heterogeneity that characterizes many neurodevelopmental conditions was also evident in a study by Caspersen and Habekost (2013). Caspersen and Habekost (2013) studied six children with spina bifida myelomeningocele (SBM), a congenital defect in the neural tube which often leads to cerebral disturbance. This is the first TVA-based study of a neuropaediatric disorder. On a group level, the children with SBM had significantly poorer α values than the control group. However, the most informative results were demonstrated at single case level. Each of the six children deviated significantly from the control group on one or more test variables (e.g., *t*0, *C*left, or *K*). The individual patterns of deficits were quite different even though all children shared the same general neurological condition. Thus the study showed the usefulness of TVA based assessment to provide individual deficit profiles, and demonstrated the heterogeneity of cognitive deficits after SBM.

### Finke et al. (2015): Preterm Born Adults

Finke et al. (2015) used whole and partial report experiments to study another population with developmental disturbances: adults who are born preterm. 33 preterm individuals were compared to 32 full-term born participants. Resting state fMRI was also included in the investigation to obtain measures of functional connectivity of intrinsic brain networks relevant for visual attention. Finke et al. (2015) found a selective deficit in the *K* parameter in the preterm group, while the other parameters did not differ from the control group. Among preterm born adults, individual patterns of changed connectivity in occipital and parietal cortices were systematically associated with visual short-term memory function in such a way that the more distinct the connectivity differences, the better the preterm adults' *K* score. Finke et al. (2015) suggested that the changes reflect processes that attempt to compensate for the adverse developmental consequences of prematurity.

### Section Summary

Compared to the other research areas reviewed in this article, the studies on neurodevelopmental conditions have produced rather divergent results. The complexity of the findings is partly related to the fact that quite different clinical conditions have been studied. However, findings within the same patient group have also differed substantially. This is perhaps not surprising, given the heterogeneity of many diagnostic categories in neuropsychiatry and neuropaediatrics (e.g., SBM). Still it is important to consider the theoretical implications of the divergent findings, especially for ADHD, where several well-defined hypotheses have been tested. In particular, it is not presently clear how to reconcile the opposite findings on *K* and *C* from the two studies on children and adults with ADHD. One possibility is that adults with ADHD represent a special subtype ("persisters") of the general population of individuals who develop ADHD. A different, methodological explanation is that the individuals studied by Finke et al. (2011) represent a more pure sample of ADHD compared to the sample of McAvinue et al. (2012a), which was not

similarly screened for comorbidities or other confounding factors. It is also possible that the divergent findings are related to the use of different variants of TVA based assessment (CombiTVA vs. the Duncan et al.-paradigm) but it is not clear why the former paradigm should be more sensitive to *C* deficits and the latter to deficits in *K*. Hopefully, future studies on ADHD will provide more clarity on this issue.

## General Discussion

See **Table 1** for an overview of the clinical TVA-based studies that have been published until now, categorized under the four research areas presented in this article. Besides the clinical condition and the number of patients investigated, the table lists the central findings of each study in terms of which TVA parameters were affected. As described in Sections "Neglect and Simultanagnosia" to "ADHD and other Neurodevelopmental Disorders", the individual findings are often complex and require a detailed understanding of each study in order to be interpreted. However, one thing is immediately evident from the table: there is a high degree of overlap between the affected TVA parameters across research fields. In particular, reductions in both *C* and *K* are found in such diverse groups as patients with neglect, pure alexia, Huntington's disease, and ADHD. This has important implications for how one should understand the specificity of TVA-based assessment. The assessment is clearly not clinically specific in the sense that deficits in particular parameters are diagnostic of particular conditions. Basic diagnosis will have to be carried out by other means. Instead, the specificity of the assessment method is cognitive: it can point to impairments in well-defined cognitive functions, deficits that may be shared with other clinical conditions, but are functionally distinct from other types of attentional deficits. The clinical value of TVA-based assessment thus lies primarily in its ability to elaborate on the basic diagnosis and provide a detailed characteristic of the pattern of attentional difficulties for a given individual or group of patients.

The cognitive specificity of TVA-based assessment seems equally well suited for studying effects of focal brain lesions as well as clinical conditions with diffusely distributed pathology affecting large-scale brain networks (e.g., neurodegenerative diseases). Especially in the latter type of conditions, the ability to characterize attentional impairments by a well-defined parameter profile goes well beyond conventional clinical assessment approaches. Thus, TVA-based assessment has great potential not only for improving the understanding of generalized brain diseases but also for establishing neuropsychological measures as valid biomarkers.

This perspective of cognitive profiling makes it important to understand the wider significance of each TVA parameter, including its relation to other cognitive and neuropsychological constructs. Starting with visual processing speed *C*, this parameter provides a measure of the efficiency of visual form recognition (for a given stimulus type). There are many other measures of processing speed in the neuropsychological literature, but most of them are reaction-time based (e.g., the Alertness task in the

#### TABLE 1 | Clinical theory of visual attention (TVA)-based studies.


TAP battery, Zimmermann and Fimm, 1993) and therefore not specific to processing in the visual system. Other accuracy-based measures of visual processing speed do exist, for example the useful field of view (Ball and Owsley, 1993) and the inspection time (Deary and Stough, 1996). Like TVA-based assessment these tests use very brief visual presentations. However, these testing conditions mean that performance can be strongly influenced by individual differences in perception thresholds as well as in visual processing speed. TVA-based analysis is currently the only way to distinguish between these two factors. Concerning the clinical significance of visual processing speed, the frequent findings of *C* reductions across neuropsychological conditions suggest that the parameter is vulnerable to disturbance in many different brain regions, also outside cortical visual areas (see Habekost and Starrfelt, 2009, for a detailed discussion on the lesion anatomy of *C* and *K*). Besides the parameter's specific relation to visual form recognition abilities, *C* can therefore be seen as a sensitive marker for the general processing efficiency of the brain.

The other main parameter of attentional capacity, *K*, represents the maximum number of visual objects that can be perceived simultaneously. As such, it is related to other measures of visuo-spatial working memory. For example, Finke et al. (2005) found a moderately significant correlation between *K* and the Visual Memory Span from the WMS-R battery (Härting et al., 2000). However the *K* parameter is arguably a very pure estimate of visual apprehension span, both due to the minor response requirements of the whole report task as well as the power of TVA analysis, which controls the *K* estimate for the influence of other visual factors (e.g., processing speed). One objection to this latter argument is that *C* and *K* typically correlate significantly, which may question their separability. However, correlation need not contradict independence; weight and height are also correlated, but clearly represent distinct aspects of bodily structure. *K* and *C* are estimated mathematically independently and, given data of sufficient quality (i.e., several hundred trials and performance spanning from threshold to ceiling), it should be possible to distinguish reliably between the two parameters even though they tend to co-vary in the same individuals. In terms of clinical significance, *K* has been found impaired in many different conditions (e.g., neglect, ADHD, or Huntington's disease). Like for *C*, there are many theoretical and empirical reasons to assume that *K* depends on large anatomical networks that involve many parts of the brain including white-matter connectivity (Habekost and Rostrup, 2007; Habekost and Starrfelt, 2009). Also the *K* parameter can therefore be taken as an indicator of the brain's general processing efficiency, besides its specific relation to simultaneous visual perception.

Parameter α represents the effectiveness of top–down control of visual attention. Also for this parameter analogous cognitive measures exist. For example, Finke et al. (2005) compared α to performance on a Stroop task and found a moderate correlation. A similar correlation was found to the executive control network parameter of the ANT task (Habekost et al., 2014b). The fact that these correlations were only modest is probably due to the fact that both alternative measures of attentional control (as well as most other tests in the literature) also depend strongly on motorrelated processes like response inhibition. In contrast, the partial report task specifically assesses perceptual processes in attentional control (i.e., visual filtering). Clinically, findings on α deficits have been surprisingly sparse and are limited to a few studies (Bublak et al., 2005; Peers et al., 2005; Redel et al., 2012). There may be several explanations for this. On the methodological side, estimation of α is generally less reliable than the other TVA parameters, and there may be ceiling effects in the CombiTVA testing of this parameter. However α still has a reliability level that is comparable to many other neuropsychological measures (Habekost et al., 2014b) and the Duncan et al. paradigm is not characterized by ceiling effects in α. It is therefore likely that top–down selectivity in visual attention is simply more robust to many kinds of brain disturbances than the other TVA parameters, and perhaps mainly vulnerable to very large lesions (Peers et al., 2005) or damage to the superior frontal lobe (Bublak et al., 2005). The latter type of brain disturbance is relatively rare after stroke, but may be more common in AD even at early stages (Redel et al., 2012).

Parameter *w*index is a measure of lateral attentional bias in visual perception. In the cognitive literature it is conceptually associated with measures like spatial orienting (although *w*index did not correlate with this parameter in the ANT test; Habekost et al., 2014b) and the Visual Scanning subtest of the TAP battery (where Finke et al., 2005, did find a significant correlation). Clinically, *w*index is most directly associated with the phenomenon of visual extinction (i.e., competition for conscious perception between bilaterally presented stimuli; Bender, 1952). Compared to such alternative cognitive or clinical measures of spatial bias, the estimation of *w*index has the same advantages as the other TVA parameters: it is controlled for the influence of confounding motor and visual factors, particularly side differences in sensory effectiveness. Further, the findings across the clinical TVA literature suggests that *w*index is a sensitive indicator of brain asymmetry, whether it be caused by unilateral stroke or asymmetrical neurodegeneration. Besides the typically strong effects of large unilateral brain damage on *w*index, a theoretically interesting finding is the relation between *w*index and lesions in the thalamic pulvinar nucleus (Habekost and Rostrup, 2006; Kraft et al., 2015), which is consistent with the neural TVA model of Bundesen et al. (2005).

The fifth TVA parameter, *t*0, represents the lower temporal threshold for visual perception. The parameter is related to traditional psychophysical measures of perceptual thresholds. *t*<sup>0</sup> has generally received less interest than the other four TVA parameters and is sometimes treated merely as a control variable for valid *C* estimation. However, recent findings indicate that changes in *t*<sup>0</sup> may be an important marker for pathological aging, as seen in MCI (Bublak et al., 2011) or changes in white matter connectivity (Espeseth et al., 2014). Future studies will clarify whether the *t*<sup>0</sup> parameter deserves more attention that it has received hitherto.

### Future Directions

Clinical TVA-based studies can be predicted to go in several directions in the coming years. One development may be a further widening of the clinical scope for the studies. Over the last 15 years TVA-based assessment has been applied to an increasing number of clinical conditions, as documented in this review. However, attentional disturbances are a significant part of most neurological and psychiatric conditions, so there still seems to be room for many novel investigations. For example, currently ongoing studies include patients with multiple sclerosis, Tourette's syndrome, congenital prosopagnosia, and traumatic brain injury. Other relevant topics for TVA-based studies that still await investigation are Parkinson's disease or schizophrenia. Also, given that many questions still remain for the clinical conditions that have been studied previously by TVA methods (e.g,. ADHD or dyslexia), studies are likely to continue within the already established research areas.

A second main development in the coming years might be a stronger combination of clinical TVA-based assessment with treatment programes and supplementary clinical and neuroimaging measures. As noted in the introduction, a number of studies have investigated the effects of cognitive or physiological interventions on TVA parameters in healthy participants. Given that several of these interventions (e.g., cognitive training or pharmacological substances) have large relevance for clinical treatment, it seems promising to design studies where the clinical effects of such interventions are monitored by TVA-based assessment. To further strengthen the clinical relevance of such investigations, it would also be natural to include a wider range of clinical and biological measures on the patients. This can be seen as a continuation of the biomarker approach taken in many of the studies on neurodegenerative diseases, but broadening the investigations further to include more information on for example genetic properties or neurotransmitter levels. Finally, given Wiegand et al.'s promising findings on EEG markers for *C* and *K*, such measures of brain function is also likely to grow in importance for clinical TVA-based research in the coming years.

### Conclusion

Since the first article was published in Duncan et al. (1999), about 30 studies have used TVA-based assessment to investigate attentional deficits in various neurological and psychiatric conditions. Clinical TVA-based studies have so far focused on four main research areas: (1) neglect and related conditions, (2) reading disturbances, (3) aging and neurodegenerative diseases, and (4) neurodevelopmental disorders. Typically the main aim of these studies has been to use the specificity of TVA-based assessment to address theoretical hypotheses about the core deficits of the disorders. The findings are generally consistent across studies and have often added substantially to the theoretical understanding of the conditions, although decisive results on the core deficits of, for example, pure alexia and ADHD still remain elusive. The clinical validity of the assessment has been supported by some studies, especially on neurodegenerative diseases, which have related the TVA parameters to clinically relevant behavior or biological disease markers. However, in other fields of TVA-based research, the relation to other clinical measures needs to be further established.

### References


The main strength of TVA-based assessment method lies in its theoretical grounding and cognitive specificity: the ability to measure five theoretically central aspects of visual attention. As additional qualities the method has also proven sensitive to minor attentional disturbances, shown good reliability, and can be adapted to many different types of clinical investigations. Looking toward future studies, the list of neuropsychological conditions that can be meaningfully addressed by TVA based assessment is far from exhausted, and the next in line may be multiple sclerosis, Tourette's syndrome, and traumatic brain injury. Besides studying new patient populations, it also seems promising to combine TVA-based assessment with cognitive and pharmacological interventions, and to include biological disease markers and EEG measures in the investigations.

### Acknowledgments

The article was supported by a grant to the author from the Danish Council for Independent Research under the Sapere Aude program (project no. 11-104180, "Attentive Mind"). Thanks to Signe Vangkilde for help with the illustrations.


Fan, J., McCandliss, B. D., Sommer, T., Raz, A., and Posner, M. I. (2002). Testing the efficiency and independence of attentional networks. *J. Cogn. Neurosci.* 14, 340–347. doi: 10.1162/089892902317361886


Farah, M. J. (2004). *Visual Agnosia*, 2nd Edn. Cambridge: The MIT Press.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Habekost. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## TVA-based assessment of visual attentional functions in developmental dyslexia

### *Johanna Bogon1\*, Kathrin Finke2 and Prisca Stenneken3*

<sup>1</sup> Department of Experimental Psychology, University of Regensburg, Regensburg, Germany

<sup>2</sup> Department of Psychology, General and Experimental Psychology/Neuro-Cognitive Psychology, Ludwig-Maximilians-Universität München, Munich, Germany

<sup>3</sup> Department of Speech and Language Pathology, University of Cologne, Cologne, Germany

#### *Edited by:*

Søren Kyllingsbæk, University of Copenhagen, Denmark

#### *Reviewed by:*

Randi Starrfelt, University of Copenhagen, Denmark Signe Allerup Vangkilde, University of Copenhagen, Denmark

#### *\*Correspondence:*

Johanna Bogon, Department of Experimental Psychology, University of Regensburg, Universitätsstraße 31, D-93053 Regensburg, Germany e-mail: johanna.bogon@psychologie. uni-regensburg.de

### There is an ongoing debate whether an impairment of visual attentional functions constitutes an additional or even an isolated deficit of developmental dyslexia (DD). Especially performance in tasks that require the processing of multiple visual elements in parallel has been reported to be impaired in DD.We review studies that used parameterbased assessment for identifying and quantifying impaired aspect(s) of visual attention that underlie this multi-element processing deficit in DD. These studies used the mathematical framework provided by the "theory of visual attention" (Bundesen, 1990) to derive quantitative measures of general attentional resources and attentional weighting aspects on the basis of behavioral performance in whole- and partial-report tasks. Based on parameter estimates in children and adults with DD, the reviewed studies support a slowed perceptual processing speed as an underlying primary deficit in DD. Moreover, a reduction in visual short term memory storage capacity seems to present a modulating component, contributing to difficulties in written language processing. Furthermore, comparing the spatial distributions of attentional weights in children and adults suggests that having limited reading and writing skills might impair the development of a slight leftward bias, that is typical for unimpaired adult readers.

**Keywords: developmental dyslexia, visual attention, processing speed, visual short term memory, spatial bias, top–down control, whole- and partial-report task**

#### **VISUAL ATTENTION AND DEVELOPMENTAL DYSLEXIA (DD)**

Visual attentional functions are currently discussed as being related to developmental dyslexia (DD), a disorder in written language acquisition, that cannot be explained by age, visual sensory problems, or inadequate reading instruction (World Health Organization, 2011). Due to increasing evidence for an underachievement of people with DD in many attention-based tasks, it is debated whether an impairment of visual attentional functions constitutes an additional or even an isolated deficit of subgroups of DD (Heim et al., 2008; Menghini et al., 2010).

Especially performance in tasks that require the processing of multiple visual elements in parallel seems to be impaired in DD (Valdois et al., 2003, 2012; Pammer et al., 2004; Hawelka and Wimmer, 2005; Bosse et al., 2007; Jones et al., 2008; De Luca et al., 2010; Romani et al., 2011; Lobier et al., 2012). This multi-element processing deficit was mainly assessed using tasks based on Averbach and Sperling (1968) where participants have to report as many letters as possible (whole-report) or only precued ones (partial-report) from briefly displayed visual arrays. For example, Bosse et al. (2007) revealed that multi-element processing was significantly impaired in children with DD and that this deficit accounted for a substantial variance in reading speed and accuracy. Furthermore, reduced multi-element processing in participants with DD was associated with an increased number of rightward fixations in text reading and of eye movements in word and pseudoword reading (Hawelka and Wimmer, 2005; Prado et al., 2007). Importantly, multi-element processing

performance depends on distinct attentional functions. Poor performance could stem from deficient general attentional resources, involving visual processing speed, and/or visual short-term memory (VSTM). Furthermore, it could be related to selectivity changes, involving spatial distribution of attentional weights and/or top–down control.

One influential theoretical concept focusses on general resource limitations. It is suggested that people with DD have a reduced visual attentional span, assessed as the amount of elements that can be reported from a briefly displayed array (Bosse et al., 2007; Prado et al., 2007; Valdois et al., 2011, 2012; Lobier et al., 2012). Critically, this deficit could be caused by an enhanced visual threshold, a reduction in visual processing speed or VSTM storage capacity, or by a combination of such deficits. A delayed start or a slower rate of encoding the elements of a briefly displayed multi-element array should both lead to a reduced number of elements that enter VSTM before the display disappears. Otherwise, a limitation of the maximum number of elements that can be stored inVSTM, could also account for a reduced visual attentional span.

Additionally or alternatively, impaired attentional selectivity aspects, i.e., changes in spatial attention and/or inefficiency in filtering information could also account for deficient multielement processing. Indeed, people with DD might not show a normal slight leftward attentional bias. It is known that patients with hemi-neglect following right parietal damage show a rightward attentional bias, which is shown, for example, in rightward deviations in line bisection tasks. Normal participants instead deviate slightly, but reliably toward the left when bisecting lines (Bowers and Heilman, 1980) and also show leftward bias in speeded lateralized stimulus detection. This behavior is termed "pseudoneglect." A number of studies indicate that participants with DD do not show pseudoneglect (Facoetti et al., 2000, 2003; Facoetti and Molteni, 2001; Hari et al., 2001; Buchholz and Davies, 2005; Sireteanu et al., 2005; Liddle et al., 2009; Ruffino et al., 2010; Waldie and Hausmann, 2010; Ziegler et al., 2010; Stenneken et al., 2011). Reduced efficiency of top–down controlled selection could also account for poor multi-element processing. People with DD might be especially prone to interference (Sperling et al., 2005; Roach and Hogben, 2007; Moores et al., 2011; Stevens et al., 2013) and such susceptibility to distracting information might reduce the amount of relevant elements encoded from briefly displayed multi-element arrays.

In this review, we explore the role of visual attention functions for impaired multi-element processing in DD. A critical methodological challenge is to identify and quantify the impaired function(s). Recently, a number of studies have used a parameterbased account of visual attention assessment (Dubois et al., 2010; Stenneken et al., 2011; Lobier et al., 2013; Bogon et al., 2014). In these studies, the formal framework provided by the "theory of visual attention" (TVA; Bundesen, 1990, 1998) was used to derive quantitative estimates of the individual capabilities of a participant in selecting visual information. This was done by computational modeling of behavioral performance in two simple, psychophysical tasks, i.e., whole- and partial-report of briefly presented letter arrays.

### **IDENTIFICATION AND QUANTIFICATION OF DISTINCT ATTENTIONAL PARAMETERS BY MEANS OF THE TVA**

Within the TVA framework, the efficiency of visual selection performance of a given participant can be described on the basis of a set of mathematically independent, quantitative measures of attentional components (for a comprehensive description of TVA see Bundesen, 1990; for a formal description and TVA equations see Kyllingsbæk, 2006). TVA assumes that objects from a briefly presented array are processed in parallel and compete for selection into a VSTM store. Only objects can be reported that reach the store before its storage capacity is exhausted and before iconic memory of the array vanishes. The resulting race among objects can be biased in such a way that some objects are favored for selection, based either on stimulusdriven, "bottom–up" or on intentional, "top–down" factors. The probability of selection is determined (i) by the participant's individual minimal effective exposure duration, the visual perception threshold *t0*, (ii) by an object's processing rate, which depends on the relative attentional weight it receives, and (iii) by the capacity of the VSTM store *K* (if the store is filled, selection terminates). TVA provides parameters for characterizing the general processing efficiency of the information processing system (minimal effective visual exposure duration, processing speed, and VSTM storage capacity), and for characterizing attentional selectivity (top–down control and spatial distribution of attention).

In TVA-based assessment, the general information processing efficiency is assessed within a whole-report task, in which subjects are briefly presented with multiple stimuli and have to identify as many as possible. The probability of identification is modeled by an exponential growth function (see **Figure 1**), in which the visual perception threshold (parameter *t0*: minimal effective exposure duration in ms), the growth parameter reflects the rate at which the stimuli (objects) can be processed (parameter processing speed *C*: number of element/s), and the asymptote of the growth function indicates the maximum number of objects that can be represented within VSTM (parameter VSTM storage capacity *K*). Thus, estimating these parameters of interest here permits to further differentiate if a deficient visual span performance (e.g., Bosse et al., 2007) in DD is caused by deficiencies in visual perception threshold or visual encoding speed, by storage capacity problems or a combination of these factors.

Furthermore, TVA-based assessment allows to individually estimate critical selectivity aspects of attention of interest here: spatial laterality (parameter spatial distribution of attention *w*λ) and efficiency in prioritizing targets over distractors (parameter top–down control α). Parameter *w*<sup>λ</sup> is derived in report tasks that involve trials with presentation of stimuli in only one and trials with stimuli in both hemifields. Based on report individual accuracy differences in bilateral vs. unilateral displays, the TVA model produces estimates of attentional weights *w*<sup>i</sup> separately for the left (*w*L) and the right hemifield (*w*R), and *w*<sup>λ</sup> is then computed as *w*L/(*w*<sup>L</sup> + *w*R). Hence, a value of *w*<sup>λ</sup> = 0.5 indicates balanced weighting; values of *w*<sup>λ</sup> > 0.5 indicate a leftward and values of *w*<sup>λ</sup> < 0.5 a rightward bias. A slight normal "pseudo-neglect" is indicated by a value of *w*<sup>λ</sup> > 0.5, because weights for objects to the left of fixation are

slightly higher than those for objects to the right. If participants with DD indeed show a reduced or absent pseudoneglect, this would be indicated by significantly higher *w*<sup>λ</sup> values compared to control participants. Parameter α, representing efficiency of top–down attentional control, is estimated from the performance in partial-report tasks, where participants have to report target objects, only, which are prespecified (e.g., with respect to color), whilst ignoring distractors. The parameter indicates the relative attentional weights of distracters compared to targets (*w*D/*w*T). Impaired top–down control in participants with DD would be indicated by higher α-values compared to control participants, indicating relatively more attentional weight allocated to distracters. In sum, estimating these selectivity parameters permits the exploration of the potential contribution of lateralized deficits or inefficiency of top–down control to impairments found in DD.

### **TVA-BASED STUDIES ON DD**

The first TVA-based study that aimed to disentangle the attention deficits underlying impaired multi-element processing was a case study on two children with DD (Dubois et al., 2010). These children with impaired performance in standardized whole- and partial-report tasks (Valdois et al., 2003, 2004) demonstrated reading and writing difficulties, characterized by a high number of reading and spelling errors and a strikingly reduced reading speed whereas their intellectual abilities were in the normal range. The parameters minimal effective exposure duration *t0*, visual processing speed *C*, VSTM storage capacity *K* and laterality of attentional weighting *w*<sup>λ</sup> were estimated by modeling whole-report accuracy for both cases and compared to an age-matched group of nine children with typical reading and writing abilities. This revealed a reduction in visual processing speed for both children with DD and, additionally, a reduced VSTM storage capacity for one of them. The two cases did not differ from controls in minimal effective exposure duration and the spatial distribution of attention.

Further insights into the potential contribution of these attentional parameters to dyslexic impairments at a later developmental stage has been provided by a group study on adults with DD. Stenneken et al. (2011) compared TVA parameter estimates in high-achieving young adults (mostly university students) with persisting DD to an age- and education-matched control group. With regard to general attentional resources a profound impairment of visual processing speed *C* was found in the group of adults with DD compared to controls. Moreover, with regard to selectivity aspects, as assessed by a partial-report task, the distribution of spatial attentional weights was found to be different than in controls. The group of normal readers showed the typical, slight leftward bias in spatial attentional weighting (i.e., pseudoneglect, Jewell and McCourt, 2000), that has been documented in unimpaired participants (for TVA-based studies, cf. Bublak et al., 2005; Finke et al., 2005; Habekost and Rostrup, 2006; Matthias et al., 2009, 2010). In contrast, participants with DD did not show this effect; interestingly, the more the spatial lateralization in these participants deviated from that of controls the more severe was their dyslexia, as assessed by a standardized spelling test.

In order to further specify developmental aspects of DD that reconcile aspects of the previous single case study on children and the group study in adults, Bogon et al. (2014) conducted a TVAbased assessment in a group of children with DD and a group of typically developing children matched according to age, educational level, gender, and general intellectual ability. Group-wise comparisons revealed the general attentional processing resource parameters, visual processing speed and VSTM storage capacity, to be impaired in children with DD compared to controls. Moreover, in the group of children with DD, low VSTM storage capacity was significantly related to impaired reading performance. In contrast, the selectivity aspects of visual attention, spatial distribution of attentional weights, and top–down control, were comparable to those of controls.

### **DISCUSSION**

Taken together, all TVA-based studies on DD implicate that a reduced perceptual processing speed is the most profound impairment at the core of DD (Dubois et al., 2010; Stenneken et al., 2011; Bogon et al., 2014). The parameter estimates assessed in the group studies are given in **Figure 2**. In both children with DD examined by Dubois et al. (2010), visual processing speed was severely reduced compared to controls while visual threshold *t0* was normal (a somewhat different paradigm was used, resulting in higher absolute values in *C* and *K* compared to **Figure 2**, with similar difference to controls). The speed reduction was replicated at group level, with similar degree in children and adults with DD (Stenneken et al., 2011; Bogon et al., 2014). Again, these studies did not report changes in visual threshold. These findings indicate that when the rate of visual information uptake is abnormally slow, this can hinder the acquisition of normal reading skills. However, the central role of visual processing speed for reading performance seems to go beyond DD pathology: first, in a recent TVA-based study Lobier et al. (2013) showed that, in typically developing children, the individual speed of visual processing predicted that of text reading. Thus, visual processing speed seems to have a central functional role in both pathological and normal reading development. Second, a visual processing speed reduction was documented also in acquired reading disorders in brain-damaged patients with simultanagnosia (Duncan et al., 2003). Therefore, also when reading development is completed, a severe slowness of processing speed might reduce the rate of information uptake below the limit required for normal reading performance. How do reductions in the TVA parameter processing speed relate to the reading difficulties in DD? Two well-established findings are compatible with the notion of reduced processing speed. One, the so-called "double deficit hypothesis" (Bowers and Wolf, 1993) is related to the results of Lobier et al. (2013). It describes a reduction in naming speed (for verbal or non-verbal material), in combination with a phonological deficit in DD. The second demonstrates a reading speed deficit in DD which is possibly based on slow decoding mechanisms, especially in regular orthographies (for discussion of underlying impairments, see Wimmer, 1993). Compared to processing speed, findings on VSTM storage capacity are, at first glance, less consistent in the studies reviewed here. In the adult-group study, VSTM storage capacity was comparable between the group with DD and controls (Stenneken et al.,

2011). In contrast, a marked reduction in VSTM was revealed in the group of children with DD (Bogon et al., 2014) and in one of the children with DD studied by Dubois et al. (2010). Obviously, low VSTM storage capacity does not present a shared deficit in all persons with DD. At second glance, these inconsistencies might reflect an influence of academic achievement. The TVA-based group studies on children and on adults with DD differed concerning the academic levels of the participants. While the adults, despite persisting DD, had above-average academic achievement, the children were unselected with respect to their own or their parent's academic achievements. Therefore, one could speculate that, in persons suffering from DD, a normal VSTM storage capacity might facilitate the compensation of DD-induced academic deficits while low VSTM storage capacity might induce a higher probability for academic failure. In support of this assumption, VSTM storage capacity in the group of children with DD was related to better reading performance (Bogon et al., 2014).

With regard to selectivity aspects of visual attention, adults with DD differed from controls in spatial weighting (Stenneken et al., 2011). The DD group did not show the typical pseudoneglect bias to the left (Bowers and Heilman, 1980; Jewell and McCourt, 2000) but rather a balanced distribution of weights. Interestingly, in children, such balanced weighting was present in both, DD and control group (Bogon et al., 2014). Thus, in normal participants, the TVA-based group studies (**Figure 2**), in line with findings from line bisection studies (Hausmann et al., 2003; Sireteanu et al., 2005), suggest that a leftward bias emerges or increases during development. In adults with DD, the absence of the spatial bias might be a primary deficit underlying DD. However, two lines of evidence suggest that it rather reflects a secondary consequence of reduced left-to-right reading experience

in impaired readers. First, differences between adult left-to-right and right-to-left readers in line- and string-bisection performance imply that reading experience and cultural reading habits have an influence on pseudoneglect development (Chokron and Imbert, 1993; Chokron and de Agostini, 1995; Zivotofsky, 2004; Kazandjian et al., 2010). Second, the fact that a group of children showed the typical DD symptoms in the absence of spatial weighting differences from a control group (Bogon et al., 2014) quite obviously indicates that the onset of DD precedes that of spatial attention changes.

Concerning the second selectivity aspect assessed, the reviewed studies reviewed above indicate that top–down control is not impaired in DD (Stenneken et al., 2011; Bogon et al., 2014; see **Figure 2**). Thus, the TVA-based analyses do not support the previously suggested assumption that multi-element processing deficits in DD result from an inability to prioritize relevant over irrelevant information (Sperling et al., 2005; Roach and Hogben, 2007; Moores et al., 2011; Stevens et al., 2013).

### **CONCLUSION**

We summarized the results of studies that used TVA-based assessment of visual attentional parameters to examine a potential relevance of deficient visual attentional processing in DD. Taken together, in children and adults with DD (Dubois et al., 2010; Stenneken et al., 2011; Bogon et al., 2014) marked reduction in visual processing speed seem to be a core deficit in DD. Furthermore, less consistently documented reductions in VSTM storage capacity may have a modulating effect on word processing performance and written language acquisition. In addition, reduced left-to-right reading skills and training in persons with DD might impair the development of a slight leftward attentional bias that is typically observed in unimpaired adult readers. It is

unknown whether this absence of pseudoneglect contributes to the persistent reading deficits in adulthood or whether it is an epiphenomenon without functional significance. In sum, findings from all rather recent parameter-based studies of DD point to significant reductions in general information processing efficiency as underlying mechanisms for impaired multi-element processing in DD. Moreover, recent studies of visual attentional span tasks support a visual—rather than an exclusively verbal or phonological—nature of the underlying deficit (Lobier et al., 2012; Valdois et al., 2012). Thus, parameter-based assessment offers new directions in investigating impaired visual attentional functions that seem to constitute an additional or even isolated deficit of DD, as previously suggested in subgroup-accounts of reading disorders (Morris et al., 1998; Heim et al., 2008; Menghini et al., 2010).

#### **ACKNOWLEDGMENT**

This work was supported by the German Research Foundation (DFG) within the funding program Open Access Publishing.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 12 May 2014; accepted: 26 September 2014; published online: 16 October 2014.*

*Citation: Bogon J, Finke K and Stenneken P (2014) TVA-based assessment of visual attentional functions in developmental dyslexia. Front. Psychol. 5:1172. doi: 10.3389/fpsyg.2014.01172*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology. Copyright © 2014 Bogon, Finke and Stenneken. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited andthatthe original publication inthis journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## TVA–based assessment of attentional capacities–associations with age and indices of brain white matter microstructure

### *Thomas Espeseth1,2\*, Signe A. Vangkilde3, Anders Petersen3, Mads Dyrholm3 and Lars T. Westlye1,2*

<sup>1</sup> Department of Psychology, University of Oslo, Oslo, Norway

<sup>2</sup> Norwegian Centre for Mental Disorders Research (NORMENT), and KG Jebsen Centre for Psychosis Research, Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway

<sup>3</sup> Department of Psychology, Center for Visual Cognition, University of Copenhagen, Copenhagen, Denmark

#### *Edited by:*

Bernhard Hommel, Leiden University, Netherlands

#### *Reviewed by:*

Derrick L. Hassert, Trinity Christian College, USA Werner Schneider, Bielefeld University, Germany Christian H. Poth, Bielefeld University, Germany (in collaboration with Werner Schneider)

#### *\*Correspondence:*

Thomas Espeseth, Department of Psychology, University of Oslo, PO Box 1094, Blindern, Forskningsveien 3A, N-0317 Oslo, Norway e-mail: thomas.espeseth@ psykologi.uio.no

In this study the primary aims were to characterize the effects of age on basic components of visual attention derived from assessments based on a theory of visual attention (TVA) in 325 healthy volunteers covering the adult lifespan (19–81 years). Furthermore, we aimed to investigate how age-related differences on TVA parameters are associated with white matter (WM) microstructure as indexed by diffusion tensor imaging (DTI). Finally, we explored how TVA parameter estimates were associated with complex, or multicomponent indices of processing speed (Digit-symbol substitution, DSS) and fluid intelligence (gF). The results indicated that the TVA parameters for visual short-term memory capacity, K, and for attentional selectivity, α, were most strongly associated with age before the age of 50. However, in this age range, it was the parameter for processing speed, C, that was most clearly associated with DTI indices, in this case fractional anisotropy (FA), particularly in the genu and body of the corpus callosum. Furthermore, differences in the C parameter partially mediated differences in DSS within this age range. After the age of 50, the TVA parameter for the perceptual threshold, t0, as well as K, were most strongly related to participant age. Both parameters, but t<sup>0</sup> more strongly so than K, were associated WM diffusivity, particularly in projection fibers such as the internal capsule, the sagittal stratum, and the corona radiata. Within this age range, t<sup>0</sup> partially mediated age-related differences in gF. The results are consistent with, and provide novel empirical support for the neuroanatomical localization of TVA computations as outlined in the neuronal interpretation of TVA (NTVA). Furthermore, the results indicate that to understand the biological sources of age-related changes in processing speed and fluid cognition, it may be useful to employ methods that allow for computational fractionation of these multicomponent measures.

**Keywords: cognition, processing speed, fluid intelligence, diffusion tensor imaging, fractional anisotropy, diffusivity**

### **INTRODUCTION**

Cognitive abilities, such as perception, attention, memory, and language depend on the precise temporal coordination of a spatially distributed set of neural computations that are organized in brain networks of various sizes and complexities (McClelland et al., 1986; Mesulam, 1990). Efficient communication between distal brain regions is thought to rely on the integrity of the white matter (WM) tracts that connect them (Neubauer and Fink, 2009). Neurological disease that involves disruptions in the structural connectivity within and between brain networks leads to cognitive dysfunctions (Geschwind, 1965a,b; Catani and Ffytche, 2005), suggesting that the integrity of WM tracts may provide a neuroanatomical substrate for individual differences in cognitive abilities (Bressler and Menon, 2010). Consistent with this, variation in network structural connectivity has been linked to cognitive function in younger adults (Gold et al., 2007; Turken et al., 2008) and in age-heterogeneous samples (Kennedy and Raz, 2009; Kochunov et al., 2010; Penke et al., 2010; Madden et al., 2012; Salami et al., 2012). In normal aging, general fluid intelligence (*g*F) declines earlier and more rapidly than general crystallized intelligence (*g*C) (Salthouse, 2004; Craik and Bialystok, 2006). While *g*C is defined by tasks that measure the ability to apply acquired knowledge and learned skills, *g*F refers to the ability to reason in novel situations (Cattell, 1963; Carroll, 1993). Processing speed is a well-known and highly replicated correlate of *g*F that has been hypothesized to mediate the relation between the brain and *g*F (Jensen, 2006) and also to drive age-related cognitive decline (Salthouse, 1996). However, processing speed itself is a complex phenomenon, including perceptual, attentional, memory, and motor components, each of which may have differential relations with WM networks and their biological properties.

Diffusion tensor imaging (DTI) is a powerful tool for quantifying structural integrity of brain networks. DTI is sensitive to the direction and degree of water displacement in biological tissues (Beaulieu, 2002). Diffusion in brain parenchyma is restricted by cytoskeletal axonal elements including plasma membranes, microtubules and myelin sheaths (Beaulieu, 2002). Water displacement occurs faster along than across the axons, and DTI enables visualization and quantification of the local organization of WM pathways via measures of directionality and rate of molecular diffusion as a function of a tensor ellipsoid that is estimated for each voxel. The most commonly reported measure is fractional anisotropy (FA), which indexes the fraction of the restricted (or anisotropic) diffusion relative to the total diffusion within each voxel. The FA range is 0–1 and values closer to 1 indicate increased directionality of diffusion. The average rate of diffusion along all directions is indexed by mean diffusivity (MD), which is orthogonal to FA. MD can be further split into rate of diffusion along the primary (axial diffusivity, AD) and the secondary (radial diffusivity, RD) axes of the diffusion ellipsoid. Whereas the exact neurobiological correlates of the DTI indices are unknown and likely to be complex, animal studies have suggested that AD may be sensitive to axonal changes, whereas RD may be sensitive to myelin changes (Song et al., 2003, 2005; Sun et al., 2006, 2008). Studies investigating effects of age on DTI measures in cross-sectional designs have shown that after the age of about 40 years, FA is negatively correlated with age, whereas MD is positively correlated with age (Westlye et al., 2010), and this general trend has been confirmed in longitudinal studies (Barrick et al., 2010).

Information processing speed is a fundamental capacity across the adult life span that correlate with the performance on many different kinds of cognitive tasks (Kail and Salthouse, 1994; Li et al., 2004). Higher *g*F has been shown to be associated with higher FA and lower MD, AD, and RD, and most of this association seems to be mediated by processing speed (Penke et al., 2012a; Haász et al., 2013). However, processing speed as typically measured in behavioral tasks is in itself a complex variable, the components of which may be mediated by distinct brain networks. For example, there is usually a motor component in estimates of processing speed which, when statistically controlled for, removes or significantly alters the pattern of association with DTI indices (Bennett et al., 2012). The influence of the motor component can be alleviated through mathematical modeling of the reaction time distribution (Ratcliff, 1978), or by the use of accuracy data from tasks with briefly presented stimuli, such as the Useful Field Of View (UFOV) task (Ball and Owsley, 1993), or the Inspection Time (IT) task (Deary and Stough, 1996). However, with briefly presented stimuli, a failure to accurately identify an object may be attributed either to an elevated threshold of perception, or to slower recognition processes, or both. The UFOV and the IT do not differentiate between these two factors.

Bundesen's (Bundesen, 1990) Theory of Visual Attention (TVA) offers an approach to measuring processing speed that is independent of motor speed while dissociating perceptual threshold and processing speed. TVA can be seen as a mathematical formalization of the influential biased competition model of attention (Desimone and Duncan, 1995), according to which objects compete for cortical representation and limited processing resources, and the competition is biased by bottom-up and top-down factors. According to TVA, selection and recognition of stimuli in the visual field occurs simultaneously during a process termed *visual categorization* (Bundesen, 1990). A visual categorization entails a decision as to whether "object *x* has feature *i*," or equivalently "object *x* belongs to category *i*." When such a categorization is made, object *x* is recognized as member of category *i*, and is thereby selected. However, for the categorization to be completed, it needs to be encoded into a visual short-term memory (VSTM) store, which is of limited capacity. In TVA, visual attention is described as a parallel processing race in which stimuli compete for representation in VSTM. The competition is influenced by attentional weights and perceptual biases, with the effect that the probability of representation in VSTM is biased in favor of particular stimuli in the visual field. The allocation of weights and biases are specified in the *rate equation* and the *weight equation* of TVA. According to the *rate equation:*

$$\nu(\mathbf{x}, i) = \eta(\mathbf{x}, i)\beta\_i \frac{\mathbf{w}\_{\mathbf{x}}}{\sum\_{z \in S} \mathbf{w}\_z},$$

the rate of processing *v* (*x*, *i*) (i.e., of the categorization "object *x* has feature *i*") is given by the product of, η(*x*, *i*), the strength of the sensory evidence that object *x* belongs to category *i*, β*i*, the perceptual bias associated with *i,* and the attentional weight of object *x*, *wx*, divided by the summed attentional weights, *wz*, of the set of elements in the visual field, *S*. The attentional weights are given by the *weight equation*:

$$\mathfrak{w}\_{\mathfrak{x}} = \sum\_{j \in \mathbb{R}} \eta(\mathfrak{x}, j) \pi\_j,$$

which specifies that the attentional weight of object *x*, *wx*, is given by the product of the sensory evidence that "object *x* belongs to category *j*," η(*x*, *j*), and the pertinence value associated with category *j,* π*j*, across the set of perceptual categories, *R*. The pertinence of category *j* is a measure of the momentary importance of attending to objects that belong to category *j* (e.g., the importance of the category red when looking for red objects) (Bundesen and Habekost, 2008). Thus, the attentional weight of object *x* is the sum of all pertinence values, weighted by the degree of sensory evidence that object *x* is a member of category *j*.

In NTVA, Bundesen et al. (2005) offers a neurophysiological interpretation of TVA in which the total neural activation representing a visual categorization is proportional to the *number* of neurons representing the categorization, and the *activation level* of these. The number of neurons representing the categorization of object *x* is given by *wx*/*wz*, whereas the activation level of these neurons is directly proportional to β*i*. The implementation of visual categorizations is characterized by two processing waves. Visual processing resources (i.e., neurons) are first distributed at random among objects in the visual field, and a parallel matching process between objects *x* in neuronal receptive fields and long-term memory representations *i*, resulting in sensory evidence values, η(*x*, *j*), unfolds. Building on this evidence, attentional weights are computed for each object in the visual field, and are used for redistribution of cortical processing capacity across them. In particular, a priority map, representing the attentional weights, controls dynamic remapping of neuronal receptive fields to ensure that the number of cells allocated to a particular object becomes proportional to the attentional weight of the object. Subsequently, different categorizations compete for access to VSTM in a stochastic race process. The capacity of VSTM is limited to *K* elements, typically around 4. The first *K* visual objects to finish processing are stored in VSTM, are accessible to consciousness and can be reported. The remaining categorizations are lost.

TVA-based assessment is typically done by use of simple letter identification tasks such as *whole report* (Sperling, 1960), in which an array of letters is briefly presented and participants are asked to report back the identity of as many letters as they can, or *partial report* (Shibuya and Bundesen, 1988), where participants should report back the identity of only a subset of the letters, for example only those printed in a certain color. The dependent measure in these tasks is accuracy (i.e., number of correctly identified letters). When the rate and weight equations are fitted to accuracy data from a combined whole report and partial report task, values on five distinct mathematical parameters can be estimated for each individual participant (Duncan et al., 1999). TVA defines these as the storage capacity of visual short-term memory, *K*; the perception threshold, *t*0; visual processing speed, *C*; visual distractibility or selectivity, α; and the relative attentional weight of each visual object, *w*. The *w* parameter can be used to compute the relative balance between attentional weights in the left and right visual fields, *w*index. Parameter estimates are strongly dependent on task characteristics (e.g., stimulus types and contrasts), but has proved to be valid and reliable measures, with split-half reliabilities around 90% for all parameters, and test-retest reliabilities ranging from around 60% (*t*0, *C*, α), to around 90% (*K* and *w*index) (Habekost et al., 2014).

Two studies on effects of age on TVA parameters have been published to date (McAvinue et al., 2012; Habekost et al., 2013). McAvinue and colleagues used the *CombiTVA* task (Vangkilde et al., 2011) and studied 113 individuals aged 12–75 years (83 participants in the 20–75 age range). They reported linear functional decline with age for the parameters *K*, *t*0, *C*, and α, with the steepest slopes for *C* and *K*, and weaker, but significant effects for *t*<sup>0</sup> and α. Although the regression analyses indicated that all age effects were linear, closer inspection of the local weighted scatterplot smoothing (LOESS, Cleveland, 1979) fits showed that *t*<sup>0</sup> and α might have more complex relationships with age; *t*<sup>0</sup> appeared to be relatively stable from the teens to the late 50s, but increase thereafter (from about 17 to about 23 ms). In contrast, α increased already from the teens and until the age of about 50, but appeared to be relatively stable thereafter. Habekost et al. (2013) studied an older age cohort (*n* = 33, age range 69–87) in two experiments. The first experiment aimed to disentangle perceptual speed from perceptual threshold and entailed a whole report procedure in which single letters were presented at fixation at different exposure times. The demands in this task are somewhat different from those of the *CombiTVA* and the nominal value of the parameter estimates cannot be directly compared to those of McAvinue and colleagues. However, this study also reported particularly strong age trends for processing speed, *<sup>C</sup>* (∼50% decrease, *<sup>R</sup>*<sup>2</sup> <sup>∼</sup>23%), as compared to the perceptual threshold, *<sup>t</sup>*0(*R*<sup>2</sup> <sup>∼</sup>9%). Interestingly, they also reported a negative correlation between *C* and the Fazekas rating of white matter hyperintensities (Fazekas et al., 1987), but this correlation did not withstand controlling for participant age. The *t*<sup>0</sup> results showed that there was a small group of the oldest participants that had elevated parameter estimates and the authors interpreted this finding to indicate potential pathological decline in this subgroup, in line with previous research reporting that individuals with mild cognitive impairment have significantly higher perceptual thresholds than age-matched healthy controls (Bublak et al., 2011). The second experiment assessed the participant's visual span with a whole report task in which five letters were presented peripherally for a fixed exposure time of 200 ms. Visual span was negatively correlated with age, but the association was weaker than for processing speed (*R*<sup>2</sup> <sup>∼</sup>9%).

Summarized, fluid intelligence declines rapidly in normal aging, and it is believed that this is partly driven by changes in brain structural connectivity. Furthermore, the association between connectivity and fluid intelligence may be mediated by changes in processing speed. However, processing speed consists of several subcomponents, each of which may be subserved by distinct brain networks and biological mechanisms. Thus, in the present study we aim to investigate these issues by (1) describing the effects of age on specific components of processing capacities as defined by TVA and on computationally more complex measures of psychometric processing speed and fluid intelligence, (2) analyzing the effects of age on different DTI indices of WM tract integrity, (3) exploring the associations between the behavioral measures and DTI indices, including an analysis of potential meditational effects of DTI indices on age-related TVA parameter differences, and of potential meditational effects of TVA parameters on the relation between DTI indices and *DSS* and *g*F. Finally, (4) we explore the regional specificity of these brain-behavior correlations.

### **MATERIALS AND METHODS**

### **SAMPLE RECRUITMENT AND DEMOGRAPHICS**

Participants were drawn from the Norwegian Cognitive NeuroGenetics (NCNG) sample which were recruited by advertisements in a local newspaper to take part in a larger community based study on the genetics of cognition (see Espeseth et al., 2012) for an overview). All participants read an information sheet and signed a statement of informed consent approved by the Regional Committee for Medical and Health Research Ethics (South-East Norway) (Project ID: S-03116). Permission to obtain and store blood samples for genotyping, as well as cognitive and MRI data in a biobank, and to establish a registry with relevant information, was given by the Norwegian Department of Health. The research was carried out in compliance with the Helsinki Declaration. All participants were native speakers of Norwegian. All subjects were interviewed and screened for neurological or psychiatric diseases known to affect the central nervous system, and history of substance abuse. Any person with a history of treatment for any of the above was excluded from further participation. The participants were administered the Vocabulary and Matrix reasoning (MR) subscales of the Wechsler Abbreviated Scale of Intelligence (*WASI*, Wechsler, 2007) to estimate general cognitive abilities (IQ). The behavioral sample consisted of three hundred and twenty five persons (219 females) in the age range 19–81 (Mean = 50.2, *SD* = 17.0). Participants included in the study had on average 14.6 years of education (*SD* = 2.3, range = 9–22) and performed within an estimated full scale IQ range of 88–148 (Mean = 121, *SD* = 10.2). Mini-Mental State Examination (MMSE) data was available for 209 individuals aged > 40 (Mean = 29.1, *SD* = 0.9, range = 26–30). Concurrent TVA and DTI data was available for two hundred and twenty nine participants. This subsample consisted of 149 females. Demographics values were similar to those from the total sample. All images were checked for signs of pathology by a certified neuroradiologist.

#### **BEHAVIORAL TASKS**

#### *Neuropsychological test battery and estimation of gF*

All participants completed a battery of neuropsychological tests in addition to the *WASI*, all tests in the official Norwegian language version. This battery included measures of episodic memory (California Verbal Learning Test, second version, *CVLT-II*) (Delis et al., 2004) and measures of processing speed. Processing speed was derived from performance on the Color–Word Interference Test (*CWIT*) (Delis et al., 2005), from an experimental visuospatial attention task involving letter discrimination with location cues of varying validity (*CDT*) (Espeseth et al., 2006), and from the Digit-Symbol Substitution (*DSS*) test (Wechsler, 1981). To construct *g*F factor scores, hierarchical PCA analyses were performed (see Davies et al., 2011). Data from *WASI MR*, *CVLT-II*, *CWIT*, and *CDT* was used as input to the PCA. Raw scores from *CWIT* and *CDT* were inverted to obtain the same ordinal order as the two other test scores (i.e., higher score = better function). Initially, two separate first-order PCAs were run on *CVLT-II* and inverted *CWIT* scores. Thereafter, a second-order PCA was run on the first, un-rotated component factor scores obtained from *CVLT-II* and *CWIT* subtests, raw MR scores and the inverted *CDT* scores. From this, the first un-rotated principal component was extracted and used to represent *g*F. Component scores such as *g*F are known to be highly reliable, even when derived from different batteries of cognitive tests (Johnson et al., 2004, 2008). Although no direct comparison between different *g*F estimations, or test-retest reliability estimates, have been conducted in the current sample, *g*F association with age, genetics, and DTI indices have been found to be comparable to such associations in other samples (Davies et al., 2011; Haász et al., 2013; Christoforou et al., 2014).

#### *TVA-bases assessment—design and procedure*

The *CombiTVA* paradigm, which was employed in the present study, is described in detail in Vangkilde et al. (2011). Briefly, the *CombiTVA* test took 45 min to complete and comprised 24 practice trials and nine experimental blocks of 36 trials. All trials followed the same basic design outlined in **Figure 1**.

A trial was initiated by a red fixation cross in the middle of a black screen, and then a 100-ms blank screen, before the stimulus display was presented on an imaginary circle (*r* = 7.5 degrees of visual angle) around the fixation cross with six possible stimulus locations. After a variable stimulus duration the letter display was terminated by a 500-ms masking display containing six masks made from red and blue letter fragments. Then the screen turned black, and the participant could type in the letters that he or she had seen. In whole report trials either two or six red target letters were presented, while partial report trials featured two red target letters and four blue distractor letters. Displays with six target letters were shown for each of six stimulus durations (10, 20, 50, 80, 140, or 200 ms) while all other displays were shown for 80 ms. All trial types were intermixed and the stimuli in a given trial were chosen randomly without replacement from a set of 20 capital letters [ABDEFGHJKLMNOPRSTVXZ] written in the

font Arial (broad) with a letter point size of 68 corresponding to 2.7 by 2.3 degrees of visual angle. The individual, multi-colored masks were 100 by 100 pixels to completely cover the letters. Participants were instructed to make an unspeeded report of all red letters they were "fairly certain" of having seen, that is to use all available information but refrain from pure guessing. Participants were informed of the accuracy of their reports (the probability that a reported letter was correct) after each block and they were encouraged to report as many red letters as possible but keep their reports within a specified accuracy range of 80–90% correct. The stimulus displays were presented on a 21 EIZO CRT monitor at 100 Hz using the E-prime 1.1 software. All tests were run in a windowless room with standard lighting conditions, with participants seated in front of a monitor with a viewing distance of approximately 60 cm.

#### *Estimation of TVA parameters*

The number of correctly reported letters in each trial constituted the main dependent variable in the *CombiTVA* task. The parameter estimates can be extracted from the behavioral data by a maximum likelihood fitting procedure that is implemented in the publicly available Matlab toolbox *LIBTVA* (Kyllingsbæk, 2006; Dyrholm et al., 2011), and the mathematical interpretations of each of the TVA parameters are described in detail in the relevant publications. Briefly, assuming a particular set of parameters, one can calculate the probability for any possible outcome of a whole and partial report trial, and thereby, for a given individual, one can estimate the set of parameter values that maximize the probability of observing a given set of data. Storage capacity of visual short-term memory, *K*, was assumed to vary on a trial-by-trial basis and, thus, *K* for a particular trial was drawn from a probability distribution consisting of five free parameters (i.e., the probabilities that *K* = 1, 2,. . . , 5), where these probabilities summed to a number between 0 and 1, and the remaining probability up to a value of 1 was accounted for by the probability that *K* = 6. Thus, the *K* value reported here is the expected *K* given a particular probability distribution. The perceptual threshold, *t*0, was assumed to be drawn trial-by-trial from a normal probability distribution with two free parameters (mean and *SD*). The total visual processing speed, *C*, is characterized by only one free parameter (i.e., a constant) defined as the sum of all *v* values across all perceptual categorizations of all elements in the visual field:

$$C = \sum\_{\mathbf{x} \in S} \sum\_{j \in R} \nu(\mathbf{x}, i),$$

Although participants were instructed to distribute attention uniformly among the possible targets in the stimulus display, they might behave differently; so attentional weights (*w* values) were estimated individually for targets at each of the six stimulus locations (5 free parameters, as the sum of the 6 attentional weights was fixed at a value of 1). Furthermore, assuming that attentional weights of every target is equal to the weight of every other target for a given spatial location, and that the attentional weight of every distractor is similar to the weight of every other distractor on the same spatial location, the visual selectivity parameter, α (one free parameter), can be estimated by the ratio of the distractor and target attentional weights:

$$\alpha = \mathcal{w}\_{\text{distractor}} / \mathcal{w}\_{\text{target}}$$

In sum, the specific model used for the fitting procedure had a total of 14 free parameters.

#### *TVA inclusion criteria*

We set the following inclusion criteria*: t*<sup>0</sup> estimation error < 30, *C* estimation error < 80, and relative attentional weight <0.8 on all six positions in the display. 325 of 347 data sets satisfied these criteria. The 22 excluded participants typically failed to follow instructions, for example by focusing primarily on only one of the target positions instead of the fixation cross, with the result that the attentional weight for that position would be >0.8 and the overall *K* estimate would be close to 1. These individuals covered most of the age range, but were predominantly in the 60s and 70s (Mean age = 63, Median age = 68).

#### **MR ACQUISITION AND ANALYSIS—DIFFUSION TENSOR IMAGING**

The data and processing scheme was performed as previously described (Westlye et al., 2012b). Imaging was performed on a 12-channel head coil on a 1.5-T Siemens Avanto scanner (Siemens Medical Solutions, Erlangen, Germany) at Oslo University Hospital, Rikshospitalet. For diffusion weighted imaging a single-shot twice-refocused spin-echo echo planar imaging sequence with the following parameters was used: repetition time (TR)/echo time (TE) <sup>=</sup> 8590/87 ms, *<sup>b</sup>*-value <sup>=</sup> 1000 s/mm2, voxel size = 2.0×2.0 × 2.0 mm, and 64 axial slices. The sequence was repeated twice with 10 *b* = 0 and 60 diffusion-weighted volumes per run.

DTI datasets were processed using the FMRIB Software Library (FSL) (Smith et al., 2004). Each volume was affine registered to the first *b* = 0 volume using FMRIB's linear image registration tool (FLIRT) (Jenkinson et al., 2002) to correct for motion and eddy currents. After removal of non-brain tissue, FA (Basser and Pierpaoli, 1996), eigenvector, and eigenvalue maps were computed by linearly fitting a diffusion tensor to the data. We defined RD as the mean of the second and third eigenvalue [(λ<sup>2</sup> + λ3)/2] and MD as the mean of all three eigenvalues. FA volumes were skeletonized and transformed into a common space as employed in Tract Based Spatial Statistics (TBSS) (Smith et al., 2006). All volumes were non-linearly warped to the FMRIB58\_FA template by use of local deformation procedures performed by FMRIB's non-linear image registration tool (FNIRT). Next, a mean FA volume of all subjects was generated and thinned to create a mean FA skeleton representing the centers of all common tracts. We thresholded and binarized the mean skeleton at FA > 0.2 to reduce the likelihood of partial voluming in the borders between tissue classes, yielding a mask of 1,20,770 voxels. Individual FA values were warped onto this mean skeleton mask by searching perpendicular from the skeleton for maximum FA values. Using maximum FA values from the centers of the tracts further minimizes confounding effects due to partial voluming (Smith et al., 2006). Similar warping and analyses were employed on MD, AD, and RD data, yielding skeletons sampled from voxels with FA > 0.20. DTI data were on average acquired 1.1 years (*SD* = 0.28) after TVA-based assessment was performed.

Mean DTI values were calculated within defined anatomical white matter regions and pathways by masking the TBSS skeletons with regions of interests (ROIs) based on digitalized probabilistic white matter and tractography atlases (Mori et al., 2005; Wakana et al., 2007; Hua et al., 2008). The ROIs included the body of corpus callosum (BCC), cingulum (CGC), corona radiata (CR), corticospinal tract (CST), external capsule (EC), fornix (FX), genu of corpus callosum (GCC), internal capsule (IC), inferior longitudinal fasciculus (ILF), posterior thalamic radiation (PTR), splenium of corpus callosum (SCC), superior fronto-occipital fasciculus (SFO), superior longitudinal fasciculus (SLF), sagittal stratum (SS), and the uncinate fasciculus (UNC).

#### **STATISTICAL ANALYSIS**

Relations between behavioral measures were analyzed with bivariate correlation analysis. To assess stability of the correlations across the adult life span (∼20–80), the sample was split into two age groups with similar age spans (<50, ≥ 50). To describe agerelated effects on means and standard deviations for each of the behavioral measures, the sample was split into six age groups of equal size (*n* = 54). For comparison across measures, all measures were transformed to standard scores and compared across the six age groups. Age effects on each of the TVA parameters were further analyzed with independent sequential linear regression models in which the TVA parameters were dependent variables, and age (first), and age<sup>2</sup> (second) was entered as independent variables. The age-relation was considered to be quadratic if the *R*<sup>2</sup> change was significant, and the age-relation was otherwise considered to be linear. The same set of analyses was repeated for the two age groups (<50, ≥ 50). We compared slopes across age groups by running linear regression analyses including the interaction between age and age group as independent variable. Associations between DTI indices (whole brain averaged values) and behavioral data was analyzed with partial correlation analyses, including first sex and time elapsed between DTI and TVA acquisition as covariates, and subsequently age and age2. The regional specificity was assessed in a similar way. Mediation analysis were done based on the criteria specified by Baron and Kenny (1986), as explained in the Results Section.

#### **RESULTS**

#### **BEHAVIORAL DATA AND THE EFFECTS OF AGE**

After applying inclusion criteria TVA parameters were close to normally distributed with almost all values within ±4 *SDs* when normalized across the whole sample. **Figure 2** shows that there was a small number of remaining high *t*0, *C*, and α scores.

Results from a bivariate correlation analysis are shown in **Table 1**. *C* and *t*<sup>0</sup> were uncorrelated whereas there was a strong correlation between *C* and *K*. The weight index and α were not strongly correlated with any of the other parameters. *DSS* and *g*F were strongly correlated with each other and had virtually identical associations with the TVA parameters. The pattern of correlation within TVA was generally stable across the age range in the present sample when compared between two age cohorts spanning approximately 30 years each. However, TVA parameter correlations with *DSS* and *g*F varied with age group. Notably, *K* and *C* correlations with *DSS* were substantially weaker in the old group, as was the α correlation with *g*F. In contrast, the *t*0-*g*F correlation was clearly stronger in the old group than in the young group.

The total sample spans about six decades of adult life and **Table 2** shows descriptive statistics for the behavioral data across the entire age range split up in six equally sized age groups covering about one decade each.

Of note, estimated IQ was more than one standard deviation above the age-adjusted population mean for all age groups. The highest IQs were estimated for the two oldest groups. To compare age effects across measures we transformed all TVA parameters and the *DSS* score to standardized scores across all subjects and plotted these for the six age groups (see **Figure 3**).

As evident from **Figure 3**, *K*, the capacity of visual short-term memory, and *C*, the speed of visual processing, seem to decline linearly across the age range, although not as steeply as *DSS* and *g*F. *t*0, the perceptual threshold, and α, the parameter for

**Table 1 | Bivariate correlations between TVA parameters,** *DSS***, and** *g***F, for the full sample, and split into age groups (***<***50, <sup>≥</sup> 50). \****<sup>p</sup> <sup>&</sup>lt;* **<sup>0</sup>***.***05, \*\****p <* **0***.***01.**


attentional selectivity, have more complex age trajectories; the α parameter increases in the early part of the age range and is relatively stable thereafter, whereas *t*<sup>0</sup> displays an opposite pattern of age effects with early stability and late increase.The weight index seems to be unrelated to age.

**Figure 4** shows scatter plots of the TVA parameter data with LOESS fits. The magnitude of the age effect for each measure, and whether the age trends were best characterized by linear or quadratic fits were assessed with sequential linear regression tests. As evident from **Table 3A**, all parameters were significantly associated with age, but only *t*<sup>0</sup> and α had any increase in explained variance for the quadratic model as evidenced by the *R*<sup>2</sup> change in a sequential regression model (*p* = 0.001 and *p* = 0.008, respectively). When the same set of analyses were repeated for the two age groups (<50, ≥ 50), all age trends were linear (i.e., nonsignificant *R*<sup>2</sup> change, see **Table 3B**). For the young group, the parameters most strongly associated with age were *K* and α. For the old group, *K*, and particularly *t*0, were most strongly related to age, the latter with a *R*<sup>2</sup> value that went beyond the *R*<sup>2</sup> of *DSS* and *g*F. To test whether age slopes were different across age groups, we ran independent linear regression analyses for each of the TVA parameters, including age group, age, and their interaction (i.e., age group × age) as independent variables, and these analyses revealed that the interaction term was significant for *t*0, *t* = 3.15, *p* = 0.002, and for α, *t* = −2.23, *p* = 0.027, but not for *K* (*p* = 0.56), and *C* (*p* = 0.86).

#### **DTI INDICES AND THE EFFECTS OF AGE**

For the DTI indices, there were strong age effects on all parameters, relatively weaker for FA than for the diffusivity measures, which were almost perfectly correlated with each other (all *r*'s ≥ 0.99), even after controlling for age. **Figure 5** displays scatter plots with LOESS fits for each of the DTI parameters. Age effects were non-linear, with steeper age-related effects from around mid-life. As for the behavioral data, age trends were approximately linear when the sample was split at age 50 (data not shown), all significant with the exception of FA in the young group (*p* = 0.08).

Results from linear regressions assessing age effects on DTI values averaged across the WM skeleton and 15 separate ROIs are displayed in **Table 4**.





R<sup>2</sup> change refers to increase in R<sup>2</sup> when a quadratic term is added to the model. F- and p-values presented for the linear model except in case of t<sup>0</sup> and α, for which the R<sup>2</sup> change was significantly improved after adding the quadratic age term. \*p <sup>≤</sup> 0.01, \*\*p <sup>≤</sup> 0.001, \*\*\*p <sup>≤</sup> 0.0001.


<sup>R</sup><sup>2</sup> change refers to increase in R<sup>2</sup> when a quadratic term is added to the model. F- and p-values presented for the linear model. Age ranges <sup>=</sup> 19–49, n <sup>=</sup> 147, and 50–81, n <sup>=</sup> 178. \*p <sup>≤</sup> 0.01, \*\*p <sup>≤</sup> 0.001, \*\*\*p <sup>≤</sup> 0.0001.

For FA, age effects were most pronounced in the corpus callosum, the corona radiata, and the fornix. For the diffusivity measures the effects of age were on average larger, and the ROIs with the most pronounced effects were the corpus callosum, the cingulum, the corona radiata, the superior longitudinal fasciculus, and the sagittal stratum.

#### **ASSOCIATIONS BETWEEN BEHAVIORAL DATA AND DTI INDICES** *Associations between behavioral data and DTI indices, and effects of controlling for age*

The association between behavioral measures (TVA parameters, *DSS*, *g*F) and DTI parameters (FA and MD averaged across the skeleton) were done with partial correlation analyses with effects of sex and difference in time elapsed between TVA and DTI acquisition partialled out. RD and AD was not included since these correlated almost perfectly with MD. For the complete sample of 229 participants who had concurrent TVA and DTI data, it was revealed that all behavioral measures, except α, correlated significantly with both average FA and average MD, with *r*'s ranging from 0.17 (*K*-FA) to −0.53 (*g*F-MD). After controlling for age, no correlation with FA remained significant. However, the correlations between *t*<sup>0</sup> and MD, and between *g*F and MD, were still significant (*r* = 0.18, *p* < 0.01, *df* = 226 for *t*0, *r* = −0.18, *p* < 0.01, *df* = 222 for *g*F). Including the quadratic age term as an additional confounder further reduced the *t*0-MD correlation (*r* = 0.13, *p* = 0.05, *df* = 225), but somewhat strengthened the *g*F-MD correlation (*r* = −0.20, *p* < 0.01, *df* = 221), and also made the *DSS*-MD correlation nominally significant (*r* = −0.15, *p* < 0.05, *df* = 196).

Given the finding of a mixture of linear and non-linear age effects, and that most effects were linear within the age groups defined by split at age 50, we reran the partial correlation analyses for each group separately. The results are displayed in **Table 5**.

As can be seen in the table, *C*, *DSS*, and *g*F were correlated with FA, and *DSS* and *g*F with MD for the young participants, but only the *C*-FA and *DSS*-FA correlations remained when controlling for age and age<sup>2</sup> (*<sup>r</sup>* <sup>=</sup> <sup>0</sup>.29, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.005 for *<sup>C</sup>*-FA, *<sup>r</sup>* <sup>=</sup> <sup>0</sup>.32, *p* = 0.008 for *DSS*-FA). For the old group *K*, *t*0, *DSS*, and *g*F correlated significantly with MD, but the *K*-MD correlation did not survive controlling for age and age2. The remaining correlations were reduced but still significant (*r* = 0.20, *p* = 0.019 for *t*0-MD, *r* = −0.20, *p* = 0.022 for *DSS*-MD, and *r* = −0.27, *p* = 0.002 for *g*F-MD). There were no significant correlations with FA in the old group.

### *Are effects of age on t***<sup>0</sup>** *mediated by differences in WM diffusivity?*

In order to test whether WM diffusivity mediates the correlation between *t*<sup>0</sup> and age in the old group, we assessed the patterns of correlations according to Baron and Kenny's (1986) four criteria

for mediation. These state that significant associations between (1) the independent variable (i.e., age) and the dependent variable (i.e., *t*0), and (2) between the independent variable and the mediator (i.e., average MD) must be established. In the old group, these criteria were satisfied since age was significantly correlated with both *t*<sup>0</sup> and MD. Furthermore, it needs to be established that (3) the mediator is associated with the dependent variable after controlling for the independent variable. This criterion was also satisfied since the *t*0-MD correlation was significant even after controlling for age and age2. Finally, it needs to be shown that (4) the association between the independent variable and the dependent variable is weakened when controlling for the potential mediator. To test this, we performed partial correlation analyses with age and *t*0, controlling first for sex and differences in elapsed time between TVA and DTI acquisition. Zero-order *r* = <sup>0</sup>.33 (*r*<sup>2</sup> <sup>=</sup> <sup>11</sup>.0%), whereas the partial correlation was *<sup>r</sup>* <sup>=</sup> <sup>0</sup>.<sup>30</sup> (*r*<sup>2</sup> <sup>=</sup> <sup>9</sup>.0%). Controlling also for average MD further reduced the partial correlation to *<sup>r</sup>* <sup>=</sup> <sup>0</sup>.12 (*r*<sup>2</sup> <sup>=</sup> <sup>1</sup>.4%). Thus, for participants aged 50 years and above controlling for average MD reduced the variance in *t*<sup>0</sup> explained by age with 84.4%.

### *Are associations of DTI indices on multicomponent measures mediated by TVA parameters?*

To test whether the relation between DTI parameters, processing speed, and fluid intelligence were mediated by TVA parameters, we assessed the correlations between them according to Baron and Kenny's (1986) four criteria for mediation.

In the young group, we examined whether the correlation between FA and *DSS* was mediated by *C*. FA was correlated with *DSS* and *C*, also after controlling for age and age<sup>2</sup> (criteria 1 and 2). The correlation between *C* and *DSS* was significant after controlling for age and age2, and FA (*<sup>r</sup>* <sup>=</sup> <sup>0</sup>.44, *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.0005) (criterion 3). To test whether the association between the independent variable (i.e., FA) and the dependent variable (i.e., *DSS*) was

**Table 4 | Based on linear regression model tests with DTI measures as dependent variables, and sex, age, and age<sup>2</sup> as independent variables.**


Degrees of freedom for F = 3, 259 for all analyses. Avg, average parameter value across TBSS skeleton; BCC, body of corpus callosum; CGC, cingulum; CR, corona radiate; CST, corticospinal tract; EC, external capsule; FX, fornix; GCC, genu of corpus callosum; IC, internal capsule; IFO, inferior longitudinal fasciculus; PTR, posterior thalamic radiation; SCC, splenium of corpus callosum; SFO, superior fronto-occipital fasciculus; SLF, superior longitudinal fasciculus; SS, sagittal stratum; UNC, uncinate fasciculus. \*p < 0.01, \*\*p < 0.001, \*\*\*p < 0.0001.


**Table 5 | Based on partial correlations controlling for sex and time elapsed between behavioral testing and DTI scanning.**

\*p <sup>≤</sup> 0.05, \*\*p <sup>≤</sup> 0.01, \*\*\*p <sup>≤</sup> 0.001, #non-significant after controlling for age and age2.

weakened when controlling for the potential mediator (i.e., *C*) (criterion 4), we performed partial correlation analyses with FA, *DSS*, controlling for sex and difference in elapsed time between TVA and DTI acquisition, and *C*. FA correlation with *DSS* was reduced from *<sup>r</sup>* <sup>=</sup> <sup>0</sup>.35 (*r*<sup>2</sup> <sup>=</sup> 12%) to *<sup>r</sup>* <sup>=</sup> <sup>0</sup>.24 (*r*<sup>2</sup> <sup>=</sup> 6%), and was no longer significant. Thus, controlling for *C* reduced the variance in *DSS* explained by FA with about 50%.

In the old group, we examined whether the correlations between MD and *DSS*, and between MD and *g*F were mediated by *t*0. For the MD-*DSS* correlation, criteria 1 and 2 were satisfied because MD was significantly correlated with both *DSS* and *t*0, also after controlling for age and age2. However, the criterion 3 was not satisfied since *t*0was not significantly correlated with *DSS* after controlling for MD (*r* = −0.16, *p* = 0.073). The correlation between MD and *DSS* was reduced from *<sup>r</sup>* = −0.34 (*r*<sup>2</sup> = 12%) to *<sup>r</sup>* = −0.27 (*r*<sup>2</sup> <sup>=</sup> 7%), when *<sup>t</sup>*<sup>0</sup> was partialled out (criterion 4). This corresponds to a reduction in explained variance of about 42%. Thus, only three of the four criteria were satisfied.

For the MD-*g*F correlation, criteria 1 and 2 were satisfied since MD was significantly correlated with both *g*F and *t*0. The *t*0-*g*F correlation was significant after controlling for age, age2, and average MD (criterion 3), and the correlation between MD and *<sup>g</sup>*F was reduced from *<sup>r</sup>* = −0.34 (*r*<sup>2</sup> <sup>=</sup> 12%) to *<sup>r</sup>* = −0.<sup>28</sup> (*r*<sup>2</sup> <sup>=</sup> 8%) when *<sup>t</sup>*<sup>0</sup> was partialled out (criterion 4). This corresponds to a reduction of variance in *g*F explained by MD with about 33%.

#### *Analysis of regional specificity of behavior-DTI correlations*

To identify potential ROI-specific associations between behavioral measures and WM tracts that were significant at the wholeskeleton level, we performed partial correlation analyses with FA, *C*, *DSS*, and *g*F in the data from young participants, and with MD, *t*0, *DSS*, and *g*F in the data from the old participants with all ROIs, including also ROI subsections (i.e., anterior, posterior, etc.) when available. The results are displayed in **Tables 6**, **7**.

In FA-related correlations the most prominent ROIs were the corpus callosum and the corona radiata. The power to reveal significant differences between correlations was limited. However, Steiger's *Z* test analyses (Steiger, 1980) revealed that the correlation between *C* and the anterior corona radiata, was larger than the correlation between *C* and the CST (*p* < 0.01, onetailed), and *C* and the SLF, *C* and the FX, and *C* and the RLIC (*p* < 0.05, one-tailed), but was not significantly different from the other correlations involving *C*. Results were similar after adding age and age<sup>2</sup> as control variables, and similar **Table 6 | Partial correlations between ROI FA values and behavioral measures, controlling for sex, and time elapsed between TVA and DTI acquisition.**


Age <sup>&</sup>lt; 50. Green, p <sup>≤</sup> 0.05; yellow, p <sup>≤</sup> 0.01. \*Significant after adding age and age2 as control variables.

across behavioral traits. The MD-related correlations tended to be stronger and the sample size was also larger. The most prominent ROIs were the internal capsule, the sagittal stratum, the corona radiata, the superior longitudinal fasciculus, and the corpus callosum. Within the internal capsule, the posterior limb and the retrolenticular part were most strongly associated with behavior. Within the corona radiata, the anterior part had the largest correlations. Steiger's *Z* test analyses (Steiger, 1980) revealed that significant *t*0-ROI correlations were not significantly different from each other (*p* > 0.05, one-tailed), but was significantly different from *t*0-ROI correlations that were nonsignificant in the primary analysis (*p* < 0.01, one-tailed). The pattern of correlation appeared quite similar for the three behavioral measures included. After adding age and age2 as control variables, only partial correlations involving CR and IC remained

**Table 7 | Partial correlations between ROI MD values and behavioral measures, controlling for sex, and time elapsed between TVA and DTI acquisition.**


Age <sup>≥</sup> 50. Green, p <sup>≤</sup> 0.05; yellow, p <sup>≤</sup> 0.01; orange, p <sup>≤</sup> 0.001. \*Significant after adding age and age2 as control variables.

significant for *t*0. The *DSS* and *g*F correlations were more resistant to controlling for age, particularly for projection fibers.

#### **DISCUSSION**

The main purpose of the present study was to characterize effects of age on parameters derived from TVA-based assessment, and on brain structural connectivity as defined by DTI-based indices of WM microstructure. Furthermore, we aimed to investigate the extent to which the age-related effects on TVA parameters were associated with age-related differences in DTI indices, and specifically whether it can be shown that DTI indices mediate effects of age on TVA parameters, and whether TVA parameters can be shown to mediate the relationship between DTI indices and age-related decline in *DSS* and *g*F.

#### **TVA PARAMETERS AND AGE**

The four TVA parameters, *K*, *t*0, *C*, α, were all significantly associated with participant age. *K* was characterized by a linear decline from an average at about 3.7 objects at age 20 to about 2.5 objects in the 80s (∼32%). *C*, the speed of processing, revealed a similar linear age-related trend with a decline from about 65 items/second at the age of 20 to about 40 items/second by the age of 80 (∼38%). The age-effect on *t*<sup>0</sup> was different: There was a modest increase with age until the 50s (about 1 ms, or ∼6%), but from here to the age of 80, *t*<sup>0</sup> increased with another 10 ms (∼59%). In line with this observation, *<sup>R</sup>*<sup>2</sup> was significantly improved when the quadratic age term was included in the model. The age effect on the α parameter was more complex. This parameter started to increase already in the 20s, continued to increase until the early 40s and was relatively stable thereafter, but with a trend toward an improvement for the oldest participants. As for *t*0, *R*<sup>2</sup> for the α parameter was significantly improved when the quadratic age term was included in the model. The age trends for *DSS* and *g*F were linear and relatively strong compared to those of the TVA parameters.

The descriptive properties of the TVA parameters as measured in the present study correspond well to those reported from other samples (Vangkilde et al., 2011; McAvinue et al., 2012; Habekost et al., 2014). The effects of participant age were also broadly comparable to those reported in prior studies (McAvinue et al., 2012; Habekost et al., 2013). Comparison with the results of McAvinue et al. (2012) is of particular interest since the age ranges overlap substantially between studies and since the paradigms used were identical. The age effects on *K*, and especially on *C* were weaker in the present study than those reported by McAvinue and colleagues. For *K* this seems to be due to the combination of somewhat lower estimates for the youngest participants (i.e., 20s and 30s) in the present study as compared to the Irish sample (∼3.4 vs. ∼3.5, respectively), and higher scores in the older part (i.e., 60s and 70s) of the sample (∼2.7 vs. ∼2.5, respectively). For *C* it seems that the relatively small age effect is due to high scores for the old participants (∼41 vs. ∼34, respectively). Age effects on *t*<sup>0</sup> and α were of similar magnitude across studies, and interestingly, both studies revealed similar age trajectories also for these parameters: Early increase followed by later stability for α, early stability and later increase for *t*0. This led us to rerun the linear regressions with two age groups split at 50 years of age. This analysis revealed that the age effect on α was relatively larger for participants under the age of 50, whereas *R*<sup>2</sup> was relatively larger for *t*<sup>0</sup> for participants above the age of 50. In fact, for this age range, *t*<sup>0</sup> changes were the dominant pattern, with an effect beyond what was observed for *DSS* and even *g*F. Age effects (i.e., slopes) on the *t*<sup>0</sup> and α parameters were significantly different across age groups, as confirmed by linear regression analysis where the age group × age interaction term was included. For *K*, *C*, and *DSS* the effect size was similar for the two age groups. The pattern of bivariate correlations indicates the relative independence of the TVA parameters with the exception of the *K-C* association replicated in most TVA-based studies (Bundesen and Habekost, 2008; Habekost et al., 2014) and the differential age trajectories provide further support for the notion of parameter independence.

The parameter for attentional selectivity, α, was best described by a quadratic term: the age-related effect was mainly seen early in adult life. This finding may suggest only a partial fit with the frontal aging hypothesis, which posits that cognitive processes that are supported by frontal lobe functions, such as executive attention, should show decline at an earlier age, and that this decline should also be of greater magnitude than for cognitive processes supported by other brain regions (West, 1996; Verhaeghen and Cerella, 2002). Our results support the former prediction, but not the latter. *K* and *t*0, which arguably represent lower level, or more basic visual attention capacity functions, showed age-related effects of greater magnitude. This pattern of results may indicate that more peripheral functional systems are becoming less efficient in the aging brain, while central functions, that may be involved in compensation or scaffolding processes are relatively more preserved (Madden, 2007; Park and Reuter-Lorenz, 2009).

### **DTI PARAMETERS AND AGE**

White matter integrity, as measured by DTI indices, declines in normal aging (Westlye et al., 2010; Madden et al., 2012; Salami et al., 2012), and the present results fit well with prior research in this field by revealing decreasing FA and increasing MD with increasing age. In some studies, it has also been shown that RD increases relatively more with age than AD, suggesting that WM tract disconnection is driven preferentially by myelin changes. The present results did not support this since AD and RD age trajectories were similar and the measures almost perfectly correlated. The high correlation between diffusivity indices does not seem to merely reflect the large age range used in the present study as correlations were essentially unaffected by partialling out age, and was similar for the age groups employed. Thus, the effects may be equally driven by differences in myelin and in axonal integrity. The literature is less clear with regards to the regional pattern of age-related effects. Several organizational schemes have been presented, including the idea that anterior regions deteriorate earlier than posterior regions (Pfefferbaum et al., 2000), and the idea that the regions last to be myelinated are the first to be demyelinated (also known as the retrogenesis hypothesis) (Bartzokis, 2004). In the present results, there were strong effects for some of the anterior tracts (e.g., the GCC) and for tracts that project to and from anterior regions (e.g., the SLF and CGC), but also significant effects in posterior tracts (e.g., SS and IC). The present results are consistent with a relatively widespread WM-based disconnection of brain networks in healthy aging.

#### **MEDIATION OF AGE EFFECTS ON TVA PARAMETERS**

In the present study we aimed to assess the extent to which DTI measures mediated age effects on TVA parameters. According to Baron and Kenny's (1986) four criteria for mediation, significant associations between (1) the independent variable (i.e., age) and the dependent variable (i.e., a TVA parameter), and (2) between the independent variable and the mediator (i.e., a DTI index) must be established. For the complete sample these two criteria were satisfied since age was associated with differences in *K*, *t*0, *C*, and α, and also with averaged FA and MD. Following Baron and Kenny, it needs to be established that (3) the mediator is associated with the dependent variable after controlling for the independent variable, and (4) that the association between the independent variable and the dependent variable is weakened when controlling for the potential mediator. In the full sample, the only significant TVA-DTI association that resisted statistically controlling for age and age<sup>2</sup> was the MD-*t*<sup>0</sup> correlation. However, this correlation was almost entirely due to effects in the old group, and we therefore assessed mediation effects in this group alone. In the old subsample, all four criteria were satisfied for the age-*t*<sup>0</sup> association, since MD was significantly correlated with *t*<sup>0</sup> even after controlling for age, and since controlling for MD clearly reduced the correlation between age and *t*0. It could also be the case that FA mediated the association between age and *C* since

#### **THE NEUROANATOMY OF TVA**

NTVA is a general neurophysiological interpretation of TVA that does not depend on any specific anatomical localization of TVA computations. However, Bundesen et al. (2005) have suggested a thalamic model of NTVA in which attentional capacity depend on the functional interconnection of thalamic nuclei and visual processing units in the occipital and parietal lobes. In particular, it is suggested that η values from the geniculo-striate pathway is integrated with pertinence values from fronto-parietal cortical systems in a priority map that represent the attentional weights of objects in the visual field. Possibly, the priority map is localized in the pulvinar nucleus in the posterior thalamus. The pulvinar nucleus has been implicated in processes related to visual attention in several studies (Petersen et al., 1987; Saalmann et al., 2012, for a review see Saalmann and Kastner, 2011), and the pulvinar is connected with posterior visual regions via WM tracts that pass through the internal capsule, particularly PLIC and RLIC. In NTVA, it is suggested that the end product of the first wave of selection is the attentional weights that are stored in the priority map. In the second wave of selection, this information is multiplied with β-values in the cortex, the product of which is transmitted to a VSTM map, possibly located in the reticular nucleus in the thalamus. Thus, WM tracts that connect the thalamus with cortical regions should be closely related to performance on TVA-based assessment. Projection fibers were strongly related to performance data. In particular, the IC, SS, and CR ROIs were strongly associated with *t*0, and this relation was stronger and more specific as age increased, particularly for the oldest participants (70–80 years) for which *t*<sup>0</sup> and MD had the strongest effect. However, sample size becomes limited if we focus narrowly on the oldest participants and we have chosen not to emphasize such analyses here. Controlling for age and age<sup>2</sup> in the regional correlation analysis revealed that only the IC (particularly PLIC) and CR (particularly ACR) remained significant. Furthermore, TVA parameters were associated with fiber tracts that inter-connect frontal, parietal, and occipital cortical regions, including the sagittal stratum and the SLF. Thus, the current data appear to be generally consistent with some of the predictions likely to be made based on NTVA. However, there are a number of qualifications to this interpretation. For example, the associations were not very specific—although some tracts were more strongly associated than others, there was a gradient of correlation strengths, making the associations quantitatively, rather than qualitatively different from each other. Statistical comparison of correlation coefficients could not contribute to specification beyond what is evident in **Tables 6**, **7**. Furthermore, the directionality of connections between brain regions cannot be inferred from DTI images. Finally, there is no way of inferring the sequence of involvement for each region of the brain in correlational studies like this.

#### **DO THE TVA PARAMETERS MEDIATE THE RELATIONSHIP BETWEEN DTI INDICES, PROCESSING SPEED, AND FLUID INTELLIGENCE?**

To assess the extent to which TVA parameters mediated the relationship between DTI indices, processing speed, and fluid intelligence, we analyzed Baron and Kenny's (1986) four criteria for mediation. Similar to the approach for DTI-mediation of age effects on TVA parameters, we performed the analyses in young and old subsamples. In the young group, FA was correlated with *DSS* and *C*, also after controlling for age and age<sup>2</sup> (criteria 1 and 2). The *C-DSS* correlation was significant after controlling for age, age2, and average FA (criterion 3), and the FA-*DSS* correlation was clearly reduced when *C* was controlled for (criterion 4). Thus, according to these criteria, *C* partly mediated the relationship between FA and *DSS* performance in young participants. This correlation seemed to be primarily driven by the anterior parts of the corpus callosum and corona radiata.

In the old group, MD was correlated with *DSS* and *g*F, also after controlling for age and age<sup>2</sup> (criteria 1 and 2). The *t*0-*g*F correlation, but not the *t*0-*DSS* correlation, was significant after controlling for age, age2, and average MD (criterion 3), and the MD-*g*F correlation was clearly reduced when *t*<sup>0</sup> was controlled for (criterion 4). Thus, according to Baron and Kenny's criteria, *g*F were partly mediated by *t*<sup>0</sup> in participants above the age of 50. The regional source of this correlation was quite distributed and seemed to be strongest in projection fibers and the SLF.

In the present study we had concurrent psychometric, psychophysical (TVA), and DTI data from a relatively large sample of healthy individuals covering the adult age range. This enabled us to make direct comparisons between multicomponent and subcomponent indices of processing speed with regards to correlations with age and with WM structural connectivity. As expected, age effects were larger on the multicomponent measures, than on subcomponents, but association with FA was similar in magnitude, and the relation between FA and *DSS* was partially mediated by *C*. Furthermore, the pattern of association with individual WM tract ROIs was similar for the two measures. This may be taken to indicate that, at least for participants under the age of 50, *DSS* is sensitive to the same neurobiological factors as the computationally more narrowly defined *C* parameter, but that it also captures variance that is unrelated to processing speed *per se*. This can be inferred from the significant correlations with the other TVA parameters, in particular *K* and *t*0. It also seems likely that *DSS* captures variance related to computations related to for example memory, executive functions, and motor planning and execution that were not individually estimated in the current study (Ratcliff, 1978).

The latter point is underscored by comparison with the results in the older cohort in which the *DSS* correlations with *K* and *C* are clearly reduced, whereas the correlation with *t*<sup>0</sup> is essentially unchanged. Also, whereas correlations with WM integrity estimates were similar for *DSS* and *C* for young participants, for old participants it was similar for *DSS* and *t*0. However, since *t*<sup>0</sup> and *DSS* were not significantly correlated when sex, age, age2, and average MD were controlled for, we were not able to show that *t*<sup>0</sup> mediated the correlation between *DSS* and average MD.

Processing speed is a well-known correlate of fluid intelligence (Deary, 2000; Jensen, 2006). In the context of cognitive aging, it has been claimed that decline in higher cognitive functions is largely determined by the efficiency with which simple mental operations can be correctly completed (Salthouse, 1996). However, studies cited in support for this idea has typically employed computationally complex paper-and-pencil tests such as the *DSS*, and as argued above, it is not clear that it is the processing speed aspect of *DSS* that is associated with fluid intelligence. By use of tasks in which performance is unrelated to motor components, one has tried to more unambiguously connect age-related decline in processing speed to age-related decline in fluid intelligence. For example, Ritchie et al. (2014) analyzed longitudinal cognitive aging data from a large sample of healthy individuals in their 70s that had been tested three times with an IT task and other psychometric measures, showed that IT and *g*F was significantly correlated at baseline (*r* = 0.46). Interestingly, when looking at performance changes (i.e., slopes) over a six-year period, the correlation was much stronger (*r* = 0.78), suggesting that there really is a functional connection between age-related declines in perceptual processing speed and fluid intelligence. However, as argued above, and elsewhere (McAvinue et al., 2012; Habekost et al., 2013), IT paradigms cannot distinguish between processing speed (the rate of perceptual processing) and perceptual threshold (the delay of processing onset after stimulus presentation). In light of the results in the present study, an alternative interpretation of the results of Ritchie et al. (2014), and indeed of the processing speed theory of cognitive aging (Salthouse, 1996), is that at least part of the correlation between decline in processing speed and decline in higher order cognition, is due to an age-related elevation in the perceptual threshold. The finding that *t*<sup>0</sup> partially mediated the correlation between WM diffusivity and *g*F, and that the regional pattern of correlations with WM tract integrity indices seemed to be largely similar for *t*<sup>0</sup> and *g*F, with relatively stronger correlations for projection fibers, may be taken to support the view that there is a common biological substrate for these measures.

### **CONCLUSION**

The findings on the neuroanatomy of TVA are generally consistent with the thalamic model in NTVA, and thus provide convergent evidence. Furthermore, the results may give clues to the source, in neuroanatomical and computational/informationprocessing terms, of age-related decline in multicomponent measures of processing speed and fluid intelligence. Before the age of 50, age effects on *DSS* may be mediated by changes in FA and *C*. After the age of 50, age effects on *g*F, may be mediated by changes in WM diffusivity, particularly in projection fibers, and *t*0. This set of findings is consistent with the notion that the aging brain is characterized by cognitive slowing due to WM tract disconnectivity, as has been argued previously by several investigators (Madden et al., 2012; Penke et al., 2012a,b; Haász et al., 2013). The current study takes this analysis one step further by showing that the association between DTI and processing speed, or between DTI indices and fluid intelligence is partly mediated by more specific computational processes as defined by *C* and *t*0, and that these associations differ across the adult age range.

#### **LIMITATIONS**

There are several limitations associated with the current study. One limitation is connected to the problem with interpretation of cognition-brain structure correlations. First of all, the data is correlational, so the presence of a significant association does not imply causation. Correlations may be due to other measured or unmeasured variables. Furthermore, it does generally not follow that a lack of correlation means that there is no functional relation between a certain behavioral measure and the anatomical properties of a structure. It has often proved challenging to reveal behavior-brain structure correlations (Van Petten, 2004). Although a specific cognitive function may be dependent on a particular brain substrate, the relation may not emerge in correlation analysis unless the structure is sufficiently different from normal or until a sufficient amount of damage has accumulated (Westlye et al., 2012a). However, structural connectivity is only part of a story where functional or synaptic connectivity may be even more important for certain cognitive processes (Friston, 1998). The importance of WM vs. synaptic connectivity may vary between behavioral measures. For TVA parameters, one could speculate whether *K* and α might be more strongly related to synaptic connectivity.

Another limitation is the sample size. Although we are not aware of published studies with larger samples with concurrent TVA-based assessment data and DTI, there are limitations to the ability to split into subgroups for more specific follow-up analyses. Future studies should aim for even larger sample sizes if correlational analyses are to be more decisive. Lastly, acquisition of behavioral data and DTI scanning was done about one year apart. We statistically controlled for differences in the elapsed time between the two types of data sampling, and the effects of this variable did seem to be negligible. However, future studies should aim to do TVA-based assessment and MRI scanning within as short a period of time as possible.

### **ACKNOWLEDGMENTS**

This study was supported by grants (177458/V50 and 231286) from the Research Council of Norway to Thomas Espeseth. Lars T. Westlye was funded by the Research Council of Norway (204966/F20). Anders Petersen and Signe Vangkilde were funded by grants from the University of Copenhagen, Programme of Excellence and the Danish Council for Independent Research (DFF) research career programme Sapere Aude.

### **REFERENCES**


*Neurobiol. Aging* 33, 433.e21–433.e31. doi: 10.1016/j.neurobiolaging.2011. 02.001


Geschwind, N. (1965a). Disconnexion syndromes in animals and man. I. *Brain* 88, 237–294. doi: 10.1093/brain/88.2.237


to evolving optic nerve injury in mice with retinal ischemia. *Neuroimage* 32, 1195–1204. doi: 10.1016/j.neuroimage.2006.04.212


correlated with fractional anisotropy, but not cortical thickness, in the medial temporal lobe. *Neuroimage* 63, 507–516. doi: 10.1016/j.neuroimage.2012.06.072


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 03 July 2014; paper pending published: 24 August 2014; accepted: 28 September 2014; published online: 21 October 2014.*

*Citation: Espeseth T, Vangkilde SA, Petersen A, Dyrholm M and Westlye LT (2014) TVA–based assessment of attentional capacities–associations with age and indices of brain white matter microstructure. Front. Psychol. 5:1177. doi: 10.3389/fpsyg. 2014.01177*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology.*

*Copyright © 2014 Espeseth, Vangkilde, Petersen, Dyrholm and Westlye. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

**METHODS ARTICLE** published: 07 October 2014 doi: 10.3389/fpsyg.2014.01137

## Normative perceptual estimates for 91 healthy subjects age 60–75: impact of age, education, employment, physical exercise, alcohol, and video gaming

### *Inge L. Wilms\* and Simon Nielsen*

Brain Research and Advanced Technology Laboratory, Department of Psychology, University of Copenhagen, Copenhagen, Denmark

#### *Edited by:*

Bernhard Hommel, Leiden University, Netherlands

#### *Reviewed by:*

Thomas Espeseth, University of Oslo, Norway Gernot Horstmann, Bielefeld University, Germany

#### *\*Correspondence:*

Inge L. Wilms, Brain Research and Advanced Technology Laboratory, Department of Psychology, University of Copenhagen, Oester Farimagsgade 2A, 1353 Copenhagen K, Denmark e-mail: inge.wilms@psy.ku.dk

Visual perception serves as the basis for much of the higher level cognitive processing as well as human activity in general. Here we present normative estimates for the following components of visual perception: the visual perceptual threshold, the visual short-term memory (VSTM) capacity and the visual perceptual encoding/decoding speed (processing speed) of VSTM based on an assessment of 91 healthy subjects aged 60–75.The estimates were modeled from input from a whole-report assessment based on a theory of visual attention. In addition to the estimates themselves, we present correlational data, and multiple regression analyses between the estimates and self-reported demographic data and lifestyle variables. The regression statistics suggest that education level, video gaming activity, and employment status may significantly impact the encoding/decoding speed of VTSM but not the capacity of VSTM nor the visual perceptual threshold. The estimates will be useful for future studies into the effects of various types of intervention and training on cognition in general and visual attention in particular.

**Keywords: visual perception, normative estimates, processing speed, TVA, gaming, senior citizens, cognitive decline**

### **INTRODUCTION**

The method demonstrated in this chapter is a computational assessment of visual processing capacity in 91 healthy subjects age 60–75. The output is a set of normative estimates as well as patterns of correlation between this capacity, demographic variables, and lifestyle variables as reference for future studies into the effect of training and intervention. The normative visual processing capacity estimates in this chapter is provided for the total sample of subjects as well as for the critical demographic variables: age, gender, level of education, employment status. In addition, we provide analysis on the influence of self-reported daily activities such as casual video gaming, alcohol consumption, smoking, physical exercise, and meditation.

#### **BACKGROUND**

Many studies have demonstrated that the processing speed of the brain is susceptible to training throughout life (Takeuchi et al., 2011). This offers hope for prolonging the cognitive quality of life in both healthy and brain injured senior citizens through training intervention. However, cognitive effect studies are notoriously difficult to manage as many different aspects apart from the training itself may influence the cognitive ability being trained. In the recruitment of test subjects, it is therefore important to have normative data that reflects other activities which might influence the cognitive ability both before and during the training period.

As part of a larger study into the effects of cognitive training, we therefore wanted to try to model the visual capacity in a sample of healthy senior citizens taking into account the influence of demographic data as well as self-reported lifestyle activities. The idea was to try to assess which of the most common activities of daily living needs to be taken into account when doing cognitive effect studies within the field of visual attention.

The chapter introduces normative data collected from a wholereport paradigm based on a theory of visual attention (TVA; Bundesen, 1990). We chose the whole-report paradigm as it has been used for many years to measure the capacity of visual perception (Sperling, 1960; Rumelhart, 1970; Shiffrin and Gardner, 1972). Based on the whole-report data, TVA estimates the temporal threshold of conscious perception in milliseconds (*t*0), the speed of visual processing measured in letters per second (*C)* and visual short-term memory (VSTM) capacity measured in number of letters (*K)* of visual attention (Duncan et al., 1999). The advantage of using TVA-based assessment is the unspeeded accuracy-based measures which make it possible to characterize different aspects of attention avoiding confounding impact from motor components. This is particularly important when investigating effects of training or specific conditions (e.g., brain injury or neuropsychiatric disorders), which might affect both perceptual and motor functioning. TVA assessment has previously been used to successfully account for a range of behavioral and neurophysiological attentional effects (for a review see Bundesen and Habekost, 2008), and the theory provides a theoretical and empirical framework for investigating and explaining attention in both normal subjects (Finke et al., 2006; Jensen et al., 2012) and patients (Duncan et al., 1999; Bublak et al., 2006; Habekost and Rostrup, 2006, 2007; Habekost and Starrfelt, 2009; Redel et al., 2012; Starrfelt et al., 2013). The retest reliability of the TVA assessment of the *C* and the *K* parameter has also been demonstrated to be robust (Habekost and Bundesen, 2003; Habekost and Rostrup, 2006; Habekost et al., 2014). Different parameters can be modeled based on input from different paradigms. Other examples of paradigms for TVA assessment include the partial-report paradigm which also estimates the selectivity to perceiving targets in the presence of distractors (Vangkilde et al., 2011) and a single letter paradigm (Petersen and Andersen, 2012) which measure psychometrics of object identification in visual processing.

#### **ABOUT ASSESSMENT USING TVA ESTIMATES**

Theory of visual attention is a formal computational theory for the way the visual attention system selects amongst incoming visual stimuli relevant for the task at hand (for a comprehensive account see Bundesen, 1990). According to TVA, the selection amongst incoming visual stimuli is a parallel processing race in which the attributes of objects in the visual field compete for access to a VSTM with a limited capacity of *K* elements. Only *K* number of items will, at any time, be selected and encoded into VSTM for later conscious actions. However, in line with the ideas of Desimone and Duncan (1995) the race is seen as a biased competition, in which the chances of winning the race are not equal for all objects and categories. Other aspects of the items in the visual field such as priming effects, spatial distribution, prior training, noise, and contrast may influence the probabilities of the encoding speed of certain objects and categories.

Encoding into VSTM is thought to proceed in two stages: in the first stage, attentional weights are computed and assigned to each element in the visual field according to their relevance. In the second stage, the total processing capacity of the visual system is distributed amongst the elements in proportion to their attentional weights. The capacity allocated to a particular element determines how fast this element is processed and how likely it is to become encoded into VSTM.

### **METHODS**

### **THE WHOLE-REPORT ASSESSMENT**

A whole-report assessment was conducted using software developed in PsychoPy (Peirce, 2007), which captured the response of the test subjects. The sequence of the whole-report TVA paradigm is outlined in **Figure 1**. Each trial is initiated by pressing the space bar, which allowed the subjects to control the speed of progress throughout the assessment. Upon the initiation of a trial, the subjects were asked to fixate on a centrally placed cross. After 1100 ms, a stimulus display appears with six letters distributed on an imaginary circle at 9◦ eccentricity. Each stimulus subtended a visual angle of 2◦ at an approximate viewing distance of 65 cm. The stimulus duration was varied pseudo-randomly within blocks between 20, 30, 50, 80, 140, or 200 ms per trial. The trial exposure time was limited to 200 ms to avoid eye movements from confounding results. All 30, 50, 80, and 140 ms trials were immediately followed by a 500 ms mesh screen masking the positions of the six letters. The 20 and 200 ms trials were in some cases followed by the same mesh screen or in some cases a blank screen, the so-called unmasked condition. The masking conditions are included to ensure that estimates are made on the actual encoding and not a visual after-image.

The unmasked conditions of the fastest and slowest exposure trials are included to improve modeling of the estimates of *t*<sup>0</sup> and *K*.

Subjects were encouraged to report the letters shown as best they could, aiming for an accuracy level between 70 and 90%. The trial procedure was self-paced by the subjects pressing the space bar. The test comprised eight blocks each of 36 trials. The first block was a training block. Thus seven blocks and a total of 252 trials with 28 repetitions per exposure duration were included in the analysis.

All tests were run on 21" ViewSonic G220f 215 CRT displays with a vertical refresh rate of 100 Hz to ensure precise timing of stimuli.

#### **PARTICIPANTS**

A total of 91 healthy subjects were included in the test. The subjects aged 60–75 (*n*=91, *M* = 67.7, SD = 4.2), males (*n* = 28, *M* = 68.9, SD = 4.2) and females (*n* = 63, *M* = 67.1, SD = 4.1), with different educational backgrounds and employment status. The subjects were recruited through the Facebook page of the local branch of The DaneAge Association ("Ældresagen"), an interest group for senior citizens in Denmark and through advertisements in local newspapers. The participating subjects received a courtesy gift of chocolate and wine as well as thanks for participation. After signing a form of consent, potential participant were directed to a website where they were informed about the inclusion criteria. They had to be healthy with no history of brain injury, dementia, and diabetes. They would

also be excluded if they currently were under medical treatment for psychiatric disorder or suffered from color blindness. Eyesight (self-reported) had to be normal or corrected to normal.

A total of 167 potential participants filled out an initial questionnaire about demographic data, activities of daily living and self-reported cognitive functioning. Based on the responses, 55 subjects were excluded either because they did not fulfill the inclusion criteria or because they had personal reasons not to continue. The remaining 112 participants were tested with the TVA whole-report paradigm and other cognitive assessments including MMSE. Data from a total of 21 subjects were excluded from the analysis, 7 due to modeling error, and 11 because they failed to comply with the task instructions. An additional three subjects were excluded from the normative material and analyses, because their TVA estimates fell beyond ±3 SD of the sample mean (Cousineau and Chartier, 2010). The exclusion of the three subjects had no consequence on the primary findings.

The randomness of the initial recruitment and subsequent exclusion of subjects resulted in a difference in the representation at the gender level (63 females, 28 males). This was taken into account in the further processing of data.

#### **THE QUESTIONNAIRE**

As part of the initial recruitment, subjects were asked to fill out a detailed questionnaire regarding their daily activities and lifestyle. We asked about all the information that we anticipated might influence cognitive abilities like level of education, state of employment, state of health both physically and mentally, the use of drugs both legal and illegal, smoking habits, alcohol consumption, physical exercise habits, the use of social media, meditation, and video gaming. This was to allow an investigation into the possible influence of these factors on the perceptual system. The inclusion of questions regarding gaming habits in this population may seem odd at first but the Danish society is highly digitized to the extent that e-mail and internetbased service is the regular way of communication with public and private enterprise. Senior citizens have been encouraged to learn the use of computers and iPads and are regular users of both.

#### **STATISTICAL PROCEDURES AND MODELING**

The whole-report data were fitted with the TVA framework (Bundesen, 1990) using MATLAB and the LibTVA toolbox (Dyrholm et al., 2011) to extract the *t*0, *C,* and *K* parameters. The LibTVA toolbox and user guide can freely be downloaded from http://zappa.psy.ku.dk/libtva.

All statistics were produced using IBM SPSS v.20. In addition to the normative statistics in Section "The Normative Estimates (**Table 1**)," we assessed the relation between the TVA parameters (dependent variables) and the demographic- and lifestyle variables (independent variables). To this end we used bivariate correlation procedures to estimate Pearson's correlations coefficients, and multiple regression procedures to estimate the causal influence and strength of prediction of the independent variables, on each of the TVA estimates.

#### **ETHICAL CONSIDERATIONS**

This project was registered at The Regional Danish Ethical Committee for research in Copenhagen and ruled to be a nonclinical trial. Written, informed consent was obtained from all participants.

### **RESULTS**

Normative data for TVA estimates will be presented for the demographic variables Gender, Retired, Age\_Group, Education\_Group, and Gaming habits. Following the normative data, correlation statistics are presented to illustrate the intra-correlations of the independent variables and of the TVA estimates. The relationship and influence of the demographic and lifestyle variables (alcohol, exercise, and casual video-gaming) on the TVA estimates will be analyzed using multiple regression statistics.

Due to the randomness of the initial recruitment and subsequent exclusion of subjects, the gender level differed substantially in size yielding a strong bias toward female (63 females, 28 males). To assess female bias on the TVA estimates, we ran a multivariate analysis of covariance with gender as fixed factor, the TVA estimates as dependent variables, and the demographic- and lifestyle variables as covariates, which showed no significant interaction of gender (all *p*-values > 0.37). Thus gender is assumed not to interact with the TVA estimates, for the demographic and life style variables presented here.

#### **NORMATIVE ESTIMATES**

In **Table 1**, distribution measures of TVA estimates are presented for each of the demographic variables Gender, Age\_Group, Retired, Education\_Group, and Gaming.

Six categories of education were originally represented in the data [1 = elementary education (7–9 years), 2 = technical school, 3 = high school, 4 = 2 years of higher education in addition to high school, 5 = bachelor, 6 = master level, or higher]. These categories have been collapsed into four groups due to a sparse representation of subjects at the lower education levels. Group 1 comprises the three basic categories of education (1–3), Group 2 correspond to category 4, Group 3 to category 5, and Group 4 to category 6.

#### **INFLUENCE OF DEMOGRAPHIC AND LIFESTYLE VARIABLES ON TVA ESTIMATES**

**Table 1** indicates that the TVA estimates vary depending on the grouping of data. In **Table 2**, correlational measures (Pearson) between the TVA estimates and the measured variables are presented. This includes demographic variables (gender, retirement status, age, level of education) and the lifestyle variables: alcohol consumption (A\_DPM; Drinks per month, *M* = 26.4, SD = 23.4), physical exercise (E\_HPM; Hours per month, *M* = 18.1, SD = 16.4), casual video-gaming (G\_HPW; Hours per week, *M* = 1.5, SD = 2.4).

#### *TVA intra-correlations*

There was a positive correlation between *C* and *K*, which has been reported previously (Habekost et al., 2014). In addition, there was a negative correlation between *t*<sup>0</sup> and *K*, which to our knowledge has not previously been reported. We suggest that the interaction may be causally related to age, which is supported by previous


#### **Table 1 | Normative statistics of TVA estimates.**

studies reporting an age related increase in *t*<sup>0</sup> and decrease in *K* (McAvinue et al., 2012; Habekost et al., 2013). However, we were not able to confirm this hypothesis in our data (see Discussion of Effects of Age).

#### *Variables correlating with TVA estimates*

There was a significant positive correlation (*p* < 0.05) between the processing speed estimate *C* and Age\_Group and G\_HPW (gaming hours per week). Surprisingly, there was a significant but negative correlation (*p* < 0.05) between *C* and E\_HPM (physical exercise hours per month; see Discussion of Effects of Physical Exercise). There was a highly significant correlation (*p* < 0.01) between the processing speed estimate *C* and retirement status (Retired).

While **Table 2** suggests interaction between some of the demographic and lifestyle variables with the TVA estimates, the correlational measures are prone to confounding biases from other factors. So to assess the causal relation and individual contribution of variance of the demographic and lifestyle variables (independent) to the TVA estimates (dependent), multiple regression analyses were run for each of the TVA estimates. The assumptions of linearity, independence of errors, homoscedasticity, unusual points and normality of residuals are all met for the analyses. Furthermore, there were no multicollinearity issues between the independent variables (which can also be verified from **Table 2**: all *r* < 0.7).

The analysis showed that the demographic and life style variables did not predict the variations in visual perceptual threshold (*t0)*, *<sup>F</sup>* (7,83) <sup>=</sup> 1.04, *<sup>p</sup>* <sup>=</sup> 0.41, adj. *<sup>R</sup>*<sup>2</sup> <sup>=</sup> 0.003, *p\_min* <sup>&</sup>gt; 0.13. Nor did they predict the capacity of VSTM (*K)*, *F* (7,83) = 0.51,

*<sup>p</sup>* <sup>=</sup> 0.83, adj. *<sup>R</sup>*<sup>2</sup> = −0.04, *p\_min* <sup>&</sup>gt; 0.12 – where *p\_min* corresponds to the significance level of the most influential independent variable. This means that when controling for the covariance in the reported measurements, no demographic or lifestyle variables significantly influenced the perceptual threshold *t*<sup>0</sup> or the VTSM capacity *K*. However, the demographic and lifestyle variables did predict the *C* parameter, *F* (7,83) = 3.49, *p* = 0.003, adj. *<sup>R</sup>*<sup>2</sup> <sup>=</sup> 0.162, and three of the variables contributed significantly (Retired, Education\_Group and G\_HPW). The results from the Multiple Regression analysis can be found in **Table 3**.

### **DISCUSSION**

#### **THE NORMATIVE ESTIMATES (TABLE 1)**

The way the sample was recruited created a potential bias when compared to the performance of the general aging population and require a word of caution. Firstly, it takes a certain amount of motivation and interest in science and research to act upon a written or electronic request to join a research program. Secondly, the level of education in the sample used is not entirely representative of the Danish population in general. According to information from the Statistics Denmark database, the current general level of education amongst the target population aged 60–69 is as follows: ∼30% of the population has only completed elementary school, 2% has completed high-school, 42% has a technical or craftsmanship education, 4% a short education above high school, 16% had a bachelor education and 6% a master's degree or higher.

In our sample, only 5% had completed elementary school only, almost 40% had a bachelor education and 20% had a master's degree or higher. So it might be a valid concern for the normative estimates that our sample is skewed toward a higher



Significant correlations at the 0.05 level are marked with a \* and highly significant correlations below the 0.01 level are indicated by \*\*.

**Table 3 |The influence of lifestyle and demographic variables on the perceptual processing speed estimate** *C.*


B = unstandardized regression coefficient; SE<sup>B</sup> = standard error of coefficient; β = standardized coefficient. The regression coefficients describe the slope of the linear relation between the specific input variable, and the output variable when the other input variables are fixed. Significant predictors at p < 0.01 level is marked with \*\* and at p < 0.05 level is marked with \*.

educational level. We chose to collapse the three basic levels of education (7–12 years of education) into one category (1) to improve the power of the measures. When using the estimates, it should be taken into consideration that category 1 includes participants with more than just the basic educational requirement of 9 years.

Thirdly, the sample was screened for serious illness and medication that would impact cognitive training before entering the trials.

#### **EFFECTS OF EMPLOYMENT STATUS**

Retirement from work seems to be linked to a reduction in the estimates for perceptual processing speed *C*. Those participants, who were still employed part time or full-time, performed better in the TVA assessment than the retired participants.

This was a surprising finding as we had expected no influence from employment state. A valid argument could be that the retirement group is older than the employed group. However, as can be seen in **Table 3**, retirement status contributed significantly to the variation in perceptual processing speed even when adjusting for age, indicating that those still employed have a higher perceptual processing speed (**Table 1**: *C* = 54.84) than those retired (**Table 1**: *C* = 41.39). There may be several explanations for this difference. It is well known that social engagement stimulates cognitive ability (e.g., Bassuk et al., 1999; Glei et al., 2005). Retirement from work will in many cases result in reduced social interaction and over time perhaps contribute to cognitive decline. Also, mental stimulation has been shown to be a preserver of cognitive functioning in old age (Wilson et al., 2012). The variety and demands from the working environment itself may very well contribute to the stimulation of abilities such as attention and memory. That our sample was strongly biased toward the highly educated, may further support this notion since the cognitive demands imposed by positions employed by highly educated are likely to be greater than those employed by people with lower levels of educations. Even navigating through traffic to and from work may play a role particularly on attention. In a study from Anguera et al. (2013) demonstrated that there was room for improvement of abilities like cognitive control in the 60–85 year olds.

In summary, the findings encourage considering controling for employment status when recruiting a homogeneous sample for future cognitive effect studies.

#### **EFFECTS OF EDUCATION**

Level of education seems to have a positive effect on perceptual processing speed. Other studies from France, Mexico, and USA have investigated the relationship between level of education and cognitive preservation on larger sets of the populations (Ardila et al., 2000; Alley et al., 2007; Glymour et al., 2012). The findings were inconclusive at the general cognitive domain of attention. In terms of cognitive assessment and education, a positive relationship has been demonstrated between level of education and cognitive performance (Glymour et al., 2012).

One explanation might be that higher education tends to imply highly developed reading ability. Reading is known to affect visual processing speed (Jackson and McClelland, 1979) and so it may be reading proficiency more than education *per se* that makes a difference. As we did not ask people in details about their reading habits, this requires further investigation.

#### **EFFECTS OF VIDEO GAMING**

In Green and Bavelier (2003) demonstrated that young people playing action video games showed improved visual selective attention. The effect was attainable through training and not a result of some innate superior ability. Since then many studies have demonstrated superior visual ability in video gamers in areas like spatial resolution (Green and Bavelier, 2007), temporal auditory and visual sequencing of external stimuli (Donohue et al., 2010) and task switching (Karle et al., 2010). Further investigations into the specific elements of attention suggest that the improvements are facilitated by faster encoding and decoding to VSTM (Wilms et al., 2013). It therefore seemed prudent to include questions about gaming habits when trying to establish normative estimates for visual perception. In the study, we specifically asked the participating subjects about their video gaming habits, the frequency and duration

of playing as well as details about the type of games played if any.

Forty-two subjects responded positively to playing video games on either iPad or PC. When asked about the type of games played, most subjects answered "Brainteasers" like Sudoku, Tetris, Candy Crush, Angry Birds, and Wordfeud etc. and many supplemented this with online card games like Solitaire, Poker, and Bridge.

Casual video gaming was a significant predictor of *C*, such that the number of hours spent playing video games was positively related to the perceptual processing speed. The results are surprising as many would argue that apps like Candy Crush do not fall into the standard category of action video games.

However, we speculate that in terms of gaming impact, it may not be only the action of the game that place demand on the perceptual system. Many of the relaxation apps have an element of time limitation, which requires the gamer to respond to a challenge within a given time frame. We speculate if this may in fact be the reason why we see an effect of frequent gaming. It certainly raises the question whether other types of modern video games not normally categorized as action games may have a positive impact on cognition.

Since this is not a training study, we have no way of knowing of the difference in perceptual processing speed in gamers and non-gamers was innate, but previous studies support that gaming influences processing speed and our measures indicate that gaming needs to be controlled in studies of intervention.

#### **EFFECTS OF AGE**

It has been the general consensus that perceptual ability decline over time (e.g., Kail and Salthouse, 1994) and that it may be related to decline in cognitive processing speed (Salthouse, 1996; Finkel et al., 2007). For that reason we had expected to find a general decline on all the perceptual estimates.

The initial correlation analysis (see **Table 2**) seems to indicate a relationship between Age\_Group and C. However, when performing multiple regression analyses (**Table 3**) Age\_Group does not contribute significantly to the variance in *C*. This means that within the age range of our sample, age does not influence the *C* value when controling for the influence of variables like education level, retirement status, and gaming habits. The same is true for *t*<sup>0</sup> and *K*.

This contradicts the findings in previous TVA studies (McAvinue et al., 2012; Habekost et al., 2013) which found a significant effect of age on the capacity of VSTM (*K*), the visual perceptual threshold (*t*0), and processing speed (C).

In Habekost et al. (2013), a significant decline was found in a slightly older population (aged 70–85, *n* = 33) and McAvinue et al. (2012) found a strong effect of age on processing speed (C). We speculate that the difference in findings may be due to the sample sizes in both studies being smaller than in our study. In addition, the individual age groups in those studies were including older age-groups. This would correspond to the findings by Rabbitt et al. (2001) that the decline from age 49–70 is much smaller than from age 70 and upward.

In terms of the result deviating from general observations we have to point out that our sample represents a highly functional and healthy group of people. General health plays an important role in preservation of cognitive ability (Wilson et al., 2012). Thirdly, the TVA assessment is unspeeded. Although the trials themselves are temporally limited by the exposure time, the response required from the subject was unspeeded. This may have reduced the cognitive load enough to counter effects of age-related decline normally found in general aging studies (Salthouse, 2000).

#### **EFFECTS OF ALCOHOL**

Many studies have investigated the influence of drinking on cognition in the elderly population (for a review see Peters et al., 2008). The general consensus is that low to moderate consumption may have a positive and protective effect on cognitive functioning as we grow older.

We asked the subjects to estimate how often they drank alcoholic drinks and the amount consumed at each event. The data show an increase in alcohol consumption over age but did not demonstrate any significant impact of consumption on the estimates of perceptual processing speed, capacity of VSTM or the visual threshold. This supports that low to medium consumption of alcohol in our sample did not influence the TVA estimates.

#### **EFFECTS OF PHYSICAL EXERCISE**

Physical exercise generally considered to be of benefit to the mental health in senior citizens (e.g., Clarkson-Smith and Hartley, 1989; Hillman et al., 2008). We asked the subjects to report the frequency and duration of physical training as well as the type of physical training practiced (if any). The data did not support that physical exercise influences the TVA estimates. There was a significant correlation between *C* and exercise (see **Table 2**) but further analyses showed that this was due to a confounder bias arising from a causal relation between Age\_Group and *C* and Age-Group and physical exercise. The multiple regression analysis controls for these co-variance factors and the result was no relationship between physical exercise and processing speed (*C*; **Table 3**).

#### **CONCLUDING REMARKS**

We presented the TVA estimates for a sample of 91 healthy subject aged 60–75 and influences on the estimates of selfreported physical exercise, alcohol consumption, video gaming as well as various demographic categorizations. We did this to provide a set of normative estimates to be used in future studies into effects of training and intervention as well as assessment of visual perception in relation to clinical conditions.

We found a significant effect of retirement status, gaming habits, and education level on the estimates of the perceptual processing speed as it relates to encoding of letters per minute.

Many studies of the capacity of visual attention have been done on younger, healthy student samples, which may not be representative of the population as a whole (e.g., Petersen and Andersen, 2012; Lohmann et al., 2013; Vangkilde et al., 2013; Wilms et al., 2013). It is our hope that the TVA paradigms in time will develop into a standard set of assessment tools for the basic elements of visual attention and that normative data will become available for other samples of the population.

In addition, studies into the effect of cognitive training have been conducted for many years and with inconclusive results. The data presented here supports the notion that in order to ascertain the effect of cognitive training, activities of daily living needs to be controlled for to avoid confounding the primary measurements. We need to improve and expand the availability of normative data for cognitive abilities susceptible to training both in healthy and injured populations through further research.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 03 July 2014; accepted: 18 September 2014; published online: 07 October 2014.*

*Citation: Wilms IL and Nielsen S (2014) Normative perceptual estimates for 91 healthy subjects age 60–75: impact of age, education, employment, physical exercise, alcohol, and video gaming. Front. Psychol. 5:1137. doi: 10.3389/fpsyg.2014.01137*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology. Copyright © 2014 Wilms and Nielsen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Cognitive aging on latent constructs for visual processing capacity: a novel structural equation modeling framework with causal assumptions based on a theory of visual attention

### *Simon Nielsen\* and Inge L.Wilms*

Brain Rehabilitation Advanced Technology and Learning Laboratory, Department of Psychology, University of Copenhagen, Copenhagen, Denmark

#### *Edited by:*

Brad Wyble, Syracuse University, USA

#### *Reviewed by:*

Brad Wyble, Syracuse University, USA Dietmar Heinke, University of Birmingham, UK Ellis Luise Gootjes-Dreesbach, University of Kent, UK

#### *\*Correspondence:*

Simon Nielsen, Department of Psychology, University of Copenhagen, Øster Farimagsgade 2A, 1353 Copenhagen K, Denmark e-mail: simon.nielsen@psy.ku.dk

We examined the effects of normal aging on visual cognition in a sample of 112 healthy adults aged 60–75. A testbattery was designed to capture high-level measures of visual working memory and low-level measures of visuospatial attention and memory. To answer questions of how cognitive aging affects specific aspects of visual processing capacity, we used confirmatory factor analyses in Structural Equation Modeling (SEM; Model 2), informed by functional structures that were modeled with path analyses in SEM (Model 1). The results show that aging effects were selective to measures of visual processing speed compared to visual short-term memory (VSTM) capacity (Model 2). These results are consistent with some studies reporting selective aging effects on processing speed, and inconsistent with other studies reporting aging effects on both processing speed and VSTM capacity. In the discussion we argue that this discrepancy may be mediated by differences in age ranges, and variables of demography. The study demonstrates that SEM is a sensitive method to detect cognitive aging effects even within a narrow agerange, and a useful approach to structure the relationships between measured variables, and the cognitive functional foundation they supposedly represent.

**Keywords: cognitive aging, visual attention, visual short-term memory, structural equation modeling, a theory of visual attention, SEM, TVA**

### **INTRODUCTION**

The increased lifespan in the general population has also increased the risk of cognitive decline. This has emphasized the need for the development of methods to detect, delineate and remedy cognitive decline, which are easy to administer and sensitive to age-related changes. In the current study, we are particularly interested in how age affects different aspects of visual processing capacity, such as the encoding/processing speed into visual short-term memory (VSTM) and the capacity of VSTM. We apply a novel computerized test battery to capture behavioral measures of visuospatial attention and memory in 112 healthy adults between 60 and 75 years of age. The testbattery is designed to assess cognition at different levels of functional complexity and specificity, to provide a detailed insight into the cognitive variables affected by age. To analyse the relationship between age and cognition we use structural equation modeling (SEM), which is a powerful approach to model the structure between measured and latent variables. To enforce the integrity of the SEM models in describing actual cognitive constructs, we apply A Theory of Visual Attention (TVA; Bundesen, 1990) to make the causal assumptions, that the TVA parameters C and K represents fundamental measures of processing speed (C) and VSTM capacity (K). This allows us to test whether specific measures affected by age relate mostly to processing speed or VSTM capacity, or conversely whether processing speed or VSTM capacity is mostly affected by age. Two SEM models are presented in the study. Model 1 is a path analysis SEM that tests a hypothetical organization of the measured variables according to functional complexity (level of assessment) and specificity (relative dependency on processing speed vs. VSTM capacity). Model 2 is a confirmatory factor analysis SEM that examines how age influences latent constructs for processing speed and VSTM capacity, when these are derived from multiple distinct measures informed by Model 1.

#### **BACKGROUND**

#### *Cognitive aging*

Cognitive aging has been related to decline in several higherorder visual working memory (VWM) abilities such as, speed of reading (Connelly et al., 1991; Hartley et al., 1994) mental image manipulation (Berg et al., 1982; Dror and Kosslyn, 1994), and memory recall (Berg et al., 1982; Dror and Kosslyn, 1994; Anderson et al., 1998). Also, there is a general consensus between behavioral and neuroimaging studies that decline in task switching abilities affects VWM performance in old age due to lack of attentional control in the wake of distraction (West, 1999; Clapp et al., 2010; Anguera et al., 2013). This has been proposed to be caused by a general selective attention impairment pertaining to inhibition of task-irrelevant information (Gazzaley et al., 2005; Clapp and Gazzaley, 2012).

In addition, a number of studies have reported age related decline in more general cognitive mechanisms of processing capacity such as a reduction in visual short term memory (VSTM) capacity (Habekost et al., 2012; McAvinue et al., 2012), and in perceptual processing/encoding speed (Salthouse, 1996, 2000; Habekost et al., 2012; McAvinue et al., 2012), which may contribute to the impaired VWM performance in old age (Brown et al., 2012; Franceschini et al., 2012).

The range and diversity of observed impairment raise the question whether the decline may share a common ground being related to either decline in capacity or encoding/decoding into different stages of working memory.

To examine this further, we create two SEM models to determine the dependency between measures and to test a hypothesized hierarchical relation between several behavioral measures that are typically sensitive to age related changes. We do this using data obtained from a novel test battery that comprise fundamental visuospatial measures (processing speed, perceptual threshold, VSTM capacity) as well as intermediate (delayed recognition, attention span) and compound VWM measures (reading, memory recall, mental image manipulation) according to a proposed hierarchically organization of measures.

#### *Structural equation modeling*

SEM was developed to estimate the direct effect of an independent variable on a dependent variable, in presence of several intracorrelated variables (Wright, 1921). This type of analysis is conceptually analogous to multiple regression models. However, in terms of applications, SEM distinguishes itself notably from these in that the coefficients represent the causal assumptions tested in the model, whereas this is not the case with regression analyses (Myth 2 in Bollen and Pearl, 2013). Another favorable property of SEM is that the method is largely invariant to multicollinearity issues—which multiple regression models are very sensitive to (when independent variables are intra-correlated, individual contributions cannot be distinguished properly). SEM also allows for combined factor-analyses to extract co-varying sources of information as latent variables, which we utilize in the current study to derive common factors for processing speed and VSTM capacity. As in the current study, SEM models are typically specified graphically as hypothesized structures in data (a priori models) that are translated by the SEM software (e.g., LISREL, AMOS), into a hypothetical structure (typically) in the co-variance matrix. The proposed SEM model is then tested against the actual structure in data via optimization (e.g., minimization) of a likelihood function to generate test statistics (typically maximum likelihood estimators) for statistical evaluation.

One of the strongest and most controversial claims surrounding SEM is whether or not causality can be inferred from SEM models. In the current study, we adapt the notion that what SEM does, is to provide quantitative causal conclusions and statistical fit measures based on the qualitative causal assumptions (a priori models), and the empirical measures that are fed into SEM. Further, significant model fit statistics (see the Discussions for an elaboration of these) do not prove the causal assumptions, but makes them tentatively more plausible (Myth 1 in Bollen and Pearl, 2013). In summary, SEM is useful to test relationships between multiple variables and is especially beneficial in large sample studies with several intra-correlated variables, such as in cognitive aging studies here, and in general.

While no previous studies (to our knowledge) have used SEM for the current purpose, related studies have systematically examined the influence of age on visual processing capacity in general. In Verhaeghen and Salthouse (1997) SEM models based on meta analyses of 91 studies showed that while several cognitive measures shared age related variance, the strongest effect of age was found on processing speed. In addition, this effect accounted for a large part of aging effects on more compound/higher level measures. These findings were recently corroborated in a neuroimaging study using SEM to structure the neural implications of aging effects on processing speed and working memory. Although white matter abnormality was only associated with decline in working memory, decline in processing speed was suggested to significantly impact other cognitive abilities (Charlton et al., 2008, however, see Penke and Deary, 2010 for a critique of their methodological approach).

#### *A theory of visual attention*

To model an assumed hierarchical organization of measures, proper estimates of the fundamental visuospatial functions are required. To this end, we use a whole-report letter paradigm to acquire data which were subsequently modeled according to A Theory of Visual Attention (Bundesen, 1990) for accurate estimates of visual processing/encoding speed (*C*), the capacity of visual short-term memory (*K*) and the visual perceptual threshold (*t0*). TVA is a formal mathematical theory of the fundamental mechanisms of the visual attention system, and provides a computational framework (Kyllingsbæk, 2006; Dyrholm et al., 2011) implemented as a limited capacity parallel race model based on principles of biased competition (Desimone and Duncan, 1995). According to TVA, visual representations race in parallel for encoding into VSTM and both the capacity of VSTM is limited (*K* letters) and the rate with which elements race (*C* letters per second). But the race is biased according to properties of pertinence and relevance to the task, and the probability of winning the race directly depend on these features (e.g., red letter will have a higher probability of encoding compared to a simultaneous presented blue letter if the task is to report red letters), which is formulated in the TVA equations.

TVA modeling has been used in a number of studies, which have justified its empirical relevance. Critical to the purpose of this study, previous findings have established that TVA estimates are sensitive to age, and the encoding/decoding speed to VSTM (processing speed *C*) and VSTM capacity (*K*) decline as we age (Habekost et al., 2012; McAvinue et al., 2012). Similarly, previous studies have shown that TVA estimates are (1) strongly related to commonly accepted neuropsychological measures of matched functions (Finke et al., 2005), and (2) largely unrelated to measures in the Attention Network Task (Posner and Petersen, 1990; Fan et al., 2002), which would be expected (Habekost et al., 2013).

Based on these properties, we employ TVA estimates in the current study and define processing speed by the TVA estimate *C*, and VSTM capacity by *K,* and make the causal assumptions that the TVA estimates constitute the most fundamental measures in the SEM models.

### **METHODS**

#### **PARTICIPANTS**

A total of 112 healthy adults aged 60–75 (*M* = 67.8, *SD* = 4.0) were included in the test. The gender average for age for males was (*N* = 34, *M* = 69.0, *SD* = 3.9) and for females (*N* = 78, *M* = 67.3, *SD* = 3.9), with different educational backgrounds and employment status. The subjects were recruited through the Facebook page of the local branch of The DaneAge Association ("Ældresagen"), an interest group for senior citizens in Denmark, and through advertisements in local newspapers and TV shows. The participants received a courtesy gift of chocolate and wine as thanks for participation. After signing a form of consent, potential participants were directed to a website where they were informed about the inclusion criteria. They had to be healthy with no history of brain injury, dementia and diabetes. They would also be excluded if they currently were under medical treatment for psychiatric disorder or suffered from color blindness. Eyesight (self-reported) had to be normal or corrected to normal.

### **ETHICAL CONSIDERATIONS**

All participants received oral and written information about the project and their tasks prior to the initiation of the trials. They all signed a written consent form and were instructed that they could leave the project at any time without any explanation.

The study was approved by the regional ethical committee (#40118).

### **COGNITIVE TESTS**

Tests were included in the study based on their sensitivity to measures of visual working memory in general, and age related differences in particular. Furthermore, the test battery was designed with an abstract hierarchal structure of assessment complexity in mind, such that both compound and fundamental measures of visual working memory were included. Top level measures was included to mimic day-to-day activities and serve as test of generalization, while lower- and intermediate-level tests were included to assess visuospatial attention and memory from different angles. As an example, a speed-of-reading task was included as a top-level assessment, while a delayed working memory task was included at the intermediate level and a TVA estimate of processing speed at the lower level. Combined, performance in the reading task are likely to depend on individual performance in the intermediate level task (working memory), which in turn is likely to depend on visual processing speed (TVA estimate *C*) at the lower level (Brown et al., 2012; Franceschini et al., 2012).

**Table 1** lists each of the included tests in the test battery with information about the measures they produce as well as the cognitive function being measured and any prior knowledge of sensitivity to age related changes including study references. The Memo task is an exception as it was developed specifically for the current study to provide a generalized measure of a day-to-day memory task. **Table 1** also include information about abstraction level of assessment according to the mentioned hierarchy in the tests. In addition to the cognitive tests, a dementia-screening test, the Minimal Mental State Examination (MMSE; Folstein et al., 1975), and a self-developed baseline motor response control test to calculate Fitts parameters (Fitts, 1954) were performed as part of the inclusion criteria to the study.

All cognitive tests were computerized for ease of application, which also provided bias-free scoring of assessment data. The tests are made freely available under the GNU General Public License and can be acquired from the corresponding author.

### **INDIVIDUAL TESTS**

#### *Delayed working memory*

The test was adapted from Clapp et al. (2010) and the stimuli were provided as courtesy of the main author. It was originally used in different versions to assess sensitivity to manipulated distractor interference in aging. In the current study, we employed the interruption condition from the original test in which a distracting stimulus needs to be attended to while remembering a cue stimulus during the delay phase. The test consisted of 66 trials. In 10% of the trials an interrupting stimulus was included, which required an additional (motor) response. These trials were excluded to avoid response biases on the primary task. In addition, the initial 5 trials were practice trials and were also excluded. A total of 55 trials were included for further analysis.

### *Four mountains task*

The test was adapted from Hartley et al. (2007) and computerized with some modifications. In the test participants were required to encode a detailed target stimulus with long exposure time, and immediately after identify the target in a set of 4 test stimuli. However, the target appeared in the test set under manipulated viewing conditions compared to the encoding phase thus imposing strong demands on working memory functions. A written message on the monitor informed participants that a self-paced "spacebar press" initiated a trial. A landscape cue was presented for 8 s immediately followed by a 4-sample probe display where the target would always appear but with viewing conditions manipulated. Selection of target landscape was done using the mouse. The test consisted of 32 trials. The first 5 trials were practice trails were feedback was provided. A total of 27 trials were included for further analysis.

### *Corsispan task*

The test was implemented as a forward Corsispan test (Corsi, 1972) with some modifications such as a random tile layout across trials. In the test, participants were required to remember sequences of spatial positions that increased in number of elements across the trials. A written message on the monitor informed participants that a self-paced "spacebar" press initiated a trial. On each trial, 10 purple tiles were randomly distributed across the monitor with the constraints that no linear (neither vertical, diagonal nor vertical) alignments of tiles were formed. During the memory period, individual tiles would lit up (turn yellow) for 1 s with a 1 s inter-tile delay to form a sequences. During the memory period, the mouse cursor was removed and its reintroduction indicated the beginning of the response period. Tiles were selected using the mouse, and a correct response required identification of all previously displayed tiles in the

#### **Table 1 | Overview of the testbattery.**


correct sequence. Sequence lengths would span linearly from 2 to 9 with each sequence length being repeated twice. The task terminated when two incorrect reports had been made across all previous trials. Following an incorrect tile selection, the correct sequence of tiles was displayed to participants as feedback. The Corsispan score was the longest sequence that could be correctly reported.

#### *Memo task*

The test was a novel memorizing procedure where participants had to locate identical pairs of tiles in a static square shaped grid. A 6 × 6 grid of square tiles all with a "?" icon, indicated the beginning of the test. The task was to identify the 18 identical pairs of images of toys hidden behind the 36 tiles. Each tiles was "turned" using the mouse, and upon turning the third tile in sequence, the previous two tiles were turned over again and their identity evaluated for a match. If the two tiles were identical the "?" was replaced with a "√" to indicate a correct pair. The time spend solving the task as well as the number of "misses" were logged at the end and displayed as feedback to the participants.

### *Reading task*

The test was adapted from the standard 9 grade Danish reading tests. The texts were extracts from the book "Dyret i dit spejl" by Bent Jørgensen, courtesy of the publisher Gyldendals Forlag. In the reading task, participants read approximately one page of highly factual text, and subsequently answered four questions regarding the text to assert comprehension. A written message on the monitor informed participants that a self-paced "spacebar" press would initiate the task. The completion of reading was recorded by a specific button press, which directed the participant to the 4-question, multiple-choice list. Answer selection was made using the mouse, and upon completion of all four answers participants indicate completion of the task by pressing a "finish" button on the monitor.

#### *Whole report task*

The test was adapted from (Kyllingsbæk, 2006) specifically to adhere to data format requirements for the TVA modeling procedure. Participants were required to identify—as many as possible—six briefly displayed letter targets arranged on an imaginary circle with an un-speeded response—please see (Wilms and Nielsen, 2014) for a thorough description and illustration of the test. The Whole Report data was used together with the TVA framework to estimate processing speed (*C*), short-term capacity (*K*) and the perceptual threshold (*t*0).

### *Fitts screening task (not included in analyses)*

The test measured visuomotor performance and required participants to click as fast as possible on one of four cued squares on the screen as fast as possible. Halfway through the test, the size of the squares was reduced to measure the difference in performance (Fitt's value).

### *General protocol*

Upon arrival, participants were met and introduced to the place by the test coordinator. Following a brief introduction, participants commenced the first of three test sessions. In the two first test sessions all the cognitive tests were completed, and each of these test session lasted approximately 1 h and 15 min including a 5 min break. In the last test session, participants completed the MMSE screening, which lasted approximately 30 min. Each test was initiated with a tutorial of the test to ensure that all subjects were well-informed. All tests but the MMSE were run on Windows PCs at an approximate viewing distance of 65 cm.

In the first two test sessions, participants were tested together in groups of up to seven people in the same experimental room, and in the last session (the MMSE) participants were tested individually in separate rooms with a test assistant. The three test sessions ran in parallel in separate rooms. Between test sessions, participants were guided to the waiting room where refreshments were made available (coffee, tea, fruit, snacks).

Tests were performed under the supervision of psychology students, who were all thoroughly trained in the test procedures and an experienced MMSE practitioner trained the assistants conducting the MMSE screening (see Acknowledgement Section). All assistants were explicitly informed to make the participants feel comfortable, and talk to them about topics that would come up, except specifics regarding the tests, and analyses. In addition the corresponding author supervised the progression of the sessions and procedures, and were called upon when needed.

#### **DEMOGRAPHIC DATA**

In addition to age, a number of demographic and life style information were gathered using questionnaires. Although some of the information divulged was previously shown to influence the TVA estimates (Wilms and Nielsen, 2014), we have chosen not to include them here since the current purpose is to derive information about the structure in data based on SEM models and the regression of age on specific and general levels of cognition.

#### **MODELING**

#### *Structural equation modeling*

SEM was used as the primary statistical method to test the causal assumptions made about the structural relations of the measures. The SEM models were specified as graphical models that were translated into testable structures in the covariance matrices of the measures. The basic components of SEM models are the variables and the logical links connecting them. Variables can either be observed/measured variables or unobserved/latent variables (MacCallum and Austin, 2000). Links can either be unidirectional to imply an assumed causal relation (regression), or bidirectional to imply simple correlation (covariance). In addition, each measured- and latent variable may have an associated error/residual variable to account for the (co) variance that cannot be explained by the SEM.

#### *Types of SEM and model specification*

We use two types of SEM models. Model 1 is a path analysis model where only measured variables are included, whereas Model 2 is a factor analysis model in which latent variables are derived as common factors between the measured variables (MacCallum and Austin, 2000). Models specification of both SEM models presented in the current study follows the *model generation* approach, in which an initial a priori model usually is adapted to the measures in the dataset (Jöreskog and Sörbom, 1996). The consequences of model adaptation to data are discussed in the Discussions Section.

#### *SEM statistics*

Standardized regression weights β indicate the strength of the (linear) relation (uni-directional links) and imply the direct relation between changes in the connected variables. For instance, A→B, *ß* = −0.3 means that a change in A causes a change in B in the opposite direction, with a magnitude equal to *ß*. Thus, if A increases 1 then B will decrease 0.3. The strength of correlations between variables is estimated by the corresponding covariance between measures. Thus, A↔B = −0.3 means that the co-variation between A and B is 0.3

Both of these statistics are shown in the graphical SEM models next to the links to which they apply.

Regression weights (β) are statistically evaluated by simple *t*-tests based on the critical ratio (CR), which is obtained by dividing a β*-*value with its associated standard error (SE). If the distributional assumptions of normality are met, CR has a standard normal distribution with the null hypothesis that the estimate has a population value of 0. Squared multiple correlations (SMC; *R*2) statistics describe the proportion of variance described in a variable by the correspondingly connected variables. Thus, SMC = 0.3 means that 30% of the variance was explained, which also can be used to interpret the residual variance that could not be accounted for by the model.

#### *SEM model fit indexes*

Several fit indexes are available to assert SEM model fit (Bollen and Long, 1993). Model fit indexes are distinguished on whether they are absolute or relative according to an often-used taxonomy (McDonald and Ho, 2002). Absolute fit indexes compare the proposed structure (the SEM model) to the actual one in data (here the covariance matrix) by minimization of a likelihood function to produce maximum likelihood (ML) estimators (Browne, 1984). Relative fit indexes are based on comparison of the ML estimators of the fitted model and a null model with uncorrelated variables (McDonald and Ho, 2002).

In the current study, we report two popular absolute fit indexes to evaluate how well the SEM models fit the data. The chi-square test of the ML estimators assert the probability that a more complex model fit the data better (McDonald and Ho, 2002). It is based on central distributional assumptions, and the null hypothesis stating that there is no difference between the proposed structure in the covariance matrix and the actual one. Thus, nonsignificant chi-square statistics suggest that the data fits well to the proposed SEM model and favors acceptance of the model. We also report the Root Mean Square Error of Approximation (RMSEA; Steiger and Lind, 1980; Browne and Cudeck, 1992), which is currently the most popular model fit index (Kenny et al., 2014). RMSEA is based on the non-centrality parameter (here the chi-square function subtracted the degrees of freedom) and that the null is false. Thus, the significance level of a statistical RMSEA evaluation implies how well the model fits the data. While criteria in the range [0.01–0.08] have been proposed to indicate excellent to mediocre fits (MacCallum et al., 1996) recent report criticize the use of cut-off criteria on the basis of lack of empirical support of these (Chen et al., 2008). However, general praise for RMSEA is mediated by the availability of confidence intervals, which relaxes assumptions on cut-off criteria (Hu and Bentler, 1998; MacCallum and Austin, 2000).

### *TVA*

The whole-report data were fitted with the TVA framework (Bundesen, 1990) using MATLAB and the LibTVA toolbox (Dyrholm et al., 2011) to extract the *t*0, *C* and *K* parameters. The LibTVA toolbox and user guide can freely be downloaded from http://zappa.psy.ku.dk/libtva.

#### **DATA ANALYSIS**

Raw response data were pre-processed in Python according to the principles described below. Statistical analyses were performed using IBM SPSS V.20 and IBM SPSS Amos V22.

#### **OUTLIER DETECTION**

We introduce a formal automated pipeline for detecting outliers to eliminate the manual tedious work of eyeballing data, and reduce the potential for biases. Outliers are detected at the trial level and at the participant level. Outlier detection on both trial and participant level is based on the Median Absolute Deviation method (MAD; see Leys et al., 2013 for a recent, relevant application). The MAD is an average measure of the variation of data points relative to the median. Specific data points are detected as outliers based on a ±2.5 MAD filter. The advantage of using the median as that the central point of evaluation is less influenced by outliers than when the mean is used, which is true both when computing the MAD and when detecting outliers based on the MAD. The outlier pipeline also includes a modified square root transformation (Cousineau and Chartier, 2010) to impose normality on response time (RT) measures that are inherently skewed. This procedure is applied at trial level only.

The outlier pipeline replaces formal guessing-rate criteria that can otherwise be used to control for random responses. For instance, the DWM task has a guessing rate of 0.25 and participants that fall near this threshold should be excluded on the account of random responses. In the current dataset, the outlier pipeline detected all of the cases where responses were near the guessing threshold by means of the MAD filter mentioned above. This approach thus resolved the issue of formally evaluating when a data point is statistically significant from guessing.

Outlier detection on trial level RTs were performed for both correct and incorrect response trials, and the subject level RT scores were computed as the average of all the trials surviving this procedure. This approach differs from the more traditional one used in some studies where the average RT scores are computed from correct response trials only (e.g., Clapp et al., 2010). The rationale behind using correct response trials only is that incorrect response trials may confound the average RT score by means of inattentiveness to the task. While this argument is valid when response accuracies are near ceiling, it is less so when task difficulty directly influences the response accuracy, such as in the Four Mountains- and DWM tasks used here.

Outlier detection at subject level was performed on the individual each test measures, and outlier identification of one measure within a test automatically excluded all measures within that test (pair-wise exclusion). This conditional criterion prevents confounds from random and flawed response. For instance, if participants misunderstood the DWM task instructions and due to this miscomprehension entered incorrect responses, it would yield a valid but random RT measure, but an invalid accuracy measure (where valid refers to whether they were detected by the outlier pipeline). In such cases both DWM measures would be excluded.

#### *Missing values exclusion*

Missing values were handled in SPSS by pairwise deletion, which contrary to list-wise deletion preserves the valid variables for a participant. AMOS on the other hand, is able to compute ML estimators even when values are missing based on principles by Anderson (1957).

### **RESULTS**

First, normative data is presented to provide an overview of the distributional properties of the cognitive measures in the dataset. Secondly, a set of bivariate correlations is presented to provide an overview of the correlation within the dataset, and to provide a means to test alternative SEM models for the dataset according to the principles described by Holm (1979; see Penke and Deary, 2010 for a more recent application). Finally, the SEM models are presented, which constitute the main results of the study from which we draw inferences and make conclusions.

#### **TEST SCORES AND CORRELATIONAL STATISTICS**

**Table 2** presents the descriptive test score statistics, and in **Table 3** the correlational statistics (Pearson's *r*) are presented. The purpose of the data tables is to give an overview of the dataset, how it is correlated, and how baseline visuomotor differences as measured by response time on the Fitts test—influence the measures.

The missing values in **Table 2** (the deviation from *N* = 112) reflect the filtering by the above-mentioned outlier detection pipeline. Outlier detection excluded an average of 6% of trials (*M* = 3.6; *SD* = 2.9) for the DWM\_RT measure, and 3% of the trials (*M* = 0.86; *SD* = 1.2) for the FM\_Time measure. Outlier detection on subject level excluded TVA measures for 14 participants, Memo measures for 13 participants, Reading measures for 6 participants, and Four Mountain's measures for 3 participants, while no outliers were detected in the Corsispan and DWM tasks. TVA measures were also excluded if the relative weight ratio between the left and the right side of the monitor fell outside the range 0.3–0.7 implying a strong bias to either side. This laterality bias could limit the estimate of *K* if participants had focused only on a small subset of the letters (e.g., Duncan et al., 1999). A total of 6 participants (part of the already mentioned 14 participants) were excluded on this criterion, and the average TVA *K*

**Table 2 | Descriptive statistics for the test scores.**




\*Correlation is significant at the 0.05 level (2-tailed) Bonferroni corrected for multiple comparisons.

\*\*Correlation is significant at the 0.01 level (2-tailed) Bonferroni corrected for multiple comparisons.

estimate of these *M* = 2.46 (*SD* = 0.04), compared to the average of all the included participants *M* = 3.58 (*SD* = 0.67) supports the argument of limited information to properly estimate *K*.

No effect of age was evident on any of the measures at the individual test level. That age did not influence the TVA estimates for processing speed (*C*), and for VSTM capacity (*K*) is not consistent with other similar studies (e.g., McAvinue et al., 2012), but consistent with what we previously found when adjusting for demographic and lifestyle factors (Wilms and Nielsen, 2014).

In **Table 3**, Fitts\_RT correlations assert the influence of baseline visuomotor differences on the cognitive tasks. Inspection of the table indicates that the cognitive measures were independent of the Fitts scores. Thus, the Fitts\_RT measure is omitted from further analyses.

The accuracy of response to questions in the reading test was excluded due to an extreme asymmetrical distribution (kurtosis = 3.45) and a sparse continuity (100% divided in 4 levels), which prevents efficient transformation of the measures (Cousineau and Chartier, 2010). The number of misses in the Memo task was excluded due to its strong dependency on the completion time of the Memo task, which is the primary measure of the task.

#### **STRUCTURAL EQUATION MODELS**

In the SEM models, standardized regression weights and covariance estimates are presented on the corresponding links, and the squared multiple correlation (*R*2) estimates on the variables. Unless otherwise specified, the only constraints imposed on the models are those of the residual regression weights (initially set to 1), which is more a convention than a constraint. Model fit statistics for the chi-square and the RMSEA indexes are presented in the models in a standard formats (note that LOW and UPPER in the RMSEA statistics correspond to the lower and upper 90% confidence intervals of the RMSEA index). Model-fit statistics for both SEM models encourage acceptance of fits of the models to the dataset. In addition, unless otherwise noted all regression statistics are significant at least at the 0.05 level.

#### *Model 1: Hierarchical and functional dependency structures*

Model 1 is illustrated in **Figure 1**. The purpose of this model is to test an a priori model of how the measures in the dataset are structured in terms of level of assessment complexity and dependency on the two cognitive functions in question. In Model 1, branches originate from the fundamental estimates of TVA through intermediate levels, and terminates at the three top-level VWM measures: completion time in the Reading task (Read\_Time), completion time on the Memo task (Memo\_Time), and response accuracy in the Four Mountains task (FM\_Acc). The leftmost structure in the model (with its apex at Read\_Time) comprises measures most strongly related to processing speed [TVA\_C, DWM\_RT, FM\_Time, Read\_Time], the rightmost one to measures of VSTM capacity [TVA\_K, Corsi\_Span, DWM\_Acc, FM\_Acc], while the middle structure indicate measures that depend more equally on both cognitive functions [TVA\_C, DWM\_RT, TVA\_K, TVA\_t0, Corsi\_Span, Memo\_Time]. All but the TVA\_C→DWM\_RT (*p* = 0.07) regression weights were significant. The average significance level for all regression coefficients was *M* = 0.02 (*SD* =

0.02). Covariance statistics suggest a strong correlation between TVA\_K↔TVA\_C (*p* < 0.01) and a modest correlation between TVA\_K↔TVA\_t0 (*p* < 0.05), which we have reported in a previous article (Wilms and Nielsen, 2014), and which are common to TVA estimates in general (e.g., McAvinue et al., 2012). The residual variances for the high level measures were uncorrelated, although e2↔e3 was marginally significant (*p* = 0.08). In summary, Model 1 shows that measures can be structured according to their level of assessment complexity, which is consistent with previous findings (Brown et al., 2012; Franceschini et al., 2012). Furthermore, the functional organization of measures indicates two major structures of processing speed and VSTM capacity.

#### *Model 2: Aging effects on processing speed and VSTM capacity*

Model 2 is illustrated in **Figure 2**. The purpose of the model is to test whether more general estimates of processing speed and VSTM capacity can be derived from multiple measures and how aging affects these. Model specification was informed by the functional structures in Model 1 to extract common factors based

on the subset of data that were most strongly related to either processing speed [Speed→(Read\_Time, DWM\_RT, TVA\_C, Memo\_Time)] or VSTM capacity [Capacity→(Corsi\_Span, TVA\_K, FM\_Acc)]. The factors were modeled as exogenous latent variables that on the one hand influence the dependent measured variables and on the other hand are influenced by Age and their respective residual variables (e8, e9). To enforce the integrity of the factors in representing their corresponding cognitive functions, the TVA measures *C* and *K* were assumed to define each function by constraining the model with the factor loadings TVA\_C→Speed and TVA\_K→Capacity, explicitly set to 1. To model the commonly found correlation between TVA estimates, *C* and *K* (see Model 1 and description), the covariance was estimated between the residuals of those variables (e3↔e6). The average significance level for all factor loadings originating from Speed and Capacity was *M* = 0.02 (*SD* = 0.01). Covariance estimates for the residuals pertaining to Speed and Capacity suggests that independent factor were extracted to represent these cognitive functions (*p* = 0.12). Furthermore, the TVA\_K and TVA\_C variables were correlated as indicated by significant covariance estimates for their respective residuals (e3↔e6, *p* = 0.01).

Age significantly affected Speed (*p* = 0.01), while Capacity was not significantly affected (*p* = 0.14). These findings are consistent with some findings suggesting that processing speed and not capacity is influenced by age (e.g., Brown et al., 2012) but inconsistent with other findings suggesting that both are affected by age (e.g., McAvinue et al., 2012).

In summary, Model 2 shows that common factors can be derived to estimate more general representations of processing speed and VSTM capacity, and while age did not influence the TVA estimates for processing speed (see **Table 3**) as it has been reported elsewhere, we found a more general effect of age on processing speed components (Salthouse, 1996, 2000).

### **DISCUSSIONS**

#### **SUMMARY**

In the current study, we examined the influence of age on behavioral measures of visuospatial attention and memory in a sample of 112 healthy adults aged 60–75. Model 1 provides evidence for a hierarchical dependency structure between measures according to functional complexity, such that higher-level visual working memory (VWM) performance could be predicted by several performance measures of lower-level visuospatial functions. Furthermore, measures were grouped in distinct structures according to functional specificity. Measures relating to processing speed were grouped in a left side structure and measures relating to VSTM capacity in a right side one, while a central structure was indicative of measures depending more equally on both functions. In Model 2, the effect of age was assessed on common factors for processing speed (Speed) and VSTM capacity (Capacity) that were derived from several distinct measures based on Model 1. In summary, our results suggest that for this sample, processing speed more than VSTM capacity is affected by age.

### **ON THE INFLUENCE OF AGE**

The main findings on the effects of aging are that processing speed more than VSTM capacity is affected by age. Moreover, in previous regression analyses on the same dataset we were not able to detect aging effects on any of the TVA estimates (Wilms and Nielsen, 2014) (see also **Table 3**) as it has previously reported (Habekost et al., 2012; McAvinue et al., 2012; Espeseth et al., 2014). However, here we were able to detect significant age effect on the latent construct for processing speed (Speed; Model 2) when the TVA estimate for processing speed (*C*) was assumed to depend on Speed. In effect, Model 2 showed that although the TVA *C* parameter was not influenced by age at the individual test level, the processes represented by TVA *C* might in fact be. In conclusion, the SEM approach presented here provides a sensitive method to examining cognitive aging effects, even within a relatively narrow age-range.

The finding of selective effects of age on processing speed, compared to VSTM capacity, is consistent with some studies (Salthouse, 2000; Hedden and Gabrieli, 2004; Brown et al., 2012) and inconsistent with others reporting aging effects on both TVA measures for processing speed (C) and VSTM capacity (K; Habekost et al., 2012; McAvinue et al., 2012; Espeseth et al., 2014). There are several plausible explanations to this inconsistency. In the previous TVA studies participants belonged to older age groups and broader age ranges than in our study, which may have led to a larger age related variance in these studies compared to ours. Similarly, differences in demographic- and lifestyle variables between the samples may alter the onset age of decline in addition to cause different trajectories for the age effects. A general review by Hedden and Gabrieli (2004) points to the fact that the onset age of decline is highly individual and dependent on many factors, such as education level and social engagement level. Even cognitive engagement has been found to have a substantial influence (Wilson et al., 2012). However, one of the main findings in relation to cognitive memory decline are remarkable sparing of semantic and short-term memory with age, contrasted by a decline in autobiographical, emotional and implicit memory (Hedden and Gabrieli, 2004). This would support the findings in this study as well as in our previous study (Wilms and Nielsen, 2014) that the capacity estimates of VSTM is more robust to agerelated changes in healthy adults, at least within this narrow age range. Processing speed, on the other hand, has for a long time been considered to be declining at a general level from 40 years and onwards (Salthouse, 2000).

#### **ON THE CORRELATION BETWEEN THE TVA C AND K ESTIMATES**

A typical finding in the TVA literature is that the estimates for processing speed, *C*, and VSTM capacity, *K* are correlated (e.g., Finke et al., 2005; Habekost et al., 2013). In Wilms and Nielsen (2014) we also reported this finding, however, here we were able to model the correlation as the covariance between the residuals for TVA estimates (Model 2) and to extract independent common factors for processing speed and VSTM capacity on the same dataset. The correlation between TVA estimates does not pose a problem for the theoretical validity of TVA, since there are several plausible explanations as to why individuals with a relative high VSTM capacity tend to have a comparable high processing speed vice versa (e.g., common underlying resources). Nevertheless, we find it encouraging being able model the correlation at measurement level and examining the constructs independently—whatever the source of correlation may be.

### **ON THE INTERPRETATION AND EVALUATION OF THE SEM MODELS**

A number of limitations arise when making inferences based on individual SEM models/datasets, which relate to generalizability of findings and thus the conclusions that can be drawn from them. Interpretations are frequently criticized, and cautions raised in the SEM literature that one can only strive to propose an adequate model, which fits reasonably well to data. Even when the model fits the data extremely well, this does not strengthen generalizability but merely indicates that the model is plausible (MacCallum and Austin, 2000). To further strengthen the plausibility of *a* SEM model, it should be critically evaluated against alternative SEM models (model comparison), and preferably on different datasets (Bollen and Pearl, 2013).

Evaluation of SEM models are based on model fit indexes and while a large number of these have been proposed (e.g., Bollen and Long, 1993), there is little consistent advice about which index asserts a given type of model best (MacCallum and Austin, 2000). This lack of standard may have caused a trend toward merely reporting a large number of these indexes "apparently because we don't know how to use any of them" (McDonald and Ho, 2002). A recent discussion on the topic in a supplementary SEM module to the American Psychology Associations' (APA) Journal Article Reporting Standards (JARS; Hoyle and Isherwood, 2013) also reflects this inconsistency. Despite the authority of this institution in conceiving clear recommendations, there are no direct suggestions, and it is merely emphasized that authors should critically review the core literature. In the endeavor of aligning our approach most efficiently, we reviewed the articles discussed there along with the cited literature in those. Ultimately, we based our approach on the highly influential reviews of MacCallum and Austin (2000) and Hu and Bentler (1998), on specific advices to the Psychological literature (McDonald and Ho, 2002), and on those that generally concern best practice in advanced SEM studies (Mueller and Hancock, 2008; Bollen and Pearl, 2013).

The two model fit indexes reported here (see Methods section for description) are sensitive to sample size to an extend that requires substantial consideration when interpreting them (which goes for goodness of fit indexes in general). A relatively large sample size will cause even insignificant discrepancies to be significant for the chi-square test of the ML estimate (Hox and Bechger, 1998). Furthermore, since the RMSEA is a non-centrality estimate based on the chi-square test of the ML estimate, by implication the same argument goes for the RMSEA, although with the opposite effect of sample size. Thus, the two fit indexes may be influenced equally by sample size and degrees of freedom (due to their mathematical equality) but with opposite signs in terms of model rejection (due their opposing statistical characteristics). More importantly, in these boundary conditions, the integrity of these fit indexes are controversial as they convey little useful information. While there is no specific definition on what constitutes *small* sample sizes or degrees of freedom, it is the authors' interpretation based on (MacCallum and Austin, 2000) that fewer than 100 samples might be critical (however see Barrett, 2007 for an even more critical approach suggesting 200 as a limit). This proposal is quantitatively supported by a recent simulation approach by Kenny et al. (2014) suggesting that a sample size of 50 or below is critical especially for small degrees of freedom (approximately < 10), whereas a sample size of 100 in combination with 10 or more degrees of freedom are sufficient. In accordance with the above, we suggest that data conditions were proper in the current study and that the SEM models proposed do fit the data well.

The design of the SEM models presented here can be categorized as belonging to the *model generation* strategy (Jöreskog and Sörbom, 1996) in which an initial a priori model is adapted to the measures in the dataset. According to a thorough review of 500 publications in 16 psychology journals between 1993 and 1997 MacCallum and Austin (2000) found that this strategy was used in approximately 25% of the studies reviewed, and argued that this number is unfortunately high. According to the authors, the problem with model generation strategies lies in its data-driven approach, which, in combination with the mentioned issues of specificity to dataset, further challenges the generalizability of SEM models. The proposed optimal strategy for evaluating SEM models is the *alternative models* strategy (Jöreskog and Sörbom, 1996) where several a priori models are tested against each other and conclusions are based strictly on which model was the best predictor of the dataset, more than which model was the most correct one. To acknowledge this procedure and best practice we relax our assumptions of generalizability accordingly, and have reported the changes made from the original a priori models. Modifications were modest for Model 1 and included only reorganizing of 10–15% of links and introduction of additional covariance links due to lack of initial technical understanding (more than a conceptual reorganization of the model). For Model 2, the modifications required removal of an unobserved variable for attentional control and reorganizing of 20–30% of links over 2–3 iterations since there were not enough power in the data set to derive a robust factor relating to attentional control. Although several of the tests in our study impose strong demands on attentional control (e.g., Corsispan task) we did not intend to assess attentional control explicitly and as such did not include distinct measures of these functions, which is a likely cause as to why no construct could be derived for attentional control (however, see Salthouse et al., 2003 for an alternative explanation).

### **CONCLUSIONS**

We studied aging effects on behavioral measures of visuospatial attention and memory in 112 healthy adults between 60 and 75 years of age. Using Structural Equation Modeling (SEM), we were able to model the hierarchical dependency structures between higher and lower level measures according to functional complexity, and distinct structures that grouped measures according functional specificity (Model 1). Furthermore, based on distinct measures of either function (informed by Model 1) we were able to derive independent latent constructs for processing speed and visual short-term (VSTM) capacity, and examine the effect of age on these (Model 2). The main finding is that processing speed compared to VSTM capacity is most strongly influenced by age in this sample. The current study also demonstrated that the proposed SEM framework is a sensitive approach to detect even subtle cognitive changes within a narrow age range.

### **ACKNOWLEDGMENTS**

This study was funded by a grant from the Danish "Fornyelsesfond" as well as a grant from the Danish Research Council for Culture and Communication. In addition we wish to thank a number of colleagues for their help with this study. Wesley Clapp for the stimuli material to the Delayed Working Memory, task Randi Starrfelt for advice and directions on the Corsispan- and Four Mountains tasks, Signe Vangkilde and Anders Petersen for advice and discussions of TVA specifics in relation to modeling and interpretation, and Iris Wiegand for training the test assistance in conducting the MMSE test.

### **REFERENCES**


Wilms, I., and Nielsen, S. (2014). Normative perceptual estimates for 91 healthy subjects age 60–75. *Front. Psychol.* 5:1137. doi: 10.3389/fpsyg.2014.01137

Wilson, R. S., Segawa, E., Boyle, P. A., and Bennett, D. A. (2012). Influence of latelife cognitive activity on cognitive health. *Neurology* 78, 1123–1129. doi: 10.12 12/WNL.0b013e31824f8c03

Wright, S. (1921). Correlation and causation. *J. Agric. Res.* 20, 557–585.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 24 September 2014; accepted: 30 December 2014; published online: 15 January 2015.*

*Citation: Nielsen S and Wilms IL (2015) Cognitive aging on latent constructs for visual processing capacity: a novel structural equation modeling framework with causal assumptions based on a theory of visual attention. Front. Psychol. 5:1596. doi: 10.3389/ fpsyg.2014.01596*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology.*

*Copyright © 2015 Nielsen and Wilms. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Acute exercise and aerobic fitness influence selective attention during visual search

### *Tom Bullock1,2 \* and Barry Giesbrecht1,2*

<sup>1</sup> UCSB Attention Lab, Department of Psychological and Brain Sciences, University of California, Santa Barbara, CA, USA <sup>2</sup> Institute for Collaborative Biotechnologies, University of California, Santa Barbara, CA, USA

#### *Edited by:*

Claus Bundesen, University of Copenhagen, Denmark

#### *Reviewed by:*

Ulrich Ansorge, University of Vienna, Austria

Christian Gaden Jensen, Copenhagen University Hospital, Denmark

#### *\*Correspondence:*

Tom Bullock, UCSB Attention Lab, Department of Psychological and Brain Sciences, University of California, Santa Barbara, CA 93106, USA e-mail: twbullock@googlemail.com

Successful goal directed behavior relies on a human attention system that is flexible and able to adapt to different conditions of physiological stress. However, the effects of physical activity on multiple aspects of selective attention and whether these effects are mediated by aerobic capacity, remains unclear. The aim of the present study was to investigate the effects of a prolonged bout of physical activity on visual search performance and perceptual distraction. Two groups of participants completed a hybrid visual search flanker/response competition task in an initial baseline session and then at 17-min intervals over a 2 h 16 min test period. Participants assigned to the exercise group engaged in steady-state aerobic exercise between completing blocks of the visual task, whereas participants assigned to the control group rested in between blocks. The key result was a correlation between individual differences in aerobic capacity and visual search performance, such that those individuals that were more fit performed the search task more quickly. Critically, this relationship only emerged in the exercise group after the physical activity had begun. The relationship was not present in either group at baseline and never emerged in the control group during the test period, suggesting that under these task demands, aerobic capacity may be an important determinant of visual search performance under physical stress.The results enhance current understanding about the relationship between exercise and cognition, and also inform current models of selective attention.

**Keywords: attention, visual search, distraction, physical activity, exercise, fatigue, aerobic fitness, VO2max**

### **INTRODUCTION**

One essential feature of the human attention system is the ability to selectively process goal-relevant visual information while ignoring goal-irrelevant information. Coherent behavior in our complex environment requires a flexible selective attention system that can not only adapt to changes in perceptual and cognitive task demands, but also to additional challenges caused by physical fatigue and stress. Current models of selective attention, such as perceptual load theory (Lavie, 1995; Lavie et al., 2004) and the theory of visual attention (TVA; Bundesen, 1990; Bundesen et al., 2005) have focused on the flexibility of selective attention in response to changes in task demands and goals in tightly controlled laboratory studies carried out under conditions of low stress. What is largely unclear from this work, however, is the extent to which physical fatigue modulates selective attention. The aim of the present work is to gain a more comprehensive understanding of the effects of long bouts of physical activity on the functioning of the attention system during visual search.

Brief, acute bouts of physical activity can influence performance in a range of cognitive tasks (for meta-analytical reviews see Lambourne and Tomporowski, 2010; Chang et al., 2012). Several physiological mechanisms are thought to contribute to the effects of exercise-induced arousal on cognition, including changes in heart rate (e.g., Hillman et al., 2003; Davranche et al., 2005) and levels of brain-derived neurotrophic factor (BDNF; Ferris et al., 2007). A number of studies have investigated exercise effects on specific aspects of selective attention and cognitive control, but the findings have been inconsistent. For example, relative to visual search performed at rest, the time to detect a target object amongst distractors during visual search is faster during an acute bout of exercise at 100% of maximum aerobic capacity (McMorris and Graydon, 1997) and after a 10 min bout of cycling at up to 85% of maximum aerobic capacity (Aks, 1998). In contrast, Bard and Fleury (1978) reported no difference in visual search target detection speed in participants tested before and after a maximal VO2max test. Similarly, in flanker response competition tasks (e.g., Eriksen and Eriksen, 1974) there is also evidence of a response time benefit (i.e., speeding up) during a 20 min bout of exercise at 50% of maximal aerobic workload that is independent of the level of distractor interference (Davranche et al., 2009). However, Hillman et al. (2003) found no effect of a 30 min bout of treadmill running at >80% of maximum heart rate on RTs in a flanker task, although they did find neural evidence from EEG to suggest enhanced stimulus classification speed and a relationship between individual aerobic capacity, neural, and behavioral indices of error monitoring (Themanson and Hillman, 2006). This discrepant pattern of results may have many causes, including differences in exercise type, differences in exercise intensity and duration, the design of the behavioral task and the timing of its administration, and individual aerobic capacity and experience with exercise.

Despite the conflicting evidence in the literature, it is clear that relatively brief bouts of acute exercise can influence multiple aspects of selective attention and cognitive control. However, few studies have tested the impact of extended bouts of activity on cognition and, as a result, the effects of these extended bouts are less well understood. Prolonged exercise can lead to hypoglycemia, which can result in increased central fatigue and increased ratings of perceived exertion because the supply of metabolites to the brain is restricted (Nybo and Secher, 2004). Furthermore, acute hypoglycemia induced via insulin infusion can have a detrimental effect on cognitive performance (Strachan et al., 2001; Schachinger, 2003). While it is reasonable to infer that hypoglycemia associated with a long bout of physical activity may also impact cognitive function, there is limited empirical evidence for this in the literature and the evidence that does exist is contradictory. For example, Hogervorst et al. (1996) gave participants a battery of cognitive tasks before and after a 60-min bout of cycling at 75% of maximal work capacity and found exercise-induced improvements in tests of executive function and simple reaction time, although performance at choice reaction time and finger tapping tasks was unaffected. Tomporowski et al. (2007) tested participants cycling at 60% of their VO2max for up to 120 min with and without fluid replacement and found that executive processing speed improved irrespective of the level of dehydration, although this was accompanied by an increase in errors. In contrast, Moore et al. (2012) reported that participants who cycled at 90% of their ventilatory threshold for 60 min performed worse at a complex perceptual discrimination task than participants who rested for an equivalent amount of time. Detection speed in a memory demanding vigilance task was also increased in participants who exercised, although detection sensitivity did not suffer. Thus, when considered together the effects of exercise-induced fatigue on cognitive performance in general are unclear, and the specific effects on selective attention and cognitive control remain untested.

The aim of the present study was to test performance on a selective attention task at several stages throughout an extended bout of steady state exercise. Participants cycled for 2 h and 16 min in total, stopping every 17 min to perform a selective attention task. A control group also completed the task the same number of times as the exercise group, but without the exercise. The behavioral task was a hybrid flanker visual search task designed to measure overall search performance as a function of task difficulty and the distraction caused by task-irrelevant stimuli that mapped onto competing responses (e.g., Lavie and Cox, 1997; Lavie and Fox, 2000; Forster and Lavie, 2007). The design of this task meant that visual search difficulty and distractor interference could be manipulated independently, thus we were able to obtain indices of both overall search performance and selectivity.

Based on existing evidence from studies using protocols with brief, acute bouts of exercise (McMorris and Graydon, 1997; Aks, 1998) we predicted overall enhanced visual search performance in the early stages of our study. Previous flanker studies have reported either reduced distraction as a function of an acute bout of exercise (Davranche et al., 2009) or no effects (Hillman et al., 2003; Themanson and Hillman, 2006), so it is possible that exercise may

either enhance or have no effect on distractibility in the early stages of activity. Conversely, at later stages of the testing session, there may be a decline in performance as participants become increasingly fatigued (Moore et al., 2012). Furthermore, given that several studies have demonstrated superior performance in high-fit individuals compared with low-fit individuals (Colcombe et al., 2004; Themanson and Hillman, 2006), we predicted there may also be a relationship between fitness level and aspects of performance at this task.

### **MATERIALS AND METHODS**

#### **PARTICIPANTS**

Twenty-eight adult volunteers (14 exercise group, 14 control group) who were students at the University of California, Santa Barbara, took part in the study, either in exchange for course credit or for financial compensation of \$10 per hour. The sample size was determined based on similar studies in the cognition/exercise literature (Themanson and Hillman, 2006; Moore et al., 2012) and previous studies from our lab that have used manipulations of task load and response competition (Giesbrecht et al., 2007; Sy et al., 2013, 2014). All participants read and signed a consent form at the beginning of the session. All procedures were approved by the UCSB Human Subjects Committee and the US Army Human Research Protection Office.

One male was removed from the exercise group as he became exceptionally tired midway through the study and was unable to maintain the required workload. An additional male was excluded from the control group due to a failure of the heart rate monitor. Demographic and fitness data from the remaining 26 participants are reported in **Table 1**, along with independent samples *t*-tests confirming no significant group differences. All participants completed the physical activity readiness questionnaire (PAR-Q; National Academy of Sports Medicine, USA) in order to determine their eligibility to participate in aerobic activity. All participants reported having normal or corrected to normal vision.

#### **VISUAL SEARCH TASK**

The task was designed to measure distraction during visual search, and was based closely on a task developed by Lavie and Cox (1997). All stimuli were presented on an 18- monitor with custom scripts that utilized the Psychophysics Toolbox for MATLAB (Brainard, 1997). Participants viewed the screen at a distance of 57 cm. Each trial of the search task consisted of a centrally presented fixation cross (1000 ms ± 125 ms), followed by the search array (100 ms) and then a blank gray screen (31.2 cd/m2) which remained on screen until a response was made (**Figure 1A**). Each search array consisted of six black upper-case letters (12.5 cd/m2) subtending 0.6◦ by 0.4◦, arranged in a circle subtending 2◦ from a central fixation point. Participants were instructed to search for a target letter (X or N) among an array of non-target letters and respond by pressing the corresponding key on the keyboard as rapidly and accurately as possible. Task difficulty was manipulated between blocks by requiring participants to search for the target among dissimilarly shaped, curvy letters (SCOGB) in the low load condition, and similarly shaped, angular letters (HKVWZ) in the high load condition. Distraction was also manipulated by presenting a task-irrelevant flanker letter (0.8◦ by 0.5◦) to the right



or left of the search array, 1.4◦ from the nearest non-target letter (**Figure 1B**). This flanker was either compatible (same letter) or incompatible (different letter) with the distractor. Target and non-target positions were randomized across trials and the distractor was equally likely to appear on the left or right of the search array.

#### **SUBMAXIMAL VO2max TESTING AND EXERCISE INTENSITY CALCULATION**

A measure of estimated maximal oxygen consumption (VO2max) was obtained from each subject by having them mount a stationary bike (CycleOps 400 Pro, Saris Cycling Group, WI, USA) and complete theAstrand-Ryming Submaximal Bike Test (Åstrand and Ryhming, 1954). The test involved a 5-min warm-up at a low pedaling resistance producing ∼40 Watts (W) of power, followed by a 6-min test phase at a higher pedaling resistance producing between 80 and 150W, depending on individual fitness, followed by a 2-min cool-down at 40 W. The goal was to elevate the subject's heart rate to a relatively stable level above 120 BPM in the final 2-min of the test phase. Heart rate was recorded at minutes five and six of the test phase using a CycleOps wireless heart rate monitor, and the mean of these two values, along with the subject's power output, were used to calculate an estimate of absolute VO2max (mL·min−1) in accordance with the guidelines in Åstrand and Ryhming (1954). An estimate for relative VO2max (mL·kg−1·min−1) was then calculated by dividing the value for absolute VO2max by the subject's body mass (kg), in accordance with ACSM guidelines (ACSM, 2007, p. 7).

The goal was to have participants work at ∼50% of their VO2max, so an individual working VO2 value was calculated for each subject by dividing their relative VO2max value by two (ACSM, 2007, p. 29). This VO2 value and the subject's body mass were then used in the ACSM leg ergometer equation (Equation 1; ACSM, 2007, p. 47) to calculate an appropriate, approximate pedaling resistance level (kgm/min). This value was converted toWatts and used throughout the study.

Equation 1: ACSM leg ergometer equation (ACSM, 2007, p. 47)

$$\text{VO2 mL} \cdot \text{kg}^{-1} \cdot \text{min}^{-1} = 1.8 \frac{\text{work rate (kg} \cdot \text{m/min})}{\text{body mass (kg)}} + 7$$

#### **PROCEDURE**

#### *Exercise and test sessions*

Participants were briefed on the nature of the experiment and the duration and intensity of exercise that they would be undertaking. After signing a consent form, participants were randomly assigned to either the exercise or control group. All participants arrived with the expectation that they would be engaging in exercise; this was necessary to ensure no differences in exercise anticipation anxiety between the groups. Participants were fitted with a heart rate monitor and then sat still and relaxed while baseline heart rate activity was recorded.

Both groups then completed practice trials of the task followed by one session of the visual search task (64 trials low load, 64 trials high load) to establish baseline search task performance. Participants assigned to the exercise group completed the Astrand-Rhyming submaximal bike test, then remounted the bike and cycled for 15 min at ∼50% of their VO2max (mean resistance level = 91 W, SD = 24 W), followed by a 2 min cool-down with minimal resistance (40 W). They then dismounted the bike, sat on a chair in front of a computer and completed one session of the visual search task. This exercise > cool-down > search task procedure was then repeated seven more times, so that in total each subject completed the search task nine times interspersed with eight exercise sessions. In total, exercise participants pedaled for 2 h total at ∼50% of VO2max, and 16 min at 40 w (cooling down at the end of each session). Participants assigned to the control group completed the same number of sessions at the same times as the exercise group, but rather than exercising for 15 + 2 min between sessions they just sat quietly doing nothing. Control participants were given the Astrand-Ryming test at the end of the session.

#### *Ratings of perceived exertion*

At the start of the study participants werefamiliarized with the Ratings of perceived exertion (RPE) scale (Borg, 1970, 1982). Exercise group participants verbally reported ratings of perceived exertion three times during each exercise block (at 4, 9, and 14 min). RPE is a subjective rating of the intensity of physical sensations a person experiences during physical activity, including increased heart rate, respiration rate, muscle fatigue and physical discomfort. When prompted by the experimenter, participants appraised their feeling of exertion by viewing a scale and reporting a number between six (no exertion) and 20 (maximal exertion).

#### *Saliva samples*

Saliva samples were collected from both groups prior to the baseline task block, and then after task sessions two, four, six, and eight. The aim was to obtain measures of salivary cortisol and alpha amylase as participants progressed through the testing session. Bouts of physical activity are associated with increases in salivary cortisol (e.g., Chicharro et al., 1998) and alpha-amylase (Li and Gleeson, 2004), so these measures would help confirm that our stress manipulation was effective. For each saliva sample, participants passively drooled 2 ml of saliva into a plastic vial via a plastic drinking straw. Samples were immediately frozen for storage at −20◦C. To ensure the samples were accurate and high quality, participants were required to not drink within a 10-min period prior to the collection of each saliva sample, as consumption of liquid within this period can dilute the sample. Participants were allowed unrestricted access to water at all other times during the testing session. Consumption of food or any other types of liquid was not permitted. Samples were shipped on dry ice for analysis at the Clinical Endocrinology Laboratory, UC Davis, Davis, CA, USA.

#### **DESIGN**

Three within-participants variables were manipulated. First, visual search performance was sampled nine times throughout the testing session. Second, distractibility was measured by manipulating whether the task-irrelevant letter presented outside of the search array was compatible or incompatible with the target. Comparing speed and accuracy between the different distraction conditions provided an index of distractor interference. Third, search task difficulty was manipulated by presenting non-target items in the search array that were either similarly shaped (high load) or dissimilarly shaped (low load) to the target. A number of studies have demonstrated that distraction is typically more robust under conditions of low visual search load (e.g., Lavie and Cox, 1997; Forster and Lavie, 2007).

#### **ANALYSES**

Data from 26 participants (13 exercise, 13 control) were collapsed from nine sessions (baseline, 17, 34, 51, 68, 85, 102, 119, 136 min) into five sessions (baseline, 34, 68, 102, 136 min) to increase statistical power. The 34, 68, 102, and 136 min sessions will be collectively referred to as the "test" sessions from here onward. *Post hoc* analyses for main effects and interactions of interest were computed using paired or independent samples *t*-tests correcting for multiple comparisons using the false discovery rate method with a threshold of 0.05 (Benjamini and Hochberg, 1995). For completeness, both the uncorrected and FDR adjusted *p*-values (*q*) are reported for the *post hoc* tests.

### **RESULTS**

The results are reported in three sections. First, we evaluate the physiological data to confirm the effectiveness of the physical fatigue manipulation. Second, we analyze the behavioral data to determine whether exercise had any impact on search task performance. Third, we examine the relationship between individual aerobic capacity (VO2max) and search task performance.

### **PHYSIOLOGICAL DATA**

#### *Ratings of perceived exertion*

Participants in the exercise group perceived their exertion to be "light/somewhat hard" in the first half of the session, increasing to "somewhat hard" after 102 min and "somewhat hard/heavy" at 136 min (**Figure 2A**). A repeated measures ANOVA on the exercise group (the baseline condition was excluded because participants were not exercising at this stage) with the within participants' factor session (34, 68, 102, 136 min) confirmed a significant increase in mean RPE as a function of time spent cycling [*F*(3,36) = 31.09, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.72]. *Post hoc* paired samples *<sup>t</sup>*-tests confirmed significant increases between 34 and 68 min [*t*(12) = −2.79, *p* = 0.016, *q* = 0.016], 68 and 102 min [*t*(12) = −4.39, *p* = 0.001, *q* = 0.002], and 102 to 136 min [*t*(12) = −7.45, *p* = 0.001, *q* = 0.002].

#### *Heart rate*

Heart rate increased in the exercise group as a function of time spent cycling, and decreased slightly over the course of the experiment in the control group (**Figure 2B**). A repeated measures ANOVA with session as the within participants factor (34, 68, 102, 136 min) and group as the between participants factor (exercise, control) confirmed significant main effects of session [*F*(4,96) <sup>=</sup> 196.62, *<sup>p</sup>* <sup>=</sup> 0.008, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.89], group, [*F*(1,24) <sup>=</sup> 166.98, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.72] and a session <sup>×</sup> group interaction [*F*(4,96) <sup>=</sup> 308.60, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.93].

Within-group direct comparisons confirmed significant increases in heart rate in the exercise group from baseline to 34 min [*t*(12) = 19.13, *p* < 0.001, *q* = 0.004], 68 to 102 mins [*t*(12) = 4.70, *p* = 0.001, *q* = 0.004], and from 102 to 136 mins [*t*(12) = 3.16, *p* = 0.008, *q* = 0.016]. In contrast, heart rate decreased slightly over the course of the session in the control group, with a significant drop from 34 to 68 min [*t*(12) = 3.26, *p* = 0.007, *q* = 0.016].

#### *Salivary cortisol and alpha-amylase*

Salivary cortisol levels increased in the exercise group relative to the control group over the duration of the test session. Raw and baseline corrected cortisol data are presented in **Figures 2C,D**, respectively. A mixed measures ANOVA with session (34 m, 68 m, 102 m, 136 m] as the within participants factor, and group (exercise, control) as the between participants factor, was computed for the baseline corrected salivary cortisol data. The analysis revealed main effects of session [*F*(3,72) <sup>=</sup> 7.84, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.25], group [*F*(1,24) <sup>=</sup> 15.59, *<sup>p</sup>* <sup>=</sup> 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.39], and a session by group interaction, [*F*(3,72) <sup>=</sup> 12.73, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.34]. Direct comparisons between the exercise and control groups confirmed that cortisol was significantly higher in the exercise group than the control group at 34 min [*t*(24) = 2.11, *p* = 0.046, *q* = 0.046], 68 min [*t*(24) = 3.11, *p* = 0.01, *q* = 0.007], 102 min [*t*(24) = 4.17, *p* < 0.001, *q* = 0.002], and 136 min [*t*(24) = 4.31, *p* < 0.001, *q* = 0.002]. Raw and baseline corrected alpha-amylase data are presented in **Figures 2E,F**, respectively. Although the alpha-amylase data appear to follow a pattern that is similar to the cortisol data, a mixed measures ANOVA computed for the

baseline corrected data with the same factors reported previously found no main effect of session [*F*(3,72) = 1.99, *p* = 0.12, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.08], group [*F*(1,24) <sup>=</sup> 3.98, *<sup>p</sup>* <sup>=</sup> 0.057, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.14], or interaction of session by group [*F*(3,72) = 0.86, *p* = 0.46, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.04].

### **BEHAVIORAL DATA**

#### *Reaction time*

The mean RT data are shown in **Figure 3A**. A 2 (group: exercise, control) × 2 (load: low, high) × 2 (interference: compatible, incompatible) × 5 (session: baseline, 34, 68, 102, 136 min) repeated measures ANOVA was computed for the mean RT data. Overall, there was a reduction in RT in both groups over the duration of the experiment [*F*(4,96) = 19.61, *p* < 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.45]. There was also a significant session <sup>×</sup> load interaction [*F*(4,96) <sup>=</sup> 8.46, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.26]. This interaction was driven by significant reductions in RT under high task load between 34and 68 min [*t*(25) = 3.04, *p* = 0.006, *q* = 0.016] and 68–102 min [*t*(25) = 3.16, *p* = 0.004, *q* = 0.016], and under low task load between baseline – 34 min [*t*(25) = 4.57, *p* < 0.002, *q* = 0.016]. Both groups also demonstrated increased RT under conditions of high load [*F*(1,24) <sup>=</sup> 214.35, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.90] and longer RTs when the distractors were incompatible distractors [*F*(1,24) <sup>=</sup> 27.91, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.54]. Furthermore, there was a load × interference interaction [*F*(1,24) = 26.00, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.52]. *Post hoc* paired samples *<sup>t</sup>*-tests confirmed that incompatible distractors significantly increased RTs under low load, [*t*(26) = −10.69, *p* < 0.001, *q* = 0.002], but not high load [*t*(26) = 0.35, *p* = 0.73, *q* = 0.73; **Figure 4**].

There was also a significant three way interaction between group, session, and load on search task performance [*F*(4,96) <sup>=</sup> 2.49, *<sup>p</sup>* <sup>=</sup> 0.048, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.09]. In the first half of the session both groups appear to demonstrate a learning trend as they complete more blocks of the task, although *post hoc* paired samples *t*-tests confirm that the only statistically significant decrease in RT is from baseline to 34 min in the control group under low load [*t*(12) = 4.33, *p* = 0.001, *q* = 0.008]. At 68 min, significant differences emerge between the two groups under high task load: there is a significant decrease in RT from 68 to 102 min [*t*(12) = 3.93, *p* = 0.002, *q* = 0.008] in the control group, but no change in the exercise group [*t*(12) = 0.64, *p* = 0.53, *q* = 0.53]. RT did not significantly change between 102 and 136 min under high task load in either the exercise group [*t*(12) = 0.66, *p* = 0.52, *q* = 0.53] or control group [*t*(12) = −1.02, *p* = 0.35, *q* = 0.55]. The groups did not differ at baseline when corrected for multiple comparisons [*t*(24) = 2.1, *p* = 0.04, *q* = 0.53).

#### *Accuracy*

The mean accuracy data are shown in **Figure 3B**. A mixed measures ANOVA (same factors as listed in the previous section) was computed for the accuracy data. There was an overall decline in accuracy in both groups under high load [*F*(1,24) = 108.88, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> .82] and high interference [*F*(1,24) <sup>=</sup> 26.00, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.52]. Accuracy improved in both groups over the duration of the experiment [*F*(4,96) <sup>=</sup> 2.74, *<sup>p</sup>* <sup>=</sup> 0.03, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.10]. There was a session by load interaction [*F*(4, 96) = 3.05, *p* = 0.02, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.11]. Follow up analyses revealed marginally improved performance between performance at baseline and after 34 min in the high load condition, but this test did not survive correction for multiple comparisons [*t*(25) = −2.15, *p* = 0.041, *q* = 0.32]. There were no overall group differences [*F*(1,24) = 0.04, *p* = 0.84, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.02].

### **CARDIOVASCULAR FITNESS AND BEHAVIORAL PERFORMANCE**

The relationship between aerobic fitness (VO2max) and search performance (RT) was assessed in two ways. First, to globally assess this relationship we computed the correlation between VO2max and RT in the baseline session and the overall average of the test sessions for experimental and control groups. There was no relationship between VO2max and RT in the baseline condition for either the exercise group [*r*(11) = −0.35, *p* = 0.24] or the control group [*r*(11) = −0.12, *p* = 0.69; **Figure 5A**). In contrast, there was a significant negative relationship between VO2max and RT averaged across all conditions of the test session for the exercise group, [*r*(11) = −0.63, *p* = 0.02], but not the control group, [*r*(11) = 0.14, *p* = 0.66]. Direct comparisons of the correlation coefficients in the experimental and control conditions using Fisher's *r*-to-*z* transformations revealed that the correlations were not significantly different in the baseline period (*Z* = −0.55, *p* = 0.59), but were different in the test sessions (*Z* = −1.97, *p* = 0.049). This finding suggests that participants with higher aerobic capacity are faster to respond to search task targets than participants with lower capacity.

Second, to investigate whether the correlation between aerobic fitness and search performance changed over the course of the experiment, separate correlations were calculated within each group at 34, 68, 102, and 136 min correcting for multiple comparisons (*p* < 0.05) within group using the false discovery rate

**FIGURE 5 | Correlations between VO2max and visual search RT at baseline (A) and during the test session (B).** RT data are averaged across search task load and distractor interference conditions. While the p-values indicate the values uncorrected for multiple comparisons, \*indicate the hypothesis tests that are p < 0.05 FDR-corrected for multiple comparisons.

**Table 2 | Pearson correlations show relationship between VO2max and RT for exercise and control groups.**


\*indicates p < 0.05 FDR-corrected for multiple comparisons.

approach (Benjamini and Hochberg, 1995). The results of this analysis are summarized in **Table 2** and **Figure 5B**. There were no significant correlations in the control group at any point during the experiment. In the experimental group, the correlation was significant in all the test sessions.

Given the relatively small sample size, several steps were taken to ensure that the results were not contaminated by deviations in normality and/or differences in baseline measurements. First, we constructed confidence intervals for each correlation using bootstrap resampling with 1000 iterations. To maintain consistency with the FDR-corrected hypothesis tests, we constructed the confidence intervals using the false coverage rate (FCR) procedure (i.e., 95% FCR intervals; Benjamini and Yekutieli, 2005). The mean correlation values of the resampled distributions and the confidence intervals are shown in **Figure 6**. Not only were

the means of the bootstrapped correlation coefficients very similar to the actual coefficients, but the bootstrapped confidence intervals corroborated the hypothesis tests (i.e., intervals for the tests that were statistically significant did not include zero). While it is still possible that the observed pattern of correlations may change with a larger sample size, the bootstrap analysis confirms that the results are stable under resampling. Second, we confirmed that all data from both groups are within three standard deviations of the condition means. Third, Kolmogorov-Smirnov tests revealed that the majority of the conditions were normally distributed. The only non-normal data were the 68, 102, and 136 min conditions in the control group. As a result, we also conducted non-parametric Spearman correlations. The results of these non-parametric hypothesis tests matched the parametric tests in every condition. Finally, it is possible that the observed differences in the correlations between VO2max and performance between the groups could be associated with differences in the baseline condition. To address this we regressed VO2max against baseline RT and overall test RT, for each group independently. In the exercise group, overall test RT significantly predicted VO2max, [β = −0.76, *t*(12) = −2.23, *p* = 0.050], whereas baseline RT did not significantly predict VO2max, (β = 0.19, *t*(12) = 0.56, *p* = 0.59). The overall model explained a marginally significant proportion of variance in VO2max, [*R*<sup>2</sup> <sup>=</sup> 0.41, *<sup>F</sup>*(2,10) <sup>=</sup> 3.52, *p* < 0.07]. In the control group, overall test RT did not predict VO2max, [β = 0.41, *t*(12) = 1.02, *p* = 0.33] and neither did baseline RT, [β = −0.41, *t*(12) = −0.99, *p* = 0.34]. The overall model did not explain the variance in VO2max, [*F*(2,10) = 0.60, *p* = 0.57]. These analyses confirm that the observed differences in correlations between VO2max and RT between the two groups are not just the result of differences at baseline.

### **DISCUSSION**

The goal of the present study was to investigate changes in multiple components of visual attention during a long bout of physical activity. Two key findings emerged from the behavioral data. First, the three-way interaction between load, session and group suggests that a long bout of exercise may impact the learning effect in a visual search task. Second, the significant negative relationship that emerges betweenVO2max and search task response time in the exercise group, but not in the control group suggests that aerobic capacity may only be a good predictor of visual search performance in the current task when one is engaged in exercise. In the following sections, we discuss the implications of these findings for both the cognition and exercise literature and for models of selective attention.

#### **EXERCISE EFFECTS ON SEARCH TASK PERFORMANCE**

Between the baseline condition and 68 min, search task RTs trended downward in both groups. However, while the control group showed significant improvement from 68 to 102 min in the high load condition, the exercise group did not. This pattern of results suggests that an hour-long bout of exercise can have a detrimental impact on normal task learning during visual search, but only under demanding task load conditions. One possible explanation is that the acute bout of exercise drains processing resources and while there are still sufficient resources for the learning effect to continue under low search task load, there are not sufficient resources to support learning under high load. There is other similar evidence in the literature suggesting that participants who engage in 1-h of intense exercise are impaired at perceptual discrimination relative to participants that rest for an equivalent amount of time (Moore et al., 2012). Our findings not only provide further evidence for impaired perceptual discrimination after a long bout of acute exercise, but also suggest that task load may have an important role in determining the extent of the impairment. The present result, however, must be interpreted with some caution because of the difference between the groups in baseline performance in the high

load condition. Also, although our physical stress manipulation was effective, as confirmed by increased salivary cortisol levels in the exercise group compared to the control group during the test session and increased RPEs over the course of the session in the exercise group, it is possible that the results of the present study may have been more robust if the exercise group had also been required to exercise more intensively. For example, (Moore et al., 2012) required subjects to exercise at a far higher intensity (90% of ventilatory threshold) than our subjects. Given that overall effects on cognitive performance generally tend to be small and affected by a range of behavioral and exercise intensity/duration related factors (e.g., Lambourne et al., 2010; Chang et al., 2012), it is perhaps not surprising that our effects are also small.

#### **RELATIONSHIP BETWEEN AEROBIC CAPACITY AND SEARCH RT**

A robust, significant negative relationship between VO2max and RT emerged in the exercise group as soon as they began to exercise, and this relationship enduredfor the remainder of the session. This relationship was not present in either group during the baseline session, and did not emerge in the control group at any stage of the test session, suggesting that aerobic capacity is related to search performance in this task, but only during a bout of physical exercise. There was no relationship between aerobic capacity and accuracy, indicating that enhancement in processing speed did not come at the cost of increased errors.

To an extent, the discovery that people with higher aerobic capacity outperform people with lower capacity in our search task corroborates previous data that show enhanced cognitive performance in higher-fit people. Themanson and Hillman (2006) monitored brain activity using the event related potential (ERP) technique while subjects performed a flanker task before and after a 30-min bout of treadmill exercise. Although the authors found no effects of the acute bout of exercise on any dependent measures, they did find that higher-fit adults showed reduced error related negativity (ERN), increased error positivity (Pe), and increased post-error response slowing, all of which suggest increased involvement of top–down cognitive control mechanisms.

One may also draw associations between the present findings and results from studies of cognitive vitality in older adults. A meta-analytical study conducted to examine the effects of aerobic fitness training interventions on cognitive performance in sedentary older adults (aged 55–80 years) found that training had robust, reliable effects on cognitive performance across various domains, including executive function and visuospatial task performance (Colcombe and Kramer, 2003). Another study of older adults found that high-fit, aerobically trained participants showed reduced interference effects in an Eriksen Flanker task when compared to low-fit participants (Colcombe et al., 2004). However, our findings are unique in showing that under the present experimental conditions, aerobic capacity may be a relevant factor for visual search proficiency *after* an individual has began an acute bout of exercise. This is in contrast to the vast majority of other research, which demonstrates a relationship between aerobic capacity and cognitive performance while participants are at rest. Thus, there may be a common mechanism whereby aerobic capacity is relevant for certain types of cognitive performance while resting and especially important for other types of cognitive task during an acute bout of exercise.

It is possible that common mechanisms can account for these results and evidence for this possibility comes from both animal and human studies. Animal models demonstrate that increased aerobic fitness is associated with higher levels of BDNF, a growth factor that protects and supports the function and survival of neurons (Neeper et al., 1995). This, and other neurotrophins are responsible for the survival of neurons (Lewin and Barde, 1996), the regulation of neuronal connectivity and synaptic efficacy (Lu and Chow, 1999) and neurogenesis (van Praag et al., 1999). Thus, aerobic activity results in greater neural efficiency and plasticity, meaning that animals with greater aerobic capacity can show improved cognitive performance (van Praag et al., 1999). There is also evidence that BDNF plays a key role in the human brain. Knaepen et al. (2010) carried out a systematic review of studies that evaluated the effects of acute exercise or a training intervention on human participants. They conclude that aerobic exercise can result in both higher BDNF synthesis and upregulation of the synthesis, reabsorption and release of BDNF by cells, thus inducing both neuroprotective and neurotrophic effects. It is therefore possible that increased cardiovascular fitness leads to an increased number of synapses in frontal and parietal regions of the human brain (Colcombe et al., 2004). It may also be the case that enhancements in oxygen transportation to the brain that are associated with higher levels of cardiovascular fitness (Endres et al., 2003; Swain et al., 2003) may also boost cognitive performance due to the increased availability of metabolites to neurons (Chodzko-Zajko and Moore, 1994). The physiological changes associated with these gains in aerobic fitness have been referred to as the cardiovascular fitness hypothesis (North et al., 1990; Etnier et al., 2006). Critically, within the present context it is plausible that this increased interconnectivity and metabolic efficiency may allow for greater recruitment and supply of neurons under conditions in which metabolic demands are increased. This may explain why higherfit participants in our study were able to search for the target more efficiently than low-fit participants during the prolonged bout of exercise.

#### **IMPLICATIONS FOR MODELS OF SELECTIVE ATTENTION** *Neural theory of visual attention*

In addition to providing insight onto the complex interaction between physical fatigue and cognition, the present results can be interpreted within the context of models of attention that are not explicitly designed to explain these effects. Consider the theory of visual attention (TVA: Bundesen, 1990) – a computation theory that is able to account for a wide range of aspects of the operation of selective attention. TVA is based on two equations, which, together, account for two fundamental aspects of selective attention: filtering (object selection) and pigeonholing (feature selection). The updated neural TVA (NTVA; Bundesen et al., 2005) is an attempt to account for these mechanisms at the neural level. According to the model, filtering changes the number of cortical neurons that represent an object, whereas pigeonholing changes activation levels in neurons responsible for coding particular features. Together, these two mechanisms are responsible for

controlling the activity levels of populations of neurons responsible for signaling specific object categories. These populations then compete with populations of neurons in the visual system responsible for signaling other object categories, with the level of activation of each population determining whether a category will be encoded into visual short-term memory. According to NTVA, if a behaviorally relevant object has a high attentional weight then a larger set of neurons representing that object should be active. Having large populations of neurons available to represent the behaviorally relevant object is therefore vital for the rapid and accurate categorization of that object. Within the present context, if high aerobic capacity translates into increased synaptic interconnectivity and metabolic supply allowing for the recruitment of greater numbers of neurons, then individuals with higher aerobic capacity may be more readily able to recruit larger populations of neurons to represent the behaviorally relevant object, and readily supply those neurons with the necessary metabolites. Hence aerobic fitness may become an important factor in determining speed of behaviorally relevant object identification under conditions of high metabolic load, such as during or immediately after an acute bout of exercise.

### *Perceptual load theory*

Conditions of high perceptual load lead to reduced interference from competing distractors (Lavie and Tsal, 1994; Lavie, 1995; see Lavie, 2006, 2010, for reviews), supporting the idea that attention is a limited capacity resource. Although perceptual load theory does not make any specific predictions regarding the effects of psychological or physiological stress and fatigue on selectivity, Lavie and Tsal (1994) acknowledge that in addition to perceptual load, attentional capacity can also be modulated by factors such as the temporal state of alertness and availability of resources. For example, evidence from hybrid flanker visual search tasks similar to the one used in the present study indicates that mental fatigue leads to increased distractor interference under low perceptual load (Csathó et al., 2012), and social stress causes reduced distractor interference under low load and increased interference under high load (Sato et al., 2012). Although the presence of a significant load × distraction interaction in our data confirmed that distractor interference was greater under low load than high load, this interference effect was not modulated by exercise at any point during the test session. This result concurs with previous studies that demonstrate an acute bout of exercise does not interact with flanker interference in an Eriksen flankers task (Hillman et al., 2003; Themanson and Hillman, 2006; Davranche et al., 2009). However, it appears there is a contrast between physical activity induced stress, which does not impact upon distractor interference, and psychological stress and fatigue, which are known to interact with distractor interference in this task (Csathó et al., 2012; Sato et al., 2012).

### **CONCLUSION**

Relatively few attempts have been made to study the effects of prolonged exercise on cognitive performance in the laboratory, and the effects of exercise-induced arousal and fatigue on selective attention and cognitive control are unclear. In the present study, the data suggest that aerobic capacity may be an important determinant of visual search performance, with high-fit participants able to identify a target more rapidly than low-fit participants during a bout of physical exercise. We also provide tentative evidence that prolonged exercise can have a detrimental effect on visual search under high search task load.

Our findings are important and unique for five main reasons. First, these data are the first to suggest that a relationship between aerobic capacity and cognitive performance can emerge after a bout of physical exercise. Second, our study is the first to test exercise effects on visual search and perceptual distraction. Third, the present study is one of a handful of studies that test the effects of a prolonged bout of exercise on cognitive performance. Fourth, while a relationship between aerobic capacity and cognitive performance is well established in aging populations (Colcombe and Kramer, 2003), our study is one of a handful that also demonstrates this effect in a sample of young, healthy adults. Finally, the present results have implications for the generality of current theories of selective attention, which are largely based on behavioral performance measured at rest. While there appears to be a link between the physiological changes that occur with exercise and cognitive function that is consistent with the NTVA, future work is needed to investigate the empirical viability of this link.

#### **ACKNOWLEDGMENTS**

This work was supported by the Institute for Collaborative Biotechnologies through contractW911NF-09-0001 from the U.S. Army Research Office. The content of the information does not necessarily reflect the position or the policy of the Government, and no official endorsement should be inferred.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 14 July 2014; accepted: 24 October 2014; published online: 11 November 2014.*

*Citation: Bullock T and Giesbrecht B (2014) Acute exercise and aerobic fitness influence selective attention during visual search. Front. Psychol. 5:1290. doi: 10.3389/fpsyg.2014.01290*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology. Copyright © 2014 Bullock and Giesbrecht. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Effects of monitoring for visual events on distinct components of attention

### *Christian H. Poth1,2 \*, Anders Petersen3 , Claus Bundesen3 and Werner X. Schneider 1,2*

<sup>1</sup> Neuro-Cognitive Psychology, Department of Psychology, Bielefeld University, Bielefeld, Germany

<sup>2</sup> Center of Excellence Cognitive Interaction Technology, Bielefeld University, Bielefeld, Germany

<sup>3</sup> Center for Visual Cognition, Department of Psychology, University of Copenhagen, Copenhagen, Denmark

#### *Edited by:*

Søren Kyllingsbæk, University of Copenhagen, Denmark

#### *Reviewed by:*

Agnieszka Konopka, Max Plank Institute for Psycholinguistics, Netherlands Jeff P. Hamm, The University of Auckland, New Zealand

#### *\*Correspondence:*

Christian H. Poth, Neuro-Cognitive Psychology, Department of Psychology, Bielefeld University, P. O. Box 10 01 31, D-33501 Bielefeld, NRW, Germany e-mail: c.poth@uni-bielefeld.de

### **INTRODUCTION**

Monitoring the environmentfor visual events is afrequent requirement of everyday tasks. Monitoring errors can have tremendous consequences, for example when an anesthesiologist fails to detect aberrations in displays of a patient's vital signs (see e.g., Seagull et al., 2001) or when an air traffic controller misses conflicting routes of air crafts (see e.g., Remington et al., 2000).

One may define *monitoring* as preparing for detecting visual events in the environment. Events could, for instance, refer to a change in appearance of a task-relevant object. In realistic situations, monitoring tasks can occur interwoven with a plethora of other activities. To date, however, little is known about the effects of monitoring for visual events on visual information processing within other concurrent activities.

For decades, monitoring has been investigated in the field of *vigilance* research (for an overview, see Davies and Parasuraman, 1982). Classical vigilance experiments required participants to detect and respond to rare visual events that occurred in between periods of visual stimulation without response requirements (e.g., Mackworth, 1948). In such tasks, detection performance typically declines over time. This result was termed "vigilance decrement," and it has given rise to a conceptualization of vigilance as a "... psychological readiness to perceive and respond ..." (Mackworth, 1948). Numerous explanations for the vigilance decrement have been proposed (Davies and Parasuraman, 1982) and are still debated (Hancock, 2013; see also Ariga and Lleras, 2011). The bulk of vigilance experiments comprised scenarios of preparatory monitoring during periods of low external stimulation where attention had to be sustained (Hancock, 2013). This research focused on the outcomes of monitoring, namely on detection performance. No studies asked how components of visual information processing are modified within the period of

Monitoring the environment for visual events while performing a concurrent task requires adjustment of visual processing priorities. By use of Bundesen's (1990) Theory of Visual Attention, we investigated how monitoring for an object-based brief event affected distinct components of visual attention in a concurrent task. The perceptual salience of the event was varied. Monitoring reduced the processing speed in the concurrent task, and the reduction was stronger when the event was less salient. The monitoring task neither affected the temporal threshold of conscious perception nor the storage capacity of visual short-term memory nor the efficiency of top-down controlled attentional selection.

**Keywords: visual attention,TVA, sustained attention, vigilance, salience, event-based prospective memory**

monitoring. In our investigation, we focus on the issue of how monitoring affects components of visual attention.

*Visual attention* summarizes a number of processes that govern priorities in visual information processing (e.g., Duncan, 2006; Chun et al., 2011). As a result of these priorities, currently relevant rather than irrelevant information is internally represented to be available for action control (Allport, 1987; Neumann, 1987; Desimone and Duncan, 1995; Schneider, 1995; Duncan, 2006). A framework for describing how this prioritization might be accomplished is provided by the "biased competition" approach. All inputs from the visual field may compete against each other for being available to control actions and this competition may be biased by representations of the current task (Desimone and Duncan, 1995). Such task representations define sets of processing priorities ("attentional sets") and may stem from working memory (e.g., Olivers et al., 2011) or long-term memory (e.g., Woodman et al., 2013).

A specific biased competition theory about distinct attentional components was suggested by Bundesen's (1990) computational *Theory of Visual Attention* (TVA). TVA assumes visual information processing to be organized in separate processing stages. Visual short-term memory (VSTM) is the stage at which visual information is temporarily maintained and becomes available for action control, so that it can be reported or acted upon. At a preceding stage, competition of objects in the visual field takes place. This competition is characterized as a race of possible categorizations of objects. The categorizations that finish processing first win the race and enter VSTM, provided there is still a free "slot" for storing information about the respective object (for evidence for such a capacity limitation of VSTM, see Luck and Vogel, 1997). In TVA, two attentional mechanisms (*filtering* and *pigeonholing*) jointly determine which of the categorizations of objects in the

visual field win the race and are encoded into VSTM. TVA's filtering mechanism was used to explain observations of "focused attention" where processing was concentrated on relevant at the expense of irrelevant visual objects (for a review, see Bundesen and Habekost, 2008). Visual processing appears to be limited in capacity so that only a certain number of visual objects can be processed at a given time (e.g., Duncan, 2006). Filtering is assumed to operate by assigning each object in the visual field an attentional priority reflecting its current relevance and salience (Nordfang et al., 2013). In TVA, these attentional priorities of objects are called "attentional weights." The processing speed of a categorization of an object is proportional to the absolute attentional weight allocated to this object, normalized by the attentional weights allocated to all objects in the visual field. Therefore, filtering directly affects the probability that an object-categorization finishes processing earlier than others and wins the race. TVA's second attentional mechanism, pigeonholing, consists of biases for categorizing visual objects as belonging to certain task-relevant categories. Like filtering, it is assumed to influence the processing speed of categorizations and thus their probability of winning the race to VSTM.

Using a TVA-based letter-report task (Vangkilde et al., 2011; see below), five distinct attentional components can be obtained. Each component represents a specific aspect of attentional functioning (for overviews, see also Bundesen and Habekost, 2008, 2014). (1) The threshold of conscious perception, *t*0, is the maximum ineffective exposure duration of a visual stimulus. That is, it indexes the time necessary to initiate the race of object categorizations to VSTM. (2) The maximum number of visual objects that can be maintained in VSTM, *K*, is taken as a measure of VSTM storage capacity. (3) Visual processing speed, *C*, reflects the overall available visual processing capacity which is distributed across the objects in the visual field according to their attentional weights. (4) The top-down controlled attentional selectivity, α, represents the efficiency of TVA's filtering process. It captures how efficiently the processing of relevant visual objects can be prioritized over the processing of irrelevant ones. (5) The laterality of attentional weighting, *w*index, provides an index of spatial attentional biases that lead to preferred encoding of visual objects from one of the visual hemifields (cf. Duncan et al., 1999; Vangkilde et al., 2011). To date, these TVA-based components have been applied to characterize attentional performance in a variety of psychological, neuropsychological, and physiological studies (for overviews, see Bundesen and Habekost, 2008, 2014). They may thus provide suitable measures to capture effects of event monitoring on visual attention.

Whether or not monitoring affects visual attention components might be especially important in situations where, concurrently to monitoring, another task has to be performed. In the above sketched real-life situation, an anesthesiologist might have to prepare for detecting changes in visual displays of a patient's vital signs while performing other tasks, such as the administration of anesthetics. Such situations may be reminiscent of tasks of event-based prospective memory. In such tasks, the intention to respond to an external event is formed and maintained over a retention interval while another ongoing task is performed, and then enacted when the event actually occurs (for reviews, see Ellis and Kvavilashvili, 2000; Burgess et al., 2011). During this retention interval, performance in the ongoing task seems impaired (e.g., Smith, 2003). In the prospective memory literature, this finding of task interference was interpreted as evidence for an overlap of attentional processing resources recruited by both, the ongoing and the event-based prospective memory task. Consequently, preparatory attentional processes were proposed, which monitor the environment for events to support event-based prospective memory (Smith, 2003; Smith and Bayen, 2004). It was implicitly assumed that only one form of attentional resource exists that has to be shared between tasks (for a critical discussion, see Allport, 1987; Neumann, 1987). In this view, tasks may either consume this resource or be "automatic"without resource consumption (for a critical discussion, see Neumann, 1984). Using TVA-based assessment, this suggestion of sharing one general attentional resource can be replaced by assuming several independent attentional resource components such as VSTM storage capacity, attentional weighting and top-down selectivity.

One factor that may be critical for effects of monitoring for visual events on visual attention may be the perceptual salience of the events. Salience plays a central role in all three of the mentioned research domains. Firstly, performance in detecting rare events seems not to decline with time-on-task when events are highly salient (Helton and Warm, 2008). Secondly, visual stimuli of high salience can capture visual attention, that is, can intrinsically call for a prioritization in processing over low-salience stimuli (e.g., Theeuwes, 2004). Thirdly, the attentional demands made by event-based prospective memory tasks are thought to be lower in case of highly salient events (e.g., McDaniel and Einstein, 2000; Hicks et al., 2005). Therefore, the effects of monitoring for visual events on visual attention could vary depending on the expected perceptual salience of the events. For instance, monitoring for highly salient events might rely more on the expected ability of the events to capture attention. This may lead to weaker effects on visual attention in a concurrent ongoing task, compared to monitoring for events of lower salience.

The aims of the present study were to investigate, first, how monitoring for visual events affects the distinct TVA-based attentional components in a concurrent ongoing task and, second, whether these effects vary with the expected salience of the events. The experiment consisted of two overlapping tasks. First, participants performed an ongoing letter-report task enabling the TVA-based estimation of the attentional components (cf. Bundesen, 1990; Vangkilde et al., 2011). Second, event monitoring was required while engaging in the ongoing letter-report task. Events were brief luminance increases of a central fixation cross. Each participant performed three conditions. In the *low*and *high-salient event condition*, participants monitored for and responded to events. Events were of lower salience in the former than in the latter condition. In the *control condition*, events were as salient as in the low-salient event condition, but participants were not required to monitor for them or to respond to them. Importantly, we attempted to capture only effects on the attentional components that were due to the preparatory monitoring for events. Therefore, only trials containing neither events nor related responses were analyzed. Moreover, we aimed at keeping the extent to which events were monitored constant

for each of the two event conditions by regularly giving feedback about detection performance. This feedback (cf. Ariga and Lleras, 2011) and possibly the structure of the letter-report task should have prevented time-on-task effects, such as vigilance decrements (McAvinue et al., 2012; see also Matthias et al., 2010). In addition to our first experiment (Experiment A), we conducted a replication experiment comprising identical experimental conditions (Experiment B).

How might event monitoring affect the TVA-based attentional components? First, monitoring for a visual event could involve the maintenance of a representation of this event in VSTM to enable its use to define the attentional set for the task (cf. Olivers et al., 2011). This requirement should have lowered the storage capacity available to maintaining letters of the letter-report task. In this case, the estimated VSTM capacity should have been lower in the two event conditions compared to the control condition where monitoring was not necessary. In addition, if the salience of events affected the need for such a representation, then the estimated VSTM capacity should also differ between the two event conditions.

Second, the event-based prospective memory literature suggests that monitoring should consume unspecific and capacitylimited attentional resources (Smith, 2003; Smith and Bayen, 2004). In the framework provided by TVA (Bundesen, 1990), specific attentional resources such as visual processing speed are assumed. In the present experiments, more visual processing resources should be assigned to process the fixation cross when it is monitored for events compared to when it is not. Under such circumstances, less resources should be available for processing the letters of the letter-report task. Therefore, the processing speed of the letters should be reduced in the event conditions compared to the control condition. Moreover, events with a higher salience could intrinsically call for visual processing resources (e.g., Theeuwes, 2004) which might lessen the amount of resources that must be reserved for monitoring potential events. In the highsalient event condition, this should lead to more available resources for processing the letters and thus to a higher processing speed compared to the low-salient event condition.

In addition to the TVA-based components of VSTM capacity and visual processing speed, we explored potential effects of monitoring on the threshold of conscious perception, the top-down controlled selectivity, and the laterality of attentional weighting and examined the potential role of the salience of events.

To preview the results, the monitoring manipulation affected the processing speed as predicted and, additionally, the laterality of attentional weighting. However, no effects of monitoring were observed with respect to VSTM capacity, the threshold of conscious perception, and the top-down controlled attentional selectivity.

#### **MATERIALS AND METHODS PARTICIPANTS**

Twenty students (13 females, 7 males) from the University of Copenhagen, Denmark took part in Experiment A, receiving a shopping voucher for participation. Four participants stated being left-handed, 16 being right-handed. Their ages ranged from 20.28 to 34.26 years, with a mean age of 24.92 (SD = 2.91).

Nineteen students (13 females, 6 males) from Bielefeld University, Germany, participated in Experiment B, some of whom received course credit for participation. Three reported being left-handed, 16 right-handed. The participants were between 19.91 and 31.75 years old, with a mean age of 25.91 (SD = 3.15).

All participants of both experiments reported having normal or corrected-to-normal visual acuity as well as normal color vision. Written informed consent was obtained from all of them before participation.

#### **APPARATUS AND STIMULI**

Participants performed Experiment A in a dimly lit room at the University of Copenhagen, wearing disconnected headphones (SRH 440, Shure, Niles, IL, USA) to be shielded from sounds. Stimuli were displayed on a 21- or a 19-- CRT-monitor (G220f, ViewSonic, Walnut, CA, USA, or Flatron 915 FT plus, LG, Seoul, South Korea, respectively) running at 100 Hz with a resolution of 1024 × 768 pixels corresponding to physical dimensions of 40 (width) × 30 (height) cm or 36 (width) × 27 (height) cm, respectively. One participant performed the complete experiment with a screen refresh rate of 120 Hz. The participant's data was analyzed because the applied modeling procedure took the refresh rate into account, so that estimations of the attentional components should not have been affected. The experiments were programmed and conducted using the E-Prime 2.0 software, ensuring accurate timing of the visual presentation (see Schneider et al., 2002). Participants viewed the screen from a distance of approximately 60 cm. Responses were recorded using a standard computer keyboard.

Stimuli were presented on a black background. A gray "plus" character written in Bell MT font with a size of 18 pt., corresponding to approximately 0.5◦ × 0.5◦ of visual angle, and with a luminance of 33.12 cd/m2 was used as central fixation cross. Letter stimuli consisted in red (RBG: 253, 43, 43; 24.57 cd/m2) and blue (RBG: 43, 53, 255; 14.05 cd/m2) capital letters, written in Arial font with a size of 68 pt., corresponding to approximately 2.7◦ × 2.3◦ of visual angle. Letter masks were made of red and blue letter fragments and covered an area of 100 × 100 pixels. Events were luminance increases of the fixation cross to either 43.20 cd/m2 in the low-salient event and control conditions or to 111.57 cd/m<sup>2</sup> in the high-salient event condition.

Experiment B took place in a dimly lit and sound-attenuated room at Bielefeld University. Stimuli were shown on a 19-- CRTmonitor (G90fB, ViewSonic, Walnut, CA, USA) with a resolution of 1024 × 768 pixels, corresponding to physical dimensions of 36 (width) × 27 (height) cm. Participants viewed the screen from a distance of 71 cm, while their head was stabilized by a chin-rest. The stimuli of Experiment B were identical to those of Experiment A, except for size and luminance. The dimensions of the central fixation cross were 0.4◦ × 0.4◦ of visual angle and its luminance was 42.00 cd/m2. Letter stimuli covered 2.3◦ <sup>×</sup> 1.9◦ of visual angle. The luminance of the red target letters was 26.00 cd/m2, the one of the blue distractor letters 18.00 cd/m2. Events were luminance increases of the fixation cross to 51.00 cd/m<sup>2</sup> in the low-salient event and control conditions and to 85.00 cd/m<sup>2</sup> in the high-salient event condition.

**FIGURE 1 | Experimental paradigm. (A)** Course of an example trial. Participants viewed a fixation cross, followed by a letter display which was masked afterward. Then they were to report the presented letters. On 10% of all trials, events consisting in brief luminance increases of the fixation cross occurred. Participants had to respond to luminance increases in the low-salient event and high-salient event condition, but

#### high-salient event compared to the other two conditions. **(B)** Letter displays could be of three different types, enabling the estimation of attentional components (Vangkilde et al., 2011). **(C)** Across each condition, waiting periods to events approximated the non-aging geometric distribution.

not in the control condition. Luminance increases were greater in the

### **PROCEDURE AND DESIGN**

The experimental task for measuring the TVA-based attentional components consisted of a modified version of the CombiTVAparadigm, a letter-report task combining whole and partial report techniques (Vangkilde et al., 2011). **Figure 1A** illustrates the course of an example trial. Each trial began with the presentation of a central fixation cross for 1000 ms. Then a letter display was presented that could be one of three types (see **Figure 1B**). In whole report trials, either two or six red target letters were presented whereas partial report trials comprised two red target letters and four blue distractor letters. In each block, nine 2-letter and eighteen 6-letter whole report as well as nine partial report trials were administered in mixed order. Letters in each trial were chosen randomly without replacement from the set of all capital letters except I, Q, U, and W, which were omitted to reduce confusability between letters. Letter displays containing six target letters were presented for either 10, 20, 50, 80, 140, or 200 ms. All other letter displays were shown for 80 ms. Letters were presented at 45, 90, 135, 225, 270, and 315◦ of an imaginary circle with a radius of approximately 7.5◦ of visual angle (6.35◦ in Experiment B) around the fixation cross. The letter display was followed by a masking display shown for 500 ms, which always contained six letter masks at the six possible letter locations. Then a response screen was presented on which participants could enter letters using the keyboard. These letters were displayed until participants pressed the "enter"-key to start the next trial. The participants' task was to report only the red target letters, while ignoring the blue distractor letters. They were instructed to type in the letters (in arbitrary order) they were "fairly certain" of having seen, refraining from pure guessing. Moreover, they were instructed to aim at an accuracy of their letter reports between 80 and 90% (i.e., at error rates between 10 and 20%). The participant's accuracy level was displayed after each block.

In 10% of all trials, an event (i.e., luminance increase of the fixation cross) was presented for 100 ms. The event's onset could occur either during the sole presentation of the fixation cross, during the letter display, or during the first 400 ms of the masking display (i.e., the first, second, and third displays in **Figure 1A**). The following steps were taken to keep the temporal expectation of events constant throughout each condition of the experiment. The specific trials on which an event occurred in each condition were randomly determined for each participant. For these trials, the onset of the event was set to a frame in the refresh cycle of the screen that was randomly chosen as well. Only one event could appear in each of those trials. Simulations of 75 complete runs of one condition revealed that the waiting periods (measured in frames of 10 ms) to the occurrences of the first, second, etc., and last event each approximated geometric distributions as assessed by diagnostic histograms. Waiting periods across all occurrences of the events approximated a geometric distribution with a probability parameter of 0.00138401 (maximum-likelihood), corresponding to a mean waiting period of 7.23 s (723 frames of 10 ms). A histogram of these overall waiting periods is depicted in **Figure 1C**. Waiting times following non-aging probability distributions have been used previously to establish constant temporal expectations (Vangkilde et al., 2012).

The monitoring task was manipulated as an independent variable in a within-subject design. Each participant performed the three different conditions, each comprising nine blocks of 36 trials, and 36 additional trials that were distributed across the nine blocks and each contained an event. In the low-salient event and high-salient event conditions, participants were required to press the space-bar as quickly as possible when an event was presented. Participants were told that it was very important not to overlook the events and they were instructed to refrain from reporting letters on these trials. The two event conditions differed only insofar as events in the high-salient event condition were greater luminance increases than in the low-salient event condition. After each block in these conditions, the number of events the participants correctly responded to and the number of missed events were displayed on the screen. In the control condition of the monitoring task, luminance increases of the fixation cross were presented that were identical to those of the low-salient event condition, but participants were instructed not to respond. They were, however, informed that irrelevant luminance increases of the fixation cross occurred.

Experiments A and B both took place in two sessions of approximately 90 min, on two days. In Experiment A, participants were examined either individually or in groups of up to five participants, separated by non-transparent curtains. In Experiment B, participants always performed the experiment alone. Each session comprised all three conditions. Prior to each condition, participants performed one block of training trials. The order in which the conditions were administered in the first session was reversed in the second, to control for effects of fatigue. The overall order of conditions was counterbalanced across the sample. Prior to the start of each condition, participants read written instructions on the screen and reported them to the experimenter in their own words. In case participants did not report the instructions correctly, the experimenter repeated the instructions verbally after which participants paraphrased them again.

### **DEPENDENT MEASURES**

Measures of monitoring performance were based on the hits (i.e., space-bar presses following events within the same trial) and false alarms (i.e., space-bar presses on trials without events). The nonparametric measures of signal detection, A and B-- *<sup>D</sup>* (Donaldson, 1992) were used to quantify the participant's sensitivity in detecting events and their bias for reporting an event, respectively. Values for *A* range from 0.50 (chance level) to 1.00 (perfect sensitivity). Values for *B*-- *<sup>D</sup>* range from −1.00 to 1.00. Negative values indicate a bias to respond that an event had been presented, a value of 0.00 indicates no bias, and positive values indicate a bias to respond that no event had been presented (Donaldson, 1992). In addition, reaction times (RTs; in ms) for correct responses to events were collected.

Because participants were instructed to aim at an accuracy level of their letter reports between 80 and 90%, their error rates in the three conditions (the proportion of typed-in letters that were wrong) were collected as a control variable. The five TVAbased attentional components (see below) were estimated for each participant in each of the three conditions. These estimates were obtained using a maximum-likelihood fitting procedure based on the number of correctly reported target letters for the different trial types (i.e., 2 and 6-letter whole report and partial report trials) and presentation durations of the letter displays (an overview of how the attentional components are estimated from performance in the CombiTVA-paradigm is given by Vangkilde et al., 2011; for a detailed description of the fitting procedure, see Kyllingsbæk, 2006 and Dyrholm et al., 2011). This means that these estimations only took the letters of the letter-report task into account. Since performance in event detection was assessed separately, the fixation cross was not included in estimations of the attentional components. A model with 14 free parameters was fitted using the LIBTVAtoolbox (Dyrholm et al., 2011; see also Habekost et al., 2014) for MATLAB® (R2011a, The Mathworks, Natick, MA, USA). The first three attentional components were estimated from the numbers of correctly reported target letters on 6-letter whole report trials.


(5) The fifth attentional component, *w*index, represents the laterality of attentional weighting across both visual hemifields, assessed as the sum of the attentional weights allocated to letters in the left visual hemifield in relation to the sum of all attentional weights in the entire visual field. Thus, *w*index values of 0.5 indicate no spatial bias, whereas values close to 0 show a right-sided and to 1 a left-sided bias. As mentioned above, estimations of the attentional components were based on only those trials on which no events had been presented and on which participants did not press the space-bar.

### **RESULTS**

The significance criterion was set to *p* = 0.05 for the statistical analyses. Unless stated otherwise, differences between the three conditions were examined by using repeated-measures analyses of variance (ANOVAs) followed by *t*-tests for dependent samples. These *t*-tests were corrected for multiple comparisons using a modified Bonferroni adjustment following Keppel (1973). The significance criterion of *p* = 0.05 was multiplied with the quotient of the degrees of freedom of the examined effect and the number of comparisons. That is, the significance criterion was set to *p* = 0.05 × 2/3 = 0.033. Mauchly's test was used to test the assumption of sphericity of the ANOVAs. The Greenhouse-Geisser correction was applied to non-spherical data. Effect sizes are stated as η<sup>2</sup> <sup>G</sup> for ANOVAs (Bakeman, 2005) and Cohen's *d*<sup>z</sup> (e.g., Faul et al., 2007) for *t*-tests.

In Experiment A, a few trials in each condition had to be discarded from the analyses because of technical problems (the presentation of duplicate letters due to an error in the experimental program). These trials occurred at random because the letters were chosen randomly (see above). The mean percentage of discarded trials amounted to 5.36% (SD = 1.35%) in the low-salient event condition, to 5.24% (SD = 1.53%) in the high-salient event condition, and to 5.49% (SD = 0.95%) in the control condition. No trials were discarded in Experiment B.

#### **MONITORING PERFORMANCE**

Since participants were not required to perform the event monitoring task in the control condition, monitoring performance was only analyzed for the low- and high-salient event conditions. The majority of participants did not respond to any events in the control condition. However, two participants in Experiment A each responded to one of the 36 events and one pressed the space-bar on two trials with no events in the control condition. In Experiment B, one participant responded once to an event and another once pressed the space-bar on a trial with no event in the control condition.

Because monitoring performance was modeled nonparametrically, the two event conditions were compared using the non-parametric Wilcoxon signed-rank test for which *r* is reported as effect size (e.g., Field et al., 2012). In Experiment A, the participants' sensitivities for detecting events, as assessed by *A*- , were significantly lower in the low-salient event (*Mdn* = 0.85, *Min* = 0.65, *Max* = 0.92) than in the high-salient event condition (*Mdn* = 0.95, *Min* = 0.87, *Max* = 1.00), *z* = −4.76, *p* < 0.001, *r* = −0.75. This was replicated in Experiment B (low-salient event condition: *Mdn* = 0.87, *Min* = 0.75, *Max* = 0.94; high-salient

event condition: *Mdn* = 0.95, *Min* = 0.90, *Max* = 0.98;*z* = −4.62, *p* < 0.001, *r* = −0.75).

The participants' bias values for *B*-- *<sup>D</sup>* were positive in both conditions in Experiment A. This indicates that they were biased for deciding that no event had been presented. Biases were stronger in the low-salient event (*Mdn* = 0.98, *Min* = 0.82, *Max* = 1.00) than in the high-salient event condition (*Mdn* = 0.92, *Min* = 0.39, *Max* = 0.99), *z* = 4.13, *p* < 0.001, *r* = 0.65. Again, this was replicated in Experiment B (low-salient event condition: *Mdn* = 0.98, *Min*=0.73,*Max* =1.00, high-salient event condition: *Mdn*=0.94, *Min* = 0.74, *Max* = 0.99; *z* = 2.98, *p* = 0.002, *r* = 0.48).

In Experiment A, RTs for correct responses to events were numerically longer in the low-salient event (*M* = 666 ms, SD = 139) than in the high-salient event condition (*M* = 606, SD = 128) and this difference approached significance, *t*(19) = 2.00, *p* = 0.060, *d*<sup>z</sup> = 0.45. In Experiment B, RTs were significantly longer in the low-salient event (*M* = 664, SD = 157) than in the high-salient event condition (*M* = 568, SD = 101), *t*(18) = 2.66, *p* = 0.016, *d*<sup>z</sup> = 0.61. Previous research found RTs for categorizing stimuli to be longer when available categories were of higher similarity (Cartwright, 1941). Here, in the low-salient event condition, luminance increases of the fixation cross (i.e., events) were more similar to the fixation cross of displays without events compared to the high-salient event condition. Therefore, the RT results are in line with the lower sensitivities (*A*- ) in the low-salient compared to the high-salient event condition.

#### **ERROR RATES AND ATTENTIONAL COMPONENTS**

Mixed ANOVAs with Experiment as between-factor did not reveal any interactions between Experiment and any of the three conditions regarding error rates and attentional components (all *F*s < 1.71, all *p*s > 0.189).

In Experiment A, the participants' error rates differed significantly between the three conditions, *F*(2,38) = 6.76, *p* = 0.003, η2 <sup>G</sup> = 0.066. Error rates were higher in the low-salient event (*M* = 0.18, SD = 0.08) than in the control condition (*M* = 0.13, SD = 0.08), *t*(19) = 4.35, *p* < 0.001, *d*<sup>z</sup> = 0.97. Likewise, error rates were higher in the high-salient-event (*M* = 0.16, SD = 0.08) than in the control condition, *t*(19) = 2.52, *p* = 0.021, *d*<sup>z</sup> = 0.56. No significant differences were observed between the error rates of the low- and high-salient event conditions, *t*(19)=0.94, *p* =0.361, *d*<sup>z</sup> = 0.21. In Experiment B, differences between the participant's error rates in the three conditions did not reach significance, *<sup>F</sup>*(1.47,26.48) <sup>=</sup> 3.14, *<sup>p</sup>* <sup>=</sup> 0.073, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.022. Nevertheless, *post hoc* analyses were conducted because this effect might be regarded as close to significance. As in Experiment A, error rates were higher in the low-salient event (*M* = 0.15, SD = 0.09) than in the control condition (*M* = 0.12, SD = 0.06), but this difference also did not reach significance, *t*(18) = 1.91, *p* = 0.073, *d*<sup>z</sup> = 0.72. Different from Experiment A, error rates in the low-salient event condition were significantly higher than in the high-salient event condition (*M* = 0.12, SD = 0.08), *t*(18) = 3.13, *p* = 0.006, *d*<sup>z</sup> = 0.72, whereas there were no significant differences between the high-salient event and the control condition, *t*(18) = 0.23, *p* = 0.821, *d*<sup>z</sup> = 0.05. It shall be noted, that these differences between the error rates in the conditions could not have compromised the observed differences between the processing speed, *C*,


#### **Table 1 | Descriptive statistics of attentional components.**

Means (M) and standard deviations (SD) of attentional components in the three conditions of both experiments. Described are the threshold of conscious perception, t0 (ms), the capacity of visual short-term memory, K (number of letters), the processing speed, C (letters/s), the top-down controlled attentional selectivity, α (0 indicates perfect selection, 1 indicates non-selectivity), and the laterality of attentional weighting, windex (a value near 0 indicates a bias to the right, a value near 1 indicates a bias to the left visual hemifield).

in the conditions (see below), because these two sets of differences were always in the same direction rather than revealing trade-off effects.

**Table 1** provides descriptive statistics for the attentional components in the three conditions of both experiments. Results of ANOVAs of the attentional components are shown in **Table 2**. Results of the accompanying *post hoc* analyses can be found in **Table 3**.

The thresholds of conscious perception, *t*0, did not significantly differ between the conditions of the two experiments.

Likewise,VSTM capacity,*K*, did not significantly differ between the conditions of the two experiments.

In contrast, visual processing speed, *C*, differed between the conditions in both experiments. In Experiment A, participants on average processed 25.70 letter/s less in the low-salient event than in the control condition. This was replicated in Experiment B, where participants processed on average 28.35 letter/s less in the low-salient event than in the control condition. They processed on average 16.02 letters/s less in the high-salient event compared to the control condition in Experiment A. Consistently, Experiment B yielded a difference in processing speed of 15.44 letter/s. In addition, participants processed on average 9.68 letters/s less in

the low-salient event than in the high-salient event condition in Experiment A. This was again replicated in Experiment B, where they processed on average 12.91 letter/s less in the low-salient event compared to the high-salient event condition.

In both experiments, there were no significant differences between the top-down controlled attentional selectivity, α, in the three conditions.

Moreover, differences between the laterality of attentional weighting, *w*index, were observed in both experiments. In Experiment A, values for *w*index were significantly higher in the lowsalient event compared to the control condition, indicating a spatial attentional bias to the left visual hemifield in the former relative to the latter condition. Likewise, values for *w*index were significantly higher in the high-salient event than in the control condition. There were no significant differences regarding the values for *w*index between the low- and high-salient event conditions. In Experiment B, values for *w*index were also higher in the low-salient event compared to the control condition, but this difference did not reach the adjusted significance criterion. In contrast to Experiment A, however, the high-salient event and the control condition did not differ significantly regarding *w*index. Moreover and again different from Experiment A, values for*w*index


**Table 2 | Analyses of variance of the attentional components for the three conditions.**

Repeated-measures analyses of variance of the following attentional components of both experiments: the threshold of conscious perception, t0; the capacity of visual short-term memory, K; the processing speed, C; the top-down controlled attentional selectivity, α; and the laterality of attentional weighting, windex. Stated are F values, degress of freedom for their numerator (dfn) and for their denominator (dfd ), p values, and <sup>η</sup><sup>2</sup> <sup>G</sup> as effect size.


**Table 3 | Pairwise comparisons between attentional components of the three conditions.**

Post hoc tests for Analyses of variance (*Table 1*) with significant effects. Analyzed were the processing speed, C (letters/s) and the laterality of attentional weighting, windex. Stated are t values with degrees of freedom (df), p values, and Cohen's dz as effect size. Comparisons were conducted with t-tests for dependent samples which were corrected for multiple comparisons using a modified Bonferroni adjustment (Keppel, 1973). \*p < 0.033 (adjusted significance criterion).

were higher in the low-salient event compared to the high-salient event condition, although this difference only approached the adjusted significance criterion. To follow-up on these results, the data from both experiments were collapsed. For the collapsed data, *w*index differed between conditions, *F*(2,76) = 9.94, *p* < 0.001, η<sup>2</sup> <sup>G</sup> = 0.023. Values for *w*index were significantly higher in the low-salient event (*M* = 0.54, SD = 0.13) than in the control (*M* = 0.49, SD = 0.11), *t*(38) = 3.88, *p* < 0.001, *d*<sup>z</sup> = 0.62, and the high-salient event condition (*M* = 0.51, SD = 0.13), *t*(38) = 2.29, *p* = 0.028, *d*<sup>z</sup> = 0.37. In addition, values were significantly higher in the high-salient event condition compared to the control condition, *t*(38) = 2.55, *p* < 0.015, *d*<sup>z</sup> = 0.41.

### **DISCUSSION**

In two experiments with identical conditions, we asked whether monitoring affects distinct TVA-based components of visual attention that were estimated from performance in a concurrent task (see Bundesen, 1990; Vangkilde et al., 2011). More specifically, monitoring of a specific object in the environment for events (luminance increases) was required. The key question was how such event monitoring affected attentional components in a concurrently performed letter-report task. This was investigated by comparing the attentional components when monitoring for events was required (low-salient event condition) and when it was not required (control condition). We were furthermore interested in whether the effects of such monitoring vary with the expected degree of perceptual salience of the event. To address this, a third condition was included (high-salient event condition), in which monitoring was required and events were of higher salience as compared to the low-salient event condition. In both experiments, events were more frequently detected (and faster responded to) in the high-salient event than in the low-salient event condition, implying that events in the former were in fact more salient than in the latter. Also, participants showed a greater bias for reporting the presence of an event when it was more salient.

The results show differential effects of event monitoring on the five TVA-based attentional components.

First, in both experiments, the temporal thresholds of conscious perception in the letter-report task did not vary depending on whether monitoring was required or not, or on whether events were of a higher salience. Monitoring did thus not affect the time necessary for visual processing to start in the letter-report task, or to initiate the race to VSTM, respectively (cf. Bundesen, 1990).

Second, storage capacity of VSTM was approximately equal in all conditions, in each of the two experiments. It hence appears as if the number of VSTM slots (cf. Bundesen, 1990; Luck and Vogel, 1997) that were available to the letters of the letter-report task did not depend on whether monitoring was necessary or not, or on whether events were more salient. In the present study, monitoring required an attentional set that defined the fixation cross as relevant source of events. A previous study suggests that sustaining an attentional set to perform a visual monitoring task and maintaining information in VSTM can interfere in some circumstances (Helton and Russell, 2013). This is compatible with the idea that activation-based maintenance in VSTM relies on neuronal resources necessary for encoding new visual information (Petersen et al., 2012; Schneider, 2013). The present results do not indicate such interference. Crucially, the monitored object, the fixation cross, did not change across experimental trials in both monitoring conditions. Based on previous evidence (for a review, see Woodman et al., 2013), one could assume that therefore the attentional set used to monitor the object was retained in long-term memory and not in VSTM. This should have decreased the costs imposed on VSTM (Woodman et al., 2013). This attentional set was only necessary in the two conditions that required monitoring but not in the control condition. Since VSTM capacity was comparable in all three conditions, it seems unlikely that the attentional set constantly occupied a slot in VSTM.

Third, strong reductions in the processing speed of the letters of the letter-report task were observed when the fixation cross was monitored compared to when it was not monitored for events. This result was obtained in both experiments, and irrespective of whether events were more or less salient. Monitoring thus decreased the portion of overall visual processing capacity that could be used for processing the letters. The idea that attentional monitoring processes call for processing resources that are needed by concurrent, ongoing tasks was also put forward in the literature on event-based prospective memory (Smith, 2003). In this research field, however, theories mainly focused on how intentions are retrieved from memory under certain triggerconditions (McDaniel and Einstein, 2000; Smith, 2003; Einstein and McDaniel, 2005). Therefore, they did not attempt to specify the mechanisms that may underlie monitoring the environment for events. For the visual domain, this might be done using TVA (Bundesen, 1990). For the present application, the effects on the processing speed can be explained by TVA's filtering mechanism. When the fixation cross was monitored for events, it was more relevant to the task and hence received a higher attentional weight compared to when monitoring was unnecessary. From the viewpoint of TVA's neural interpretation, NTVA (Bundesen et al., 2005), more neuronal processing resources should have been allocated to the fixation cross due to its higher attentional weight. Consequently, less neuronal resources should have been available to processing the letters of the letter-report task, leading to a lower processing speed. Distributing neuronal processing resources with NTVA's filtering mechanism may thus provide a means to monitor the environment for visual events. In other words, we suggest that monitoring may be accomplished via attentional weights.

Interestingly, the letters in the letter-report task were processed faster when the events' salience was expected to be higher rather than lower. This result was again obtained in both experiments. A recent TVA-based study revealed that salience makes an additional contribution to attentional weights, independent of the task-defined relevance of object features (Nordfang et al., 2013). In the present study, the attentional weight of the fixation cross might have been adjusted to the expected salience of the events. More salient events could have allowed participants to rely more on the events' intrinsic ability to capture attention (e.g., Theeuwes, 2004), thereby saving visual processing resources for the letter-report task. Nevertheless, the visual processing speed in the letter-report task was still reduced when monitoring was targeted at salient events compared to when it was unnecessary. The attentional weight of the fixation cross still seemed increased as a consequence of monitoring, suggesting "monitoring via attentional weights." In sum, monitoring for salient events imposes weaker costs on visual processing speed in a concurrent task. This is partly in line with the idea that increasing the salience of events might reduce interference between an event-based prospective memory task and an ongoing task (McDaniel and Einstein, 2000). In contrast to this account, however, capturing interference effects by an estimation of visual processing speed does not imply a unitary attentional resource (cf. Allport, 1987; Neumann, 1987). Instead, it focuses on one specific attentional resource of visual processing (cf. Bundesen, 1990).

Fourth, in neither of the two experiments was the top-down controlled attentional selectivity reliably affected by the monitoring manipulation. The up-regulation of the attentional weight of the fixation cross when it was monitored for events (evident from the effects on visual processing speed, see above), thus did not seem to affect the relation between the attentional weights of target and distractor letters within the letter-report task. In the present experiments, only one object at one location in the visual field had to be monitored for events, and it remained visible throughout the relevant periods of a trial. This might resemble situations in everyday life where it is clear which location has to be monitored to detect events of interest. A question for future research may be whether this holds also for situations of location uncertainty regarding the monitored information source. For instance, adopting a less strict selectivity within a concurrent task may be advantageous in some cases. It could facilitate the detection of events at previously unknown locations which are irrelevant to the concurrent task but which are supposed to trigger retrieval of an intention from prospective memory (cf. Vangkilde et al., 2011).

Fifth, the laterality of attentional weighting seemed affected by the monitoring manipulation. Stronger biases to the left visual hemifield were observed in case less-salient events were monitored for. In Experiment A, a left-sided bias was also observed when events were more salient compared to the condition without monitoring requirements. In Experiment B, no such bias was observed in this condition. When the data was collapsed across experiments, stronger biases to the left hemifield became apparent in the low-salient compared to the other two conditions as well as in the high-salient event compared to the control condition. Therefore, these results await further replication and investigation. Monitoring requirements might have led to an increase in the experienced task-difficulty and might have been stronger for low-salient than for high-salient events. This interpretation corresponds to the higher error rates in the two event conditions, which were known to participants because of their receiving feedback. Consequently, the participants' level of alertness may have been higher in the two event conditions and highest in the low-salient event condition. There is some evidence that alerting participants by means of warning signals can introduce left-sided biases of attentional weighting (Matthias et al., 2010). A possible interpretation of the present results may thus be that monitoring requirements can also increase the participants' alertness level, and thereby provoked the shifts of attentional weighting to the left hemifield.

Moreover, the finding that event monitoring affected only two out of five attentional components is not compatible with the assumption of unspecific attentional resource costs (cf. Smith, 2003; Smith and Bayen, 2004).

As mentioned in the introduction, classical studies on monitoring were part of research on vigilance (Mackworth, 1948) or sustained attention (e.g., Sarter et al., 2001). The experimental task of the present study required sustained attention in the sense of attentional sets that had to be persistently maintained across trials. These attentional sets consisted of task-based priorities for target and distractor letters of the letter-report task and for the fixation cross, which was monitored for events or not. Monitoring should have affected these attentional sets. That is, it should have affected the processing priorities (e.g., attentional weights of objects) within the task. Outside the monitoring context, however, sustained attention may also mean to engage in a cognitive task (cf. McAvinue et al., 2012; Hancock, 2013). The maintained set of processing priorities would here encompass the entire task in question, as opposed to other tasks or "absentmindedness" (cf. Manly et al., 1999). On this level of analysis, sustaining attention to a task and TVA's attentional components (threshold of conscious perception, VSTM capacity, visual processing speed, and top-down controlled

attentional selectivity; estimated from performance within this task) do not appear to be interrelated (McAvinue et al., 2012).

In summary, the present study suggests that monitoring prespecified objects in the environment for visual events reduces the speed of visual processing in a concurrent task. In such situations, "monitoring-via-attentional weights" could take place. That is, the monitored object receives an increased attentional weight and visual processing resources are redistributed accordingly. Therefore, the amount of processing resources available for the concurrent task and the corresponding visual processing speed are reduced. Moreover, monitoring for high-salient events seems to impose weaker costs on the concurrent task than monitoring for low-salient events. By adjusting attentional weights to the expected high salience of events, that is, by relying more on the events' stimulus-driven ability to attract attention, it seems possible to save resources for the concurrent task. There is a variety of safetycritical settings where pre-specified locations or objects have to be monitored for certain events while performing other visual tasks. Performance in such settings may benefit from applying the results of the present study. Increasing the salience of events that are to be monitored for may be used to mitigate detrimental effects of monitoring on concurrent tasks. In the example above, anesthesiologists who monitor displays of a patient's vital signs while accomplishing other tasks may perform better in these tasks when aberrations in these displays can be expected to be highly salient.

#### **ACKNOWLEDGMENTS**

Portions of this work were presented at the 56th "Tagung Experimentell Arbeitender Psychologen" (TEAP) 2014, Giessen, Germany, at the Conference "Competitive visual processing across space and time: interactions with memory," 2014, Center for Interdisciplinary Research (ZiF), Bielefeld University, Bielefeld, Germany, and at the 13th Meeting of the Vision Sciences Society 2014, St. Pete Beach, FL, USA. The study was funded by the DFG, Cluster of Excellence 277 "Cognitive Interaction Technology (CITEC)."We acknowledge support for the Article Processing Charge by the Deutsche Forschungsgemeinschaft and the Open Access Publication Funds of Bielefeld University Library.

#### **REFERENCES**


conditions. *Proc. Hum. Fact. Ergon. Soc. Annu. Meet.* 45, 1395–1399. doi: 10.1177/154193120104501817


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 29 April 2014; accepted: 05 August 2014; published online: 21 August 2014. Citation: Poth CH, Petersen A, Bundesen C and Schneider WX (2014) Effects of monitoring for visual events on distinct components of attention. Front. Psychol. 5:930. doi: 10.3389/fpsyg.2014.00930*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology. Copyright © 2014 Poth, Petersen, Bundesen and Schneider. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Automatic attraction of visual attention by supraletter features of former target strings

### *Søren Kyllingsbæk1\*, Sven Van Lommel <sup>2</sup> , Thomas A. Sørensen3 and Claus Bundesen1*

<sup>1</sup> Department of Psychology, Center for Visual Cognition, University of Copenhagen, Copenhagen, Denmark

<sup>2</sup> Katholieke Universiteit Leuven – University of Leuven, Leuven, Belgium

<sup>3</sup> Department of Communication and Psychology, Aalborg University, Aalborg, Denmark

#### *Edited by:*

Signe Allerup Vangkilde, University of Copenhagen, Denmark

#### *Reviewed by:*

Edmund Wascher, Leibniz Research Centre for Working Environment and Human Factors, Germany Jan Brascamp, Utrecht University, Netherlands

#### *\*Correspondence:*

Søren Kyllingsbæk, Department of Psychology, Center for Visual Cognition, University of Copenhagen, Øster Farimagsgade 2A, DK-1353 Copenhagen K, Denmark e-mail: sk@psy.ku.dk

Observers were trained to search for a particular horizontal string of three capital letters presented among similar strings consisting of exactly the same letters in different orders. The training was followed by a test in which the observers searched for a new target that was identical to one of the former distractors. The new distractor set consisted of the remaining former distractors plus the former target. On each trial, three letter strings were displayed, which included the target string with a probability of 0.5. In Experiment 1, the strings were centered at different locations on the circumference of an imaginary circle around the fixation point. The training phase of Experiment 2 was similar, but in the test phase of the experiment, the strings were located in a vertical array centered on fixation, and in target-present arrays, the target always appeared at fixation. In both experiments, performance (d') degraded on trials in which former targets were present, suggesting that the former targets automatically drew processing resources away from the current targets. Apparently, the two experiments showed automatic attraction of visual attention by supraletter features of former target strings.

**Keywords: attention, visual search, capture, visual perception, letters**

### **INTRODUCTION**

Following the lead of Shiffrin and Schneider (1977, Experiment 4d), Kyllingsbæk et al. (2001) explored the extent to which visual features of alphanumeric characters gain in pertinence (propensity to attract attention to characters with the given features) by prolonged and consistent training in visual search for characters with these features. In a simple and instructive experiment (Kyllingsbæk et al., 2001, Experiment 4), six different (types of) letters (H, N, L, T, X, and Z) were used as stimuli. For each participant, one of the six different letters served as the target throughout the training phase, while the other five letters served as distractors. On each trial, a circular array of letters was presented briefly, followed by a pattern mask. The participant's task was to indicate whether the target letter appeared in the array. No time pressure was imposed on the response. Training sessions were run during four successive days. On the fifth day of the experiment, the target and distractor sets were redefined. One of the five letters that had been used as distractors during the training was selected to be the new target. The new distractor set consisted of the four remaining former distractors plus the former target. The presentation of the former target, instead of a former distractor, caused a decrement in *d'* averaging 0.15 units (*breakthrough* effect). Apparently, the former target letter automatically drew processing resources away from the current target.

The results of Schneider and Shiffrin (1977) and Kyllingsbæk et al. (2001) suggest that visual attention can be attracted by shapes as complex as those of individual alphanumeric characters. As noted by Kyllingsbæk et al. (2001), other evidence seems to suggest that the initial allocation of attention to items in a visual

display is insensitive to words of four letters or more. Bundesen et al. (1997) presented observers with briefly exposed visual displays of words, which were common first names with a length of four to six letters. In the primary experiment, each display consisted of four words: two names shown in red and two shown in white. The observer's task was to report the red names (targets), but ignore the white ones (distractors). On some trials the observer's own name appeared as a display item (target or distractor). Presentation of the observer's name as a distractor caused no more interference with report of targets than did presentation of other names as distractors. Apparently, visual attention was not automatically attracted by the observer's own name. By contrast, a supplementary single-stimulus identification experiment showed that observers were more accurate in reading their own name than in reading other names (for a similar finding, see Shapiro et al., 1997).

If a visual 4-letter word could attract attention automatically, we would expect the attention of an observer with a 4-letter name to be attracted automatically by his or her own name (see Moray, 1959). As suggested byBundesen et al. (1997), the contrast between findings with single letters and digits and findings with short words may be explained by assuming that visual attention can be attracted by individual alphanumeric characters, but not by shapes as complex as those of 4-letter words. To further explore this issue, we conducted two new experiments investigating attentional effects of prolonged search for strings of three letters. We chose 3-letter strings rather than 4-letter strings to decrease the complexity of the stimuli and thus increase the likelihood that they would be able to attract attention after training. The letter strings all contained the same three letters and could only be distinguished from each

other by considering the ordering of the three letters comprising each string (supraletter features).

#### **EXPERIMENT 1**

In Experiment 1, the participants trained for 2 days searching for a pre-designated 3-letter target-string presented among similar strings consisting of the same letters in different orders. On the third day the task was changed to one of searching for one of the former distractors while ignoring the former target string.

The stimulus material was designed so that it was impossible to discriminate any of the stimulus strings from all of the remaining ones by considering only simple features of individual letters or identities of the individual letters making up the strings. The stimuli were defined as all the possible ordered combinations of the letters *E*, *L*, and *O*, which yielded six 3-letter strings: *ELO, EOL, LEO, LOE, OEL,* and *OLE*. The only way in which any of the six strings could be discriminated from the rest of the stimuli was by considering the ordering of the three letters comprising the string.

Three of the stimuli were common Danish first names (*ELO, LEO,* and *OLE*), whereas the rest of the stimuli were non-words in Danish. Propensity to attract attention may develop more easily for familiar stimuli such as letters or words (see Czerwinski et al., 1992) than for less familiar stimuli such as non-words. Our stimulus material made it easy to test this possibility.

### **METHOD**

#### *Participants*

Five students (all females) from the University of Copenhagen participated in the experiment. Each participant was paid DKK 100 (\$14) per hour. The ages of the participants ranged between 18 and 25 years. All participants had normal or corrected-to-normal visual acuity. The experiment was approved by the local ethical committee of the Department of Psychology, University of Copenhagen.

#### *Stimuli*

Six letters strings (*ELO, EOL, LEO, LOE, OEL,* and *OLE*) were used as stimulus material. Each stimulus frame contained eight possible stimulus positions (N, NE, E, SE, S, SW, W, NW) on the circumference of an imaginary circle centered on fixation. Each stimulus display contained three stimuli, which were distributed randomly across the eight positions. The distance from the center of a letter string to a small white fixation cross at the center of the screen was 40 mm (1.9o). The width and height of the letter strings were 18 (0.9o) and 8 mm (0.4o), respectively. All stimuli were presented in white on a black background at a viewing distance of 1.2 m.

#### *General procedure*

The experiments were run on a CRT controlled by a PC. The participants were seated in a semi-darkened room 1.2 m from the screen. The participant started each trial by first fixating the fixation cross and when ready pressing a key, which immediately released a brief 200-ms exposure of the stimulus frame. The stimulus frame was immediately succeeded by a 500-ms exposure of a frame with eight masks, one at each of the eight possible stimulus positions (see **Figure 1**). The participant's task was to indicate

whether a pre-designated target was present in the stimulus frame. Participants responded *present* by pressing the right key and *absent* by pressing the left key of a response box. A short warning sound was given as feedback when an error was made.

#### *Training*

For each participant, one of the six strings served as the target throughout the training phase, while the other five letter strings served as distractors. On each trial, the target appeared in the display with a probability of 0.5.

One session consisted of 2,000 trials (100 blocks of 20 trials each) and took about 2 h. For each trial, three strings were presented and the distractors were randomly drawn without replacement from the set of five distractor strings. Two training sessions were run during two successive days.

### *Test*

On the third day of the experiment, target and distractor sets were redefined. One of the five strings that had been used as distractors during the training was selected to be the new target. The new distractor set consisted of the four remaining former distractors plus the former target string. One test session was run with the new target and distractor sets. The former target appeared (once per display) in one half of the stimulus displays. Except as noted the procedure during the test phase was the same as during the training. Thus, the probability that the former target appeared in a stimulus display was exactly the same as the probability that any other particular member of the new distractor set appeared in the display.

#### **RESULTS**

The error rates were analyzed by use of *signal-detection theory* (Green and Swets, 1966) to disentangle variations in sensitivity (measured by parameter *d'*) from variations in response bias (measured by the natural logarithm of parameter β). Learning curves for each participant are shown with respect to both sensitivity (**Figure 2**, Panel A) and bias (**Figure 2**, Panel B). The data were split into subblocks of 500 trials each and the following analyses were also done with this division of the data. A linear regression analysis across the five participants showed a significant increase in sensitivity during the training period [*F*(1,3) <sup>=</sup> 16.72, *<sup>p</sup>* <sup>&</sup>lt; 0.05]1. The rate of increase in *d'* averaged

1The analyses of the training sessions were based on only the data from Participants 2–5, because the data from the last subblock was lost for Participant 1.

variations in sensitivity (d'), and **(B)** shows variations in bias (logβ). As can be seen from the graphs for Participant 1 (P1), the data from the last

0.11 units per subblock. The linear trend in log β as a function of number of session did not reach significance [*F*(1,3) = 3.21, *p* = 0.17].

The effects of former targets on sensitivity and bias in the test phase are illustrated in **Figure 3**. We computed sensitivity and bias values for the two conditions by first separating trials where the former target was present and absent, respectively. We then computed hits and false alarm rates within the two sets of trials and from these sensitivity and bias values for the two conditions. As can be seen in Panel A of **Figure 3**, sensitivity was lower when the former target was present than when the former target was absent. The effect of the former target was significant [*t*(4) = 2.97, *p* < 0.05] and present in all the five participants. The decrement in *d'* averaged 0.19 units, range 0.06–0.43. The effect on bias bordered on significance [*t*(4) = 2.03, *p* = 0.06] suggesting that participants may have been more conservative in the training phase

present (FTP), respectively. **(A)** shows variations in sensitivity (d') across participants, and **(B)** shows variations in bias (logβ).

subblock was lost for this participant.

compared to the test phase (see **Figure 3B**). When testing if the effect on *d'* depended on whether the former target was a word or a non-word, we found no significant difference [*t*(3) = 1.12, *p* = 0.35] (see also **Table 1**). Of course, with only five participants, the null result may be a Type II error due to lack of power (but see Experiment 2).

#### **DISCUSSION**

The decrement in sensitivity observed when the former target was presented as a distractor extended the findings reported by Shiffrin and Schneider (1977) and Kyllingsbæk et al. (2001). The magnitude of the decrement we found in *d'* (0.19 units) was comparable in magnitude to the decrement (0.25 units) found by Kyllingsbæk et al. (2001, Experiment 4) in a study of search for a single letter target in displays of three letters. We found no evidence for differential effects of presentation of former targets depending on whether these were words or non-words.

The main finding from Experiment 1 was that the breakthrough of former targets demonstrated by Shiffrin and Schneider (1977) and Kyllingsbæk et al. (2001) for individual alphanumeric characters could be obtained for former targets that were 3-letter strings defined neither by visual features of individual letters, nor by the global shapes of individual letters, but, apparently, by features that reflected the ordering of the letters in the target string: supraletter visual features.

#### **EXPERIMENT 2**

The results of Experiment 1 might be interpreted not as a result of automatic attraction of attention by the former 3-letter target, but rather as attention getting stuck at the former target when accidentally encountered, assuming, for example, that attention is allocated to the display items in a random order. To test this

hypothesis, we fixed the location of the target in the test phase of Experiment 2.

The training phase of Experiment 2 was identical to the one used in Experiment 1. However, in the test phase the display setup was changed so that the target could be selected by location. Instead of a circular search display with varying stimulus locations, the three display elements were always located in a vertical column centered at fixation. Further, the new target string always appeared at the central location if present (known by the participants), whereas the former target never appeared at the central location (not known by the participants). If participants were able to ignore the former target by attending exclusively to the string presented at the central location, we should expect no effect of presentations of the former target.

#### **METHOD**

#### *Participants*

Eight students (four females and four males) from the University of Copenhagen participated in the experiment. Each participant was paid DKK 100 (\$14) per hour. The ages of the participants ranged between 17 and 29 years. All participants had normal or corrected-to-normal visual acuity. The experiment was approved by the local ethical committee of the Department of Psychology, University of Copenhagen.

#### *Stimuli*

The stimulus material was the same as the one used in Experiment 1. Only the stimulus frame during the test phase was different. In the test phase of Experiment 2, the three letter strings were positioned in a vertical column centered at fixation (see **Figure 4**). The center-to-center distance between the strings was 12 mm (0.6o).

**Table 1 |Targets during training and test for participants in Experiments 1 and 2.**


#### *Procedure*

The procedure during the training phase was identical to that of Experiment 1. In the test phase, however, the participants were instructed to attend exclusively to the middle one of the three strings presented (i.e., the string presented at fixation). Participants were told that the new target string would always appear at the central location if present in the display. Participants thus had a clear incentive to attend to the stimulus at the central location and ignore the two flanking distractor strings. Further, the former target string never appeared at the central location. Participants were not made aware of this fact and none reported having noticed it when questioned after the end of the experiment. The exposure duration was calibrated before the start of the test phase for each participant to prevent ceiling and floor effects. The exposure duration ranged between 40 and 80 ms across the eight participants.

#### *Design*

Again one of the six letter strings was designated as target for each participant and two blocks of 2,000 trials were run as training. The test phase comprised 2,000 trials similarly to the test phase in Experiment 1.

Because of the constraint that the former target could not appear at the central location during the test phase, the new target had to be selected from a particular subset of the five distractor strings from the training phase in order to prevent participants from using a strategy whereby the new target string could be identified by looking for only one of the letters in the

string. For example, if the former target was *OLE*, the new target was either *EOL* or *LEO*. That is, the new target was one of the two strings in which neither *O* appeared as the first letter (i.e., as in *OEL*), *L* appeared as second letter (i.e., as in *ELO*), or *E* appeared as the third letter (i.e., as in *LOE*). If any of the strings *OEL*, *ELO*, or *LOE* had been chosen as new target, participants would have been able to identify the new target at the central location by looking for an *O* at the first position in the string, an *L* at the second, or an *E* at the third, respectively.

#### **RESULTS**

**Figure 5** shows learning curves for sensitivity (Panel A) and bias (Panel B). As in Experiment 1, there was a strong and significant linear trend for sensitivity [*F*(1,7) = 41.36, *p* < 0.001], but no significant trend for bias (*F* < 1). The average rate of increase in *d'* was 0.11 units per subblock of 500 trials.

The effect of former targets on sensitivity and bias in the test phase is shown in **Figure 6**. Panel A shows that *d'* was again lower when the former target was present compared to trials in which it was absent [*t*(7)=1.94, *p* <0.05]. The effect was observed in seven out of the eight participants. The decrement in *d'* averaged 0.03 units. The effect on bias did not reach significance [*t*(7) = −1.01, *p* = 0.35]. Again, we found no effect on *d'* of whether the former target was a word or a non-word [*t*(6) = −0.423, *p* = 0.69] (see also **Table 1**).

separately shown for trials in which the former target was absent (FTA) and present (FTP), respectively. **(A)** shows variations in sensitivity (d') across participants, and **(B)** shows variations in bias (logβ).

#### **DISCUSSION**

In Experiment 2, uncertainty concerning the possible target location was reduced to a minimum by using a fixed location centered at fixation. The former target never appeared at this location, but only at the two flanking locations. As in Experiment 1, presentation of the former target impeded detection of a simultaneously presented current target. The decrement in *d'* found in Experiment 2 (0.03 units) was smaller than the decrement found in Experiment 1 (0.19 units), but still statistically significant.

### **GENERAL DISCUSSION**

Experiments 1 and 2 provided clear evidence of automatic attraction of visual attention by supraletter features of letter strings following prolonged and consistent practice in search for these targets. Either experiment replicated the breakthrough effect of Shiffrin and Schneider (1977, Experiment 4d) and Kyllingsbæk et al. (2001) with a stimulus ensemble consisting of 3-letter strings that were constructed in such a way that it was impossible to determine whether a string was a target or a distractor by testing for either features of individual letters or presence of particular individual letters within the string. Thus, because the former targets and other distractors consisted of exactly the same letters, our findings suggest that supraletter visual features that reflected the ordering of the letters in the targets gained pertinence (propensity to attract attention to objects with the given features) during the training (see, e.g., Bundesen, 1990; Nordfang et al., 2013).

The nature and complexity of the supraletter visual features in question is still a matter of speculation. Most obviously, having the shape of the 3-letter target string (e.g., ELO) as a whole may be one supraletter visual feature that gained pertinence and, accordingly, enhanced the attentional weight of the target during training. However, supraletter visual features need not be complex. Containing a particular bigram (ordered pair of letters such as EL or LO) within the target string, or containing a bigram with particular features, is a more simple supraletter visual feature that also may have gained pertinence and, thereby, enhanced the attentional weight of the target during training (see Dehaene et al., 2005, for a proposal for a neural code for written words in which bigrams, including "open bigrams," have a pivotal role). Indeed, the supraletter visual features that gained pertinence could in principle have been any features of multiletter units that were useful in discriminating the target string of letters from the distractor strings.

The results found by Shiffrin and Schneider (1977) have had a strong impact on the development of general theories of attention (see, e.g., Duncan, 1980; Treisman, 1988; Duncan and Humphreys, 1989; Bundesen, 1990; van der Heijden, 1992; Wolfe, 1994; Lavie, 1995; Schneider, 1995, 1999). Proponents of late selection theories of attention (e.g., Shiffrin and Schneider, 1977) have argued that if a particular type of stimuli automatically attracts attention, recognition of this type of stimuli must be possible preattentively and in parallel across all objects in the visual field (see also Kyllingsbæk and Bundesen, 2007). A weaker and safer claim is that if a particular type of stimuli automatically attracts attention, retrieval of evidence that stimuli belong to the type in question must be

possible preattentively and in parallel across the visual field. Thus, the results of the present experiments suggest that simultaneously presented visual stimuli defined by supraletter features can be compared in parallel against representations in visual long-term memory.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 12 August 2014; accepted: 12 November 2014; published online: 27 November 2014.*

*Citation: Kyllingsbæk S, Van Lommel S, Sørensen TA and Bundesen C (2014) Automatic attraction of visual attention by supraletter features of former target strings. Front. Psychol. 5:1383. doi: 10.3389/fpsyg.2014.01383*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology. Copyright © 2014 Kyllingsbæk, Van Lommel, Sørensen and Bundesen. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Cognitive programs: software for attention's executive

### *<sup>1</sup> <sup>2</sup> John K. Tsotsos \* and Wouter Kruijne*

<sup>1</sup> Department of Electrical Engineering and Computer Science and Centre for Vision Research, York University, Toronto, ON, Canada <sup>2</sup> Department of Cognitive Psychology, Vrije Universiteit, Amsterdam, Netherlands

#### *Edited by:*

Søren Kyllingsbæk, University of Copenhagen, Denmark

#### *Reviewed by:*

Ulrich Ansorge, University of Vienna, Austria Claus Bundesen, University of Copenhagen, Denmark

#### *\*Correspondence:*

John K. Tsotsos, Department of Electrical Engineering and Computer Science, York University, 4700 Keele St., Toronto, ON M3J 1P3, Canada e-mail: tsotsos@cse.yorku.ca

What are the computational tasks that an executive controller for visual attention must solve? This question is posed in the context of the Selective Tuning model of attention. The range of required computations go beyond top-down bias signals or region-of-interest determinations, and must deal with overt and covert fixations, process timing and synchronization, information routing, memory, matching control to task, spatial localization, priming, and coordination of bottom-up with top-down information. During task execution, results must be monitored to ensure the expected results. This description includes the kinds of elements that are common in the control of any kind of complex machine or system. We seek a mechanistic integration of the above, in other words, algorithms that accomplish control. Such algorithms operate on representations, transforming a representation of one kind into another, which then forms the input to yet another algorithm. Cognitive Programs (CPs) are hypothesized to capture exactly such representational transformations via stepwise sequences of operations. CPs, an updated and modernized offspring of Ullman's Visual Routines, impose an algorithmic structure to the set of attentional functions and play a role in the overall shaping of attentional modulation of the visual system so that it provides its best performance. This requires that we consider the visual system as a dynamic, yet general-purpose processor tuned to the task and input of the moment. This differs dramatically from the almost universal cognitive and computational views, which regard vision as a passively observing module to which simple questions about percepts can be posed, regardless of task. Differing from Visual Routines, CPs explicitly involve the critical elements of Visual Task Executive (vTE), Visual Attention Executive (vAE), and Visual Working Memory (vWM). Cognitive Programs provide the software that directs the actions of the Selective Tuning model of visual attention.

**Keywords: visual attention, executive, visual routines, working memory, selective tuning**

### **INTRODUCTION**

The corpus of theories and models of visual attention has grown rapidly over the past two decades (see Itti et al., 2005; Rothenstein and Tsotsos, 2008; Nobre and Kastner, 2013). It has become difficult to keep track of these and even more difficult to compare and contrast them with respect to their effectiveness at explaining known phenomena and predicting new ones. Surprisingly, few have attempted to go beyond the creation of saliency maps or re-creation of single cell response profiles. Larger efforts aimed at connecting visual attention with its executive controller or with real-world tasks such as recognition, motor behavior or visual reasoning, are not common. Such a larger scale effort is precisely our long-term goal and a first step will be proposed. The key question addressed is: What are the computational tasks that an executive controller for visual attention must solve? The answer to this question would play a major role in any cognitive architecture. Unfortunately, the previous literature on large-scale cognitive frameworks does not provide much guidance as can be seen from the following synopsis. A great review can be found in Varma (2011).

Dehaene and Changeux (2011), in an excellent review paper, point out that "Posner (Posner and Snyder, 1975; Posner and Rothbart, 1998) and Shallice (Shallice, 1972, 1988; Norman and Shallice, 1980) first proposed that information is conscious when it is represented in an *executive attention* or *supervisory attentional* system that controls the activities of lower-level sensory-motor routines and is associated with prefrontal cortex. In other words, a chain of sensory, semantic, and motor processors can unfold without our awareness, [....] but conscious perception seems needed for the flexible control of their execution, such as their onset, termination, inhibition, repetition, or serial chaining." This viewpoint puts our effort squarely on the same path as those that address consciousness, however we will stop short of making this link. Our focus is to develop this supervisor for attention so that it is functionally able to provide a testable implementation that uses real images. It can be considered in larger roles as Dehaene and Changeux suggest but we reserve further discussion on this for future work. It might be that by defining concrete mechanisms for executive attention, contributions to our understanding of consciousness will emerge.

Two of the best-known cognitive architectures are SOAR (Laird et al., 1987) and ACT-R (Anderson and Lebiere, 1998). Within SOAR, designed to provide the underlying structure that would enable a system to perform the full range of cognitive tasks, an attentional component was defined named NOVA (Wiesmeyer and Laird, 1990). Attention is claimed to precede identification, is a deliberate act mediated by an ATTEND operator, and functions as a gradient-based, zoom lens of oval shape that separates figure from ground. Attended features move on to recognition. This reflects an "early selection" conceptualization (Broadbent, 1958). ACT-R, designed with the same goals as SOAR, defines perceptual-motor modules that take care of the interface with the environment. Perception operates in a purely bottom-up manner and is assumed to have the function of parsing the visual scene into objects and their features. Attention then is used to select objects and recognize them in a manner that combines a spotlight model with search guidance. The firing of production rules controls shifts of attention. This model also reflects an early selection strategy.

More recently, massive neuronal network simulations have become possible not in small part due to increased computing power and large engineering feats. Zylberberg et al. (2010) develop a large-scale neural system that embodies attention in the form of a router whose job is to set up the precise mapping between sensory stimuli and motor representations, capable of flexibly interconnecting processors and rapidly changing its configuration from one task to another. This captures the information routing part of the problem, but does not include the dynamic nature of attentive single neuron modulations. Eliasmith et al. (2012) describe another large-scale neural model, impressive for its ability to generalize performance across several tasks. The entire vision component is modeled using a Restricted Boltzmann Machine as an auto-encoder, but attention is not used. The major brain areas included in the model are modeled using abstract functional characterizations and are structured in a feed-forward processing pipeline for the most part. Each of these, and in fact most major proposals, view the visual system as a passively observing, data-driven classifier of some sort, exactly the kind of computational system that Marr had envisioned (Marr, 1982) but not of the kind indicated by modern neurobiology. Specifically, the enormous extent of inter-connectivity and feedback connections within the brain (Markov et al., 2014) seem to elude modeling attempts and their function remains a major unknown.

On the neuroscience side, there have been three recent attempts to capture the essence of top-down control for visual attention; executive control has been of interest for some time (e.g., Yantis, 1998; Corbetta and Shulman, 2002; Rossi et al., 2009, and many others). They also provide steps toward understanding the role of the feedback connections. In Baluch and Itti (2011), a map of brain areas and their top-down attentive connections is presented. They nicely overview a number of attentional mechanisms but it is odd that no top-down attentional influences are included for areas V1, V2, and LGN, areas where attentional modulation has been observed (e.g., O'Connor et al., 2002). By contrast, Miller and Buschman (2013) describe several pathways for top-down attention, all originating in frontal cortex and influencing areas LIP, MT, V4, V2, V1. They provide a good picture of top-down attentional connections but not much on exactly how this influence is determined and executed. Finally, Raffone et al. (2014) take an additional important step by defining a visual attentional workspace consisting of areas FEF, LIP and the pulvinar, this workspace being supported by a global workspace in LPFC. This more complex structure likely comes closest to what we seek too, but Raffone et al. do not provide mechanistic explanations as to how this concert of areas operate in a coordinated fashion. Such work subscribes to the philosophy that by combining experimental observations, one can develop an understanding without detailing workable mechanisms. Our perspective is exactly the opposite: we will proceed by trying to determine what needs to be solved first (see Marr's computational level of analysis Marr, 1982) and how those solutions may come about (Marr's algorithmic and representational level). Experimental observations play the role of constraining the set of possible solutions (see Tsotsos, 2014).

Nevertheless, the past work reviewed above has value for our efforts. We notice that each of the above works embody the idea that there exists a sequence of representational transformations required to take an input stimulus and transform it into a representation of objects, events or features that are in the right form to enable solution of a task. This concept is key and points us to Ullman's Visual Routines (VRs). Ullman presented a strategy for how human vision might extract shape and spatial relations (Ullman, 1984). Its key elements included:


Why is the visual routine a useful concept? The key requirement for a solution to our goal is an approach that is centered on the visual representations important for the completion of perceptual tasks, and transformations between representations that traverse the path from task specification to stimulus presentation to task completion. VRs depend on visual representations and represent algorithms for how one representation is transformed to another toward the overall task satisfaction and as such present us with a starting point for our goal. We will generalize this concept beyond its utility for shape and spatial relation computation, but first look at previous developments of VRs.

A number of researchers have pursued the visual routines concept. Johnson (1993) and McCallum (1996) looked into how VRs may be learned, using genetic programming and reinforcement learning. Horswill (1995) developed a system that performs visual search to answer queries in a blocks world. He included a set of task-specific weights to compute a saliency map, a set of markers that hold the centroids of regions, and a return inhibition map that masks out regions that should not be selected. Brunnström et al. (1996) propose an active approach including an attentional mechanism and selective fixation. They define VRs that can rapidly acquire information to detect, localize and characterize features. Ballard et al. (1997) emphasize the need for an attentive "pointing device" in visual reasoning. Rao's (1998) primitive VR operations are: shift of focus of attention; operations for establishing properties at the focus; location of interest selection. These enable VRs for many visuospatial tasks. Ballard and Hayhoe (2009) describe a gaze control model for event sequence recognition. They highlight problems with saliency map methods for task-based gaze control. VRs also found utility in practical domains: control of humanoids (Sprague and Ballard, 2001); autonomous driving (Salgian and Ballard, 1998); natural language interpretation and motor control (Horswill, 1995); control of a robot camera system (Clark and Ferrier, 1988).

Neurobiologists have also embraced VRs. Roelfsema et al. (2000, 2003) and Roelfsema (2005) have provided neurophysiologic support. They discovered neurons in motor cortex selective for movement sequences. They also monitored the progression of a sequence by recording activity of neurons in early visual cortex, associating elemental operations with changes in neuron response. They thus suggested an enhanced set of VRs: visual search, cuing, trace, region filling, association, working memory, suppression, matching, and motor acts. This work forms a nice stepping-stone onto the path we will take.

However, almost everything has changed in our knowledge of vision and attention since Ullman described visual routines in 1984 and this necessitates at least an update of its conceptualization. We know that attention is more complex than region-of-interest selection for gaze change. It also involves topdown priming of early visual computations, feedback processing, imposes a suppressive surround around attended items to ignore background clutter and modulates individual neurons to optimize them for the task at hand both before the stimulus is presented as well as during its perception. Attentive modulation can change the operating characteristics of single neurons virtually everywhere in the visual cortex (see Itti et al., 2005; Carrasco, 2011; Nobre and Kastner, 2013). Moreover, we know the time course of attentive effects differs depending on task; attentional effects are seen *after* Marr (1982) limit of 160 ms. Further we now know there are no independent modules, as Marr believed, because most neurons are sensitive to more than one visual modality/feature. We also know that the feedforward pass of the visual cortex has limits on what can and cannot be processed. It is not the case that this feedforward pass, as Marr had thought, suffices to compute a complete base representation on which any additional reasoning can take place. If anything, that feedforward pass is only the beginning of the act of perception (Tsotsos, 1990, 2011; Tsotsos et al., 2008). The view that is becoming more accepted is that vision is a dynamic process. For example, Di Lollo et al. (2000) conclude that mismatches between the reentrant visual representations and the ongoing lower level activity lead to iterative reentrant processing. Lamme and Roelfsema (2000) provide a more general view of this idea with motivations from neurophysiology. They show the activity of cortical neurons is not determined by this feedforward sweep alone. Horizontal connections within areas, and higher areas providing feedback, result in dynamic changes in tuning. The feedforward sweep rapidly groups feature constellations that are hardwired in the visual brain, and in many cases, recurrent processing is necessary before the features of an object are attentively grouped. Cichy et al. (2014) provide a comprehensive view of object recognition during the first 500 ms of processing showing that early visual representations (while the stimulus is still on) develop over time and are transient while higher level representations (with greater temporal duration than the stimulus) and various categorical distinctions emerge with different and staggered latencies. Rather than being purely stimulus-driven, visual representations interact through recurrent signals to infer meaning (Mur and Kriegeskorte, 2014). As a result, the vision system is far more complex than Ullman had considered and the control issues become critical.

### **WHAT DO COGNITIVE PROGRAMS HAVE TO CONTROL?**

Our original question was "What are the computational tasks that an executive controller for visual attention must solve?" and we posed it in the context of the Selective Tuning (ST) model. ST functionality includes not only the often seen top-down bias signals or region-of-interest determinations, but also overt and covert fixation change, parameter determinations, information routing, localization, priming, and coordination of bottom-up with top-down information. Elements that seem necessary but not currently within ST include representations of task, shortterm memory, and task execution. During task execution, results must be monitored to ensure the expected results are obtained. Similar elements are common in the control of any kind of complex system and typically, such tasks are represented within algorithms designed to accomplish control. Such algorithms operate on representations, transforming a representation of one kind into another, which then forms the input to yet another algorithm. Cognitive Programs are hypothesized to capture exactly such representational transformations via stepwise sequences of operations. Cognitive Programs (CPs), an updated and modernized offspring of Ullman's seminal Visual Routines, provide an algorithmic structure to the set of attentional functions and play a role in the overall shaping of attentional modulation of the visual system so that it provides its best performance. We consider the visual system as a dynamic, yet general-purpose processor, tuned to the task and input of the moment. This differs dramatically from what is most common in previous theories of cognition and current computational vision, which regard vision as a passively observing module to which simple questions about percepts can be posed, with the tacit assumption that this suffices for any task. Just and Varma (2007) make exactly the same point after reviewing how recent brain imaging results impact the design of complex cognitive systems. It is important to note that for the balance of this presentation, the motivation for the components of Cognitive Programs arises exclusively from the functional needs of the ST attentional process in its expanded role of tuning the visual system for a given task.

First, let us make the notion of a Cognitive Program more concrete. Ullman defined his visual routines as sequences of elemental operations as described earlier in this paper, and he distinguished universal routines from "regular" ones. Here, Cognitive Programs will be of two types also, but the similarity ends there. The first type is termed *methods*, and whereas Ullman suggested that universal routines can be usefully applied to any scene to provide some initial analysis, and transform input into a representation that is then amenable to the regular kind of routine, here methods cannot be applied without some degree of adaptation to task and/or input (including sensor) characteristics of the moment. For example, a CP method that encodes how to perform visual search needs a specification of the target being sought. It might also be tuned to overall light levels, any context information available, and so on, all useful information for tuning the attentive behavior of the system.

*Scripts* are the executable versions of tuned methods and can be used directly to provide the necessary information to initiate, tune, and control visual processing. Here, all CPs do more than transform one representation into another. They may also encode decision-making elements and set control signals in addition to sequencing representational transformations. The elemental operations differ from Ullman's VRs as well. CPs are composed of accesses to memory (both read and write), yes-no decision points decided by the execution of particular functions, and determination of control signal settings. CPs can be formed by the composition of other CPs. Whereas Ullman's VRs included high-level actions such as boundary tracing as elemental operations, here, tracing will be composed of more primitive elements and will result in a CP of its own. This is simply one example of how CPs may be considered as a more fine-grained version of VRs.

A sample CP may help clarify their form. **Figure 1** shows a simple CP method, one intended for the visual task of discrimination. Acronyms and some components are not fully defined until a subsequent section; this example is given in order to show only the form of CPs and the kinds of elemental operations that will come into play. Discrimination, following Macmillan and Creelman (2005) is defined as a task where a yes-no response is required on viewing a display with a stimulus drawn from one of two classes, and where one class may be noise. As can

**discrimination task.** The traversal of the graph from start to end provides the algorithm required to execute a discrimination task. The kinds of operations involve several instances of moving information from one place to another (in red), executing a process (in green), making a selection (in blue), or setting parameters (in orange). In words, this algorithm has the following steps: (i) the visual task executive receives the task specifications; (ii) using those specifications, the relevant methods are retrieved from the methods longe-term memory; (iii) the most appropriate method is chosen and tuned into an exectuable script; (iv) the script is then executed, first activating in parallel the communication of the task information to the visual attention exectutive and initiating the attentive cycle; (v) the visual

hierarchy is primed using task information (where possible) and in parallel attention is disengaged from the previous focus; (vi) the visual attention executive sets the parameters for executing the competition for selecting the focus of attention; (vii) disengage attention involves inhibiting the previously attended pathways and any previously applied surround suppression is also lifted (note that steps v–vii are executed in parallel before the visual stimulus appears, to the extent possible); (viii) the stimulus flows through the tuned visual hierarchy in a feed-forward manner; (ix) the central focus of attention is selected at the top of the visual hierarchy; the central focus of attention is communicated to the visual task exectuive that then matches it to the task requirements; (x) if the selected focus and the task requirements match, the task is complete. be seen from the figure, the kinds of operations involve several instances of moving information from one place to another (in red), executing a process (in green), making a selection (in blue), or setting parameters (in orange). The first step is for the visual task executive (vTE) to receive the specification of the task (from an unspecified source external to this model). The details of the task can be used as indices into the methods long-term memory (mLTM) in order to select and fetch the most appropriate method. This implies that the memory itself is organized in an associative manner that reflects key task elements. The chosen method is then tuned using the task specification and becomes an executable script. The script initiates the attentive cycle, and sends the elements of the task that are required for attentive tuning to the visual attention executive (vAE). The vAE then primes the visual hierarchy (VH) with the appropriate top-down signals that reflect expectations of the stimulus (e.g., the display will consist of a ring of 8 items) or instructions to the subject (e.g., search for the green item) and also sets any parameters needed for stimulus competition for attention. How is it possible to communicate a "cue" to a subject? One way is to simply show the cue; it would be processed by exactly the same system, attended, and the resulting output representation (later termed *attentional sample*) used as the basis for priming the system for the upcoming stimulus. While priming is occurring, attention is also being disengaged from its previous focus, and here disengage means that any attentive spatial surround suppression (Hopf et al., 2010) or feature surround suppression (Störmer and Alvarez, 2014) imposed for previous stimuli is lifted, and any previously attended pathways are inhibited (implementing an object-based inhibition of return). This last set of functions gives an excellent example of predictions this kind of analysis provides. The notion of disengaging attention is a common one in the literature but it has not previously been operationalized. Here, an operational definition is presented, amenable to experimental verification, and functionally consistent with the needs of ST. Continuing with the example, once all of this is complete, the feedforward signal appears and traverses the tuned VH. In other words, these actions would occur before stimulus onset, consistent with Müller and Rabbitt's (1989) conclusion that in order for priming to be effective subjects must be informed of it 300–80 ms before stimulus onset. Once the feedforward pass is complete, ST's θ-WTA process (a winner-take-all decision process based on a binning threshold θ that selects a spatially contiguous set of largest values within some retinotopic representation, such as the responses of a specific neural selectivity or filter across the visual field—see Tsotsos et al., 1995; Rothenstein and Tsotsos, 2014) makes a decision as to what to attend and passes this choice on to the next stage. The vTE, which is monitoring the execution of the script, then takes this choice, compares it to the task goals, and decides on whether the discrimination task is completed in a positive or negative manner and the task is complete.

It may seem that a neural realization of such an abstract and complex process is doubtful. However, recently, Womelsdorf et al. (2014) have detailed a broad variety of simple neural circuit elements that provide precisely the kinds of functionality CP's require, including gating, gain control, feedback inhibition and integration functions. An important future activity is to see how to assemble such circuit elements into the functions described here for CPs.

Now it is clear how CP's are the software for the vision executive; the example of **Figure 1** is a *flowchart* representing the algorithm that may solve discrimination. There is no illusion here that this specification is all that is needed. Much additional processing is required by each of the components, but the additional computations are all known and fit into the existing ST methodology. However, at an abstract level, this description suffices. In comparison with Ullman's visual routines, this description has a finer grain of detail.

A brief overview of ST is in order. For a full description of ST see Tsotsos (2011), Rothenstein and Tsotsos (2014)—those details will not be repeated here. The roots of ST lie in a set of formal proofs regarding the difficulty of comparing one image to another using the methods of computational complexity (Tsotsos, 1989, 1990). It was shown that a passive, feedforward pass was insufficient to solve the task in its general form, given the resources of the human brain. This paradox—the human brain is very good at solving this problem—underlies the implausibility of vision as a passive observer and points to a dynamically modifiable, active vision process. Further, the range of visual tasks humans perform require a time course often longer than that provided by a single pass through the visual cortex. This characteristic, with the flexibility to tune or parameterize, functionally re-purposing the processing network for each pass, distinguishes ST from its competitors (recurrence in a dynamical system or in a neural network is not the same).

In **Figure 2**, a caricature of the visual processing hierarchy is shown for descriptive purposes. It is intended that this simple 4-level structure represent the full ventral and dorsal visual networks and from here on, the acronym VH refers to this. The manner in which ST operates on a visual hierarchy shows how feedforward and recurrent traversals are inter-leaved. Within ST, the basic attentive cycle consists of a first stage, labeled B in **Figure 2**, that represents a task-based priming stage. The time period of this stage can range from 300 to 80 ms before stimulus presentation, as described along with **Figure 1**. That is, if during an experiment some priming signal is shown to a subject within that time period, there is enough time for it to be processed so that it affects the perception of the test stimulus. Here, not only is this scenario covered, but also any priming can be included, such as the impact of world knowledge; it is assumed that in order to affect processing, a top-down traversal of VH would be required based on the content of the priming stimulus. The next stage, C, is the stimulus-driven feedforward processing stage (requiring about 150 ms for a full traversal), followed by selection and task-specific decision. Then stage D, a recurrent tracing, localization and surround suppression stage (needing about 100–150 ms for a full top-down pass), and E, a modified feedforward processing stage that permits a recomputation of the stimulus with background clutter suppressed with the intent of optimizing neural responses to the attended item.

Each of the stages is parameterized differently depending on task. Some of the stages may not be needed for a given stimulus

and task. If the decision stage C, for example, determines that the task is satisfied by the output of the first feedforward stage, then no further stages are needed. For the visual tasks of discrimination, categorization, or identification (in all cases following their definitions in Macmillan and Creelman, 2005; Tsotsos, 2011), stages A–C usually suffice. For the tasks of within-category identification, A–D are needed with the option of stage D requiring only a partial recurrent pass. Full localization tasks require a complete stage D, while segmentation, visual search, and other more complex tasks require all stages and perhaps multiple repetitions of the cycle. These stages are more fully described in Tsotsos et al. (2008), Tsotsos (2011). Any controller will have to manage these differences.

The requirement for an additional top-down pass for localization is not inconsistent with the claims of Isik et al. (2014). There, it is shown that IT neural representations encode position information that can be decoded by a classifier, and thus the authors conclude that position is represented with a latency of about 150 ms, consistent with a feedforward progression through the visual hierarchy. In ST, it is the recurrent localization process that replaces the role of the classifier, and in contrast to current classifiers presents a biologically plausible mechanism (and is partially supported experimentally, Boehler et al., 2009). It also provides a mechanism for tracing down to earlier levels, functionality that classifiers do not possess, and thus providing more detailed position information if required. This highlights the conceptual difference between the time at which information is *available* from which position may be computed—which the Isik et al. paper well documents—and the time at which that information is decoded and made *usable* for processes needing position information, which is what ST can accomplish. Further, a simple classifier cannot easily determine position from a spatial representation containing multiple objects; a selection method is needed, and ST provides this.

There is one important concept to introduce at this point, namely that of the Attentional Sample (AS), which was mentioned earlier. During the recurrent tracing stage (**Figure 2D**), θ-WTA decision processes at each level of the hierarchy select the representational elements computed at that level that correspond to the attended stimulus1 . It will not always suffice to make this selection at the highest level, say at the level of object categories for example. Some tasks will require more details, such as locations of object parts, or feature characteristics. In general, the AS is formally a subset of the full hierarchy, that is, the set of neural pathways from the top of the hierarchy to the earliest level including all intermediate paths, and where at every level there is a connected subset of neurons with spatially adjacent receptive fields that represent the selected stimulus at that level of representation. The recurrent localization process will select these portions of the representation. Multiple stimuli, that are distinguishable from one another on the basis of their constituent features, are not typically selected together as part of a single AS. Since the selection occurs in a top-down fashion, each selection becomes part of the overall attentional sample that represents what is being attended and it can be added to working memory, as will be seen below. **Figure 3** provides an illustration of the concept of an attentional sample with its components highlighted on the appropriate processing stage in **Figure 2**, while **Figure 4** shows the

<sup>1</sup>This process was first described in Tsotsos (1991), and is fully detailed in Tsotsos (2011) and Rothenstein and Tsotsos (2014). Its mathematical properties, under the assumptions of the overall model, plus a strategy for monitoring and correcting the top-down traversal are also detailed in those presentations.

AS computed by the functioning model. **Figure 4** shows a snapshot of how ST attends to a rotating object in an image sequence. Not only is the object selected as a portion of the input image this is a common result of most attention models—but ST also connects that input selection to the particular neurons that have played a role in its selection throughout the processing hierarchy. In other words, if each level of processing computes selectivity of a different kind of feature abstraction (velocity, direction, velocity gradient, rotation/expansion/contraction, etc.), this feature set is localized within the hierarchy and can be thought of as the feature vector that best describes what is attended. It is this attentional sample that is then used by other visual computations for further processing. A classifier might consider this AS as its input. To draw a further comparison to Isik et al. (2014), the position information Isik et al. refer to is what is represented at the top level of the hierarchy only. It is position at its coarsest spatial resolution. ST on the other hand, provides not only that but also position at increasingly higher spatial resolutions through the hierarchy, to the level needed by task requirements.

### **EXTENSIONS OF SELECTIVE TUNING TO ENABLE COGNITIVE PROGRAMS**

A new functional architecture, based on Selective Tuning, for executive control via a Cognitive Program strategy can now be proposed (and is an extension of Kruijne and Tsotsos, 2011; Tsotsos, 2013). It must be stressed that this architecture was developed not by examining the literature to see what functions are attributed to, for example, working memory, or other functional units. Rather, the only components of function included are those that the algorithm for Selective Tuning requires (e.g., computation of its various parameters or control signals). This is a risky approach because it might seem that there are obvious missing pieces or inconsistencies. However, it is a unique approach in that it uses the ST foundation, which has been proven in many ways and provides a strong functional base, something that purely experimental work does not. In other words, here we present a strategy designed in a top-down manner as required by an existing successful algorithm, with the ultimate goal of trying to

continue to further processing levels. On the left is the further abstraction of translational computation at coarser spatial resolutions (areas MT, MST, and 7a) while the right hand side is concerned with computation for spatial velocity gradients, spiral motion (rotation, expansion, etc.) and full field egomotion (areas MT, MST, and 7a). In total, there are 654 separate filter types in this hierarchy. The recurrent localization process begins at the top, selects strongest responses, and then refines that selection tracing back the neural inputs that are responsible for that strongest, top-level response

(Tsotsos, 2011).

discover new components or functions that might stand as predictions for future experimental work. As such, the architecture stands as a hypothesis and there is no claim whatsoever that this functional architecture suffices to explain the existing literature in its full breadth and detail. However, it is claimed that it will suffice to augment ST to enable it to execute a broad family of visual tasks in a manner that is extensible to more complex tasks and is consistent with much (but perhaps not all) of the relevant aspects of human visual performance.

**Figure 5** gives the block diagram of the major components needed and their communication connections. Brief descriptions of each follow. Evidence from primarily human studies as to the functionality of such components is detailed in Kruijne and Tsotsos (2011) and, due to space limits, will not be repeated here.

The visual hierarchy, (VH), in a form that is amenable to the ST method of attention, represents the full ventral and dorsal streams of the visual processing areas. Partial implementations have been previously reported (Tsotsos et al., 2005a; Rodríguez-Sánchez and Tsotsos, 2012). The qualification of "amenable to the ST method of attention" is important. The majority of current popular visual representations such as HMAX (Riesenhuber and Poggio, 1999) or Convolution Nets (LeCun and Bengio,

**of instructions for visual task execution.**

**processing with communication channels indicated by the red arrows.**

1995) contain components that make them entirely unsuitable for attentive processing of the kind ST employs, among them feedforward max-pooling operations. Since ST requires a top-down, recurrent max-finding operation, methods that choose maximum responses on the feedforward pass make their decisions too early (and against Marr's principle of least commitment, 1982) and prevent the recurrent method of ST, or perhaps even any recurrent process at all. Arguments as to why ST's recurrent version is more consistent with known neurobiology are provided in Tsotsos (2011).

The Fixation Control mechanism (FC) was first described in Tsotsos et al. (1995) and a cursory implementation was shown. It has since been further detailed and completely implemented (Zaharescu et al., 2005; Tsotsos, 2011; Wloka, 2012), but its details will not be included here. The fixation control mechanism includes two important representations. The first is the Peripheral Priority Map (PPM) that represents the saliency of the peripheral visual field, biased by task and computed using the AIM algorithm (Bruce and Tsotsos, 2009). The other is the History Biased Priority Map (HBPM) which combines the focus of attention derived from the central visual field (cFOA—defined as the image region with strongest response profile at the highest levels of representation within the visual hierarchy) after processing by the full visual hierarchy and the foci of attention derived from the peripheral visual field (pFOA), i.e., the top few most salient items of the PPM. The point is to provide a representation that includes central fixation items (that do not require gaze change), peripheral fixation items (that do require gaze change), and task influence on these, on which computations of next target selection can be performed.

The Long Term Memory for methods (mLTM) stores CP methods, as described earlier. Where CPs might come from is not addressed here; we may assume that they are learned through some unspecified process external to this framework. **Figure 1** shows an example method (and was detailed earlier). An important characteristic of mLTM is that is will require a powerful indexing scheme to enable fast search among all of the methods for the particular ones most relevant to the task at hand. That is, in an associative manner, elements of the task description should quickly identify relevant methods.

Visual Working Memory (vWM) contains at least two representations. Within the vWM is the Fixation History Map (FHM) that stores the last several fixation locations. Each decays over time but while active provides the location for location-based inhibition-of-return (IOR) signals. This inhibition is intended to bias against revisiting previously seen locations (Klein, 2000) but can be over-ridden by task demands. The second representation is the Blackboard (BB), introduced in Tsotsos (2011) and where more details can be found. The BB stores the current *attentional sample* (the selected locations, features, concepts at each level of the VH as described earlier) determined by the recurrent attentional localization process so that it may be used by all other components.

Task Working Memory (tWM) includes the Active Script NotePad which itself might have several compartments. One such compartment would store the active scripts with pointers to indicate progress along the sequence. Another might store information relevant to script progress including the sequence of attentional samples and fixation changes as they occur during the process of fulfilling a task. Another might store relevant world knowledge that might be used in executing the CP. The Active Script NotePad would provide the vTE with any information required to monitor task progress or take any corrective actions if task progress is unsatisfactory.

The vTE reads tasks, selects task methods, tunes methods into executable scripts, deploys scripts to tune the vision processes, and monitors and adapts script progress. It receives input in the form of a task encoding from outside the structure of **Figure 5**. Sub-elements include the Script Constructor that tunes methods into scripts, the Script Executor that moves along the script step by step, sending the appropriate commands to the correct places, and the Script Monitor. The Script Monitor checks each step of the script to ensure the appropriate results are achieved. The full details of task execution are represented by the attentional sample AS, and the sequence of AS's, fixations, and other information stored in the Active Script NotePad. In other words, it has access to the history of important computations performed and their results during the process of performing the task. If those details do not confirm script success, there might be remedial action taken by making small alterations to the script or replacing the current script by a different one.

The vAE contains a Cycle Controller, algorithms to translate task parameters into control signals, and communicates with external elements. The Cycle Controller is responsible for initiating and terminating each stage of the ST process (shown in **Figure 2**). For example, it would initiate the θ-WTA process for the top of the VH in order to determine the focus of attention in the central visual field (cFOA). The vAE also initiates and monitors the recurrent localization process of **Figure 2D**. This process is fully detailed in Rothenstein and Tsotsos (2014). There, we present the implementation of ST's neural encoding scheme integrated with attentional selection. We show that it models firing rates observed in experimental work on single cells as well as across hierarchies of neurons. The Cycle Controller cycles repeatedly until the task is complete, a determination that is made by the vTE.

**Figure 6** shows the details of the major components shown in **Figure 5** and graphically links the details of the descriptions above to each other. Importantly, it shows the various control signals and information pathways identified to enable ST to function. The figure highlights several functional components that also stand as experimental predictions, among them:


There are more characteristics that stand as experimental predictions. One dimension of prediction that is not included is that of associating specific brain areas to the functions in the figure. Although it is possible to make some associations (for example, VH represents the set of ventral and dorsal visual areas, the BB may be part of the pulvinar, the PPM and HBPM may be area V6, the FHM may be part of FEF—justifications for these appear in Tsotsos, 2011), we refrain from emphasizing these. The reason is simply that it is more important to confirm the function of each component and of the framework as a whole. Once we have strong evidence that the correct functional pieces are included, we can start to consider which brain areas might correspond.

It is important to ensure that there is sufficient justification for the decision to functionally separate the elements described in this section. For example, why separate the tWM from the vWM? Or the vAE from the vTE? In both cases, the intent is clear. The tWM is intended to keep track of any information that relates to status and progress relating to completion of the task at hand. Think of it as the storage for each of the major checkpoints that must be satisfied during task execution, whether due to visual, motor, reasoning or other actions. The vMW, on the other hand, stores all the actual visual information extracted from the input stream and processing by the VH, whether they correspond to components of task checkpoints or not. It corresponds to whatever is seen and remembered for short-term processing and provides input to the determination of whether or not checkpoints are satisfied. The vAE and vTE have a similar distinction. The vAE applies its processing to the VH only; it controls the VH to adapt it to the task and input. The vTE is not concerned with this but focuses on setting up all the task components into an executable script, and of course, this includes the attentional aspects. Functionally it makes some sense to have separate components in both cases, even though it may appear as if it might be possible to embed the vWM within the tWM and the vAE within the vTE. There is little or no empirical evidence of which we are aware for either strategy in the brain; it is clear that from a pure modeling perspective either approach can be made to function correctly. The separation suggested is a logical one and would stand as a prediction for future experimental work.

**Figure 7** shows a set of linked method CPs, extensions to the Discrimination CP described earlier. The Discrimination CP has

been previously described while the Visual Search: Overt CP will be used in the example of the next section. The Visual Search: Covert CP is a straightforward extension of Discrimination while the Localize/Reinterpret CP reflects the recurrent localization mechanism of the descriptions accompanying **Figures 2**, **3**.

### **THE CURVE TRACING EXAMPLE**

We turn now to a simple example, one that has appeared in the context of visual routines previously (Jolicoeur et al., 1986). Their main experimental task was to quickly decide whether two Xs lay on the same curve or on different curves in a visual display. Mean response time for "same" responses increased monotonically with increasing distance along the curve between the Xs. The authors, based on this and similar results on a related experiment, concluded that humans can trace curves in a visual display internally at high speed (the average rate of tracing was about 40◦ of visual angle per second). The curves were displayed approximately foveally, with the distance between the Xs being between 2.2◦ and 8.8◦ of visual angle. There were no cross points of the curves, and it seems the curves were "simple" and not overly close to one another nor convoluted in shape. The authors conclude that curve tracing is a basic visual process. Here, we show how the CP strategy can provide an explanation for curve tracing but in order to make the demonstration more interesting, the display is assumed to be large enough to require eye movements. The same strategy, as should be apparent, can deal with smaller displays as Jolicouer et al. use, that do not require eye fixation changes.

The sequence of figures below (**Figures 8**–**14**) show the steps executed by the model in order to achieve a single step of tracing; details are in the figure captions. Clearly, several steps such as these are required to complete the task. Using the CP's of **Figure 6**, the details required for curve tracing can be seen in the Visual Search: Overt network, although the task specific component of tracing is not specifically shown. Smaller displays might only require the Visual Search: Covert CP.

**FIGURE 8 | Suppose the task is to trace this curve.** The current fixation is at the red dot, the visual processing hierarchy has been biased to be more selective to curved lines, and the curve portion highlighted in red has already been tracked and this is recorded in the Active Script NotePad.

How does this differ from previous explanations of human curve-tracing behavior? It is a more detailed and generalizable explanation than what was previously presented by Ullman (1984), Jolicoeur et al. (1986) or Roelfsema et al. (2000). Specifically, Roelfsema et al. (2000) propose that attentional mechanisms realize an *attentional label* that spreads along the activated units that belong to the target object or region, binding them into a single representation. Ullman's operations can all be re-interpreted using this spread of the attentional label, either along a curve or over a region. Although this is a sensible proposal it describes the process at one level of representation, that is, it assumes that all required data and computation can be performed within the representation of a curve. How this might generalize to other kinds of visual tasks is an open question. The difference with our approach is that the generalization to a broad set of tasks is more apparent.

### **DISCUSSION**

In addition to comparisons to the various theories and systems already mentioned, here a further comparison can be made with

**FIGURE 9 | This figure represents the central attentional field (the central 10◦ or so of the image) representation at the top of the VH.** The curve at current fixation has already been attended; the attentional sample in BB contains its details. The central field is examined via the θ-WTA mechanism to find largest responding element other than the current fixation; the already tracked curve has been suppressed by the inhibition of return mechanism. The green dot is selected as the central attentional focus.

**FIGURE 10 | The figure represents the Fixation History Map.** The FHM represents the previous fixation (red dot) and the already traced portion of the curve in red. These provide the inhibition of return bias for the HBPM. Note how the FHM represents a spatially larger area than the visual field because it must also include extra-retinal space in order to reduce the possibility of incorrect gaze oscillatory behavior.

**FIGURE 11 | This figure shows the contents of the Peripheral Priority Map.** The PPM gives the salient locations outside the central attentional field. The locations would be added to the HBPM. Note that higher saliency is represented by darker shading.

**FIGURE 12 | This shows the History Biased Priority Map.** Once the next central focus (green dot), FHM, and PPM are all combined into the HBPM, this representation can now serve as the basis for selecting the next fixation.

the Neural Theory of Visual Attention (NTVA) (Bundesen et al., 2005; Kyllingsbæk, 2006). At the outset, the Cognitive Programs that control ST (let's term this as CP-ST for ease of referral) embody major differences when compared to NTVA, both in approach and in goals. NTVA is purely a theory of visual attention, not addressing how vision functions and assuming that the visual system uses the Gestalt principles to segment, in an unspecified manner, the visual scene into objects as part of its first wave of processing. NTVA then describes how objects and features are subsequently selected in the second wave of processing. In contrast, CP-ST attempts to represent the visual process itself, using abstractions of its elements, neurons and synapses, as well as a full set of selection mechanisms. Secondly, the NTVA system relies

**FIGURE 13 | The HBPM is shown again.** The choice of next fixation is computed from the set of salient peaks with the additional constraint that the next fixation must lie along the curve and be connected to the previous fixation along a portion of the curve not already inhibited. This is the yellow dot and the choice would lead to a saccade.

the central field moves, the new visual field is processed, the already determined central fixation is attended, the attentional sample is recorded, and the process repeats.

on the concept of resource allocation at the heart of attentional processing, following many previous works going back to the earliest explanations of attention (Tsotsos et al., 2005b). During the two waves of processing in NTVA, the first allocation of processing resources is at random while in the second pass they are allocated according to attentional weights that are computed for each object in the visual field such that the number of neurons allocated to an object increases with its attentional weight. There is no similar neurons-to-visual-object allocation within CP-ST. The processing architecture is constant throughout processing and only its parameters change that make some neurons more or less selective and connections more or less transmissive. NTVA uses only two mechanisms to accomplish its goals, *filtering* (selection of objects) and *pigeonholing* (selection of features). CP-ST employs many mechanisms in each of the *suppression*, *selection* (filtering and pigeon-holing are two of ST's 8 selection mechanisms) and *restriction* categories (see Tsotsos, 2011). It is difficult to see how NTVA can account for the top-down latency of attentional modulation, for the attentive suppressive surround, for receptive field narrowing, for inhibition of return, and other aspects of attention as a result while CP-ST inherits these from ST.

A major strength of NTVA is the quantitative comparisons possible using its two major equations, as is illustrated in Bundesen et al. (2005), covering a wide range of effects observed in the firing rates of single cells in primates. CP-ST has yet to prove itself—it is a hypothesis at this stage; however ST recently has shown these same effects (Rothenstein and Tsotsos, 2014) and has the additional strength that it can accept real images and process them, exhibiting attentive behavior as would an experimental subject. There are 10 free parameters for the basic ST equations (Rothenstein and Tsotsos, 2014). The CP-ST framework, however, would have more and at this stage it is unknown what they may be. TVA on the other hand, on which NTVA is based, has a smaller number of parameters, 4 (Andersen and Kyllingsbæk, 2012). Although an ability to represent behavior with as few parameters as possible is an important consideration, it cannot be expected that that complex behavior comes without a price. The trick is to not have more free parameters than needed; it's an Occam's Razor issue. In general, most models are not currently detailed enough for a comparison on this point. Summarizing this comparison, it would be an interesting and likely valuable research project to detail the connections between NTVA and CP-ST and to see if unification might lead to a productive result.

The Cognitive Programs framework, although containing elements seen in other models, provides an implementable (see Kotseruba and Tsotsos, 2014) model that we hypothesize exhibits task behavior comparable to the behavior of human subjects performing the same tasks.

#### **CONCLUSIONS**

This paper began with a question: What are the computational tasks that an executive controller for visual attention must solve?

The answer is not a simple enumeration of tasks as one might have hoped. Rather, exploring this question has led to a complex set of inter-connected and cooperating functional components, each a hypothesis with several sub-hypotheses within. **Figure 15** summarizes the tasks that our controller must address—in other words, this is the answer to our original motivating question using the same figure structure as **Figures 5**, **6**. Within each box, the major tasks that must be performed are listed and these arise from the detailed structure of **Figure 6**.

The value of a hypothesis rests solely with the possibility of testing its validity and here it is important to ensure the proposed components can be tested. Testing would proceed computationally to ensure computational performance with respect to human behavior is satisfactory as well as experimentally in search of evidence supporting the many predictions of the overall theory. The task would be daunting if the framework of **Figure 6** was composed of entirely new components. However, the fact that so many components have already been examined with success gives the hypothesis represented by the overall integration some degree of plausibility.

The full system of **Figure 6** can be best tested via computational implementation. A success would provide an existence proof that all of these components perform their intended function and that in concert they function as a controller for visual task execution. Such a test is not easy to conduct but it is important that any test use images and a non-trivial task. This has been accomplished to a large degree through a computer system that plays a video game (Kotseruba and Tsotsos, 2014). This implementation did not test all of the elements of **Figure 6** but does test the subset required for the game and also demonstrates that the form of the CP's presented in **Figures 1**, **7** is feasible.

With respect to the CP's shown in **Figure 7**, it is important to note that all of these represent computationally confirmed processes. That is, they are simply encodings of the algorithms presented in our past publications. In some cases elements are also supported by experiment. For example, the Localization function includes sub-components of an attentive suppressive surround and also requires this to be a result of recurrent processes. These two elements have experimental support (Hopf et al., 2006; Boehler et al., 2009). However, the extraction of the attentional sample from the Localization function and its use within the BB of the CP framework does not yet have experimental support. The Visual Search CP's generally encode behavior that is well documented experimentally in the visual search literature and also is shown to be a characteristic of ST (Rodríguez-Sánchez et al., 2007; Tsotsos, 2011). Finally, the VH of **Figure 4** is also defined and shown to perform under attentive conditions (Tsotsos et al., 1995, 2005a; Tsotsos, 2011; Rodríguez-Sánchez and Tsotsos, 2012; Rothenstein and Tsotsos, 2014). The fixation control (FC) component has also been implemented and successfully tested (Tsotsos et al., 1995; Zaharescu et al., 2005; Bruce and Tsotsos, 2009; Wloka, 2012) with all of its sub-components included. Attentive behavior and predictions of Selective Tuning has been extensively tested both with computational and human experiments (detailed in Tsotsos, 2011; Rothenstein and Tsotsos, 2014).

Cognitive Programs grew out of Ullman's Visual Routines, but represent a generalized and updated conceptualization. Although many aspects of the CP framework have been successfully tested, some computationally and some experimentally, the full framework awaits testing as do the many experimental predictions that it expresses. The main hypothesis presented by this paper then is that the Cognitive Programs framework, built upon the substrate of the Selective Tuning model, suffices to provide an executive controller for ST, and that it also offers a testable, conceptual structure for how visual task execution might be accomplished.

### **ACKNOWLEDGMENTS**

Funding was gratefully received from the Canada Research Chairs Program and the Natural Sciences and Engineering Research Council of Canada. Wouter Kruijne is thankful for research support from York University and from Vrije Universiteit during his thesis residency while performing research presented in this manuscript. Portions of this manuscript were developed during John K. Tsotsos's sabbatical stay in the Department of Brain and Cognitive Sciences and in the McGovern Institute for Brain Research at the Massachusetts Institute of Technology and John K. Tsotsos is grateful for the support of James DiCarlo and Robert Desimone. We thank Eugene Simine for programming assistance and Yulia Kotseruba and Calden Wloka for discussion and comments on early drafts.

#### **REFERENCES**


Boehler, C. N., Tsotsos, J. K., Schoenfeld, M., Heinze, H.-J., and Hopf, J.-M. (2009). The center-surround profile of the focus of attention arises from recurrent pro-


*Computational Vision: Second International Workshop, WAPCV 2004, Revised Selected Papers, Lecture Notes in Computer Science Vol. 3368/2005* (Heidelberg: Springer-Verlag), 133–147.

Zylberberg, A., Slezak, D. F., Roelfsema, P. R., Dehaene, S., and Sigman, M. (2010). The brain's router: a cortical network model of serial processing in the primate brain. *PLoS Comput. Biol.* 6:e1000765. doi: 10.1371/journal.pcbi.1000765

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 15 August 2014; accepted: 17 October 2014; published online: 25 November 2014.*

*Citation: Tsotsos JK and Kruijne W (2014) Cognitive programs: software for attention's executive. Front. Psychol. 5:1260. doi: 10.3389/fpsyg.2014.01260*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology.*

*Copyright © 2014 Tsotsos and Kruijne. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## ADVANTAGES OF PUBLISHING IN FRONTIERS

FAST PUBLICATION Average 90 days from submission to publication

COLLABORATIVE PEER-REVIEW

Designed to be rigorous – yet also collaborative, fair and constructive

RESEARCH NETWORK Our network increases readership for your article

### OPEN ACCESS

Articles are free to read, for greatest visibility

### TRANSPARENT

Editors and reviewers acknowledged by name on published articles

GLOBAL SPREAD Six million monthly page views worldwide

### COPYRIGHT TO AUTHORS

No limit to article distribution and re-use

IMPACT METRICS Advanced metrics track your article's impact

SUPPORT By our Swiss-based editorial team

EPFL Innovation Park · Building I · 1015 Lausanne · Switzerland T +41 21 510 17 00 · info@frontiersin.org · frontiersin.org