# MORPHOLOGICALLY COMPLEX WORDS IN THE MIND/BRAIN

EDITED BY: Alina Leminen, Harald Clahsen, Minna Lehtonen and Mirjana Bozic PUBLISHED IN: Frontiers in Human Neuroscience

### *Frontiers Copyright Statement*

*© Copyright 2007-2016 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

> *All copyright, and all rights therein, are protected by national and international copyright laws.*

*The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88919-803-0 DOI 10.3389/978-2-88919-803-0

# About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

# Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

# Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

# What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **MORPHOLOGICALLY COMPLEX WORDS IN THE MIND/BRAIN**

Topic Editors: **Alina Leminen,** Aarhus University, Denmark **Harald Clahsen**, Potsdam Research Institute for Multilingualism, Germany **Minna Lehtonen**, Abo Akademi University and University of Helsinki, Finland **Mirjana Bozic**, University of Cambridge, UK

The question of how morphologically complex words (assign-ment, listen-ed) are represented and processed in the brain has been one of the most hotly debated topics in the cognitive neuroscience of language. Do complex words engage cortical representations and processes equivalent to single lexical objects or are they processed as sequences of separate morpheme-like units? Research on morphological processing has suggested that adults make efficient use of both lexical (i.e., whole word) storage and retrieval, as well as combinatorial computation in processing morphologically complex words. Psycholinguistic studies have demonstrated that processing of complex words can be affected both by properties of the morphemes and the whole words, such as their frequency, transparency, and regularity. Furthermore, this research has been informative about the time-course of complex word recognition and production, and the role of morphological structure in these processes. At the neural level, left-hemisphere inferior frontal and superior temporal areas, and negative-going event-related potentials, have been consistently associated with morphological processing.

While most previous research has been done on the recognition of morphologically complex words in adult native speakers, much less is known about neurocognitive processes involved in the on-line production of morphologically complex words, and even less on morphological processing in children and non-native speakers. Moreover, we have limited understanding of how linguistically distinct morphological processes, e.g. inflectional (listen-ed) versus derivational (assign-ment), are handled by the cortical language networks.

This e-book gives an up-to-date overview of the questions currently addressed in the field of morphological processing. It highlights the significance of morphological information in language processing, both written and spoken, as assessed by a variety of methods and approaches. It also points to a number of unresolved issues, and provides future directions for research in this key area of cognitive neuroscience of language.

**Citation:** Leminen, A., Clahsen, H., Lehtonen, M., Bozic, M., eds. (2016). Morphologically Complex Words in the Mind/Brain. Lausanne: Frontiers Media. doi: 10.3389/978-2-88919-803-0

# Table of Contents



Naama Friedmann, Aviah Gvion and Roni Nisim

*212 Evidence from neglect dyslexia for morphological decomposition at the early stages of orthographic-visual analysis*

Julia Reznick and Naama Friedmann

# Editorial: Morphologically Complex Words in the Mind/Brain

Alina Leminen1, 2 \*, Minna Lehtonen2, 3, Mirjana Bozic<sup>4</sup> and Harald Clahsen<sup>5</sup>

*<sup>1</sup> Department of Clinical Medicine, Center of Functionally Integrative Neuroscience and MINDLab, Aarhus University, Aarhus, Denmark, <sup>2</sup> Cognitive Brain Research Unit, Cognitive Science, Institute of Behavioural Sciences, University of Helsinki, Helsinki, Finland, <sup>3</sup> Department of Psychology, Abo Akademi University, Turku, Finland, <sup>4</sup> Department of Psychology, University of Cambridge, Cambridge, UK, <sup>5</sup> Potsdam Research Institute for Multilingualism, University of Potsdam, Potsdam, Germany*

Keywords: morphology, derivation, inflection, compound, L2, dyslexia, semantics, decomposition

# **The Editorial on the Research Topic**

## **Morphologically Complex Words in the Mind/Brain**

In most languages, sentences can be broken down into words, which themselves can be further decomposed into units that contain meaning of their own, so-called morphemes (e.g., "play" or plural form "-s"). Morphemes are the main building blocks and tools, which we use to create and change words. The representation of morphologically complex words (inflected, derived, and compound) in the mental lexicon and their neurocognitive processing has been a vigorously investigated topic in psycholinguistics and the cognitive neuroscience of language. Are morphologically complex words such as "player" and "plays" decomposed into their constituents (i.e., into their stem "play" and plural suffix "-s" or agentive suffix "-er") or are they processed and represented holistically ("player" and "plays")? Despite extensive research, many important questions remain unanswered. Our Research Topic addresses several currently unresolved topics on the time-course of morphological analysis and the relationship between form and meaning information in morphological parsing. The studies also seek answers to the questions of how inflections and derivations differ in the way they are handled by the mental lexicon, how compound words are recognized and produced, as well as how morphologically complex words are processed within the bilingual mental lexicon, as well as by different clinical populations.

### Edited and reviewed by:

*Srikantan S. Nagarajan, University of California, San Francisco, USA*

### \*Correspondence:

*Alina Leminen alina.leminen@helsinki.fi*

Received: *10 December 2015* Accepted: *28 January 2016* Published: *16 February 2016*

### Citation:

*Leminen A, Lehtonen M, Bozic M and Clahsen H (2016) Editorial: Morphologically Complex Words in the Mind/Brain. Front. Hum. Neurosci. 10:47. doi: 10.3389/fnhum.2016.00047*

With respect to time-course of morphological processing and interplay between form and meaning, many current models assume that morphological processing proceeds by analyzing form first at the very earliest stages of processing, after which meaning of the morphemes is accessed (e.g., Rastle and Davis, 2008). In contrast, Feldman et al. provided evidence for the view that meaning information comes into play even at the very early stages of morphologically complex word recognition. Two studies (Estivalet and Meunier; Smolka et al.), focusing on the role of semantic transparency and regularity in derived and inflected words indicate decomposition in semantically and phonologically opaque and transparent words in two different languages. That is, both semantically transparent and opaque derivations were found to be represented and processed in similar ways in German (Smolka et al.), and all inflected verbal forms in French showed decomposition effects during visual recognition (Estivalet and Meunier), regardless of their regularity and phonological realization, thus supporting models of obligatory morphological decomposition (e.g., Taft, 2004). Two neuroimaging studies in this Research Topic elucidated the neural correlates of the processing of regular vs. irregular inflection, a highly debated issue. Using time-resolved magnetoencephalography (MEG) with English verbs, Fruchter et al. found priming effects for visually presented irregular stimuli, quite early in the processing, within the left fusiform and inferior temporal regions. The results were interpreted as favoring a single mechanism account of the English past tense, in which even irregulars are decomposed into stems and affixes prior to lexical access (Stockall and Marantz, 2006), as opposed to a dual mechanism model, in which irregulars are recognized as whole forms (e.g., Pinker, 1991). On the other hand, with Russian, a language with very little scrutiny so far and a relatively novel analysis of fMRI functional connectivity, Kireev et al. reported that functional connectivity between the left inferior frontal gyrus (LIFG) and bilateral superior temporal gyri (STG) was significantly greater for regular real verbs than for irregular ones during production. The results shed new light on the functional interplay within the language-processing network and stress the role of functional temporo-frontal connectivity in complex morphological processes. These two studies with arguably different outcomes suggest that the debate on regular vs. irregular form processing continues. They however also point to the potentially critical influences of the processing modality (written vs. spoken) as well as the task (comprehension vs. production) on the mechanism of morphological processing.

Turning to a question of inflected and derived word processing, where several previous studies have observed differences in the underpinning neural mechanisms (e.g., Leminen et al.; Leminen et al., 2013; Leminen et al., for a review see e.g., Bozic and Marslen-Wilson, 2010). Service and Maury report differences between derivations and inflections in working memory (as measured by simple and complex span tasks), suggesting different levels of lexical competition and hence, differential lexical storage. Using combined magneto- and electroencephalography (M/EEG), Whiting et al. defined the spatiotemporal patterns of activity that support the recognition of spoken English inflectional and derivational words. Results demonstrated that spoken complex word processing engages the left-hemisphere's fronto-temporal language network, and, importantly, does not require focused attention on the linguistic input (Whiting et al.). Using a similar auditory passive oddball paradigm and EEG, Hanna and Pulvermuller observed that the processing of spoken derived words was governed by a distributed set of bilateral temporo—parietal areas, consistent with the previous literature (Bozic et al., 2013; Leminen et al.). In addition, derived words were found to have full-form memory traces in the neural lexicon (see e.g., Clahsen et al., 2003; Bozic and Marslen-Wilson, 2010; Leminen et al.), activated automatically (see also Leminen et al., 2013).

In the field of cognitive neuroscience of language, a largely under-investigated topic has been the neural processing of compound words. An article by Brooks and Cid da Garcia therefore brings an important contribution to elucidating this issue. Their primed word naming task revealed decompositional effects in access to both transparent and opaque compounds. In the MEG results, the left anterior temporal lobe (LATL) as well as the left posterior superior temporal gyrus showed increased activity only for the transparent compounds. These effects were concluded to be related to compositional processes and lexical-semantic retrieval, respectively. Our Research Topic also presents novel findings on written production of compounds, where Bertram et al. introduces an approach rarely used with morphologically complex words. Specifically, they investigated the interplay between central linguistic processing and peripheral motor processes during typewriting. Bertram et al. concluded that compound words seem to be retrieved as whole words before writing is initiated and that linguistic planning is not fully complete before writing, but cascades into the motor execution phase.

With respect to the important topic on bilingual morphological processing, our Research Topic presents three studies and one commentary. Lensink et al. used a priming paradigm to show that both transparent (e.g., moonlight) and opaque (e.g., honeymoon) compounds in the second language (L2) undergo morphological analysis in production. The second study (De Grauwe et al.) used fMRI to assess the processing of Dutch prefixed derived words, demonstrating a priming effect for L2 speakers in the LIFG, an area that has been associated with morphological decomposition. De Grauwe et al. concluded that L2 speakers decompose transparent derived verbs rather than process them holistically. In his commentary on De Grauwe et al.'s article, Jacob discusses the specific aspect of decomposition that the LIFG finding might be reflecting, as well as the extent to which the findings can be generalized to all derivations, instead of one particular verb class. In the third article, Mulder et al. examined the role of orthography and task-related processing mechanisms in the activation of morphologically related complex words during bilingual word processing. Their study shows that the combined morphological family size is a better predictor of reaction times (RTs) than the family size of individual languages. This study also demonstrates that the effect of morphological family size is sensitive to both semantic and orthographic factors, and that it also depends on task demands.

Last but not least, two studies aimed to provide insights into morphological processing by analyzing neglect and letter position issues in dyslexic population. Reznick and Friedmann suggested that the effect of morphology on reading patterns in neglexia provides supportive evidence that morphological decomposition occurs pre-lexically, in an early orthographicvisual analysis stage. Using a different dyslexic population, letter position dyslexics, Friedmann et al. reached a similar conclusion that morphological parsing takes place at an early, pre-lexical stage and that decomposition is structurally rather than lexically driven.

To summarize, this Research Topic presents an overview of a wide range of questions currently addressed in the field of morphological processing. It highlights the significance of morphological information in language processing, both written and spoken, as assessed by the variety of methods and approaches presented here. The partly discrepant findings in some of the contributions to our Research Topic also underline the need for increased cross-talk between researchers using different methods, modalities, and paradigms.

# AUTHOR CONTRIBUTIONS

AL wrote the main paper, ML and MB edited the manuscript, HC provided conceptual advice.

# ACKNOWLEDGMENTS

We would like to thank all the authors and reviewers who contributed to this Research Topic. AL is funded by Lundbeck Foundation (PI Yury Shtyrov) and Kone Foundation. ML is funded by Academy of Finland (grant #288880) and HC holds Alexander-von-Humboldt Professorship.

# REFERENCES


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Leminen, Lehtonen, Bozic and Clahsen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Neural dynamics of inflectional and derivational processing in spoken word comprehension: laterality and automaticity

# *Caroline M.Whiting1,2 \*,William D. Marslen-Wilson1,2 andYury Shtyrov 2 ,3,4*

<sup>1</sup> Department of Psychology, University of Cambridge, Cambridge, UK

<sup>2</sup> MRC Cognition and Brain Sciences Unit, Cambridge, UK

<sup>3</sup> Center of Functionally Integrative Neuroscience, Aarhus University, Aarhus, Denmark

<sup>4</sup> Centre for Languages and Literature, Lund University, Lund, Sweden

### *Edited by:*

Alina Leminen, University of Helsinki, Finland

### *Reviewed by:*

Dirk Koester, Bielefeld University, Germany Linnaea Stockall, Queen Mary University of London, UK

### *\*Correspondence:*

Caroline M. Whiting, Department of Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, UK e-mail: cmw59@cam.ac.uk

Rapid and automatic processing of grammatical complexity is argued to take place during speech comprehension, engaging a left-lateralized fronto-temporal language network. Here we address how neural activity in these regions is modulated by the grammatical properties of spoken words. We used combined magneto- and electroencephalography to delineate the spatiotemporal patterns of activity that support the recognition of morphologically complex words in English with inflectional (-s) and derivational (-er) affixes (e.g., bakes, baker). The mismatch negativity, an index of linguistic memory traces elicited in a passive listening paradigm, was used to examine the neural dynamics elicited by morphologically complex words. Results revealed an initial peak 130–180 ms after the deviation point with a major source in left superior temporal cortex. The localization of this early activation showed a sensitivity to two grammatical properties of the stimuli: (1) the presence of morphological complexity, with affixed words showing increased left-laterality compared to non-affixed words; and (2) the grammatical category, with affixed verbs showing greater left-lateralization in inferior frontal gyrus compared to affixed nouns (bakes vs. beaks). This automatic brain response was additionally sensitive to semantic coherence (the meaning of the stem vs. the meaning of the whole form) in left middle temporal cortex.These results demonstrate that the spatiotemporal pattern of neural activity in spoken word processing is modulated by the presence of morphological structure, predominantly engaging the lefthemisphere's fronto-temporal language network, and does not require focused attention on the linguistic input.

**Keywords: morphology, MEG, EEG, inflection, derivation, language comprehension, attention**

# **INTRODUCTION**

Successful speech comprehension involves extracting linguistic information from a spoken input and accessing a unique representation from the mental lexicon. In mapping from speech to meaning, converging evidence from behavioral, neuroimaging, and neuropsychological studies indicates that the grammatical structure of a word is automatically detected and segmented – e.g., *darkness* is broken down into two morphemes, the stem *dark* and the affix -*ness* (Taft and Forster, 1975; Marslen-Wilson et al., 1994 see Rastle and Davis, 2008 for review). This has motivated longstanding questions about how lexical representations are organized and accessed, in particular for words containing more than one morpheme1. Morphological complexity plays a key role in languages such as English by introducing systematic and productive elements to the language, broadening the range of possible meanings through the use of multiple morphemes within a word. A critical question in this study will be how the language system identifies and processes this linguistic complexity as the speech signal unfolds.

We examine two types of affixes in English, inflectional (*-s*) and derivational (*-er*), both of which combine with a stem to form a morphologically complex word.<sup>2</sup> Forms containing an inflectional suffix are semantically transparent, such that the meaning of the complex form is predictable from the meaning of the stem (e.g., *jump-jumps-jumped*). It has been argued that inflections create a new form but not a new lexical entry (Clahsen et al., 2003). Derivational affixes function in changing the meaning and in many cases the grammatical category of the stem (e.g., *farm-farmer*). To date, extensive evidence from masked priming in the visual domain supports the claim for automatic morphological segmentation (Rastle et al., 2000, 2004; Longtin et al., 2003; Longtin and Meunier, 2005; Marslen-Wilson et al., 2008), where *any* word containing a potential stem and affix is segmented. This work has primarily focused on derived forms, but research on inflected forms – often centered on distinctions between regular and irregular past-tense processing – has also pointed to early morphological decomposition (Meunier and Marslen-Wilson, 2004; Crepaldi et al., 2010). Converging evidence for processing of inflected forms has come

"fnhum-07-00759" — 2013/11/16 — 13:43 — page 1 — #1

<sup>1</sup>We use here the standard linguistic definition of the morpheme as the minimal meaning-bearing linguistic unit (e.g., Matthews, 1991), distinguishing between 'content' morphemes like the stem {dark}, and grammatical morphemes like the derivational affix {-ness}.

<sup>2</sup>A third type of morphological complexity in English involves compounds words (e.g., blackboard), composed of multiple roots as opposed to a root and affix, which are not assessed in the present study (but see MacGregor and Shtyrov, 2013, for related evidence on compound processing).

from spoken word comprehension. Spoken forms ending in the characteristic pattern of regular inflection in English – a final coronal consonant (d, t, s, z) that agrees in voicing with the preceding phoneme (e.g., *played* and *plays* as opposed to vowel–consonant voicing mismatch in *plate* and *place*) – will trigger automatic morphological decomposition (Tyler et al., 2005; Post et al., 2008). Though this appears counterproductivefor words such as*corner* or *trade*, where a decompositional reading of *corn* + -*er* or *tray* + -*ed* has no relationship to the meaning of the whole form, it suggests a tuned sensitivity of the language system to morphological structure.

To address the neural foundations of this automatic morphological process, it is essential to use a brain imaging technique which can provide not only spatial but also temporal precision. For this reason, we use concurrent magnetoencephalography (MEG) and electroencephalography (EEG) recordings of brain responses to morphologically simple and complex words. In the visual domain, converging cross-linguistic evidence using EEG has pointed to specific processes linked to the presence of morphological complexity in the time window of the N400 (Münte et al., 1999; Rodriguez-Fornells et al., 2001; Lavric et al., 2007; Lehtonen et al.,2007), with additional studies showing earlier effects between 150 and 300 ms (Morris et al., 2008; Lehtonen et al., 2011; Lavric et al., 2012; Morris and Stockall, 2012). Recent MEG evidence has revealed early effects before 200 ms (Zweig and Pylkkänen, 2009; Lewis et al., 2011), as well as effects peaking at 400 ms (Vartiainen et al., 2009). Taken as a whole, these studies provide evidence for sensitivity to potential morphological structure, with the work on derived forms showing that complex and pseudo-complex forms like *farmer* and *corner* produce a similar neural pattern relative to orthographic controls such as *scandal* (*scan* + non-affix *–dal*; Morris et al., 2008; Lehtonen et al., 2011; Lavric et al., 2012). These findings have been taken as evidence for automatic morphological segmentation independent of word meaning, confirming the behavioral masked priming effects.

Evidence for blind morphological decomposition does not, however, require a decompositional representation for all words containing morphological structure – and indeed would not be appropriate for pseudo-affixed words such as *corner*. Dual-route accounts have been proposed which argue for decompositional processes, but allow for the co-existence of whole-word and morphologically decomposed representations (Caramazza et al., 1985; Marslen-Wilson et al., 1994; Schreuder and Baayen, 1995). This presupposes a level of processing where forms are accessed in terms of their constituent morphemes, but does not assume all complex words are accessed through parsing. Electrophysiological evidence for dual-route recognition has been demonstrated through sensitivity to surface frequency and the relationship between stem and suffix (transition probability), suggesting that both whole form and morphological factors modulate early stages of word processing (Lewis et al., 2011). Features of the affix are thought to play a key role in determining whether a form is represented decompositionally or as a full form, including word formation type (inflected vs. derived) and the productivity of the affix (Bertram et al., 2000).

There is accumulating neuroimaging and neuropsychological evidence to suggest that the presence of an inflectional ending engages left hemisphere fronto-temporal regions, with specific involvement of the left inferior frontal gyrus (IFG; Laine et al., 1999; Longworth et al., 2005; Tyler et al., 2005; Lehtonen et al., 2006; Bozic et al., 2010). Derivationally complex forms appear to show a distinct neural pattern, engaging a more bilateral system (Meinzer et al., 2009; Leminen et al., 2011; Bozic et al., 2013), and suggesting that lexical access to derivations may be achieved via the full forms. We aim to detail these putative differences in brain activation dynamics by comparing EEG/MEG activation elicited by inflections and derivations in a tightly controlled stimulus set. We focus in this study on the initial stages of morphological processing involved in identifying complexity. If there is rapid morphological segmentation, as has been argued in the visual domain (see Rastle and Davis, 2008 for review), we would hypothesize that this process will be triggered for both types of affixes (inflectional and derivational) once phonological cues to the presence of the affix are identified.

Particular challenges arise when addressing morphological processing in the auditory domain. Unlike written text, where there are discrete letters available simultaneously to the reader, spoken language is uttered in a continuous stream. The listener must recognize linguistic units within a stream that is evolving over time, with new information constantly arriving to the auditory system. Models of spoken word processing state that listeners are able to recognize words before they have finished hearing them (Marslen-Wilson and Welsh, 1978; Grosjean, 1980), where multiple candidates compete for selection until the speech input is uniquely identifiable. The notion of simultaneous activation of all potential candidate words is a fundamental concept in many spoken word recognition models (Marslen-Wilson and Welsh, 1978; McClelland and Elman, 1986; Norris, 1994). Thus, an important issue is determining the point in the speech signal at which there is sufficient information to determine its correct identity, in particular when considering the relationship between the meaning of the stem and the meaning of the complex form (*jump-jumps, farm-farmer, corn-corner*). By tracking the time course of spoken word comprehension using time-resolved MEG/EEG, it is possible to time-lock neural responses precisely to the suffix onset and thus investigate how the suffix triggers segmentation once it can be identified in the speech signal.

In delineating the neural systems underlying speech comprehension using fMRI, a bilateral fronto-temporal network has been shown to be engaged in the processing of spoken words, including superior and middle temporal regions which are linked to the processing of lexical meaning (Binder et al., 2000; Scott et al., 2000; Davis and Johnsrude, 2003; Hickok and Poeppel, 2007). A further left-lateralized subsystem of this network has been implicated in the processing of morphological complexity, comprising left-hemisphere frontal, temporal, and parietal regions (Friederici et al., 2003; Marslen-Wilson and Tyler, 2007; Bozic et al., 2010). Thus, by manipulating the presence or absence of potential morphological complexity, we can investigate how these bilateral and left-lateralized networks are activated during spoken word comprehension. Once evidence has accumulated that a potential affix is present in the speech signal, we would predict that processing should automatically shift to the left-lateralized fronto-temporal system.

"fnhum-07-00759" — 2013/11/16 — 13:43 — page 2 — #2

To address these issues neurophysiologically, it is necessary to use brain responses that reflect automatic processing, provide accurate information on the time course of stimulus-specific processing in the brain, and that are sensitive to the linguistic properties of the stimuli. For these reasons, the present study involves the use of the mismatch negativity (MMN), a neural response component elicited by rare unexpected changes in the auditory stream. The paradigm consists of an oddball design in which a sequence of a frequent "standard" stimulus is occasionally replaced by a rare "deviant" stimulus (Näätänen et al., 1978). It has been argued that the MMN – typically a negative deflection peaking 100–200 ms after the onset of the change between deviant and standard – can reflect the activation of experience-dependent auditory memory traces (Näätänen et al., 1997).

Crucially for our study, the mismatch response is sensitive to linguistic sounds such as syllables and words, resulting in an increased left-lateralized response for language deviants (Näätänen et al., 1997; Shtyrov et al., 2005). The amplitude of the MMN shows a specific increasefor real words compared to acoustically matched pseudowords (Korpilahti et al., 2001; Pulvermüller et al., 2001). This lexical enhancement effect is explained by the existence of cortical memory traces that are automatically activated for known words in passive oddball sequences, but fail to activate for pseudowords that are not stored in the lexicon (Pulvermüller et al., 2001; Näätänen et al., 2007). Importantly, the timing of the mismatch response has been linked to behaviourally determined word-specific recognition times (Pulvermüller et al., 2006) whilst temporal patterns of local cortical activation spread show fine-tuned specificity for linguistic stimulus properties (Pulvermüller and Shtyrov, 2009).

Evidence from English inflectional morphology has shown that the mismatch response is modulated by grammatical changes due to the presence of morphological structure, with effects emerging in left-lateralized perisylvian areas for affixed deviants as compared to stems (Shtyrov and Pulvermüller, 2002); similar activity patterns were found for MMN responses elicited by differences in morphological structure in Arabic (Boudelaa et al., 2010). Our focus in this study is on the initiation of morphological segmentation of potentially complex forms, which is argued to be triggered automatically (e.g., Tyler et al., 2002; Post et al., 2008). A key advantage of the MMN paradigm is the ability to record neural responses elicited in the absence of focused attention on the auditory stream, enabling an investigation into early stages of spoken word recognition and the initiation of morphological processing before strategic effects or conscious processing of the word forms have taken place.

The MMN paradigm relies on a small set of items, implying that caution is needed in generalizing MMN results to the entire language. However, it offers a number of benefits, which make it an attractive tool for studies of spoken word recognition. It allows for ruling out acoustic confounds by incorporating the same acoustic/phonological contrast (e.g., addition of the same consonant) into different linguistic contexts which can themselves be tightly matched acoustically. By determining the time-point of standard-deviant divergence (such as addition of an affix here), neural responses can be aligned precisely allowing for a direct comparison between different morphological conditions. Finally, as mentioned above, it is an automatic response in that its elicitation does not depend on focusing attention on stimuli or engaging in a stimulus-related task.

In the present study, we include a matched set of inflected, derived and non-affixed forms. The inflectionally complex forms (*bakes, beaks*) allow us to examine how neural activity in the language system is modulated by the presence of an affix which results in a fully transparent form. Inflectional suffixes do not modify the meaning of the stem, and it has been argued that regularly inflected forms are represented and accessed compositionally (Pinker and Ullman, 2002; Marslen-Wilson and Tyler, 2007). We predict that inflected forms should trigger automatic decomposition, engaging a left-lateralized network including inferior frontal cortex compared to non-affixed forms (Tyler et al., 2005). Converging MMN findings show a left-lateralized response to inflected forms at ∼150 ms (Shtyrov and Pulvermüller, 2002; Shtyrov et al., 2005), indicating that the mismatch response can reveal specific memory trace activations in the neural subsystems involved in morphological decomposition.

Further, such a stimulus design allows us to directly contrast the verb (*bakes*) and the noun (*beaks*) inflection in order to examine potential differences related to grammatical class (signaling agreement as opposed to nominal plural). Differential noun vs. verb processing has been suggested in previous studies, where inflected verbs have revealed an increased left-lateralized distribution compared to inflected nouns, with key involvement of left inferior frontal cortex (Shapiro and Caramazza, 2003; Tyler et al., 2004; Longe et al., 2007). Though both forms are morphologically complex and would require segmentation into stem and suffix, it has been argued that verbs and nouns differentially engage the neural systems involved in morphological processing when they are inflected. This has been linked to differences in grammatical function of verbs and nouns in English, where verbs can be associated with a greater range of inflections to mark number, tense and person (unlike nouns, which only mark number), thus playing a greater role in the structural interpretation of a sentence (Tyler et al., 2004). However, the automaticity of this neural distinction between word classes remains unexplored. In the present study, we hypothesize increased engagement of left fronto-temporal regions for suffixed verbs compared to nouns, in particular in left inferior frontal cortex.

Using the derivational suffix *-er*, we investigate a further contrast between semantically transparent and pseudo-affixed word forms (*baker* vs. *beaker*) in order to examine whether morphological processing is indeed unaffected by the lexical appropriateness of the segmentation, as indicated by the previous behavioral investigations. Like the inflected forms, we would predict automatic segmentation of complex and pseudo-complex forms, with both derived forms patterning with the inflected forms compared to non-affixed forms. This would indicate the existence of discrete neural networks for automatic morphological processing which should be engaged for all forms containing potential complexity (e.g., Morris et al., 2008; Lehtonen et al., 2011; Lavric et al., 2012).

The two affixed conditions (*bakes/baker, beaks/beaker*) are contrasted with non-affixed forms (*bacon/beacon*) that embed the same (false) stems but should not trigger any attempts at segmentation as no affix is present. These non-complex forms are likely to

"fnhum-07-00759" — 2013/11/16 — 13:43 — page 3 — #3

engage a more bilateral cortical distribution, since morphological processes may not be engaged when no clues for morphological segmentation (such as a valid suffix) are present (Bozic et al.,2010). In addition, we include a control condition aimed at assessing acoustic/phonological effects by incorporating the same deviant contrasts in a meaningless pseudoword (*boke*). This provides a way of assessing whether effects are due to processing of low-level acoustic changes, rather than morphological processing.

In summary, the aim of this work is to examine how the spatiotemporal dynamics of word processing are modulated once a potential affix is identified in the speech signal. We focus on pinpointing when and where morphological information is accessed, and whether this is done automatically in the absence of attention, using the fine-grained spatiotemporal resolution of combined magneto- and electroencephalography (MEG–EEG). We address two issues of morphological processing: contrasting suffixed and non-suffixed forms, as well as inflected and derived forms, the latter comprising both semantically transparent and opaque derivations. We predict increased left fronto-temporal engagement for all morphologically complex forms compared to simple forms, regardless of word meaning, triggered by the presence of an inflectional or derivational suffix. Furthermore, with the inflected forms we examine processing of grammatical category, contrasting noun and verb forms. Verbal *-s* forms should elicit more left fronto-temporal activation, in particular in IFG, compared to nominal -*s* forms. To assess this potential shift in left hemisphere engagement for morphologically complex forms, we incorporate a laterality analysis (Shtyrov et al., 2005) to examine hemispheric differences across the complex and non-complex forms. The MMN paradigm has revealed increased left-lateralization for language stimuli (Näätänen et al., 1997), and we would predict this laterality should increase for morphologically complex forms compared to simple forms, and for verbs compared to nouns, both properties which have been shown to modulate the degree of left fronto-temporal activity. In this way we can examine how the addition of a suffix modulates the spatiotemporal pattern of word recognition as the speech signal unfolds, as well as the networks that support recognition of morphologically simple and complex spoken words.

# **MATERIALS AND METHODS**

## **SUBJECTS**

Fifteen subjects (13 female) took part in the experiment. All were right-handed (handedness tested according to Oldfield, 1971; range: 85–100%) native British English speakers between the ages of 19–34 (mean age of 24.9) with normal hearing, normal or corrected-to-normal vision, and no history of neurological disease, who gave written consent to take part and were paid for their time.

### **MATERIALS**

Standards and deviants were selected on the basis that acoustic differences would be minimized while manipulating lexicality, semantic transparency (relationship between stem and whole form), and morphological complexity (the presence of a potential suffix). Two word conditions (*bake, beak*) and one pseudoword condition (*boke*) were presented as standards in separate experimental blocks. Three deviants were created by adding an inflectional affix (*-s*), a derivational affix (*-er*)*,* and a non-affix (*-on*) to each of the standards (see **Table 1**). Crucially, the addition of *-er* produced a semantically transparent or opaque meaning in relation to the stem (*baker* vs. *beaker*). Both inflected forms (*bakes, beaks*) produced a valid complex form but differed in word class (verb vs. noun). Stimuli were matched on spoken wordform frequency, taken from the Celex database (Baayen et al., 1995), and neighborhood size (*N*).

Unaffixed stem stimuli *(bake, beak, boke)* were spoken by a female native British English speaker. Multiple versions of the standards were recorded, and the selected stimuli were closely matched on pitch/fundamental frequency, intensity and duration. The [b + vowel] segment was cut from each standard and served as the base form for all stimuli in the experiment. These base forms were adjusted to be of equal length (165 ms); they were also normalized for their peak amplitude. Endings for the standards and deviants were taken from recordings of the words *wreck*, *wrecks*, *wrecker,* and *reckon*; thus, the speaker produced the endings in the context of real words without a specific co-articulation bias toward any vowel used in the test stimuli. Multiple tokens of these words were also recorded and the selected words were closely matched on the pitch, intensity and duration to the main test stimuli. Each [k + ending] was spliced after the [b + vowel] following a 75 ms pause, which signaled the closure period before the release of the [k] typical of stop-consonants in the English language. The duration of the deviant endings were also adjusted to be of equal length starting from the [k] release.

Within each condition, the same [b + vowel] was used, and within each deviant set, the same [k + ending] was used. Thus, the stimuli of a given condition (i.e., all *bake* forms) were identical until the release of the [k]. This occurred at 240 ms post-stimulus onset, and all deviants were 460 ms long in total (see **Figure 1**). In this way, a set of naturally sounding but strictly controlled stimuli were obtained that were matched for acoustic–phonetic properties between conditions; furthermore, the deviant-standard contrasts (the critical feature determining purely acoustic MMN) were identical across the three main sets. At the same time, the context in which these contrasts occurred was systematically modulated, allowing us to rule out any acoustic confounds and concentrate on the linguistic context effects.

### **PROCEDURE**

Stimuli were presented pseudo-randomly in blocks of approximately 20 min in length, with short pauses between blocks and in the middle of each block. The order of the conditions was randomized across subjects. The pseudo-randomization

### **Table 1 | Standards and deviants used in MMN study.**


\* indicates pseudoword.

"fnhum-07-00759" — 2013/11/16 — 13:43 — page 4 — #4

deviation point from the standard.

within each block was done to ensure that at least two standards appeared between every deviant, and the order of the deviants within the blocks was completely random. Each stimulus was presented for 460 ms with a jittered inter-trial offset-toonset interval of 460–500 ms. For each condition, 100 trials of each of the three deviants were presented in the context of 900 standards, constituting 25% deviants (8.3% of each) and 75% standards. Ten filler trials of the standard stimulus were used at the beginning of each block to build up a representation of the standard, and were not included in the average event-related field. Every standard that appeared after a deviant was also discarded, as it might produce a change detection response of its own when immediately following the deviant.

Stimuli were presented binaurally through non-magnetic earpieces attached to plastic tubes while subjects were seated in front of a screen inside a dimly lit, magnetically shielded room. Before the experiment began, subjects were given a hearing test to ensure they could hear sounds equally well in each ear. Subjects were instructed to attend to a silent video during the experiment and did not perform a task on the stimuli, which they were instructed to ignore. They were told there would be a questionnaire following the experiment on details concerning the film, and all subjects scored at least 90% on the questionnaire. The experiment was run using E-Prime 1.0 (Psychology Software Tools, Sharpsburg, PA, USA) and lasted approximately 60 min.

# **DATA ACQUISITION**

Concurrent MEG–EEG data were acquired at a sampling rate of 1000 Hz (passband 0.10–330 Hz), with triggers placed at the onset of each stimulus. Neuromagnetic signals were recorded continuously with a 306-channel (102 magnetometers and 204 planar gradiometers) Vectorview MEG system (Elekta Neuromag, Helsinki, Finland). Electrical activity was recorded using a 70-channel EEG cap (Easycap, Herrsching, Germany), using a reference electrode on the nose. Prior to recording, five electromagnetic coils were positioned on the head and digitized along with the EEG electrodes using the Polhemus Isotrak digital tracker system (Polhemus, Colchester,VT,USA) with respect to three standard anatomical landmarks (nasion, left and right pre-auricular points). During the recording, the position of the magnetic coils was continuously tracked using continuous head position identification (cHPI), providing information on the exact head position within the MEG dewar for later movement correction. Four electrooculogram (EOG) electrodes were placed laterally to each eye and above and below the left eye to monitor horizontal and vertical eye movements during the recording.

# **PRE-PROCESSING**

Continuous raw data were pre-processed off-line with MaxFilter (Elekta Neuromag) implementation of signal-space separation (SSS) technique with a temporal extension (tSSS; Taulu and Simola, 2006), which minimizes movement artifacts and effects of magnetic sources outside the head. Averaging was performed using the MNE Suite (Athinoula A. Martinos Center for Biomedical Imaging, Boston, MA, USA). Epochs containing gradiometer, magnetometer, or EEG/EOG peak-to-peak amplitudes larger than 3000 fT/cm, 6500 fT, or 200 μV, respectively, were rejected. Trials were averaged by condition with epochs generated from −50 to 500 ms from the [k] release (at 240 ms after stimulus onset), at which point the standard and deviant stimuli started to diverge. Averaged data were low-pass filtered at 45 Hz and baseline corrected using the −50 to 0 ms interval before the divergence point. This interval was selected as it falls within the closure period preceding the [k] release (a silent period of 75 ms), and the standard and all the deviants are identical up to this point, thus there should not be any differences before this time point except random noise-related variations that should be removed using

"fnhum-07-00759" — 2013/11/16 — 13:43 — page 5 — #5

the baseline-correction procedure. The average response for the standards was subtracted from the three associated deviants to produce the MMN. For sensor-level analysis, tSSS was used to transform MEG data to the head position coordinates of the subject with the median head position within the helmet, to minimize transformation distance.

### **SENSOR-LEVEL ANALYSIS**

Analyses at the sensor level were conducted on EEG, gradiometers, and magnetometers separately using the sensor-space statistical parametric mapping (SPM) SensorSPM analysis method implemented in SPM5 (www.fil.ion.ucl.ac.uk/spm/). EEG and magnetometer data were used as such, whilst for each pair of gradiometer channels, a vector sum was calculated reconstructing the field gradient from its two orthogonal components and its amplitude (computed as a square root of the sum of squared amplitudes in the two channels) was used in further analysis. For each subject and condition, a series of *F*-tests were performed on a three-dimensional topography (2D sensors by 1D time) image. Each contrast results in a SPM, in which clusters of contiguous suprathreshold voxels are corrected using Random Field Theory (Kiebel and Friston, 2004). The 3D images were thresholded at a voxel level of *p* < 0.005, and corrected for cluster size at *p* < 0.05. These clusters could extend in space (distributed across the topography) and in time. This made it possible to compare conditions across every sensor over time while still correcting for multiple comparisons, allowing us to investigate a wider spatiotemporal array (Shtyrov et al., 2012). This provides a conservative approach to defining significant effects, avoiding any bias inherent to conventional visual inspection.

### **MRI ACQUISITION AND SOURCE ESTIMATION**

MPRAGE T1-weighted structural images with a 1 mm × 1 mm × 1 mm voxel size were acquired on a 3-Tesla Trio Siemens Scanner for each subject (repetition time [TR] = 2250 ms, echo delay time [TE] = 2.99 ms, flip angle 9, field of view [FOV] = 256 mm × 240 mm × 192 mm), which were used for source reconstruction of the cortical surface using FreeSurfer (Athinoula A. Martinos Center for Biomedical Imaging). The L2 minimum-norm estimation (Hämäläinen and Ilmoniemi, 1994) technique was applied for source reconstruction as implemented in the MNE Suite. A three-layer boundary element model (scalp, inner skull, outer skull) was created for each subject and was used to compute the combined MEG + EEG forward solutions. An average cortical solution was created from the fifteen subjects, and data from individual subjects were morphed to this cortical surface in 5 ms time-steps. The cortical representation provided by FreeSurfer was decimated to 10,242 dipoles per hemisphere, providing, at every time-step, source estimates for over 20,000 dipoles.

Regions of interest (ROIs) were anatomically defined based on the Desikan–Killiany atlas of the brain (Desikan et al., 2006) as implemented in the FreeSurfer package, with the exception of the temporal regions which were subdivided into an anterior and posterior region (pre-defined ROIs extend the entire length of the temporal lobe). ROIs were defined on the average cortical surface, and for each subject the mean value for all dipoles from within each region was extracted for statistical analysis. Selected ROIs were: superior and middle temporal gyrus (STG and MTG, respectively) and IFG. Time windows were defined by the results from the 3D SensorSPM analyses where significant effects were found, and were subject to further statistical analysis using repeated measures ANOVAs with condition and ROI as within-subject factors. Source-level results are visualized on the inflated cortical surface of the average subject's brain.

## **LATERALITY ANALYSIS**

Lateralization at the source level was calculated using a laterality coefficient *Q* as previously applied in psychoacoustic research and in MEG (e.g., Shtyrov et al., 2000, 2005; Holland et al., 2012):

$$Q = \frac{(A\_{\rm l} - A\_{\rm r})}{(A\_{\rm l} + A\_{\rm r})} \times 100\%, \ .$$

where *A*<sup>l</sup> and *A*<sup>r</sup> are the mean amplitude across vertices in the left and right hemispheres, respectively. In this way we could assess the degree of lateralization for each condition and compare across deviant types by removing any differences in absolute magnitude. Statistical analysis was carried out using repeated measures ANOVAs, with condition and ROI as within-subject factors.

# **RESULTS**

In the presentation of the results, sensor-level results are presented separately for gradiometers, magnetometers, and EEG. **Figure 2** shows the MMN responses averaged across word conditions (*bake* and *beak*) at the sensor and source level, with the MMN defined as the peak between 100 and 200 ms with a major source in posterior temporal cortex. The zero time point was placed at the release of the [k], which was equivalent across conditions. The [-s] deviant had the earliest mismatch response, peaking at approximately 135 ms, while the [-er] deviant peaked at 165 ms and the [-on] deviant at 185 ms. As expected for word deviants, all three conditions showed a left-lateralized MMN, with largest activation within left temporal sensors in MEG and fronto-central electrodes in EEG. The combined MEG–EEG source solutions, seen in **Figure 2B**, confirmed this left-lateralized response, which localized primarily to posterior superior temporal cortex (**Figure 2C**).

In the laterality analysis, a 30-ms window around the peak of each mismatch response was used in order to compare across deviant conditions with differing onset latencies. We included frontal and temporal regions bilaterally, which covers the main sources of the mismatch response across the three deviant types (see **Figure 2B**), and which encompasses ROIs that have previously been implicated in morphological processing (Tyler et al., 2005; Lehtonen et al., 2006; Marslen-Wilson and Tyler, 2007; Bozic et al., 2010). Comparing the three deviants averaged across the two stems (*bakes/beaks*, *baker/beaker*, and *bacon/beacon*), there was a significant main effect of condition (*F*(1,14) = 5.62, *p* < 0.05), but no effect of ROI (*F*(4,56) = 1.60, *p* > 0.05) and no interaction between the two factors (*F* < 1). The effect of condition showed increased left-lateralization for the [-s] and [-er] deviants compared to [-on] (*p* < 0.05), as seen in **Figure 3A**. Based on the lack of a main effect of ROIs, we collapsed data across the five ROIs, which showed that the left-lateralization for the [-er] and [-s] conditions was in

"fnhum-07-00759" — 2013/11/16 — 13:43 — page 6 — #6

itself significant (i.e., greater than zero; (*t*(14) = 2.58, *p* < 0.01 and *t*(14) = 2.59, *p* < 0.05, respectively), and was not significant for the [-on] condition (*t*(14) = 1.18, *p* > 0.05; two-tailed).

Within individual affix types (*bakes* vs. *beaks*, *baker* vs. *beaker, bacon* vs. *beacon*), the inflected [-s] forms were the only words to reveal a difference in laterality, with the verbal form *bakes* showing a more left-lateralized response compared to the nominal form *beaks* (**Figures 3B,C**). There was no significant main effect of condition (*F* < 1) or of ROI (*F*(4,56) = 1.21, *p* > 0.05), but there was a significant interaction between condition and ROI (*F*(4,56) = 2.96, *p* < 0.05) from 160 to 240 ms. We assessed this interaction statistically by carrying out a series of planned comparisons, showing greater laterality in IFG for the verb compared

to the noun (*F*(1,14) = 5.30, *p* < 0.05). The timing of this effect corresponds to the second half of the mismatch response for the [-s] deviants (see **Figure 2B**). **Figure 3C** demonstrates the difference in amplitude between the two hemispheres from 160 to 240 ms (LH minus RH at each vertex), with yellow/red indicating increased left hemisphere activity, and blue indicating increased right hemisphere activity. As revealed by the laterality analysis, the verb deviant showed increased left hemisphere activity in frontal and temporal areas.

### **WORD–PSEUDOWORD**

To test for a lexical enhancement effect (e.g., Pulvermüller et al., 2001), each deviant type ([-er], [-s], and [-on]) was analyzed

"fnhum-07-00759" — 2013/11/16 — 13:43 — page 7 — #7

separately contrasting the two word conditions (*bake, beak*) with the pseudoword (*boke*). The [-er] deviants elicited a significant effect in the gradiometers within left temporal sensors with a greater response to the two word conditions compared to the pseudoword condition (see **Figure 4A**). The cluster was significant from 150 to 185 ms with a peak at 165 ms, which corresponds to the timing and the topography of the mismatch response in the [-er] deviants. Though this predominantly gradiometer-driven effect did not reach significance in the magnetometers or EEG, the topographies in **Figure 4A** showed a greater response to the word conditions (more negative for EEG) compared to the pseudoword condition across the time window of the mismatch response. No other time windows were significant. Source analyses were performed on time windows from the sensor analysis where significant effects were found. Using combined MEG and EEG at the source level, the [-er] word–pseudoword contrast (*baker, beaker* vs. *boker*) localized primarily to left posterior temporal cortex (**Figure 4B**). Significant effects of condition (*F*(1,14) = 5.30, *p* < 0.05), ROI (*F*(4,56) = 12.61, *p* < 0.001) and the interaction of condition and ROI (*F*(4,56) = 2.89, *p* < 0.05) emerged in the left hemisphere from 150 to 185 ms. Planned comparisons showed increased amplitude for words compared to pseudowords

"fnhum-07-00759" — 2013/11/16 — 13:43 — page 8 — #8

in left posterior STG (*F*(1,14) = 11.35, *p* < 0.005). In the right hemisphere, there was a significant effect of ROI (*F*(4,56) = 3.73, *p* < 0.01), but no significant effect of condition (*F*(1,14) = 2.53, *p* > 0.05) and no interaction between the two factors (*F* < 1).

Turning to the unaffixed [-on] deviants (**Figure 5**), these revealed a significant cluster from 175 to 200 ms within anterior right temporal gradiometers, corresponding to the timing of the [-on] mismatch response (see **Figure 5A**). Unlike the sensor-level analysis, no source ROIs showed a significant lexicality effect for the [-on] word–pseudoword contrast (*bacon, beacon* vs. *bokon*). In the left hemisphere, there was a significant effect of ROI (*F*(4,56) = 12.14, *p* < 0.001) but no effect of condition (*F* < 1) or an interaction between condition and ROI (*F* < 1). In the right hemisphere, there was an effect of ROI (*F*(4,56) = 5.39, *p* < 0.001), but no effect of condition (*F*(1,14) = 1.38, *p* > 0.05) or an interaction between the two factors (*F* < 1).

# **DERIVATIONAL TRANSPARENCY CONTRAST: BAKER vs. BEAKER**

At the sensor level, the two word conditions were contrasted for each deviant type separately. Within the [-er] deviants (corresponding to the derivational affix), the words elicited a significant difference starting at 240 ms (see **Figure 6A**). In the magnetometers, the semantically opaque deviant (*beaker*) showed increased activity within right-hemisphere sensors compared to the transparent deviant (*baker*) from 240 to 270 ms. This time window corresponds to the second half of the MMN response curve, which peaks at 165 ms. The significant effect in EEG covered the time window of 240–280 ms, corresponding to distinct spatial distributions for the two conditions: a negativity for the semantically transparent deviant (*baker*) in posterior electrodes and a positivity for the semantically opaque deviant (*beaker*) in central electrodes. No significant differences were found in the gradiometers.

At the source level, an effect between the two word deviants emerged in left anterior MTG, as seen in **Figure 6B**. From 260

"fnhum-07-00759" — 2013/11/16 — 13:43 — page 9 — #9

to 270 ms, there was no main effect of condition (*F*(1,14) = 2.05, *p* > 0.05), but a significant effect of ROI (*F*(4,56) = 3.30, *p* < 0.05), and a significant condition by ROI interaction (*F*(4,56) = 3.35, *p* < 0.05). Planned comparisons revealed increased activity for *beaker* compared to *baker* in left anterior MTG (*F*(1,14) = 4.94, *p* < 0.05). In the right hemisphere, there was a significant effect of ROI (*F*(4,56) = 2.70, *p* < 0.05), but there was no effect of condition (*F* < 1) or an interaction between the two factors (*F* < 1).

# **INFLECTIONAL WORD CLASS CONTRAST: BAKES vs. BEAKS**

In contrast to the [-er] forms, both word deviants with [-s] endings were morphologically complex and semantically transparent. At the mismatch response, peaking at 135 ms, the only difference between the [-s] deviants was linked to lateralization as described above (see **Figure 3B**).

# **MONOMORPHEMIC STIMULI WITH EMBEDDED STEMS: BACON vs. BEACON**

In contrast with multiple results obtained for affixed conditions, no significant differences in the MMN response were found at the sensor level between the non-affixed monomorphemic deviant stimuli.

# **DISCUSSION**

The aim of this study was to investigate the spatiotemporal pattern of morphological processing in the context of spoken word recognition, focusing on how neural activity within the bilateral

frontal–temporal language network is modulated by the presence of a derivational or inflectional suffix. Results revealed language-specific responses that rapidly and automatically dissociated between words based on the presence of possible morphological complexity. All three conditions contained an embedded stem, and the addition of an ending that signaled either a potentially complex word or a non-affixed word resulted in distinct cortical distributions. For all conditions, the mismatch response peaked between 130 and 190 ms after the deviation point from the standard, and the source-level analysis revealed that neural activity within this time window showed a left-lateralized distribution in fronto-temporal regions. We focus on three major findings: the shift in the laterality of the brain response based on the grammatical properties of the deviants; the selectivity of the neural response for words compared to pseudowords, and the divergence between semantically transparent and opaque complex words.

# **LATERALIZATION**

The deviants all showed a left-lateralized distribution, but there was a significant shift in the degree of lateralization which was modulated by the presence of a potential affix. Both the [-s] and [-er] conditions showed increased left-lateralization within frontal and temporal regions compared to the [-on] condition during the mismatch response, and the lateralization for the affixed deviants was significantly greater than zero. This would suggest that the addition of a derivational or inflectional affix triggered

"fnhum-07-00759" — 2013/11/16 — 13:43 — page 10 — #10

increased engagement of left hemisphere fronto-temporal language regions, and this process occurred automatically once the suffix was present in the speech signal. This is in line with previous fMRI findings showing increased involvement of left-hemisphere perisylvian regions in morphological processing (Tyler et al., 2005; Lehtonen et al., 2006; Bozic et al., 2010), and supports the claim that the left-lateralized subsystem of the fronto-temporal network is specialized for processing of morphological complexity (e.g., Marslen-Wilson and Tyler, 2007). Importantly, unlike previous behavioral and fMRI results that could not speak to the timing of these events and were obtained using active tasks, the present study demonstrates that these fronto-temporal systems are triggered rapidly and automatically in the course of spoken word comprehension.

This increase in left hemisphere engagement was present for both suffix types, derivational and inflectional. Previous MMN research has not focused on derivationally complex forms; however source estimation from other studies examining morphological complexity and grammatical processing have demonstrated the key role of the left perisylvian areas in early stages of spoken word recognition (Shtyrov and Pulvermüller, 2002; Shtyrov et al., 2003). Furthermore, we found increased left-lateralization for both semantically transparent and opaque forms (*baker* and *beaker*), suggesting that morphological processing is triggered for any form containing morphological structure, regardless of word meaning. This is consistent with evidence from the visual domain showing automatic segmentation of word forms containing a stem and an affix, both behaviourally (Longtin et al., 2003; Rastle et al., 2004), and with MEG/EEG (Lavric et al., 2007; Morris et al., 2008; Lehtonen et al., 2011; Lewis et al., 2011), as well as fMRI evidence from spoken word comprehension demonstrating automatic decomposition of a stem and suffix (Tyler et al., 2002). Our findings are also in line with a dual-route account, in which parallel access through the full form as well as the constituents is engaged from early stages of recognition (Schreuder and Baayen, 1997). Word forms containing a stem and suffix would be initially decomposed; at a later stage the acceptability of the parsed form would be assessed, and semantically opaque forms would not be consistent with the decompositional route. However, the current study cannot speak directly to falsifying or strongly supporting dual-route accounts. Our results support initial morphological processing for all forms containing a potential suffix, which does not discount representation as whole forms.

We found additional laterality effects based on differences related to word class. The inflected word deviants contained a verb (*bakes*) and a noun (*beaks*). As both forms are semantically

"fnhum-07-00759" — 2013/11/16 — 13:43 — page 11 — #11

transparent and should be segmented into a stem and a suffix, they should not result in any differential processing due to the presence of morphological complexity. There were sustained laterality differences during the mismatch response, showing increased leftlateralization for the verb compared to the noun in frontal regions. The laterality analysis at the source level was in line with the evidence that verbs engage greater left perisylvian activity when they are inflected, which may be linked to the greater number of roles verbal affixes play in specifying number, tense and person (Tyler et al., 2004; Longe et al., 2007).

# **LEXICALITY**

The mismatch response showed a sensitivity to lexicality, with an increased response to words compared to the pseudoword which was strongest for the derived forms, i.e., [-er] deviants. The effect for the [-er] deviants appeared in left temporal sensors when comparing words vs. pseudowords, and at the source level was localized to left posterior STG. This is consistent with previous MMN findings showing a lexical enhancement effect (e.g., Pulvermüller et al., 2001), and indicates that lexical processing takes place automatically and does not require focused attention on the linguistic input. The presence of robust lexicality effects within left posterior temporal cortex during the mismatch response suggests that this area is involved in signaling acoustic changes (when deviants are sufficiently different from the standard) that are language-specific and in activating long-term cortical memory traces for stored words. In fMRI, left middle and superior temporal regions have been shown to play a key role in accessing stored lexical representations (Indefrey and Levelt, 2004; Hickok and Poeppel, 2007; Turken and Dronkers, 2011). Left STG was previously identified as underlying lexical MMN enhancement both in MEG (Shtyrov et al., 2005) and fMRI (Shtyrov et al., 2008).

The monomorphemic [-on] deviants also showed a leftlateralized distribution in temporal sensors, but the difference between word and pseudoword deviants appeared in the right hemisphere, showing increased activity for words. This suggests that both hemispheres respond to spoken words, although there may be a stronger left hemisphere involvement in this response. Whereas previously reported MMN lexicality effects were focused on the left temporal cortex, the potential role of right hemisphere generators has not been ruled out; furthermore, in at least one previous study a bilateralMMN response to concrete imageable nouns was linked to semantic stimulus features that are encoded by memory circuits encompassing both hemispheres (Pulvermüller et al., 2004). This is in line with extensive evidence for the involvement of the right hemisphere in language comprehension (e.g., Federmeier et al., 2008), as well as for increased bilateral engagement for morphologically simple words (Bozic et al., 2010). As we see in the laterality analysis, the monomorphemic [-on] deviants, which do not contain a potential suffix, show more bilateral fronto-temporal activity compared to the bimorphemic [-s] and [-er] forms, with the [-s] forms showing almost no right hemisphere activity at the peak of the MMN response (see **Figures 2B,C**). The combination of lexicality and laterality results point to the engagement of both the left and right temporal regions in lexical processing.

There was no significant lexicality effect in the inflected [-s] deviants, suggesting that the inflectional suffix was processed similarly for all forms, regardless of the meaning of the stem. This points to a specificity in the processing of the inflectional affix, which plays a grammatical role but does not alter the meaning of the stem (unlike derivational affixes, which change meaning and grammatical category). The inflectionally complex forms *bakes* and *beaks* do not require access to a separate representation from the stem, based on the argument that inflected forms are represented decompositionally (e.g., Pinker and Ullman, 2002). Thus, the same process of morphological segmentation should apply to both the words and pseudowords, suggesting that the [-s] suffix is triggering morphological processing as opposed to additional lexical processing.

# **SEMANTIC TRANSPARENCY**

The [-er] word forms varying in semantic transparency (*baker, beaker*) showed differential processing starting at 240 ms following the deviation point. We found increased processing of the semantically opaque word (*beaker*) which occurred more anteriorly, engaging left middle temporal cortex. We did not find similar amplitude differences between [-s] and [-on] pairs. This supports claims from the visual domain for a processing stage following blind segmentation which is constrained by word meaning, whereby the appropriateness of the segmentation is analyzed (Dominguez et al., 2004; Lavric et al., 2012). Semantically opaque forms such as *beaker* would require re-analysis since a decompositional meaning is not appropriate. The involvement of left anterior MTG points to additional processing demands required in accessing the appropriate meaning after an incorrect segmentation. Left MTG has been shown to be a key region in language comprehension (Turken and Dronkers, 2011), and anterior MTG in particular has been previously implicated in lexical retrieval (Damasio et al., 1996; Martin and Chao, 2001).

# **AUTOMATICITY**

In the present study, we extend the issue of automatic morphological processing to investigate how suffixed and non-suffixed forms are processed when attention is not focused on the stimuli and participants are not engaged in a stimulus-related task. Our results suggest that morphological segmentation is triggered automatically by the presence of a suffix, regardless of word meaning, activating a left-lateralized network of frontal and temporal regions. This would point to a primarily feedforward stimulusdriven process, driven by acoustic cues to morphological structure (*-s* or -*er* suffix). We report further evidence for automatic lexical processing, a finding which has been previously demonstrated when attention is not directed towards word identity (Price et al., 1996; Hinojosa et al., 2004). This does not disregard the crucial role of top-down processing, a relevant issue for understanding interactions between feedforward and feedback processes during word recognition – for instance, in examining task-relevant effects and how neural responses linked to morphological processing may be tuned by task demands (e.g., Wright et al., 2011). MEG and EEG could be beneficial in future studies in tracking neural activity across time between regions in the language network in order to investigate recurrent interactions between bottom-up

"fnhum-07-00759" — 2013/11/16 — 13:43 — page 12 — #12

and top-down processes during morphological and lexical processing.

Whilst using a limited set of stimuli, the MMN methodology offers a number of unique advantages because it (1) provides a tightly controlled method for studying neural processing of spoken words that are well-matched for acoustic and phonological similarity, (2) allows for examining language processes that occur independently of focused attention, and (3) allows for precise time-locking of brain activation to word recognition points in the spoken stimuli (Shtyrov and Pulvermüller, 2007). Variability in uniqueness point across words presents a challenge for examining large, controlled sets of stimuli in a typical eventrelated design. This is particularly important for suffixed words, since it makes it possible to control the point at which information about the stem and suffix is present in the speech signal across conditions. Importantly, at least in lexical and syntactic domains, initial MMN findings on rapid automatic processing could be confirmed in multi-item non-oddball designs (Hasting and Kotz, 2008; MacGregor and Shtyrov, 2013) when similarly rigorous stimulus matching was applied. In this way, focused MMN results could pave the way for further studies using more ecologically valid design. Future studies are needed to confirm the current MMN findings using other paradigms, including for example multi-item stimulus sequences with uniqueness point time-locking (cf. Leminen et al., 2011).

It is therefore crucial to consider how we can extrapolate to other words, and whether we can make conclusions about derivational, inflectional, and non-affixed words more generally from this study. The effects within this paradigm were robust and showed spatiotemporal patterns consistent with previous findings using morphologically simple and complex word forms. In order tofurther establish these results, additional studies looking at morphological complexity need to be performed using the MMN and other paradigms in the spoken domain. Given the limited morphological complexity of English in comparison with other languages, future studies are needed that will allow us to confirm these results using different stimuli in different languages. Using combined MEG and EEG and focusing analyses at the source level, it is possible to dissociate morphological processing from later stages involved in integration of semantic and syntactic aspects of the word, providing a more complete picture of the neural processing streams that support recognition of morphologically simple and complex words.

# **CONCLUSION**

We recorded automatic brain responses to acoustically and psycholingistically controlled sets of morphologically complex words, monomorphemic items and pseudoword control stimuli using combined MEG–EEG. In this study, we found:


• Modulation of automatic brain response to complex forms by their semantic coherence (transparency/opacity).

This study provides evidence that the spatiotemporal pattern of speech processing is modulated by the morphological status of the word ending. These results demonstrate processing of lexical and morphological features in the absence of focused attention, pointing to the key role that morphology plays in language comprehension.

# **ACKNOWLEDGMENTS**

This research was supported by a grant to William D. Marslen-Wilson from the European Research Council (NEUROLEX 230570) and by MRC Cognition and Brain Sciences Unit funding (William D. Marslen-Wilson: U.1055.04.002.00001.01, Yury Shtyrov: U.1055.04.014.00001.01). Caroline M. Whiting was supported by funding from the Cambridge Trusts and a Howard Research Studentship from Sidney Sussex College, Cambridge.

# **REFERENCES**


"fnhum-07-00759" — 2013/11/16 — 13:43 — page 13 — #13


"fnhum-07-00759" — 2013/11/16 — 13:43 — page 14 — #14


Shtyrov, Y., Smith, M., Horner, A. J., Henson, R., Nathan, P. J., Bullmore, E. T., et al. (2012). Attention to language: novel MEG paradigm for registering involuntary language processing in the brain. *Neuropsychologia* 50, 2605–2016. doi: 10.1016/j.neuropsychologia.2012.07.012


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 10 May 2013; accepted: 22 October 2013; published online: 18 November 2013.*

*Citation: Whiting CM, Marslen-Wilson WD and Shtyrov Y (2013) Neural dynamics of inflectional and derivational processing in spoken word comprehension: laterality and automaticity. Front. Hum. Neurosci. 7:759. doi: 10.3389/fnhum.2013.00759*

*This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2013 Whiting, Marslen-Wilson and Shtyrov. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

"fnhum-07-00759" — 2013/11/16 — 13:43 — page 15 — #15

# MEG masked priming evidence for form-based decomposition of irregular verbs

### *Joseph Fruchter <sup>1</sup> \*, Linnaea Stockall <sup>2</sup> and Alec Marantz 1,3,4*

*<sup>1</sup> Department of Psychology, New York University, New York, NY, USA*


### *Edited by:*

*Srikantan S. Nagarajan, University of California, San Francisco, USA*

### *Reviewed by:*

*Matthew K. Leonard, University of California, San Francisco, USA Marc Joanisse, The University of Western Ontario, Canada*

### *\*Correspondence:*

*Joseph Fruchter, Department of Psychology, New York University, 6 Washington Place, 2nd Floor, New York, NY 10003, USA e-mail: fruchter@nyu.edu*

To what extent does morphological structure play a role in early processing of visually presented English past tense verbs? Previous masked priming studies have demonstrated effects of obligatory form-based decomposition for genuinely affixed words (teacher-TEACH) and pseudo-affixed words (corner-CORN), but not for orthographic controls (brothel-BROTH). Additionally, MEG single word reading studies have demonstrated that the transition probability from stem to affix (in genuinely affixed words) modulates an early evoked response known as the M170; parallel findings have been shown for the transition probability from stem to pseudo-affix (in pseudo-affixed words). Here, utilizing the M170 as a neural index of visual form-based morphological decomposition, we ask whether the M170 demonstrates masked morphological priming effects for irregular past tense verbs (following a previous study which obtained behavioral masked priming effects for irregulars). Dual mechanism theories of the English past tense predict a rule-based decomposition for regulars but not for irregulars, while certain single mechanism theories predict rule-based decomposition even for irregulars. MEG data was recorded for 16 subjects performing a visual masked priming lexical decision task. Using a functional region of interest (fROI) defined on the basis of repetition priming and regular morphological priming effects within the left fusiform and inferior temporal regions, we found that activity in this fROI was modulated by the masked priming manipulation for irregular verbs, during the time window of the M170. We also found effects of the scores generated by the learning model of Albright and Hayes (2003) on the degree of priming for irregular verbs. The results favor a single mechanism account of the English past tense, in which even irregulars are decomposed into stems and affixes prior to lexical access, as opposed to a dual mechanism model, in which irregulars are recognized as whole forms.

**Keywords: masked priming, MEG and EEG, neurolinguistics, visual word recognition, morphological processing, past tense debate**

# **INTRODUCTION**

### **BACKGROUND: PAST TENSE DEBATE**

The distinction between regular (e.g., *jump*-*jumped*) and irregular (e.g., *teach*-*taught*) morphology in the English past tense has served as the basis for much debate in the psycholinguistic literature. Some have argued for a dual mechanism account, in which regular verbs are generated from their stems by rule, and irregular verbs are memorized as whole forms and stored in the lexicon (Pinker and Prince, 1988; Pinker, 1991). Under this account, irregulars, hypothesized to be stored as whole forms, are predicted to display surface (i.e., whole word) frequency effects, while regulars, hypothesized to be computed by rule from a stem and suffix, are predicted to display stem frequency effects; Pinker (1991) cites confirmatory evidence from experiments on ratings of past tense forms, as well as reaction times (RTs) in a verb generation task. Similarly, separate neural bases for regular and irregular inflection are also predicted by this account. Utilizing an ERP morphological violation paradigm, Luck et al. (2006) found that auditory presentation of invalid words generated by adding a regular suffix to a stem that requires irregular suffixation (i.e., overregularizations) elicited LAN/P600 effects, while presentation of invalid words generated by adding an irregular ending to a stem that requires regular suffixation (i.e., irregularizations) produced N400 effects. These findings were interpreted as illustrating the syntactic nature of overregularization, since the LAN and P600 are generally associated with syntactic violations (Friederici, 2002), and the lexical nature of irregularization, since the N400 is generally associated with word-level violations (Kutas and Schmitt, 2003). In an fMRI experiment, Vannest et al. (2005) compared suffixed words that showed behavioral evidence of decomposition (i.e., stem frequency effects for words ending in -*ness*, -*less*, and -*able*) with suffixed words that failed to show such effects (i.e., only surface frequency effects for words ending in-*ity* and -*ation*), and they found that decomposability was associated with increased levels of activity in Broca's area and the basal ganglia (argued by Ullman et al., 1997, to be part of the procedural circuit for grammatical rule processing). The distinction between surface frequency and stem frequency effects was thus taken by Vannest et al. (2005) as a diagnostic for the distinction between storage and computation, allowing them to argue for separate neural bases for the processing of decomposable and non-decomposable complex words.

Some have argued for alternative models of the English past tense, in which regular and irregular inflection are processed by a single mechanism. Under one such theory, regular and irregular verbs are both represented in a single connectionist network with quantifiable mappings between stems and candidate past tense forms (Rumelhart and McClelland, 1986; see McClelland and Patterson, 2002a,b for specific arguments in favor of single mechanism connectionist models of morphological complexity and irregularity). A different type of single mechanism account is advanced by Stockall and Marantz (2006), who argue that both regular and irregular verbs are composed by rule from their stems, in contrast both to the connectionist account of morphological relationships as a type of similarity, and to the dual mechanism account in which only regulars are composed by rule from their stems. Their evidence for this position comes from an MEG evoked response associated with lexical access (the M350) that displayed equivalent morphological priming effects for both regular and irregular verb-stem pairs, but no priming effects for pairs such as *boil-broil*, which are phonologically and semantically similar, but have no plausible morphological relationship.

It should be noted that under both types of single mechanism account, frequencies associated with the computation of the past tense should be relevant during the early stages of recognition of past tense verbs, for both regulars and irregulars; in contrast, under the dual mechanism account, only surface frequency should be relevant during the early stages of recognition of irregular verbs, and thus the results of Pinker (1991) and Vannest et al. (2005), inter alia, would seem to argue against such single mechanism models.

However, in the recent psycholinguistic literature, there have been findings that complicate the previously drawn binary distinction between storage and computation, as measured by the difference between surface frequency and stem frequency effects. Taft (2004) noted that stem frequency effects may be attenuated by the later stage of recombination of stem and affix: specifically, when matched for surface frequency, complex words with higher stem frequencies are more difficult to recombine than those with lower stem frequencies, thus canceling out the earlier stem recognition advantage. Baayen et al. (2007) argued that the dichotomy between storage and computation is false, since even low frequency regular verbs show effects of being stored in memory (i.e., surface frequency effects). Albright and Hayes (2003) presented behavioral ratings data for novel past tense forms, which demonstrated an influence of the phonological features of the stem, for both regulars and irregulars. One conclusion that we can draw from these findings is that processing of regular verbs is affected by a language user's prior experience with productive combination of their constituent morphemes. In other words, regulars are similar to irregulars, in that they display the effects of experience with their complex forms. In the present study, we ask the converse question: are irregulars similar to regulars, in that they show the effects of decomposition into their constituent morphemes?

If the predictions of the single mechanism account of Stockall and Marantz (2006) are correct, then we should find evidence for early visual word form based decomposition of irregular verbs. In order to experimentally verify these predictions, we combine MEG recordings with the behavioral masked priming paradigm (Forster and Davis, 1984). We contrast the predictions of Stockall and Marantz (2006) with those of the dual mechanism theory, in which irregulars are not predicted to be decomposed into stems and affixes at the early stages of word recognition. We do not specifically test the predictions of the single mechanism connectionist account, since any such predictions would be highly dependent on the details of a particular instantiation of a connectionist network (see Seidenberg and Plaut, in press for a discussion of the issues involved in taking any particular connectionist model as representative or complete).

# **EARLY STAGES OF VISUAL PROCESSING OF COMPLEX REGULAR WORDS**

There is much evidence for the importance of morphological structure during the early stages of visual word recognition. Rastle et al. (2004) reported findings from a masked priming experiment, which demonstrated significant levels of RT priming for genuinely affixed word-stem pairs (e.g., teacher-TEACH), as well as for pseudo-affixed word-stem pairs (e.g., corner-CORN), but not for pairs exhibiting similar orthographic overlap, but which cannot be exhaustively parsed into a possible stem and affix (e.g., brothel-BROTH; see Rastle and Davis, 2008 for a review of 19 studies reporting similar findings). Since real affixation and pseudo-affixation (but not simple orthographic overlap) lead to masked priming effects, we can conclude that the visual word recognition system is sensitive to the potential presence of morphological structure, before the lexical representation of a word is accessed; the latter point follows from the fact that pseudoaffixed words would presumably not be decomposed if the lexical entry were already retrieved, and the apparent morphological decomposition found to be erroneous. Thus, findings from the psycholinguistic literature can be taken as support for the presence of an early stage of morphological decomposition of visual word forms, independent of semantics, which takes place prior to lexical access.

There is also neural evidence, from non-priming paradigms, for early form-based morphological decomposition of complex visual words. MEG studies of visual word recognition, employing a single word reading paradigm and using correlational analyses to model evoked neural effects, have shown that an early evoked response from the visual word form area (Cohen et al., 2000) in the left fusiform gyrus, known as the M170 (Pylkkänen et al., 2002), is modulated by the transition probability from stem to affix in derivationally complex words [e.g., *p*(*teacher* |*teach*); Solomyak and Marantz, 2010], as well as the transition probability from (pseudo-)stem to (pseudo-)affix in pseudo-affixed words [e.g., *p*(*corner* |*corn*); Lewis et al., 2011]. Thus, the M170 can be regarded as a neural index of visual morphological decomposition, insofar as it is sensitive to statistical variables related to the morphological structure of visual word forms. However, previous M170 results have only involved derivational morphology; it thus remains an open question as to whether inflectional morphology will play a similar role in modulating the M170. If the M170 does indeed show sensitivity to inflectional morphology, then it is reasonable to predict that it should be modulated by a masked priming manipulation with past tense verbs.

In summary, there is strong, convergent support for visual word form based morphological decomposition, which occurs rapidly and automatically for all potentially complex regular words. This decomposition seems blind to semantic factors, though sensitive to transitional probabilities of the component morphemes. In the present study, we will utilize the neural index of the decomposition process (i.e., the M170) to investigate processing of inflectional morphology. Given this background for the behavioral and neural consequences of regular affixal morphological decomposition, we can now turn specifically to the issue of irregular past tense morphology.

## **MODELING IRREGULAR PAST TENSE MORPHOLOGY**

Albright and Hayes (2003) conducted a computational test of a single mechanism account of the English past tense, which featured stochastic rules as the basis of past tense generation<sup>1</sup> . The evidence for their rule-based account consisted of behavioral ratings of novel past tense forms; crucially, ratings of both regular and irregular forms were affected by the phonological features of their respective stems, suggesting raters were making use of these stem features in evaluating the well-formedness of the inflected forms. For both Albright and Hayes (2003) and Stockall and Marantz (2006), irregular inflection is generated by rule, such that, for example, one might produce *gled* as the past tense form of the novel verb *gleed*, due to the morphophonological rule i→ ε/[X{l,r}\_d][+past] (as in *lead-led*, *bleed-bled, breed-bred*, etc.). Albright and Hayes (2003) refer to a phonological context of relatively high consistency for a particular rule as an "island of reliability." Interestingly, their results demonstrated that English speakers were sensitive to such islands of reliability not only for novel irregular verbs (e.g., *fleep-flept* and *gleed-gled*), but also for novel regular verbs (e.g., *bredge-bredged* and *nace-naced*). The latter finding is contrary to the predictions of the dual mechanism theory, in which all regular verbs are derived via a single rule (with three predictable allomorphs: -*t*, -*d*, -∂*d*), and thus would not be expected to demonstrate effects of differing phonological contexts.

Albright and Hayes (2003) also developed a computational model that learned rules for the mapping from the phonological form of a stem to the phonological form of the past tense. Using this model, we were able to derive scores for the past tense verbs in our study, which represent the degree to which a particular past tense verb is supported by the morphophonological rules governing past tense formation in general (this measure will be referred to as "AlbrightScore"). Within the irregular verbs, there was a wide distribution over the AlbrightScore measure (from a minimum value of 0 to a maximum value of 1): for example, the irregular pair *send*-*sent* has a relatively high value for AlbrightScore (0.72), since it was supported by the related pairs *lend*-*lent* and *rend*-*rent*, while the irregular pair *fly*-*flew* had the lowest possible value for AlbrightScore (0), since it was not supported by any related forms.

We tested the effect of AlbrightScore on the level of M170 morphological priming, in order to look for evidence of rule application during processing of irregular verbs: specifically, we predicted that irregular morphological priming effects would be stronger for those verbs with higher AlbrightScore (e.g., *sendsent*), since they would have a greater degree of support for their particular past tense inflections from the overall rule structure governing past tense formation. Such evidence would support a rule-based decomposition model for all past tense verbs, as argued for by Stockall and Marantz (2006); it would also be contrary to the predictions of the dual mechanism account, under which irregular verbs (such as *sent*) are retrieved as whole forms from the lexicon.

# **BEHAVIORAL MASKED PRIMING: EVIDENCE FOR DECOMPOSITION OF IRREGULARS**

As outlined above, given the extensive theoretical debate regarding the English past tense, we designed the present study to investigate the following question: does early form-based decomposition take place only on the basis of regular affixal morphology, or does it also apply to irregular morphology? Since stem allomorphy is extremely pervasive and systematic across languages, it seems highly unlikely that a sophisticated visual linguistic pattern detection system would only be capable of detecting affixal morphology. In fact, a recent behavioral study has shown that, despite the lack of a visual morphemic segmentation for irregularly inflected words, masked presentation of such words facilitated lexical decision RTs to their corresponding stems, more than orthographically related primes and unrelated control primes (Crepaldi et al., 2010). This study also included a pseudo-irregular condition containing words that shared the orthographic sub-regularities of the irregular items (e.g., bell-BALL matches the orthographic pattern in fell-FALL). If the masked irregular priming effects are due to an early visual word form based decomposition using these orthographic sub-regularities, then the pseudo-irregular condition would be predicted to show the same masked priming effects. Contrary to this prediction, Crepaldi et al. (2010) found no such pseudoirregular priming effect. Since this finding seems to argue against a form-based decomposition mechanism operating over irregularly inflected forms, they interpret the result as implying the existence of an additional lemma level source of morphological priming. However, their conclusion may be premature, since they matched their pseudo-irregular items to real irregular items based only on their orthographic patterns, while allowing divergence in their phonological patterns (e.g., drought-DRINK was matched to thought-THINK). We matched our pseudo-irregulars

<sup>1</sup>While the past tense verb generation models from Albright and Hayes (2003) and Stockall and Marantz (2006) were similar, there was a significant difference between them: Stockall and Marantz (2006) argued that past tense verbs are generated via affixation to a stem, followed by a phonological readjustment to the stem when followed by that affix, while Albright and Hayes (2003) argued that past tense verbs are generated directly via a phonological rule applied to the features of the stem. This distinction will not be explicitly tested in the present study. It is also important to note that Albright and Hayes (2003) were offering an analysis of verb learning, and their predictions regarding online processing of familiar verbs are thus unclear.

to real irregulars based on orthography as well as phonology<sup>2</sup> .

Despite this complication with the pseudo-irregular condition, the behavioral masked priming evidence is consistent with a single mechanism account of the past tense: complex words seem to be decomposed into their stems, irrespective of whether they contain regular or irregular morphology. Since effects of semantic relatedness are not typically observed in a masked priming paradigm (at least for an SOA of 43 ms; Rastle et al., 2000), the priming observed for even irregular verbs must be formbased: brief (i.e., <50 ms) exposure to the irregular past tense form *left* is sufficient to parse this form as *lef* + *-t*, and to recognize *lef* as an allomorph of *leave*. It is less obvious how to explain irregular decomposition under a dual mechanism theory, in which the nature of the connection between irregular verbs and their stems is that of a semantic link between different lexical items.

### **MEG/EEG PRIMING LITERATURE**

Though there have been several previous MEG studies of visual word priming, none of these studies have presented clear data relating the priming manipulation to the M170. Stockall and Marantz (2006) utilized an overt (i.e., unmasked) priming paradigm, which provided evidence that both regular and irregular verbs prime their stems; however, their dependent measure was the M350, a late evoked response from the left superior and middle temporal regions that has been associated with lexical access (Pylkkänen and Marantz, 2003).

A number of recent studies have combined masked morphological priming with EEG or MEG measurements, but the earliest evoked response showing sensitivity to morphological complexity peaks between 200 and 300 ms (EEG: Lavric et al., 2007; Morris et al., 2007, 2008; Morris and Stockall, 2012; Royle et al., 2012; MEG: Lehtonen et al., 2011). Lavric et al. (2007), Morris et al. (2008), and Morris and Stockall (2012), all using EEG to measure neural processing, do report sensitivity to masked repetition priming in an evoked response peaking 130–200 ms after target onset (N/P 150), but Monahan et al. (2008), using MEG, find the earliest effects of masked repetition priming at ∼225 ms. The EEG studies also report a later masked priming effect, namely an attenuation of the N400 response, which is sensitive to both repetition priming and morphological priming (Lavric et al., 2007; Morris et al., 2008; Morris and Stockall, 2012).

Given that there is evidence that the lexical access process has already begun at 300 ms (or earlier), from an MEG study of homonyms that demonstrated effects of meaning entropy at this latency (Simon et al., 2012), and given the MEG single word reading evidence for sensitivity to morphological complexity in the M170 response, the lack of any observed M170 masked priming sensitivity is surprising.

A preliminary goal of our study is thus to investigate whether there is indeed an M170 masked priming effect, in general. An important difference between the current study and the previous EEG and MEG masked priming research is that rather than analyze averaged sensor data, we use minimum norm estimation to determine the plausible neural generators of the evoked sensor data, and then use anatomically and functionally defined regions of interest (ROIs) to constrain our analyses in source space. As outlined above, a further goal of our study is to investigate whether there is an early form-based masked morphological priming effect for irregular verbs specifically. Such an effect would be consistent with the single mechanism account of the English past tense (Stockall and Marantz, 2006), as well as with the behavioral masked priming results (Crepaldi et al., 2010). It would also highlight the importance of the M170 as an index of visual formbased morphological decomposition, not only for the previously studied cases of regular derivational morphology, but also for inflectional morphology, both regular and irregular. Finally, given the EEG evidence for N400 effects of masked priming, we also verify that there is a later MEG effect of masked priming, during the time window of the M350/N400m (i.e., the MEG evoked response analogous to the N400, discussed in Helenius et al., 1998 and Halgren et al., 2002).

# **MATERIALS AND METHODS**

### **DESIGN AND STIMULI**

Our experiment consisted of a visual masked priming lexical decision task, with simultaneous MEG recording of the magnetic fields induced by electrical activity in the brain. There were four conditions of interest, with 50 trials in each condition: **identity** (car-CAR), **regular** (jumped-JUMP), **irregular** (fell-FALL), and **pseudo-irregular** (bell-BALL). The irregular and pseudoirregular items were matched on both their orthographic and phonological patterns. Primes were presented in lower case and targets were presented in upper case, in order to ensure that any priming effects would not be due merely to repetition of the lowlevel visual features of the stimuli. There were an equal number of trials in which the same targets were preceded by unrelated primes (wing-FALL). We did not include orthographic or semantic control conditions, since there is no evidence of a facilitatory masked priming effect for orthographically or semantically similar words, given the SOA (33.3 ms) and the average word length (4.2 letters) of the stimuli in this experiment3 . Words were excluded from our study if they had a mean accuracy rate below 55% in lexical

<sup>2</sup>We also restricted ourselves to irregular verbs, both as the items in the irregular condition, and as the basis for generating the pseudo-irregulars. Crepaldi et al. (2010) included irregular plural nouns and pseudo-irregulars based on these patterns (e.g., *mice - mouse and spice - spouse*) in their experiments. The set of irregular nouns is much smaller than the set of irregular verbs (i.e., less than 20 irregular nouns vs. at least 150 irregular verbs), and the islands of sub-regularity are consequently much smaller. See Yang (2002) for extensive discussion of the limits of productive rule learning given small sample sizes.

<sup>3</sup>Rastle et al. (2000) did not find a significant masked priming effect for orthographic relatedness (electrode-ELECT) at an SOA of 43 ms, and the effect tended toward inhibition at longer SOAs. Davis and Lupker (2006) noted that most masked priming experiments with word primes reported inhibitory effects of orthographic relatedness, and that the facilitatory effects with word primes noted in some experiments were likely due to the greater length of their stimuli (8–9 letters), which would tend to boost the facilitation of the target (as a result of the larger amount of letter overlap), relative to the inhibition produced by activating its lexical competitors. With word lengths closer to those used in the present experiment (4–5 letters), the inhibitory effects of form overlap are predicted to outweigh the facilitation of the target. Additionally,

decision tasks, as measured by the English Lexicon Project, or ELP (Balota et al., 2007).

Frequency counts for the words in this experiment were obtained from CELEX (Baayen et al., 1995). Surface frequency for the regular and irregular verb primes was taken to be the logarithm of the CELEX wordform frequency for the particular past tense verb.

**Table 1** summarizes the mean values of word length, log surface frequency, and orthographic neighborhood size for the different experimental conditions. We chose the primes for these conditions so as to minimize the difference between related and unrelated primes along the above three dimensions; in particular, the related and unrelated primes were pairwise matched for word length, and listwise matched for surface frequency and orthographic neighborhood size, for each of the different conditions of interest (identity, regular, irregular, and pseudo-irregular).

We also selected 200 non-word targets from the ELP, which could be transformed into real words upon substitution of a single letter. We sought to minimize the difference between the mean values of word length and orthographic neighborhood size for the word and non-word targets; the non-word targets were thus pairwise matched to the word targets for length, and they were listwise matched for neighborhood size. The non-word targets were preceded by real word primes: 75 were orthographically related via the single letter substitution, and 125 were orthographically unrelated. Of the unrelated primes, 25 ended in "-ed" in order to match the 25 related primes in the regular verb condition. The primes for the non-word targets were listwise matched to the primes for the word targets on all three variables. None of the primes were non-words, in order to ensure that the lexicality of the prime could not be used as evidence toward the lexical decision on the target. The stimuli for this experiment are listed in the Appendix.

Since we did not want a given participant to view the same target twice, we developed two versions (A and B) of the experiment. In each version, half of the real word targets were preceded by related primes, and the other half were preceded by unrelated primes; the 200 non-word trials remained the same in both versions of the experiment. Versions A and B were counterbalanced across participants. Thus, a given participant viewed a total of 200 unique real word targets (preceded by 100 related primes and 100 unrelated primes, from version A or B) and 200 unique non-word targets (preceded by 75 related primes and 125 unrelated primes, with no difference between versions).

AlbrightScore values were generated for the regular and irregular verbs, using the past tense learner program available online (Albright, 2003). The input to the learner consisted of the phonological representations of the verb stems in our experiment. The score for a given past tense form was taken from the program's output, if available for that verb; otherwise, if the program did

**Table 1 | Mean values of word length, log CELEX surface frequency (Freq), and orthographic neighborhood size (***N***) for the different experimental conditions.**


Rastle et al. (2000) found no masked priming effect for semantic relatedness (cello-VIOLIN) at an SOA of 43 ms, though the effect tended toward significance at longer SOAs.

not produce a given inflection, the AlbrightScore was assigned to be 0 (i.e., the minimum value for the measure). The AlbrightScore measure thus ranged from 0 (no support for the past tense form) to 1 (complete support for the past tense form).

### **EXPERIMENTAL PROCEDURES**

Sixteen right-handed native English speakers (8 males and 8 females) participated in the MEG experiment. All subjects provided written informed consent to participate in the study.

DMDX (Forster and Forster, 2003) was used as the presentation platform for the experiment. The font was Courier New, size 28. Each trial of the experiment consisted of a string of hash marks appearing for 500 ms ("#######"), a lower-case prime appearing for 33.3 ms ("fell"), and an upper-case target displayed for 300 ms ("FALL"). Subjects were instructed to respond to the target stimulus by pressing one button if they recognized the string as a valid word of English, and a second button if the string was invalid. After the experiment, subjects were asked whether they were able to read the masked primes; none of the subjects indicated an ability to do so.

A 157-channel axial gradiometer whole-head MEG system (Kanazawa Institute of Technology, Kanazawa, Japan) recorded the MEG data at a sampling frequency of 1000 Hz. The data was filtered between DC and 500 Hz, with a band elimination filter of 60 Hz. The subjects' heads were digitized prior to entering the magnetically shielded room. The head positions during the experiment were determined via coils attached to anatomical landmarks. Structural MRIs were also obtained for all the subjects, and the coil locations were used to translate from the MEG spatial coordinates to the MRI coordinates.

# **ANALYSIS**

### *Behavioral analysis*

Reaction times and accuracy data were recorded for each trial of the lexical decision task. Subjects with a mean RT greater than 2 standard deviations above the mean RT for all subjects, or an RT standard deviation greater than 2 standard deviations above the mean RT standard deviation for all subjects, were removed from the behavioral analysis; this resulted in the removal of two subjects, while maintaining the counterbalancing between the two versions of the experiment. Trials with an RT that was either less than 300 ms or greater than 2 standard deviations above the mean RT across subjects (within the given condition) were also removed from the behavioral analysis. Two of the subjects had accuracy rates slightly worse than 2 standard deviations below the mean accuracy rate (88.75 and 89%), but we included them in the analysis, since removing their data would ruin the counterbalancing across the two versions of the experiment.

In order to analyze the correlation of RT with the masked priming manipulation, we used linear mixed effects models (Baayen et al., 2008) with RT as the dependent variable, PrimeType (related vs. unrelated) as the fixed effect, and subject and item as random effects. The linear mixed effects models were constructed using the lmer function of the lme4 package in R (Bates and Maechler, 2009). The *p*-values were computed via Monte Carlo (MC) simulation with 10,000 iterations each. In order to determine whether the pseudo-irregular items displayed a significantly different level of priming than the irregular items, following Crepaldi et al. (2010), we analyzed the interaction between PrimeType and Pseudo-irregularity (i.e., irregular vs. pseudo-irregular) for the irregular and pseudo-irregular items only. In order to analyze this interaction, we first fit a linear mixed effects model with PrimeType and Pseudo-irregularity as fixed effects. We then fit a second linear mixed effects model with the two measures and their interaction as fixed effects. Finally, we performed a likelihood ratio test of the two nested models, which produces a χ2-value and an associated *p*-value, indicating the significance of adding the interaction term to the model.

# *MEG analysis*

*Data analysis.* The MEG data was noise reduced via the Continuously Adjusted Least-Squares Method (Adachi et al., 2001), in the MEG160 software (Yokogawa Electric Corporation and Eagle Technology Corporation, Tokyo, Japan). Cortically constrained minimum-norm estimates were calculated via MNE (MGH/HMS/MIT Athinoula A. Martinos Center for Biomedical Imaging, Charleston, MA). The cortical reconstructions were obtained using FreeSurfer (CorTechs Labs Inc., La Jolla, CA and MGH/HMS/MIT Athinoula A. Martinos Center for Biomedical Imaging, Charleston, MA). A source space of 5124 points was generated for each reconstructed surface, and the BEM (boundary-element model) method was employed on activity at each source to calculate the forward solution. Using the grand average of all trials for a particular subject, after baseline correction with the pre-target interval (−150, −50 ms) [or, equivalently, the interval (−117, −17 ms) relative to the presentation of the prime] and low pass filtering at 40 Hz, the inverse solution was computed from the forward solution, in order to determine the most likely distribution of neural activity. The inverse solution was computed with a free orientation for the source estimates, meaning that the estimates were unconstrained with respect to the cortical surface. The resulting minimum norm estimates were signed, with positive values indicating an upward directionality, and the negative values indicating a downward directionality, in the coordinate space defined by the head <sup>4</sup> . The signed estimates were transformed into (signed) noise-normalized dynamic statistical parameter maps (dSPMs; following Dale et al., 2000). FreeSurfer's automatically-parcellated anatomical ROIs were used to obtain estimates of the average noise-normalized neural activity (i.e., dSPM values) within left temporal cortical regions. In order to analyze the grand-averaged evoked activity across

<sup>4</sup>MNE provides the user with a choice of several orientation constraints for the source estimates: free, fixed, or loose (with respect to the cortical surface normal). Under the latter two orientation constraints, the sign of the resulting source estimates indicates the directionality with respect to the cortical surface: positive indicates a current directed outward from the cortex, and negative indicates a current directed inward toward the cortex. For our choice of free orientation, the sign of the source estimates has a different meaning: positive indicates a current directed upward, and negative indicates a current directed downward, and the coordinate space used to determine these directions is one defined by the head (i.e., MNE's "MEG head coordinate frame," rather than the cortical surface). Thus, the sign of the data is not meant to indicate the sign of the current, but rather the directionality within this particular coordinate space.

all subjects, we morphed each individual subject's brain to the common space of a single representative subject's brain. In order to analyze the functionally defined ROI (fROI), we drew an ROI in the common neuroanatomical space, morphed it back into each individual subject's neuroanatomical space, and extracted the average dSPM values within the fROI for each subject.

Outlier trials were removed based on an absolute threshold of ±2.5 pT, enforced over the time window (−150, +300 ms) for the noise reduced MEG data.

*Anatomical ROI analysis.* We examined two cortical areas of interest within the left temporal lobe, since this general location is associated with the M170 response and the M350 response (Pylkkänen and Marantz, 2003; Solomyak and Marantz, 2010). In particular, we used the FreeSurfer-generated anatomical ROIs for the fusiform and middle temporal regions (**Figure 1**).

For the M170 analysis, we investigated the effect of PrimeType (related vs. unrelated) on activity in the fusiform ROI; the time window of interest was a 50 ms interval centered at the peak of the M170 (i.e., the peak of the mean fusiform activity across trials and across subjects). For the M350/N400m analysis, we investigated the effect of PrimeType on activity in the middle temporal ROI; the time window of interest was the general late interval 300–500 ms post-target onset.

*Functional ROI analysis.* In our analysis of the grand-averaged evoked activity across all subjects and all trials in the experiment (**Figure 2A**), we observed a large patch of positive (i.e., upward) activity in the occipitotemporal region, as well as a separate patch of negative (i.e., downward) activity more anteriorly within the temporal lobe. Both of these patches of activity overlapped with the fusiform ROI; the former positive patch overlapped with the posterior part of the fusiform, and the latter negative patch overlapped with the anterior part of the fusiform. The time course of the positive patch was consistent with the M170 response, showing a positive peak at ∼170 ms post-target onset, while the time course of the negative patch showed a more gradual decline in the negative (downward) direction (not shown). The presence of two separate response components within the same ROI yields a potential confound for our anatomical ROI analysis of the M170 priming effect. Due to the uncertainty arising from this confusion of separate evoked responses, we decided to conduct a functional region of interest (fROI) analysis as well.

We defined an fROI on the basis of the identity and regular priming conditions (i.e., repetition priming and regular morphological priming). Specifically, within the cortical area covered by the fusiform and inferior temporal anatomical ROIs in the common neuroanatomical space of a representative subject's brain, we drew an fROI around the peak facilitatory priming effect <sup>5</sup> (across all subjects) in the identity and regular conditions combined, during the time window around the M170 (**Figure 3A**). We then morphed this fROI from the representative subject's brain to the neuroanatomical space for each individual subject. We investigated the effect of PrimeType (related vs. unrelated) on activity within this fROI for the irregular and pseudo-irregular conditions. Additionally, we investigated whether there was an interaction of AlbrightScore and PrimeType for the irregular verbs; specifically, we hypothesized that there would be a greater priming effect for the irregular verbs that had a higher AlbrightScore value.

*Statistical methodology.* To analyze the masked priming effects in the MEG data, we employed linear mixed effects models (Baayen et al., 2008) millisecond-by-millisecond (i.e., we used separate models at each time point), with the average neural activity in an ROI as the dependent variable, PrimeType as the fixed effect, and subject and item as random effects. The *t*-values for the fixed effect <sup>6</sup> were then corrected for multiple comparisons over the selected time window of interest only. The linear mixed effects models were constructed using the lmer function of the lme4 package in R (Bates and Maechler, 2009). The technique that we used for multiple comparisons correction is based on the methods of Maris and Oostenveld (2007), as adapted by Solomyak and Marantz (2009). Specifically, we computed t, the sum of all *t*-values within a single temporal cluster of consecutive significant effects in the same direction (where significant is defined by |*t*|> 1.96, *p* < 0.05 uncorrected). The highest absolute value of t, for any cluster within the whole time window, was then compared to the results of the same procedure repeated on 10,000 random permutations of the independent variable (i.e., PrimeType). An MC *p*-value was thus computed, based on the percentage of times a random permutation of the independent variable led to a larger maximum absolute value of t than the original maximum absolute value of t (as computed on the actual data).

In order to analyze interaction effects, say for measures A and B, we first fit a linear mixed effects model with A and B as fixed effects. We then fit a second linear mixed effects model with A, B, and their interaction as fixed effects. Finally, we performed a likelihood ratio test of the two nested models, which produces a χ2-value, indicating the significance of adding the interaction term to the model. To correct for multiple comparisons over a

<sup>5</sup>We chose the larger anterior patch (**Figure 3A**) as the fROI, rather than the (uncorrected) posterior effect, since the anterior effect was stronger, more widespread, and most importantly, in the expected direction (i.e., facilitatory, with a larger magnitude of negative activity in the unrelated condition).

<sup>6</sup>No degrees of freedom are provided for the *t*-values generated by the linear mixed effects models; due to the large number of observations, the *t*-distribution effectively converges to the standard normal distribution (Baayen et al., 2008: Note 1).

**FIGURE 2 | (A)** Mean whole-brain activity across all subjects and all trials at 170 ms post-target onset, shown on a representative subject's inflated cortical surface (ventral view, left hemisphere). Positive activity (i.e., upward with respect to the head) is shown in red/yellow, and negative activity (i.e., downward with respect to the head) is shown in blue. The anatomical fusiform ROI is highlighted in green. **(B)** Mean activity in the fusiform ROI, collapsed across all four conditions: identity, regular, irregular, and pseudo-irregular. **(C)** Mean activity in the fusiform ROI, separated by PrimeType, and pooled across the 4 conditions of identity, regular, irregular,

time window of interest, we performed a similar procedure to the one described above, except with the square root of the χ2 values rather than *t*-values, and with random permutations of two independent variables (A and B).

# *AlbrightScore Analysis*

We also conducted a test of the scores generated by the past tense learning model from Albright and Hayes (2003). More specifically, we analyzed the interaction of AlbrightScore with PrimeType for the irregular verbs, in order to test whether the gradient measure of a past tense form's support from the various past tense phonological rules (i.e., its AlbrightScore) would impact the degree of priming from the masked past tense form to its corresponding stem. We performed this analysis for both the behavioral data (i.e., RT) and the MEG data (i.e., the M170 response), using the same methodology described in the above sections. Due to the extremely high mean AlbrightScore for the regular verbs (close to the maximum possible value, in fact), we refrained from performing a comparable AlbrightScore analysis for those items.

# **RESULTS**

### **BEHAVIORAL RESULTS**

The mean accuracy rate across all subjects was 94.4% (±2.65%). The mean RT across all subjects was 620.7 ms (±178.7 ms). Significant RT priming was found for the identity condition (33.3 ms; *t* = 4.66, MC-corrected *p* = 0.0001), the regular condition (22.5 ms; *t* = 3.21, MC-corrected *p* = 0.002), and the irregular condition (14.2 ms; *t* = 2.07, MC-corrected *p* = 0.042). The pseudo-irregular condition displayed a trend toward significance (14.6 ms; *t* = 1.72, MC-corrected *p* = 0.083). Our behavioral analysis showed no significant interaction between PrimeType and Pseudo-irregularity (χ<sup>2</sup> <sup>=</sup> <sup>0</sup>.002, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>.97), consistent with the fact that the irregulars and pseudo-irregulars demonstrated comparable levels of priming.

# **MEG RESULTS**

Visual inspection of the grand-averaged evoked fusiform activity (**Figure 2B**) reveals that it peaks in the positive direction (i.e., upward with respect to the head) during the time window 100–200 ms post-target onset. In fact, there appears to be an all subjects, we morphed each individual subject's brain to the common space of a single representative subject's brain. In order to analyze the functionally defined ROI (fROI), we drew an ROI in the common neuroanatomical space, morphed it back into each individual subject's neuroanatomical space, and extracted the average dSPM values within the fROI for each subject.

Outlier trials were removed based on an absolute threshold of ±2.5 pT, enforced over the time window (−150, +300 ms) for the noise reduced MEG data.

*Anatomical ROI analysis.* We examined two cortical areas of interest within the left temporal lobe, since this general location is associated with the M170 response and the M350 response (Pylkkänen and Marantz, 2003; Solomyak and Marantz, 2010). In particular, we used the FreeSurfer-generated anatomical ROIs for the fusiform and middle temporal regions (**Figure 1**).

For the M170 analysis, we investigated the effect of PrimeType (related vs. unrelated) on activity in the fusiform ROI; the time window of interest was a 50 ms interval centered at the peak of the M170 (i.e., the peak of the mean fusiform activity across trials and across subjects). For the M350/N400m analysis, we investigated the effect of PrimeType on activity in the middle temporal ROI; the time window of interest was the general late interval 300–500 ms post-target onset.

*Functional ROI analysis.* In our analysis of the grand-averaged evoked activity across all subjects and all trials in the experiment (**Figure 2A**), we observed a large patch of positive (i.e., upward) activity in the occipitotemporal region, as well as a separate patch of negative (i.e., downward) activity more anteriorly within the temporal lobe. Both of these patches of activity overlapped with the fusiform ROI; the former positive patch overlapped with the posterior part of the fusiform, and the latter negative patch overlapped with the anterior part of the fusiform. The time course of the positive patch was consistent with the M170 response, showing a positive peak at ∼170 ms post-target onset, while the time course of the negative patch showed a more gradual decline in the negative (downward) direction (not shown). The presence of two separate response components within the same ROI yields a potential confound for our anatomical ROI analysis of the M170 priming effect. Due to the uncertainty arising from this confusion of separate evoked responses, we decided to conduct a functional region of interest (fROI) analysis as well.

We defined an fROI on the basis of the identity and regular priming conditions (i.e., repetition priming and regular morphological priming). Specifically, within the cortical area covered by the fusiform and inferior temporal anatomical ROIs in the common neuroanatomical space of a representative subject's brain, we drew an fROI around the peak facilitatory priming effect <sup>5</sup> (across all subjects) in the identity and regular conditions combined, during the time window around the M170 (**Figure 3A**). We then morphed this fROI from the representative subject's brain to the neuroanatomical space for each individual subject. We investigated the effect of PrimeType (related vs. unrelated) on activity within this fROI for the irregular and pseudo-irregular conditions. Additionally, we investigated whether there was an interaction of AlbrightScore and PrimeType for the irregular verbs; specifically, we hypothesized that there would be a greater priming effect for the irregular verbs that had a higher AlbrightScore value.

*Statistical methodology.* To analyze the masked priming effects in the MEG data, we employed linear mixed effects models (Baayen et al., 2008) millisecond-by-millisecond (i.e., we used separate models at each time point), with the average neural activity in an ROI as the dependent variable, PrimeType as the fixed effect, and subject and item as random effects. The *t*-values for the fixed effect <sup>6</sup> were then corrected for multiple comparisons over the selected time window of interest only. The linear mixed effects models were constructed using the lmer function of the lme4 package in R (Bates and Maechler, 2009). The technique that we used for multiple comparisons correction is based on the methods of Maris and Oostenveld (2007), as adapted by Solomyak and Marantz (2009). Specifically, we computed t, the sum of all *t*-values within a single temporal cluster of consecutive significant effects in the same direction (where significant is defined by |*t*|> 1.96, *p* < 0.05 uncorrected). The highest absolute value of t, for any cluster within the whole time window, was then compared to the results of the same procedure repeated on 10,000 random permutations of the independent variable (i.e., PrimeType). An MC *p*-value was thus computed, based on the percentage of times a random permutation of the independent variable led to a larger maximum absolute value of t than the original maximum absolute value of t (as computed on the actual data).

In order to analyze interaction effects, say for measures A and B, we first fit a linear mixed effects model with A and B as fixed effects. We then fit a second linear mixed effects model with A, B, and their interaction as fixed effects. Finally, we performed a likelihood ratio test of the two nested models, which produces a χ2-value, indicating the significance of adding the interaction term to the model. To correct for multiple comparisons over a

<sup>5</sup>We chose the larger anterior patch (**Figure 3A**) as the fROI, rather than the (uncorrected) posterior effect, since the anterior effect was stronger, more widespread, and most importantly, in the expected direction (i.e., facilitatory, with a larger magnitude of negative activity in the unrelated condition).

<sup>6</sup>No degrees of freedom are provided for the *t*-values generated by the linear mixed effects models; due to the large number of observations, the *t*-distribution effectively converges to the standard normal distribution (Baayen et al., 2008: Note 1).

### *M170 analysis: Functional ROI*

Our fROI, defined on the basis of the facilitatory identity and regular priming effect, was localized to the middle-to-anterior part of the fusiform and inferior temporal regions (**Figure 3A**). Visual inspection of the time course of the average activity within this fROI reveals an evoked response moving gradually in the negative (i.e., downward) direction, starting at ∼100 ms post-target onset (**Figure 3B**), consistent with the anterior negative evoked response seen in the temporal lobe for the grand-averaged wholebrain data. When corrected for a 50 ms window centered at the peak of the identity and regular priming effect (i.e., 183 ms), there is a significant effect of PrimeType for the irregular condition (*p* = 0.017 for the cluster at 158–183 ms, MC-corrected for 158–208 ms; **Figure 3C**), but no effect for the pseudo-irregular condition (**Figure 3D**). Unlike the M170 anatomical ROI analysis, the direction of the priming effect within the fROI is such that neural activity in the fROI is *reduced* (i.e., less negative) in the related prime condition.

# *M350/N400m analysis: middle temporal ROI*

As can be seen in **Figure 4**, there is sustained negative activity (i.e., downward with respect to the head) in the middle temporal ROI at ∼200–400 ms. Our M350/N400m priming analysis reveals a clear pattern of facilitatory priming effects in this region (i.e., less negative activity for the related PrimeType condition). When corrected for the general late time window 300–500 ms, the priming effects for the identity (*p* = 0.002 for the cluster at 427–500 ms, MCcorrected for 300–500 ms), regular (*p* < 0.0001 for the cluster at 385–493 ms, MC-corrected for 300–500 ms), and irregular (*p* = 0.003 for the cluster at 406–484 ms, MC-corrected for 300–500 ms) priming manipulations were each highly significant on their own, while the pseudo-irregular condition showed no effect (**Figure 4**). Consistent with this pattern of results, there was a significant interaction of PrimeType and Pseudoirregularity (*p* = 0.029 for the cluster at 405–439 ms, MCcorrected for 300–500 ms), when comparing only the irregulars and pseudo-irregulars.

# **ALBRIGHTSCORE RESULTS**

The mean AlbrightScore of the irregular verbs was 0.514 (± 0.228), in contrast to the regular verbs, whose mean AlbrightScore was 0.975 (± 0.025); this disparity is due to the fact that the regular rules are always more supported than the irregular rules for past tense formation, given the overwhelming number of regular verbs. Given the tight clustering of the regular AlbrightScore values at close to the maximum value (i.e., 1), we refrained from analyzing them further.

# *AlbrightScore behavioral results*

First, we tested the effect of AlbrightScore on the degree of RT priming for the irregular verbs. Since AlbrightScore and surface frequency are correlated (*r* = 0.29, *p* < 0.0001), we orthogonalized AlbrightScore with respect to surface frequency (AlbrightScoreO). The effect of the interaction of AlbrightScoreO and PrimeType on RT was not significant for the irregulars (χ<sup>2</sup> <sup>=</sup> 1.06, *p* = 0.30).

# *AlbrightScore MEG results*

We also tested the effect of AlbrightScore on the degree of M170 priming for the irregular verbs (**Figure 5**). Given the fact that the irregular priming effect was in the expected direction only for the functional ROI analysis, we used that same analysis to look at the effect of AlbrightScore. When corrected for a 50 ms window centered at the peak of the identity and regular priming effect (i.e., 183 ms), the interaction of AlbrightScoreO and PrimeType had a significant effect on activity within the fROI for the irregular condition (*p* = 0.004 for the cluster at 176–208 ms, MC-corrected for 158–208 ms; **Figure 5A**). When we divide the data into two bins, high AlbrightScore (defined as >0.5) and low AlbrightScore (defined as <0.5), we see a striking disparity: after correction for a 50 ms window centered at the peak of the identity and regular priming effect (i.e., 183 ms), there is a very significant priming effect for the irregulars with high AlbrightScore (*p* = 0.0009 for the cluster at 158–208 ms, MC-corrected for 158– 208 ms; **Figure 5B**), and no effect for the irregulars with low AlbrightScore (**Figure 5C**).

# **DISCUSSION**

Our behavioral analysis confirmed that masked presentation of primes significantly facilitated RTs for lexical decision on their targets, in the identity, regular, and irregular conditions, with near-significant facilitation for pseudo-irregulars. This partially confirms the findings of Crepaldi et al. (2010) of RT facilitation due to masked irregular morphological priming, with the caveat that they did not find any hint of priming for the pseudo-irregular condition.

Our MEG analysis confirmed that there is indeed an M170 masked priming effect in the left fusiform gyrus, which is earlier than the effects previously found in MEG studies of masked priming (Monahan et al., 2008; Lehtonen et al., 2011). Interestingly, the direction of the M170 priming effect was such that fusiform activity was greater in the related prime condition than in the unrelated prime condition. Since the direction of this effect is counterintuitive, and we observed that there are actually two, potentially confusable, response components within the same fusiform ROI, we decided to conduct a functional ROI (fROI) analysis as well. The fROI, localized to the middle-to-anterior portion of the fusiform and inferior temporal regions, displayed a significant morphological priming effect for the irregular verbs, and this effect was in the expected facilitatory direction. The presence of an early masked priming effect for irregular verbs suggests that they are decomposed into their stems for lexical access, despite the fact that, unlike regular verbs, they do not necessarily contain their stems in an orthographic sense. Our masked priming results thus provide additional evidence for the single mechanism theory of the English past tense (Stockall and Marantz, 2006), as opposed to the dual mechanism theory (Pinker and Prince, 1988), which would predict early decomposition effects only for regular verbs.

There is an even earlier priming effect for the identity and regular conditions, during the time window of the M100, which was not entirely expected given that the primes and targets were presented in distinct cases. However, there is a precedent in the

literature for this type of abstract letter priming: Pylkkänen and Okano (2010) found equal amounts of masked repetition priming for primes and targets in distinct Japanese scripts, as well as visual word form frequency effects at the M100 regardless of the particular script that a word was presented in. Finally, we also found a late M350/N400m masked priming effect in the middle temporal ROI, which was highly significant for the identity, regular, and irregular conditions individually, but not for the pseudo-irregular condition. Thus, while the pseudo-irregular condition displayed a trend toward significance in the behavioral priming analysis, it did not yield similarly significant neural priming effects (in either the M170 or M350/N400m analyses). Given the fact that pseudo-affixed words (e.g., *corner*) do indeed prime their pseudo-stems (e.g., *corn*) in a masked priming paradigm (Rastle et al., 2004), as well as the observation that the transition probability from pseudo-stem to pseudo-affix modulates the M170 in single word reading (Lewis et al., 2011), the failure to obtain clear verification of a pseudo-*irregular* priming effect is surprising. One possibility is that the pseudo-irregular behavioral priming trend is driven by a post-decision process, which may be localized to brain regions outside of left temporal cortex. Additionally, it is possible that the lack of pseudoirregular MEG priming effects is due to an issue related to the AlbrightScore of the pseudo-irregular pairs, as will be discussed further below.

Our AlbrightScore analysis provides additional evidence supporting the single mechanism account of the past tense. While the behavioral findings were not conclusive, we did find a significant effect of AlbrightScore on the level of priming in the functional ROI for irregular verbs, during the rough time window of the M170 (i.e., 150–250 ms). These results show that the masked morphological priming effect for the irregular verbs only arises because of the high AlbrightScore items; the low AlbrightScore irregulars display no priming effects within the fROI. This confirms the predictions of the single mechanism, form-based, account, in which the irregular past tense forms that are more rule-like (i.e., receive greater support from the general rule structure of how past-tense inflections are computed within English) might be expected to prime their stems to a significantly greater degree than the more exceptional (i.e., less supported) irregulars would, for their respective stems (cf. Stockall and Marantz, 2006, who found different M350 and RT priming effects for high overlap irregular verb-stem pairs, such as *gave*-*give*, and low overlap pairs, such as *taught*-*teach*, with an overt, or unmasked, priming paradigm, and Kielar et al., 2008, who found that *–t* affixed

irregular past tense forms prime their stems as effectively as regulars, while *-*∅ affixed past tense forms do not, in a masked priming paradigm). Since the masked irregular morphological priming effect seemed to be concentrated at the high end of the AlbrightScore measure for the irregular verbs, it is possible that this fact explains the failure to obtain a significant level of pseudo-irregular priming in the MEG analysis: if the pseudo-irregular condition were analyzed within the high end of an AlbrightScore measure appropriately tailored for those items <sup>7</sup> , we might then observe a significant neural priming effect within those higher AlbrightScore (i.e., more rule-like) pseudo-irregulars.

In summary, the M170 masked morphological priming effect for irregular verbs, as well as the effect of AlbrightScore on the priming effect, suggests that processing of irregular verbs involves application of rules of the sort that generative linguistics predict would be used to map between stems and their past tense forms.

# **ACKNOWLEDGMENTS**

This material is based upon work supported by the National Science Foundation under Grant No. BCS-0843969, and by the NYU Abu Dhabi Research Council under Grant No. G1001 from the NYUAD Institute, New York University Abu Dhabi. We thank Gwyneth Lewis for assistance in collecting the experimental data.

# **REFERENCES**


<sup>7</sup>It is not obvious how to compute AlbrightScore values for the pseudoirregular items, since the morphophonological rules in the learning algorithm of Albright and Hayes (2003) take into account the onsets of the words. We would instead need a measure that reflects the similarity of the pseudoirregular items to the irregular items, in terms of the inflectional patterns of the rhymes (without regard to the onsets).


mental dictionary is part of declarative memory, and that grammatical rules are processed by the procedural system. *J. Cogn. Neurosci.* 9, 266–276. doi: 10.1162/jocn.1997.9.2.266

Vannest, J., Polk, T. A., and Lewis, R. L. (2005). Dual-route processing of complex words: new fMRI evidence from derivational suffixation. *Cogn. Affect. Behav. Neurosci.* 5, 67–76. doi: 10.3758/CABN.5.1.67

Yang, C. D. (2002). *Knowledge and Learning in Natural Language*. Oxford, New York: Oxford University Press.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 28 June 2013; accepted: 02 November 2013; published online: 22 November 2013.*

*Citation: Fruchter J, Stockall L and Marantz A (2013) MEG masked priming evidence for form-based decomposition of irregular verbs. Front. Hum. Neurosci. 7:798. doi: 10.3389/fnhum.2013.00798*

*This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2013 Fruchter, Stockall and Marantz. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# **APPENDIX**


*(Continued)*

*(Continued)*


# L2 speakers decompose morphologically complex verbs: fMRI evidence from priming of transparent derived verbs

# *Sophie De Grauwe1\*, Kristin Lemhöfer 1, Roel M. Willems 1,2 and Herbert Schriefers <sup>1</sup>*

<sup>1</sup> Radboud University, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, Netherlands <sup>2</sup> Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands

### *Edited by:*

Germany

Mirjana Bozic, University of Cambridge, UK

### *Reviewed by:*

Christos Pliatsikas, University of Kent, UK Gunnar Jacob, University of Potsdam,

### *\*Correspondence:*

Sophie De Grauwe, Donders Institute for Brain, Cognition and Behaviour, Radboud University, P.O. Box 9104, 6500 HE Nijmegen, Netherlands e-mail: s.degrauwe@donders.ru.nl

In this functional magnetic resonance imaging (fMRI) long-lag priming study, we investigated the processing of Dutch semantically transparent, derived prefix verbs. In such words, the meaning of the word as a whole can be deduced from the meanings of its parts, e.g., wegleggen "put aside." Many behavioral and some fMRI studies suggest that native (L1) speakers decompose transparent derived words. The brain region usually implicated in morphological decomposition is the left inferior frontal gyrus (LIFG). In non-native (L2) speakers, the processing of transparent derived words has hardly been investigated, especially in fMRI studies, and results are contradictory: some studies find more reliance on holistic (i.e., non-decompositional) processing by L2 speakers; some find no difference between L1 and L2 speakers. In this study, we wanted to find out whether Dutch transparent derived prefix verbs are decomposed or processed holistically by German L2 speakers of Dutch. Half of the derived verbs (e.g., omvallen "fall down") were preceded by their stem (e.g., vallen "fall") with a lag of 4–6 words ("primed"); the other half (e.g., inslapen "fall asleep") were not ("unprimed"). L1 and L2 speakers of Dutch made lexical decisions on these visually presented verbs. Both region of interest analyses and wholebrain analyses showed that there was a significant repetition suppression effect for primed compared to unprimed derived verbs in the LIFG.This was true both for the analyses over L2 speakers only and for the analyses over the two language groups together.The latter did not reveal any interaction with language group (L1 vs. L2) in the LIFG. Thus, L2 speakers show a clear priming effect in the LIFG, an area that has been associated with morphological decomposition. Our findings are consistent with the idea that L2 speakers engage in decomposition of transparent derived verbs rather than processing them holistically.

**Keywords: language, fMRI, bilingual, morphological processing, priming, derivations**

## **INTRODUCTION**

During the past few decades, the processing of morphologically complex words has led to considerable debate. Many studies have been devoted to the question whether these words are decomposed into their constituent parts or processed holistically. Semantically transparent derivations (e.g., *reread*, derived from *read*) provide an interesting case in this debate. On the one hand, they differ from semantically opaque derivations (e.g., *understand*, derived from *stand*) in terms of meaning compositionality: their meaning as a whole is related to the meaning of their constituent parts, in contrast with opaque derivations, whose meaning cannot be inferred from the meaning of their parts. Thus, lexical access to transparent derivations might be accomplished by decomposition of these words into their constituent parts. On the other hand, transparent derivations differ from inflections (e.g., *reads*, the present tense third person singular form of *read*), in that they, like opaque derivations, are the result of historical word formation processes, whereas inflections are the result of syntactic operations. Thus, transparent derivations constitute new words, in contrast with inflections, which constitute different forms of the same word. As a result, transparent derivations might be associated with full lexical entries in the so-called "mental lexicon," potentially leading to holistic processing of these complex words (see, for example, Marslen-Wilson, 2007, for a discussion of this issue).

As we will see below, the majority of the available evidence suggests that native (L1) speakers decompose transparent derivations. This makes transparent derivations a particularly interesting test case for the processing of transparent derivations in non-native (L2) speakers, as one could hypothesize that L2 speakers may not (yet) have grasped the compositionality of these words, and thus tend to process them holistically (see, for example, Clahsen et al., 2010). Most studies on the processing of transparent derivations have tested (especially L1) speakers in behavioral tasks. In this study, we use functional magnetic resonance imaging (fMRI) to investigate the neural correlates of the processing of semantically transparent derivations in L2 speakers.

Many behavioral studies on L1 processing of transparent derivations have used the morphological priming/lexical decision method. In this approach, a target word is preceded by a morphologically related word or an unrelated word. For example, a morphologically complex word such as *reread* is preceded by its stem (*read*), or vice versa. Participants have to decide as quickly as possible whether the target is a real word or not (lexical decision task). In visual priming (targets and primes presented visually), primes and targets may be separated by several intervening stimuli (long-lag priming) or follow each other without intervening stimuli (short-lag priming)1. The underlying idea is that if *reread* and *read* are separate entries in the mental lexicon, *read* should not facilitate the recognition of *reread* any more than a control prime like *think* does. In contrast, if the recognition of the target word *reread* involves its decomposition into *re-* and *read*, the previous encounter with one of these parts (*read*) should speed up recognition. The results of these studies mostly show significant facilitatory priming for transparent derivations in L1 speakers, both in long-lag priming (Napps, 1989; Raveh and Rueckl, 2000; Rueckl and Aicher, 2008) and in short-lag priming (Feldman and Soltano, 1999; Rastle et al., 2000; Feldman et al., 2002, 2004; Smolka et al., 2009, 2014). These results have been interpreted as evidence that transparent derivations are decomposed during lexical access.

However, the interpretation of priming effects with transparent derivations is complicated by the fact that transparent derivations are not only morphologically, but also semantically and formally related to their stems. Thus, the observed priming effects could be due to the semantic and/or form overlap between transparent derivations and their stems, rather than to their morphological relationship. However, long-lag priming typically elicits facilitatory effects of morphological relatedness, but not of semantic or form relatedness (Napps and Fowler, 1987; Napps, 1989; Feldman, 2000; Rueckl and Aicher, 2008, Experiment 1). For example, in a series of long-lag priming experiments, morphologically related word pairs such as *manager–manage* led to significant facilitatory priming, whereas no priming was found for form-related (e.g., *ribbon–rib*) or semantically related (e.g., *ache–pain*) word pairs (Napps and Fowler, 1987; Napps, 1989). Therefore, longlag priming seems particularly useful for the study of transparent derivations: any facilitatory priming effects for transparent derivations in long-lag priming will likely be due to the morphological relationship of the prime-target pair rather than their semantic or form relationship.

In fMRI studies on the processing of transparent derivations, the left inferior frontal gyrus (LIFG) has often been associated with morphological decomposition of these words. For example, in two lexical decision fMRI studies (Meinzer et al., 2009; Pliatsikas et al., 2014b), increased LIFG activation was found for morphologically complex compared to morphologically less complex semantically transparent words. The two conditions were matched on a number of lexical and semantic characteristics, such as length, frequency, concreteness, etc., and only differed in degree of derivational complexity. In both studies, the authors therefore concluded that transparent derivations are decomposed, and that this decomposition process is supported by the LIFG (see also Vannest et al., 2005, 2011, for similar results for "decomposable" vs. "non-decomposable" derived words and for derived vs. simple words, respectively; but see Davis et al., 2004; Bozic et al., 2013, who found no selective activation of the LIFG for derived vs. simple words).

The fMRI studies mentioned so far did not use morphological priming. In contrast, Bozic et al. (2007) used a long-lag priming paradigm in an fMRI study contrasting morphologically, semantically and form-related word pairs. In fMRI studies, priming often leads to "repetition suppression": a decrease in the BOLD response to primed compared to unprimed targets. This decrease is supposed to reflect faster or "more efficient" processing of the primed target in a certain brain region, due to the application of the same processes in that brain region as during exposure to the prime (Schacter and Badgaiyan, 2001; Henson, 2003). In the unprimed condition, the same process is supposed to operate on the stimulus, but in this case, processing is not facilitated by the earlier presentation of a prime – there is no prime "greasing the tracks," so to say (Henson, 2003). Thus, if a brain area such as the LIFG displays a decreased hemodynamic response to a morphologically complex word that is primed by its stem, this is an indication that, in this brain region, processing of the complex word involves processing of its stem – suggesting that the complex word is morphologically decomposed. This is precisely what Bozic et al. (2007) found: the LIFG showed lower activation for target words primed by morphologically related primes than for unprimed target words. This was not the case for semantically or form-related prime-target pairs, indicating that the long-lag priming effect was not due to the overlap between form and meaning. The LIFG therefore seemed to be specifically involved in morphological processing.

Several other brain areas have been implicated in the processing of derivations, such as the right inferior frontal gyrus (Bick et al., 2010; Bozic et al., 2013), middle temporal cortex (Meinzer et al., 2009; Bozic et al., 2013), superior temporal cortex (Meinzer et al., 2009;Vannest et al., 2011; Bozic et al., 2013), inferior temporal and occipital-temporal cortex (Bick et al., 2010), and occipital cortex (Meinzer et al., 2009; Bick et al., 2010). However, only a minority of fMRI studies on derivation processing report evidence of their involvement, in contrast with the more consistent evidence that exists for the involvement of the LIFG. This is why, as we will see later, we conducted region of interest (ROI) analyses of the LIFG only, whereas the potential involvement of other brain areas was assessed through whole-brain analyses.

Only a few behavioral studies have been conducted on the processing of transparent derivations in L2 speakers. To our knowledge, all of them used other paradigms than unmasked priming. These studies have produced conflicting results. In a masked priming experiment, Clahsen and Neubauer (2010) found no priming effect for morphologically related prime-target pairs (German derived nouns and their stems) in Polish L2 speakers of German, as opposed to L1 speakers of German. In another masked priming study, Silva and Clahsen (2008) found that priming was reduced for morphologically related primetarget pairs (English derived nouns and their stems) compared to word pairs with identical prime and target in Chinese and German L2 speakers of English. In contrast, L1 speakers of English showed similar effects for morphological and identical priming.

<sup>1</sup>We only review L1 behavioral studies in which a similar method and similar stimuli are used as in the present study, i.e., unmasked visual priming and transparent derivations. In contrast, our review of the L1 fMRI and L2 literature also includes studies in which other methods and/or stimuli are used, as there are hardly any fMRI and/or L2 studies on the processing of transparent derivations using unmasked visual priming.

The results of these experiments were interpreted as suggesting that L2 speakers relied more on holistic processing than L1 speakers.

Other studies, however, report no differences between L1 and L2 speakers in terms of the processing of transparent derivations. Diependaele et al. (2011) also used masked priming, and found similar facilitatory priming effects for transparent derivations in L1 speakers of English and in L2 speakers of English (with either Spanish or Dutch as their L1). These results suggest that both native speakers and bilinguals decomposed the complex words (see also Kirkici and Clahsen, 2013, for similar results for derivations in their masked priming experiment with L2 speakers of Turkish). In an unprimed visual lexical decision study, Portin and Laine (2001) found that both L1 speakers of Swedish and early Finnish–Swedish bilinguals showed shorter lexical decision latencies to transparent derived nouns than to morphologically simple nouns of the same length andfrequency. One of the possible interpretations discussed by the authors refers to parallel dual-route models (more specifically the morphological race model proposed in Frauenfelder and Schreuder, 1992). According to this interpretation, transparent derivations might be processed faster because of a race between two parallel lexical access routes (a decompositional route and a whole-word route). In contrast, simple nouns can only be processed through the whole-word route, and thus would not benefit from the race between two competing routes.

The conflicting evidence reported in these behavioral studies may be due to differences in paradigms: masked priming (Silva and Clahsen, 2008; Clahsen and Neubauer, 2010; Diependaele et al., 2011) vs. unprimed lexical decision (Portin and Laine, 2001); materials: homogeneous (Portin and Laine, 2001; Silva and Clahsen, 2008; Clahsen and Neubauer, 2010) vs. inhomogeneous (Diependaele et al., 2011) in terms of suffix and/or word class of derived words, matched vs. unmatched in terms of length and/or frequency of derived and unrelated primes (Silva and Clahsen, 2008: prime length not matched, no information on whole-word prime frequency; Clahsen and Neubauer, 2010: no information on prime frequency); participants: early (Portin and Laine, 2001) vs. late (Silva and Clahsen, 2008; Clahsen and Neubauer, 2010; Diependaele et al., 2011) bilinguals; and/or differences in L1–L2 combinations (Clahsen and Neubauer, 2010: Polish-German; Silva and Clahsen, 2008: Chinese/German-English; Diependaele et al., 2011: Spanish/Dutch-English; Portin and Laine, 2001: Finnish– Swedish).

In the fMRI literature, to our knowledge, only three studies have addressed morphological processing in L2 speakers: two on inflectionally complex words (Lehtonen et al., 2009; Pliatsikas et al., 2014a) and one on derivations (Bick et al., 2010). In all three studies, the LIFG was associated with morphological processing. Lehtonen et al. (2009) used an unprimed visual lexical decision task with early Finnish–Swedish bilinguals. Each participant saw two lists of simple and inflected nouns: a Swedish list and a Finnish list. The results showed increased activation of the LIFG for Finnish inflected nouns compared to Swedish inflected nouns and to Finnish simple nouns, suggesting decomposition in Finnish and holistic processing in Swedish. This was linked to the structural difference between Finnish (morphologically rich) and

Swedish (morphologically poor). Pliatsikas et al. (2014a) used a masked priming task involving inflected verbs with late Greek L2 learners of English. They found activation in a network including the LIFG for morphologically related regular verb pairs compared to morphologically related irregular verb pairs (which are more likely to be represented holistically) and to unrelated regular verb pairs. This pattern of results was found for the combined group of L1 and L2 speakers of English, with no indication of any betweengroup differences. Therefore, the L2 speakers were interpreted to use the same decompositional strategy as the L1 speakers. Masked priming was also used by Bick et al. (2010)in their study of derivational processing in early Hebrew–English bilinguals. A bilateral network including the LIFG wasfound to show lower activationfor morphologically related prime-target pairs compared to semantically related and orthographically related prime-target pairs. This repetition suppression effect was found for both Hebrew and English transparent derivations, suggesting decomposition in both languages. Although all three studies found evidence for the involvement of the LIFG in L2 morphological processing, none of them contrasted L1 and L2 processing of transparent derivations. The neural correlates of derivational processing in late bilinguals remain to be investigated.

With this study, we want to find out whether transparent derivations are decomposed or processed holistically in late bilinguals. Decomposition may be challenging for L2 speakers because it requires an understanding of the morphological structure of words – an understanding which may develop only after extended experience with the language. However, holistic processing also comes at a cost, as it requires extended memory resources for the storage of whole-word forms. The behavioral evidence on this issue is mixed. By using fMRI, this study may shed new light on derivational processing in late bilinguals.

The stimuli used in this experiment consisted of two types of prefix verbs, i.e., particle verbs (verbs with separable particles, e.g., *meenemen* "take along") and prefixed verbs (verbs with nonseparable particles, e.g., *omvatten* "enclose"). Particle verbs differ from prefixed verbs in that their particles are separated from their stem when used in finite form in main clauses (e.g., *Zij neemt het boek mee* "She takes the book along"). One could hypothesize that, because of their separability, particle verbs are more likely to be morphologically decomposed than prefixed verbs. However, several studies comparing the two types of prefix verbs have found no processing differences between prefixed and particle verbs in terms of decomposition (Schriefers et al., 1991; Lüttmann et al., 2011). For this reason, both types of stimuli were used in this study. Care was taken that the proportion of each type was balanced over conditions.

In this fMRI study, we contrasted native speakers of Dutch with late learners of Dutch who had German as their L1. Using long-lag priming, the processing of semantically transparent derived verbs was investigated in both groups. We wanted to determine whether L1 and L2 speakers show a repetition suppression effect for morphologically primed vs. unprimed derived verbs in the LIFG in particular. We expected this to be the case for L1 speakers, thus replicating Bozic et al.'s (2007) results. For L2 speakers, no clear prediction can be formulated on the basis of the mixed existing literature. If L2 speakers decompose transparent derived verbs, we should also find an LIFG repetition suppression effect for derived verbs primed by their stems. If they process these verbs holistically, we should not find such an effect.

Since we had a clear prediction for the involvement of the LIFG in derivation processing (at least in L1 speakers), we used ROI analyses to investigate effects in this area. Regarding the involvement of other brain areas, predictions were less clear, because of the inconsistency in the existing literature on derivation processing. However, because there is at least some evidence that brain areas such as temporal cortex may be involved, we also conducted whole-brain analyses. In this way, we made sure not to miss effects in brain areas less attested in the literature.

The present study was the second part of a two-part fMRI session2. Each part of this session constituted an experiment on its own. The results of the first part are reported in De Grauwe et al. (2014). The second part provided the data reported in the current study. In the description of the methods used, the reader is referred to De Grauwe et al.'s (2014) study where appropriate.

As mentioned above, a long-lag priming methodology was used. Complex transparent verbs (targets) were preceded by their stems (primes), with four to six intervening stimuli (primed condition). This condition was contrasted with a condition with complex verb targets that were not preceded by their stem (unprimed condition). To keep the set of stimuli similar across the two priming conditions, the verb targets in the unprimed condition were followed by their stem, with the same number of intervening stimuli. The potential priming effect in the primed condition was enhanced by making use of part 1 of the two-part fMRI session: in addition to its presentation as a prime for the primed complex verb target in part 2, the stem had already been presented twice in part 1, once as a simple verb and once as the stem of a semantically opaque complex verb. Thus, primed complex targets were primed three times: twice in part 1 and once in part 2. In contrast, the stems of unprimed complex targets had not been presented before (neither in part 1 nor 2). An overview of the design can be found in **Table 1**.

# **MATERIALS AND METHODS**

### **PARTICIPANTS**

Initially, 21 L1 speakers<sup>3</sup> of Dutch and 29 German L2 speakers of Dutch participated in the study. After exclusion (for details, see Results below), 18 L1 speakers (14 female, four male) and 21 L2 speakers (13 female, eight male) remained. The mean age of the remaining participants was 22.11 (SD: 2.42, range 18– 26) for L1 speakers and 24.62 (SD: 2.13, range 22–29) for L2 participants.

The L2 participants, most of them students at the Radboud University Nijmegen, had German as their dominant language, had lived and/or studied in the Netherlands for at least 1.5 years, **Table 1 | Design. Triple priming vs. no priming.**


German and English translations in parentheses. Targets are printed in bold.

and used Dutch regularly for their studies, work and/or private life. Prior to the fMRI experiment, they were asked to complete the online version of the Dutch LexTALE test (Lemhöfer and Broersma, 2012), a non-speeded visual lexical decision test. Only participants with a minimum score of 67.50% were invited for the fMRI experiment. The average score of the selected participants on the LexTALE test was 78.04% (SD 7.63%). After participating in the fMRI experiment, L2 participants completed a self-assessment rating on their proficiency in Dutch (see Supplementary Material, Table S1, for results). Their mean age of acquisition of Dutch was 20.10 (SD 2.45), and they had an average of 4.52 (SD 3.03) years of experience with Dutch.

The L1 participants, most of them students at the Radboud University Nijmegen, had Dutch as their first and dominant language. They had lived in the Netherlands from birth.

All participants were right-handed and reported having no reading disorders. They gave their written consent in accordance with national legislation and the Helsinki Declaration of 1975, revised in 2004. The study received ethical approval from the local reviewing committee (Commissie Mensgebonden Onderzoek, regio Arnhem Nijmegen; approval number 2001/095 and amendment "Imaging Human Cognition" 2006, 2008).

### **MATERIALS**

Seventy Dutch morphologically complex verbs were selected as targets (see **Table 1** for examples). They were all semantically transparent, derived Dutch prefix verbs. Because of the high similarity between Dutch and German, it was not possible to select enough non-cognate verbs of this type. Therefore, we restricted ourselves to cognate verbs. These were mostly non-identical in form (e.g., *inslapen* – German: *einschlafen*/English: *fall asleep*), except for two verbs (*bedienen* – German: *bedienen*/English: *serve*; *bemerken* – German: *bemerken*/English: *notice*). Half of the targets occurred in the primed condition, the other half in the unprimed condition. The primed condition contained 28 particle (i.e., separable) verbs and seven prefixed (i.e., non-separable) verbs, whereas the unprimed condition contained 27 particle verbs and eight prefixed verbs.

<sup>2</sup>Parts 1 and 2 of the fMRI session took place immediately after each other. In between the two parts, participants could take a small break of several minutes, during which they remained in the scanner.

<sup>3</sup>In part 1 (De Grauwe et al., 2014), 22 L1 participants took part. One of them only participated in part 1 and not in part 2, resulting in 21 initial L1 participants for the current study.

Complex targets were selected on the basis of two prior rating studies. First, the degree of transparency of the complex verbs was determined on the basis of the transparency/opacity rating reported by De Grauwe et al. (2014). Primed and unprimed transparent complex verbs were matched on degree of transparency, as determined by a *t*-test (*p* > 0.47). Second, De Grauwe et al. (2014) had selected stems such that they were either clearly motorrelated or not. Thus, the stems of the primed complex targets in the current study were either clearly motor-related or not. To match these stems with the stems of the unprimed complex targets (which did not occur in De Grauwe et al., 2014), the same number of motor- and non-motor-related stems was included in both priming conditions (19 motor-related and 16 non-motor-related stems in each condition). In addition, the degree of motor-relatedness was rated (see De Grauwe et al., 2014) and matched for stems in the primed and unprimed conditions (*p* > 0.66). Primed and unprimed complex verbs were also matched in terms of whole-word length and stem length (number of letters; *p*s > 0.53), and whole-word frequency and stem frequency (log-transformed lemma frequency, based on the Celex database, Baayen et al., 1995; *p*s > 0.39). (See Supplementary Material, Table S2, for further details on stimulus characteristics).

Thus, participants saw 140 words: 35 primed complex targets, 35 unprimed complex targets, 35 stems used as primes for the primed complex targets, and 35 stems used as fillers (following the complex targets in the unprimed condition). Twenty-eight pseudo-words were added, all of them verb-like (ending in the Dutch infinitive suffix "en") and obeying the phonotactic rules of Dutch. They were created by changing one or more letters of real Dutch words. Half of them were "complex," consisting of an existing Dutch prefix and a non-existing stem. The other half were "simple," being the non-existing stems of the complex pseudo-words. Half of the complex pseudowords were "primed," that is they were preceded by their stem in the present study (i.e., in part 2 of the fMRI session) and had also been presented in part 1 of the fMRI session (see **Table 1**). The other half of the complex pseudo-words were "unprimed."

### **STIMULUS PRESENTATION**

Participants saw the stimuli through a mirror attached to the head coil while lying on their back in the scanner. Their task was to respond to pseudo-words only (go/no-go task), by pushing a button on a response box with their right index finger. Each trial started with a blank screen presented for a variable jitter time (0–2000 ms), followed by a fixation cross (400 ms). Then the stimulus appeared and remained on the screen for 2000 ms or until a response was recorded. Finally, a blank screen was presented until the fixed trial length of 8440 ms was reached. Word and pseudoword trials were interspersed with 28 null trials. These consisted of a blank screen shown for 8440 ms. The stimuli were presented in 20-point, light-gray, lower-case letters in Arial font against a black background using Presentation software (developed by Neurobehavioral Systems, http://www.neurobs.com).

Four different lists were generated. Each list was randomized with the restriction that words of the same word condition and pseudo-words were not presented on more than three consecutive trials. Primed complex verbs were always preceded by their stem, while unprimed complex verbs were always followed by their stem, with four to six intervening stimuli between a complex verb and its stem in both cases. Participants saw all 196 trials in one block, which lasted ∼30 min.

Before the fMRI session, participants were familiarized with the task in a practice block of eight word and eight pseudo-word trials outside the scanner. Following the fMRI session, they completed two off-line ratings: a motor-relatedness rating of the words of part 1 (see De Grauwe et al., 2014) and a familiarity rating of the words of part 2. In the familiarity rating, participants were asked to indicate for each word if they knew it or not. Finally, L2 participants filled out a language background questionnaire to rate their proficiency in Dutch (see Supplementary Material, Table S1, for results).

### **BEHAVIORAL DATA ANALYSIS**

Mean error percentages to words and pseudo-words were calculated. Error percentages to complex words were analyzed with a 2 × 2 repeated-measures analysis of variance (ANOVA) with the factors of Language (between-participant factor; L1 vs. L2) and Priming (within-participant factor; Primed vs. Unprimed).

Participants were excluded from further analysis if they made more than 30% errors to pseudo-words or if less than 25 trials per critical condition remained in the fMRI analysis. Items were excluded from further analysis for a certain language group if their error percentage was more than three standard deviations above the mean of their language group. Only correctly answered trials were included in the fMRI analyses.

### **fMRI DATA ACQUISITION AND ANALYSIS**

Whole-brain images were acquired on a Siemens TRIO 3.0T MRI system (Siemens, Erlangen, Germany). For the EPI images, the following acquisition parameters were used: 31 axial slices, TR = 2110 ms, TE = 30 ms, flip angle = 90◦, voxel size = 3.5 mm × 3.5 mm × 3.5 mm. High-resolution anatomical images were acquired using an MPRAGE sequence (192 sagittal slices, TR = 2300 ms, TE = 3.03 ms, FOV = 256, voxel size = 1 mm × 1 mm × 1 mm).

Imaging data were analyzed using SPM8 (Statistical Parametric Mapping, http://www.fil.ion.ucl.ac.uk/spm). After discarding the first five volumes, preprocessing was performed by motion correction through rigid body registration along three translations and three rotations, slice timing correction using the middle slice (slice 17) as reference, normalization to the T1 image in MNI space and spatial smoothing using an isotropic 8-mm FWHM Gaussian kernel. For one participant, the normalization procedure led to considerable distortion. Therefore, this participant's images were normalized to a standard EPI template centered in MNI space.

For the first-level analysis, the preprocessed functional images of each participant were analyzed using the general linear model with regressors for each word condition (Primed, Unprimed, Stem Prime, and Stem Filler). A regressor for the null trials was added, as well as the six realignment parameters generated

during motion correction (three translation and three rotation parameters). The regressors were convolved with a canonical hemodynamic response function.

### *ROI analyses*

To find out whether primed complex verbs (compared to unprimed complex verbs) led to repetition suppression in the LIFG, three ROIs were defined in this area: Brodmann Area (BA) 44, 45, and 47. For this, the BAs section of the Talairach Daemon database was used in the WFU PickAtlas toolbox (Lancaster et al., 1997, 2000; Maldjian et al., 2003, 2004). Together, these three ROIs make up the most part of LIFG gray matter. Using these ROIs thus allows us to derive conclusions regarding activation in the LIFG ROIs separately (if an interaction with the ROI factor is found) or regarding activation in the LIFG as a whole (if effects found are not modulated by the ROI factor).

For each participant and each ROI, the contrast values for each complex verb condition compared to the null condition were calculated using MarsBar, and averaged across all voxels in the ROI (Brett et al., 2002). These were entered into a (3 × 2 × 2) repeatedmeasures ANOVA with the factors of ROI (BA44 vs. BA45 vs. BA47), Language (L1 vs. L2) and Priming (Primed vs. Unprimed). In addition, results for each language group were analyzed separately using repeated-measures ANOVAs with the factors of ROI (BA44 vs. BA45 vs. BA47) and Priming (Primed vs. Unprimed). Only effects and interactions involving Priming are reported. A significance level of α = 0.05 was used, and the Greenhouse and Geisser (1959) correction was applied to correct for violations of sphericity when there was more than one degree of freedom in the numerator. In those cases, original degrees of freedom and adjusted *p*-values are reported.

### *Whole-brain analyses*

To determine whether other brain regions are also involved in the processing of morphologically complex words, we conducted a second-level random effects analysis over both language groups. For this, the contrast images of the complex word conditions vs. the null condition of each participant were entered into a full-factorial 2 × 2 analysis (Language: L1 vs. L2; Priming: Primed vs. Unprimed). The main effect of Priming and the interaction between Language and Priming were investigated with directional *t*-tests: Unprimed – Primed and reverse, and L1 (Unprimed – Primed) – L2 (Unprimed – Primed) and reverse, respectively. In addition, *t*-tests were used to investigate whether the effect of Priming was present for each of the two language groups separately.

A double threshold was used to protect against false positives: a voxel-level *p*-value of *p* < 0.005 (uncorrected) was combined with a minimum cluster size of 65 voxels. This led to a correction for multiple comparisons of *p* < 0.05, as determined by the randomization method proposed by Slotnick et al. (2003; see also De Grauwe et al., 2014, for more details).

# **RESULTS**

Eight (one L1, seven L2) out of the original 50 participants were excluded because their number of errors exceeded the criteria set. One additional L2 participant was excluded because of excessive motion, and two additional L1 participants were excluded because of compromised data quality. For each language group, three items were excluded because their percentage of errors exceeded the criterion set (see Supplementary Material, Table S3, for details).

### **BEHAVIORAL RESULTS**

On average, the L1 participants only made 1.5% errors to words (SD 1.6%) and 4.2% errors to pseudo-words (SD 4.8%). L2 participants made 5.2% errors to words (SD 4.1%) and 11.9% errors to pseudo-words (SD 8.7%), indicating that, as to be expected, the task was more demanding for them.

**Table 2** gives the mean error percentages for complex verbs for L1 and L2 speakers. The repeated-measures ANOVA on the error percentages on complex verbs revealed significant main effects of Language [*F*(1,17) = 9.52, *p* < 0.01] and Priming [*F*(1,17) = 8.16, *p* < 0.01], modulated by a significant Language by Priming interaction [*F*(1,17) = 6.20, *p* < 0.05]. Follow-up analyses for the two language groups separately showed that L2 speakers made fewer errors to primed than to unprimed complex verbs (*p* = 0.001), whereas no difference was found between the two conditions in L1 speakers (*p* > 0.79).

# **fMRI RESULTS** *ROI ANALYSES*

The ANOVA over both groups revealed that the main effect of Priming was significant, indicating that primed complex verbs elicited less activation in the LIFG than unprimed complex verbs (**Table 3**). None of the interactions of Priming with the other two variables (Language and ROI) was significant.

Although the interactions involving Priming and Language were not significant, L1 and L2 speakers were also analyzed separately for exploratory purposes, to make sure that the Priming effect was indeed present in both groups (see **Figure 1**). The ANOVA for L2 speakers showed that the LIFG was activated less for primed than for unprimed complex verbs. For L1 speakers, however, no such difference was found: none of the effects or interactions was significant.

To determine whether the null hypothesis (i.e., no difference between primed and unprimed complex verbs) can be accepted for L1 speakers, we performed a Bayesian analysis of the L1 data. For this, we used Masson's (2011) approach, which is based on a transformation of the sum-of-squares values obtained in a regular ANOVA. For the main effect of Priming with L1 speakers, the resulting Bayes factor was 2.53. This is equivalent to 71.6% support for the null hypothesis, as opposed to 28.4% support for the alternative hypothesis. According to

**Table 2 | Behavioral results. Mean error percentages to complex verbs.**


Standard deviation in parentheses.


**Table 3 | ROI analyses. Repeated-measures ANOVAs on contrast values for complex verbs.**

–, not applicable.

Raftery (1995), this constitutes weak evidence in favor of the null hypothesis.

We also wanted to know whether the neural priming effect found for L2 speakers is due to the increased difficulty of unprimed compared to primed complex verbs. Therefore, a regression analysis was performed. The predictor in this analysis was the difference in error percentage between unprimed and primed complex verbs for L2 participants. For the dependent variable, an LIFG ROI was created by combining the BA44, BA45 and BA47 ROIs. For this ROI, contrast values were extracted for primed and unprimed conditions for each L2 participant using MarsBar (Brett et al., 2002). The difference between the contrast values for unprimed and primed complex verbs constituted the dependent variable. Results showed no evidence that the size of the priming effect in error percentages predicted the difference in contrast values between unprimed and primed conditions (*p* = 0.84).

So far, the results indicate that L2 participants show a clear Priming effect for complex verbs in the LIFG. The results for L1 participants are not as clear: descriptively, they also show a Priming effect, and the analysis over both groups shows no evidence of an interaction of the significant Priming effect with participant group. However, in the analysis over L1 speakers only, the Priming effect fails to reach significance. Still, the Bayesian analysis of the L1 results only provides weak evidence for the absence of a Priming effect. These results will be addressed in more detail in the Discussion.

# *Whole-brain analyses*

To examine whether the Priming effect was present not only in the LIFG but also in other brain regions, a full-factorial second-level analysis over both groups was performed (see Supplementary Material, Table S4, for an overview of significant activations).

The Unprimed vs. Primed contrast yielded five significant left-lateralized clusters of activation: from the pars orbitalis to the pars triangularis in the LIFG (overlapping with the BA47 ROI), in the pars opercularis of the LIFG (overlapping with the BA44 and some of the BA45 ROI) reaching into the insula, in the supramarginal gyrus, in the posterior superior temporal sulcus and in the bilateral medial superior frontal gyrus (see **Figure 2**).

The reverse contrast (Primed vs. Unprimed) also revealed five significant clusters: one cluster extended from the left insula to the left superior temporal gyrus, one was found in the right superior temporal gyrus, one in the right hippocampus reaching into the parahippocampal gyrus, one in the bilateral cerebellum and one in the right inferior parietal lobule.

For the Language by Priming interaction contrast [L1 (Unprimed – Primed) – L2 (Unprimed – Primed)], two significant clusters were found bilaterally in the posterior insula. For the reverse contrast, no significant clusters were found. To informally inspect whether the lack of significant activations was due to thresholding issues, the threshold was lowered to *p* < 0.005 (uncorrected). With this threshold, clusters were found in the pars opercularis of the LIFG and the left insula. However,

**Primed] contrast in the full-factorial whole-brain analysis.** Red: both groups; green: L2 speakers; yellow: overlap between activations for both groups and for L2 speakers. p < 0.005/k > 65, leading to a correction for multiple comparisons of p < 0.05. No significant activation was found for L1 speakers for this contrast.

they were too small (*k* < 7) to satisfy the corrected *p* < 0.05 threshold.

When L1 speakers were analyzed separately, the Unprimed vs. Primed contrast revealed no significant clusters. Again, to rule out thresholding issues, the threshold was lowered to *p* < 0.005 (uncorrected). The only cluster coming close to significance at this threshold was located in the left posterior superior temporal sulcus (*k* = 39). For the reverse contrast, no significant clusters were found either. At the *p* < 0.005 (uncorrected) threshold, small clusters were found in the right superior temporal gyrus, the left cerebellum, the right inferior parietal lobule, the right inferior frontal sulcus and left periventricular white matter. However, they were all too small to satisfy the corrected *p* < 0.05 threshold (*k* < 14).

In contrast, L2 speakers showed significant activation for the Unprimed vs. Primed contrast (see **Figure 2**). A large leftlateralized cluster stretched from the pars orbitalis over the pars triangularis to the pars opercularis of the LIFG (overlapping with the three LIFG ROIs), reaching into the insula. With the threshold lowered to *p* < 0.005 (uncorrected), the only other cluster coming close to significance was situated in the left supramarginal gyrus (*k* = 39). For the reverse contrast, significant bilateral

clusters were found in the superior temporal gyrus, extending into the ventral insula, and in the dorsal insula, reaching into the right parietal operculum. Another significant rightlateralized cluster stretched from the parahippocampal gyrus into the hippocampus.

To summarize, the whole-brain analysis confirmed a clear Priming effect in the LIFG over both groups and for L2 participants, and revealed additional clusters of activation in bilateral temporal, parietal, and frontal regions over both groups.

# **DISCUSSION**

In this long-lag priming fMRI study, the processing of semantically transparent derived verbs was investigated in L1 and L2 speakers. The priming paradigm allowed us to determine whether the LIFG showed a repetition suppression effect to primed compared to unprimed transparent derivations. Such an effect would indicate that, in the LIFG, the primed target (derivation) is processed more efficiently because the same process has already been applied to the prime (stem). Since long-lag priming is supposed to reflect morphological rather than semantic or formal processing, this facilitation should be due to morphological decomposition rather than to semantic and/or form similarities between stem and derivation (see Introduction). Both ROI analyses and wholebrain analyses revealed that repetition suppression effects were indeed present in the LIFG for primed compared to unprimed complex verb targets. This was true both for the analyses over the two language groups together and for the analyses of L2 participants only. When L1 speakers were analyzed separately, no such priming effect was found. However, no evidence was found of a difference between the two language groups in the LIFG, as shown by the lack of a Language by Priming interaction in this area. The whole-brain analysis over both groups also revealed additional repetition suppression effects in mainly left-lateralized temporal, parietal, and frontal regions, and increased activations or repetition enhancement effects for primed compared to unprimed derived verbs in bilateral temporal and cerebellar regions and right parietal areas.

The involvement of the LIFG in morphological processing has been revealed in many neuroimaging studies on derivational and inflectional processing in L1 and L2 speakers, both in studies using a priming paradigm (L1 derivations: Bozic et al., 2007; Bick et al., 2009; L2 derivations: Bick et al., 2010; L2 inflections: Pliatsikas et al., 2014a) and in studies not using a priming paradigm (L1 derivations: Vannest et al., 2005, 2011; Meinzer et al., 2009; Pliatsikas et al., 2014b; L1 inflections: Laine et al., 1999; Tyler et al., 2005; Lehtonen et al., 2006; L2 inflections: Lehtonen et al., 2009; – for a discussion of the potential effect of using a priming paradigm, see below). The involvement of the LIFG has been interpreted as evidence for decomposition of morphologically complex words. More specifically, the LIFG has been postulated to be involved in morpho-phonological segmentation of complex words (Tyler et al., 2005). In another account (Lehtonen et al., 2006), however, this segmentation function is attributed to more posterior areas, such as the left occipitotemporal cortex (OT), whereas the LIFG is supposed to support later combinatorial processes in which stem and affix are phonologically and semantically integrated. This account is supported by

studies suggesting that the LIFG is involved in controlled retrieval and manipulation processes of semantic and phonological representations (e.g., Poldrack et al., 1999; Wagner et al., 2001). In addition, several masked priming fMRI studies on morphological processing showed repetition suppression in the left OT for morphologically related word pairs, suggesting that this region is involved in early stages of morphological processing (L1 derivations: Gold and Rastle, 2007; L2 derivations: Bick et al., 2010; L2 inflections: Lehtonen et al., 2009). In our study, we did not find any involvement of the OT. This may be related to our use of long-lag priming, which may not be as sensitive to early effects as masked priming.

Left inferior frontal gyrus involvement in morphological processing is sometimes accompanied by the involvement of the left or bilateral posterior superior temporal gyrus (pSTG; L1 derivations: Meinzer et al., 2009; Vannest et al., 2011; Bozic et al., 2013; L1 inflections: Laine et al., 1999; Tyler et al., 2005) or superior temporal sulcus (pSTS; L1 inflections: Lehtonen et al., 2006). So far, this has only been found in studies on L1 morphological processing. The pSTG has been associated with phonological and/or lexico-semantic processing (phonological: Binder et al., 2009; Graves et al., 2014; lexico-semantic: Grindrod et al., 2008; Ruff et al., 2008; Ulrich et al., 2013), whereas activation of the pSTS has mainly been found for phonological processing (Price, 2000; Buchsbaum et al., 2001; Turkeltaub et al., 2003). The involvement of these areas in morphological processing has been attributed to lexical access to the stems of inflected words (Tyler et al., 2005) or access to semantic, phonological and/or syntactic representations of stems and affixes (Lehtonen et al., 2006).

In the current study, the priming paradigm led to a pattern of repetition suppression and repetition enhancement effects in both inferior frontal and posterior temporal areas for primed compared to unprimed morphologically complex words: repetition suppression effects were found in the LIFG and left pSTS, and repetition enhancement effects were found in bilateral pSTG. According to Henson (2003), repetition suppression indicates that the same type of processing occurs for primed and unprimed stimuli in the areas showing this effect, a processing that is facilitated by the prime in the primed condition, but not in the unprimed condition (for a more elaborate explanation, see Introduction). In contrast, repetition enhancement effects are generally interpreted to show additional processing for primed compared to unprimed stimuli in the areas showing increased activation (Henson, 2003). First, we will discuss the repetition suppression effects we found; then we will go into the repetition enhancement effects.

The repetition suppression effect in the left pSTS indicates that the (phonological) representations of the stems are accessed for both primed and unprimed transparent verbs, but that this is facilitatedfor theformer because their stems have already been accessed upon presentation of the stem primes. The controlled retrieval account of the LIFG (e.g., Poldrack et al., 1999) suggests that the LIFG controls access to these representations, following decomposition of the complex verb into stem and affix (Lehtonen et al., 2006). In this account, the repetition suppression effect found in the LIFG indicates that controlled retrieval of the representations of stem and affix occurs for both primed and unprimed complex

verbs, but that this is facilitated for the former because the stem representation is already retrieved upon presentation of the prime. The alternative account, i.e., that the LIFG supports the morphological segmentation process itself (Tyler et al., 2005), seems more difficult to integrate with the repetition suppression results. The facilitation reflected by repetition suppression is supposed to be due to performance of the same process on the prime as on the primed stimulus (Henson, 2003). Therefore, presentation of the stem prime should not lead to facilitation of the morphological segmentation process of the primed complex verb, as morphological segmentation is not performed on the stem prime itself. Of course, the LIFG may support morphological segmentation for both primed and unprimed complex verbs to a similar degree, in addition to controlling access to stem representations. This cannot be determined on the basis of the current study, as our results are dependent on the comparison of primed and unprimed complex verbs.

Next, we turn to the repetition enhancement effect. The increased activation in the bilateral pSTG indicates that additional semantic and/or phonological processing occurs for primed compared to unprimed complex verbs (see above). One could hypothesize, first, that priming of the stem can also lead to increased competition between the representation of the stem and the representation of the complex verb, and/or additional comparison processes between these representations. It is unclear, though, why this would not also lead to repetition enhancement effects in (subregions of) the LIFG, as the latter is supposed to control such processing.

Alternatively, the repetition enhancement effect in the pSTG may be related to learning. Repetition enhancement rather than repetition suppression effects have been found to occur with unfamiliar stimuli (Segaert et al., 2013). The repetition of unfamiliar stimuli may lead to the creation of new representations, which involves increased activation. In contrast, familiar stimuli already have stable representations, so that no increased activation is necessary to build their representations. The stimuli used in the current experiment were moderately frequent (approximately 13 per million). Thus, they would be familiar enough for L1 speakers, but probably relatively unfamiliar for L2 speakers, as also reflected by their relatively high error percentage. As shown in the whole-brain analyses, the repetition enhancement effects in our analysis over both groups seem to be primarily driven by the L2 speakers' results. In fact, the only significant interaction between Language and Priming is due to repetition enhancement effects in the bilateral posterior insula in L2 speakers and not L1 speakers. Activation in this area has been related to (bilingual) language learning (Ardila et al., 2014). The presence of repetition enhancement effects in the right hippocampal and parahippocampal regions also seems to support the learning account, as activation in these areas may indicate that memory encoding is taking place (e.g., Stark and Okado, 2003).

The studies on morphological processing discussed so far have all found evidence for decomposition of transparent derivations by revealing the involvement of the LIFG (sometimes combined with the pSTS/STG) in their processing. In contrast, in some (non-priming) fMRI studies on L1 speakers, either no evidence for decomposition or evidence for holistic processing of transparent derived words was found. Davis et al. (2004) found no significant differences between transparent derived or inflected words vs. simple words. Bozic et al. (2013) reported increased activation in bilateral frontotemporal regions for opaque derivations (e.g., *archer*, *breadth*) and transparent unproductive derivations (e.g., *warmth*) compared to simple words (but not for transparent productive derivations (e.g., *bravely*) compared to simple words). This bilateral activation pattern (including LIFG and RIFG) was interpreted to reflect more general perceptual and semantic processes supporting language comprehension. Since no specific left-lateralized system was engaged, (transparent and opaque) derived words were supposed to be processed holistically. In contrast, inflected words were argued to be decomposed (Bozic et al., 2010), because they were processed by such a left-lateralized frontotemporal system (including LIFG but not RIFG), supposedly specialized for grammatical computations. In the present study, a repetition suppression effect was found in the LIFG and no effects were found in the RIFG for derived verbs. According to the account proposed by Bozic et al. (2010, 2013) this would be an indication that the transparent derived verbs were decomposed.

Several explanations can be provided for the discrepancy between our results (involvement of LIFG but not RIFG) and Bozic et al.'s (2013) results (involvement of both LIFG and RIFG). Firstly, we used a morphological priming paradigm, whereas Bozic et al. (2013) used direct comparisons between simple and complex words. Possibly, priming increases the probability that derived words are decomposed: presentation of the stem may increase the chance that the morphological structure of subsequently presented derived words is recognized. This explanation is supported by the results of Bozic et al. (2007). In the latter fMRI study, a long-lag priming paradigm was used with derived words, and a left-lateralized effect was found: a repetition suppression effect in the LIFG and not in the RIFG. The idea that priming may lead to increased decomposition is in line with results showing that the processing of derived words is influenced by factors affecting the recognition of their morphological structure. For example, derived words with longer suffixes tend to be decomposed rather than being processed holistically (Kuperman et al., 2010).

Another factor which may influence the processing of derived words is the choice of task. Like Bozic et al. (2007), Meinzer et al. (2009), and Pliatsikas et al. (2014b), we used a linguistic task (lexical decision), whereas Bozic et al. (2013) used a non-linguistic task (detection of silent gaps within auditory stimuli). Possibly, the lexical decision task directs attention more to the morphological structure of derived words than gap detection does.

So far, we have only discussed the analyses over both groups. These revealed a pattern of repetition suppression and repetition enhancement effects in LIFG and pSTS/STG. In contrast, the analyses over L1 speakers only did not show any significant effects. It is difficult to draw any conclusions from this, however, as no significant Language by Priming interactions were found in the left frontotemporal regions which are normally associated with morphological processing (LIFG and left posterior temporal cortex). Thus, no evidence was found of a difference between L1 and L2 speakers in terms of derivation processing. Also, the Bayesian analysis of the L1 ROI data only revealed weak evidence in favor of the null hypothesis of no priming in L1 speakers. Finally, in the

pSTS, a cluster was found just below significance for L1 speakers, which did reach significance in the analysis over both groups. As mentioned above, the pSTS has also been associated with morphological processing. One possible explanation for the absence of a clear priming effect in L1 speakers may be related to the familiarity of our stimuli. As mentioned before, unfamiliar stimuli often elicit repetition enhancement effects, whereas familiar stimuli generally elicit repetition suppression effects. For the L1 speakers, our stimuli were moderately familiar, i.e., they may have been too familiar to elicit repetition enhancement effects, but not familiar enough to elicit clear repetition suppression effects. However, stimulus familiarity cannot account for the whole pattern of results, as L2 participants, for whom the stimuli were relatively unfamiliar, displayed both repetition enhancement and repetition suppression effects.

In contrast with L1 speakers, L2 participants did display clear priming effects in the LIFG. This suggests that L2 speakers do decompose transparent derived verbs, rather than relying on holistic processing. This confirms some of the previous results on morphological processing in L2 speakers (Bick et al., 2010; Diependaele et al., 2011; Pliatsikas et al., 2014a), but contrasts with other studies (Silva and Clahsen, 2008; Clahsen and Neubauer, 2010). As mentioned in the Introduction, however, none of these studies used an unmasked priming paradigm, which may explain the differences found with this study. (For a further discussion of whole-brain analysis results, see Supplementary Material, Further Discussion of Whole-Brain Results).

Besides the significant repetition suppression effect in the LIFG, L2 speakers also displayed a significant behavioral effect: more errors were made to unprimed than to primed complex verbs. However, the regression analysis we conducted showed that there is no indication that the neural priming effect found for L2 speakers is due to the increased difficulty of unprimed compared to primed complex verbs.

A limitation of the present study is that, due to the high degree of relatedness between Dutch and German, we could not use noncognate verbs as stimuli (see Materials section). Therefore, our conclusions only pertain to the processing of cognate derivations by L2 speakers. Cognates have a special status in bilingual language processing, as they are not only similar in meaning in two languages, but also similar in form. The so-called "cognate facilitation effect" (e.g., Dijkstra et al., 2010) has shown that there might be transfer from L1 to L2 through cognates, at least in simple word recognition. It is not clear whether this special status also holds for morphological processing, and it remains to be investigated whether the same results are obtained for non-cognate as for cognate derived verbs. For this, a different language pair should be used, for example French L2 speakers of Dutch, so that enough non-cognate stimuli can be selected. Also, since our stimuli contained more particle (separable) verbs than prefixed (non-separable) verbs, the results we obtained may primarily have been driven by the particle verbs. However, as mentioned before, studies comparing the processing of particle and prefixed verbs have found no differences between the two types (Schriefers et al., 1991; Lüttmann et al., 2011). Therefore, we have no reason to assume that results would have been different if only prefixed verbs had been included.

To conclude, the central result of the present study is that L2 speakers of Dutch (with German as their L1) show a repetition suppression effect in the LIFG when processing semantically transparent derived Dutch verbs primed by their stems. In the context of other studies on the processing of morphologically complex words in L1 speakers, this indicates that German L2 speakers of Dutch decompose such morphologically complex verbs. In the whole-brain analysis over both L1 and L2 speakers of Dutch, the involvement of the LIFG was supplemented by a repetition suppression effect in the pSTS. This suggests that the (phonological) representations of the stems of the derivations are accessed after morphological decomposition, with the LIFG possibly controlling access to these stem representations. Additionally, L2 speakers of Dutch showed repetition enhancement effects in the bilateral superior temporal gyrus and insula and in the right parahippocampal gyrus. These may be related to L2 language learning, as the presentation of relatively unfamiliar stimuli may lead to the creation of new representations. Future research should address the question whether, first, the sensitivity of L2 speakers to morphological structure is restricted to morphologically complex words of the type investigated in this study, i.e., prefix verbs, or also generalizes to other types of morphologically complex words, such as suffixed nouns; and second, whether this morphological sensitivity of L2 speakers is restricted to languages with a similarly rich morphological system, such as Dutch and German (Basnight-Brown et al., 2007; Portin et al., 2008), or also generalizes to other language pairs (Pliatsikas et al., 2014a).

# **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fnhum.2014. 00802/abstract

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

### *Received: 25 July 2014; accepted: 20 September 2014; published online: 10 October 2014.*

*Citation: De Grauwe S, Lemhöfer K, Willems RM and Schriefers H (2014) L2 speakers decompose morphologically complex verbs: fMRI evidence from priming of transparent derived verbs. Front. Hum. Neurosci. 8:802. doi: 10.3389/fnhum.2014.00802 This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2014 De Grauwe, Lemhöfer, Willems and Schriefers. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The L2 decomposition of transparent derived verbs - Is it 'morphological'? A commentary on De Grauwe, Lemhöfer, Willems, & Schriefers (2014)

Gunnar Jacob\*

Potsdam Research Institute for Multilingualism, University of Potsdam, Potsdam, Germany

Keywords: morphological processing, derivational affixes, decomposition, non-native speakers

### **A commentary on**

# **L2 speakers decompose morphologically complex verbs: fMRI evidence from priming of transparent derived verbs**

by De Grauwe, S., Lemhöfer, K., Willems, R. M., and Schriefers, H. (2014). Front. Hum. Neurosci. 8:802. doi: 10.3389/fnhum.2014.00802

Assume you come across a morphologically-complex Japanese word such as " ." Even if you have absolutely no knowledge of Japanese at all, and are therefore completely insensitive to the word's morphological structure, you might still be able to distinguish between the stem " " and the affix " ." This is because in Japanese, stems are typically written in Kanji, while affixes are written in Hiragana, with the surface form differences between these two scripts being distinct enough that they might even be noticeable for someone without any knowledge of Japanese. As a result, you might actually be able to "decompose" the word, but this decomposition process obviously does not operate on morphological units. Instead, you simply make use of the fact that, in addition to being morphological units, the head " " and the affix " " also constitute units on a completely different level; they are at the same time also orthographic units.

How is this (admittedly rather far-fetched) example related to De Grauwe et al. (2014) study on morphological decomposition in non-native (L2) speakers? In their fMRI experiment, De Grauwe and colleagues convincingly show that L2 speakers of Dutch, just as native speakers, are able to decompose transparent derived verbs such as "opstaan" into the head "staan" and the modifier "op." Based on these findings, the authors argue against accounts of L2 morphological processing which assume qualitative differences between native speakers and L2 speakers with regard to morphological decomposition.

In De Grauwe's study, stems and affixes were of course not written in different scripts. However, just as in the Japanese example, the head "staan" and the modifier "op" in a Dutch word such as "opstaan" are not only morphological units, but also constitute units on other linguistic levels. First, at least for the vast majority of the materials used in De Grauwe's study, head and modifier are also existing lexical units. Specifically, "op" is a Dutch preposition, while "staan" is a verb. For separable verbs (which constitute 55 out of 70 verbs used in the experiment), this is actually the case by default, assuming that such verbs are either "phrasal constructs" (Booij, 1990) or derived through incorporation of a preposition into a verb (Van Riemsdijk, 1978). As a result, modifiers in separable verbs automatically also have to be existing words of their own. Second, "op" and "staan" also constitute syntactic units (Booij, 2002). While verbs are usually syntactic islands (i.e., a

### Edited by:

Minna Lehtonen, University of Helsinki, Finland

### Reviewed by:

Jon Andoni Dunabeitia, Basque Center on Cognition, Brain and Language, Spain Eva Smolka, University of Konstanz, Germany

### \*Correspondence:

Gunnar Jacob, gujacob@uni-potsdam.de

Received: 01 December 2014 Accepted: 07 April 2015 Published: 23 April 2015

### Citation:

Jacob G (2015) The L2 decomposition of transparent derived verbs - Is it 'morphological'? A commentary on De Grauwe, Lemhöfer, Willems, & Schriefers (2014). Front. Hum. Neurosci. 9:220. doi: 10.3389/fnhum.2015.00220 syntactic operation such as inflection is normally conducted on the entire verb), separable verbs are an exception to this; for example, in order to produce a grammatically correct Dutch sentence based on the verb "opstaan," such as "Marie staat op," the formulator has to separate head and modifier, and subsequently perform different syntactic operations (e.g., inflecting the head, moving each unit to its correct position in the sentence) on each of the two.

Thus, while the effects reported in De Grauwe's study presumably involve a form of decomposition, the particular properties of the derived verbs used in the study raise the question whether this decomposition mechanism really operates on morphological units. In other words, even a parser which is completely insensitive to morphology might be able to decompose "opstaan" into "op" and "staan," provided that it has access to either information about syntactic properties of separable verbs or to a lexicon which contains separate entries for "op" and "staan."

A possible counter-argument against this is based on the particular area for which the decomposition effect occurred in the fMRI study. De Grauwe and colleagues correctly point out that the effect occurred in the LIFG, an area which, in several previous papers, has been found to play a role in morphological decomposition. However, it could simply be that the LIFG is generally involved in all sorts of decomposition processes. The same knife can theoretically be used to cut all sorts of different things into pieces.

Given these particular linguistic properties of their materials, how does De Grauwe's study relate to the current debate about L1/L2 differences in morphological processing? While previous behavioral studies investigating the L2 processing of derived forms (e.g., Silva and Clahsen, 2008; Clahsen and Neubauer, 2010; Diependaele et al., 2011; Kirkici and Clahsen, 2013) have come to different conclusions about L2 processing, all of these studies have discussed their findings with reference to the early morpho-orthographic segmentation mechanism proposed by Rastle et al. (2004) and Marslen-Wilson (2007). Crucially, this account assumes a decomposition mechanism which operates specifically on morphological units (in Rastle's case, morphemes; in Marslen-Wilson's case, affixes). Unlike De Grauwe and colleagues, the L2 studies mentioned above used derived forms in which stems and affixes constitute units only at the morphological level, and also, through appropriate control conditions, went to great lengths to ensure that priming effects are morphological in nature. Thus, while De Grauwe and colleagues interpret their findings as evidence against L1/L2 differences, the linguistic properties of their materials make it difficult to discuss their findings with reference to these previous studies. In this respect, behavioral studies which have found similar priming effects for derived forms in L1 and L2 speakers (e.g., Diependaele et al., 2011) can possibly be considered more direct evidence against the idea of fundamental differences between L1 and L2 processing. Additionally, de Grauwe's study also differs from these previous studies with regard to the methodological approach (long-lag priming vs. masked priming) and with regard to the possible role of the L1 in L2 processing (all stimuli were Dutch/German cognates), making the studies difficult to compare.

Importantly, these issues do not diminish De Grauwe's contribution to the field in any way. Their fMRI study quite convincingly shows that L2 speakers do not have a general problem with the decomposition mechanism per se. Also, De Grauwe's claims about the processing of the particular class of verbs investigated in the study, and the lack of fundamental L1-L2 differences with regard to these verbs, remain valid. The key question is whether these findings can be generalized from this particular verb class to all derivations, or whether such verbs possess specific linguistic properties which make them uniquely different from other types of morphologically-complex forms. Hence, it would be interesting to see whether L2 speakers show similar effects for types of morphologically-complex words in which stems and affixes only constitute units at the morphological level, such as derived nominalizations or even inflected forms.

# References


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Jacob. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Neurophysiological evidence for whole form retrieval of complex derived words: a mismatch negativity study

# **Jeff Hanna \* and Friedemann Pulvermüller**

Brain Language Laboratory, Department of Philosophy and Humanities, Freie Universität Berlin, Berlin, Germany

### **Edited by:**

Minna Lehtonen, University of Helsinki, Finland

### **Reviewed by:**

Cyril R. Pernet, University of Edinburgh, UK Juan Esteban Kamienkowski, Universidad de Buenos Aires, Argentina Joanna Morris, Hampshire College, USA

### **\*Correspondence:**

Jeff Hanna, Brain Language Laboratory, Department of Philosophy and Humanities, Freie Universität Berlin, WE4, Habelschwerdter Allee 45, 14195 Berlin, Germany e-mail: jeff.hanna@gmail.com

Complex words can be seen as combinations of elementary units, decomposable into stems and affixes according to morphological rules. Alternatively, complex forms may be stored as single lexical entries and accessed as whole forms. This study uses an event-related potential brain response capable of indexing both whole-form retrieval and combinatorial processing, the Mismatch Negativity (MMN), to investigate early brain activity elicited by morphologically complex derived words in German. We presented complex words consisting of stems "sicher" (secure), or "sauber" (clean) combined with abstract nominalizing derivational affixes -heit or -keit, to form either congruent derived words: "Sicherheit" (security) and "Sauberkeit" (cleanliness), or incongruent derived pseudowords: <sup>∗</sup>"Sicherkeit", and <sup>∗</sup>"Sauberheit". Using this orthogonal design, it was possible to record brain responses for -heit and -keit in both congruent and incongruent contexts, therefore balancing acoustic variance. Previous research has shown that incongruent combinations of symbols elicit a stronger MMN than congruent combinations, but that single words or constructions stored as whole forms elicit a stronger MMN than pseudowords or non-existent constructions. We found that congruent derived words elicited a stronger MMN than incongruent derived words, beginning about 150 ms after perception of the critical morpheme. This pattern of results is consistent with whole-form storage of morphologically complex derived words as lexical units, or mini-constructions. Using distributed source localization methods, the MMN enhancement for well-formed derivationally complex words appeared to be most prominent in the left inferior anteriortemporal, bilateral superior parietal and bilateral post-central, supra-marginal areas. In addition, neurophysiological results reflected the frequency of derived forms, thus providing further converging evidence for whole form storage and against a combinatorial mechanism.

**Keywords: morphology, derivation, MMN, ERP, EEG, German**

# **INTRODUCTION**

Arguably, the defining characteristic of human language is the ability to iteratively combine units of meaning into more and more complex meaningful structures. The atomic meaning carriers are called *morphemes* and their combinations can be described by morphosyntactic rules. However, recent research in cognitive linguistics has cast doubt on the view that morphologically complex words are in all cases combined and assembled from their composite parts. Compelling arguments have been raised that at least a subset of the frequently used complex forms are stored as whole forms or mini-constructions in a lexicon or "constructicon" (Langacker, 1987; Goldberg, 2003). Consequently, these stored forms would be activated as whole units in the word recognition and language comprehension process. Such whole-form constructions may exist at the level of sentences (idioms, for example), phrases, or single, morphologically complex words.

In the present study we explore the processing of morphologically complex words bearing a derivational affix (e.g., calm-ness). As German is well-known for its rich derivationalmorphological system, German derived word stimuli are wellsuited for such investigations. Derivational affixes modify the meaning of a word and, in many cases change its lexical category. For example, English derivational affixes -ness and -dom are taken on by adjectives, and convert them into nouns (calm*ness*, free*dom*). The German affixes we used in this study, -heit and -keit, share this property of converting adjectives into nouns. Additional advantages of the German forms are their phonological similarity to each other and their often unpredictable pairing with word stems; nearly all adjectives only allow pairing with one of them and, in exemplary cases, no phonological criteria are available that could firmly determine the to-be-chosen affix (Fleischer and Barz, 2012). As it is not straightforward to formulate a unique set of algorithmic rules describing relationships between their stems and affixes that encompasses all cases, these linguistic forms appear as good candidates for exploring the possibility that complex words may be stored as whole forms.

Current theories of derivational processing range from total *obligatory decomposition*, where all derived forms are combined from their morphemes (Taft, 2004), to *dual-route* models allowing for both whole-form storage and composition, depending on linguistic properties of the word or the individuals' cognitive systems, which vary, for example, in maturation or language exposure (Caramazza et al., 1985; Schreuder and Baayen, 1995; Clahsen, 1999; Pinker, 1999; Ullman, 2001). A large body of evidence in the domain of visual masked priming (Rastle and Davis, 2008, for review) indicates that derived words undergo an obligatory morphological decomposition at an early stage of processing, and not only in the expected case of semantically transparent, morphologically complex words (e.g., hunter = hunt + er), but also in semantically opaque cases, where the word has the appearance of a derived form, but is actually morphologically simplex (e.g., corner ∼ = corn + er) (Longtin et al., 2003; Rastle et al., 2004). Results from masked visual priming fMRI studies showing modulation of brain activity by morphological relatedness in left inferior frontal gyrus (LIFG) (Bozic et al., 2007; Levy et al., 2008) or occipital areas (Gold and Rastle, 2007), have been interpreted in favor of this account, although such activation *per se* cannot speak to the issue of whether whole-form storage or rather combinatorial processes are brought about by derived forms. Results from priming tasks where primes are fully perceivable have been used to suggest that semantically transparent derived forms are typically decomposed into their morphological constituents (e.g., Marslen-Wilson and Warren, 1994; Rastle and Davis, 2008 for full review).

Much prior neurophysiological work on derived forms also supports the obligatory decomposition hypothesis, though in many cases with evidence for an active, whole-form access route being available under special circumstances, for example with semantically opaque items. Two studies found enhanced N400 components to incorrectly derived words (Janssen et al., 2006; Leminen et al., 2010), which can be seen as supporting decomposition, as the N400 is known to be enhanced to semantically incongruous combinations of words (Kutas and Federmeier, 2011). Another EEG study found a reduced N400-like component in response to morphologically complex target stimuli primed by forms sharing their stem with the targets, in comparison to prime-target pairs with no morphological relationship (Lavric et al., 2007), which was used to argue that the same morphological unit is included in both prime and target, thus supporting composition and combination. However, later studies showed that such relatively late effects, following the critical stimulus word by 400 ms and longer, are only present in specific tasks and that early brain responses are increased to congruent derived forms compared with forms that violate morphological regularities, thus going against the N400 pattern (Leminen et al., 2011, 2013a). Bölte et al. (2009) found that incongruently derived words produced a left anterior negativity (LAN), which is generally thought to reflect "syntactic" or combinatorial processing (Kutas et al., 2006). Other studies found an ERP/F component around 200 ms after stimulus presentation, which was modulated according to whether there was a potential morphological relationship between prime and target, but not whether this relationship was semantically transparent or opaque (Zweig and Pylkkänen, 2009; Lehtonen et al., 2011; Lavric et al., 2012), which was also used to argue in favor of obligatory decomposition of derived forms. Solomyak and Marantz (2010) found that M170 amplitude correlates with the transition probability of lemma to suffix, but found no correlation to bigram-based transition probabilities on the same items. The authors interpret this as consistent with obligatory decomposition. However, a follow-up study with more items and participants also found an accompanying effect for surface form frequency, suggesting a parallel, whole-form access route (Lewis et al., 2011). Finally, Leminen et al. (2011) compared inflectional and derivational morphology processing and found that while the former produced a tight, consistent left lateralized activation of cortical sources in the perisylvian language cortex, derived and simplex words sparked a more dispersed and bilateral network of sources, with stronger RH activity for derived than simplex and inflected words. The authors interpret this topographical difference as evidence for whole-form access of derived words, with the possibility that derived forms are also in some cases decomposed in parallel.

In sum, consistent with a major part of the linguistic literature, most of the past behavioral, neurophysiological and brain imaging research, largely done in the visual modality, seems to support obligatory decomposition of morphologically complex derived words. The handful of studies which used the spoken modality produced results more consistent with a dual route account, suggesting that at least under specific circumstances and early after the onset of the critical morphologically derived stimulus (100–300 ms), whole form access may become relevant (Leminen et al., 2010, 2011; Whiting et al., 2013). As a fundamental theoretical caveat, the rationale underlying the interpretation of brain activation results rely on heuristics which were not always straightforward. For example, an N400 increase was sometimes used as an argument for combinatorial (de)composition, although it is well-known that this brain response also distinguishes whole-form-stored words from novel and therefore not stored pseudo-words, so that it is not a unique indicator of either storage or combination (Kutas and Federmeier, 2011). Other questionable heuristics concern the brain loci activated: left inferior frontal activity was sometimes used as an argument for decomposition, although single word and construction processing engage this locus too (Pulvermüller et al., 2009; Allen et al., 2012; Bozic et al., 2013a). For these reasons, it is desirable to investigate the brain basis of derivationally complex words (i) using spoken language as the primary and native modality of language; and (ii) using a theoretically founded neuromechanistic rational for interpreting brain responses to language.

Whole forms are stored by memory traces, which, at the neurobiological level, are neuronal circuits that develop when words and constructions are being learned (Pulvermüller and Fadiga, 2010). Neurocomputational simulations and neuroimaging work show that these neuronal circuits are typically distributed over several areas (Garagnani et al., 2008). Activation generated by these memory circuits may add to the activation provided by sensory stimulation, so that when familiar words

**of whole form retrieval (left side) and combinatorial processing (right)**. Left: The neurobiological substrate of a whole-form-stored word or construction is seen as a strongly connected, distributed neuronal circuit, encompassing not only the word's phonemic and acoustic properties (black nodes), but also its lexical and semantic properties (gray nodes). In constrast, an unfamiliar pseudo-word would activate only phonemic and acoustic networks, with comparatively weaker connections (dashed lines). Due to the broader connections of the construction circuit, it generates stronger activation than the weakly connected neuron set, as reflected in the differing strengths of their corresponding MMN brain responses.

are recognized, stronger overall brain activity is elicited compared with the processing of acoustically similar pseudo-words, which would not activate a corresponding distributed neuronal assembly (Pulvermüller et al., 2014). The neurophysiological difference between existing, stored forms and unstored, novel forms should therefore be relatively greater activation to the stored forms (**Figure 1**, left). In contrast, combinatorial processes are supported by mechanisms that apply the same algorithm or combinatorial schema to a whole class of stored item. At the neuromechanistic level, this mechanism is captured by combinatorial neuronal circuits linked with two or more sets of neuronal assemblies for stored items (Pulvermüller, 2010). In this case, the typical combinatorial context of a target word leads to preactivation or priming of a target word's representation, so that, when the word itself appears, its neuronal assembly is already active to a degree and the additional activation process to bring it to full ignition is therefore reduced compared with the unprimed case (**Figure 1**, right). The neurophysiological difference between forms that are connected by a combinatorial mechanism and unlinked ones is therefore relatively reduced activation for the former. Thus, whereas stored forms should *increase* the brain response relative to unstored ones, regularly-combined forms should elicit *smaller* brain responses than ill-combined ones. These reverse neurophysiological indicators of combination and storage are underpinned by explicit neurocomputational simulations and experimental results. In this context, one brain response has been particularly fruitful, the mismatch negativity, or MMN, as we will explain below.

The MMN is an ERP which indexes the perception of change, for example when a series of frequently presented identical "standard" stimuli is interrupted by a rarely appearing and therefore unexpected "deviant" stimulus (Näätänen et al., 2007). In comparison to the ERP responses to standard stimuli, the ERP response to deviant stimuli shows a negative deflection is seen as a set of stand-alone circuits with strong combinatorial links between them. These strong between-circuit links are missing in case of a sequence that violates combinatorial regularities. When activated, the terminal member of the combinatorial circuit creates less activation than the terminal member in the incoherent circuit, because the priming between strongly-connected network members reduces the final activation enhancement needed to fully ignite the terminal element. This extra activation is reflected in higher MMNs for the terminal element of an ungrammatical string (MMN data adopted from Pulvermüller et al., 2001; Pulvermüller and Assadollahi, 2007).

manifesting in the fronto-central electrodes, typically somewhere between 100–200 ms after acoustic deviance. Interestingly, it could be shown that this MMN response to spoken words and constructions shows exactly the dynamics to stored and combined forms predicted by the neuromechanistic model summarized above: words elicit larger MMN responses than acoustically and psycholinguistically matched, novel, pseudo-word syllable combinations. We call this extra MMN activation for words or whole-forms the "lexical MMN" (lMMN; Korpilahti et al., 2001; Pulvermüller et al., 2001, 2004; Kujala et al., 2002; Shtyrov and Pulvermüller, 2002; Endrass et al., 2004; Pettigrew et al., 2004; Shtyrov et al., 2005, 2010). On the other hand, grammatically congruent combinations of words and morphemes elicit *reduced* MMNs relative to the large ones elicited by ungrammatical strings, called here "syntactic MMNs" (sMMN), indexing lack of a combinatorial mechanism (Pulvermüller and Shtyrov, 2003; Shtyrov et al., 2003; Pulvermüller and Assadollahi, 2007; Herrmann et al., 2009; Bakker et al., 2013; Hanna et al., 2014). Therefore, the MMN offers the opportunity to address questions about storage and combination at the neurophysiological level.

Over and above its properties as a neurophysiological index of whole-form-storage and combination, the MMN brings several further advantages for neuroscience investigations into language. First it manifests early, within 100–200 ms after the critical information about a construction can first be distinguished and understood. This is important, because language comprehension is a fast and early process, and responses with longer latency therefore run into the problem that it can become difficult to decide whether any brain processes indexed are indeed a hallmark of first-access parsing and understanding, or are rather epiphenomenal (Pulvermüller et al., 2009). Second, the MMN is elicited regardless of whether participants focus their attention on the stimuli or elsewhere. This is important because natural language is mostly understood without effort; in fact, it is very difficult *not* to understand one's native language. An ERP which disappears with the participant's attention is therefore not likely to index natural language processing *per se*, but possibly metalinguistic, post-linguistic, or task-related processes. Additional strengths of linguistic MMN experiments are that they use orthogonal designs and make it possible to minimize the variance caused by acoustic variation. These features make the MMN an ideal tool for investigating higher linguistic and cognitive processes, and especially for looking at the brain basis of storage and combination (Pulvermüller and Shtyrov, 2006).

Because of its double potential as an index of both whole form storage and combination, the MMN has indeed recently been used to inform the linguistic debate around whole form retrieval vs. combinatorial processing of complex words and constructions. Looking at inflected forms, Bakker et al. (2013) found larger MMN responses for incongruently inflected past-tense forms, i.e., a sMMN, suggesting combinatorial processing for regular past-tense, rather than whole-form storage. Cappelle et al. (2010) found that particle verbs, in spite of their manifestation as different words dispersed over a sentence, still behave neurophysiologically as single, stored lexical items, with congruent particle verbs like "heat...up" producing stronger MMNs than incongruent ones like <sup>∗</sup> "cool...up". Leminen et al. (2013b), used an orthogonal MMN design to directly compare inflectional and derivational processing in Finnish and found not only the whole-form-storage index (lMMN) for derived forms, but also the combinatorial pattern (sMMN) for inflected forms. This would indicate a status as whole-form items for complex derived words, which goes against the body of evidence favoring (de-) composition and combination. As highlighted in the discussion above, data and opinions diverge about the status of semantically opaque complex forms, but it is relatively uncontroversial that semantically transparent complex derivational forms are seen as combined from their parts (Marslen-Wilson et al., 1994). To clarify the issue, we looked here at transparent derived forms in a language with rich derivational morphology, German.

The current study exploits the sMMN/lMMN to explore how German derived nouns are processed by native speakers. In German, an adjective may be rendered into an abstract noun by use of the derivational suffixes -heit and -keit (similar to English -ity or -ness). For example, "sicher" means "secure", "sicherheit" means "security", "sauber" means "clean", "sauberkeit" means "cleanliness." Note that these forms are semantically transparent so that classic morphological theories predict decomposition and combination. We presented "sicherheit", <sup>∗</sup> "sicherkeit", <sup>∗</sup> "sauberkeit", and "sauberheit" as deviant stimuli in the context of standard stimuli "sicher" and "sauber". When for example "sicher" is a standard, and "sicherheit" follows as a deviant, an MMN is elicited from the onset of the "h" sound, and will additionally be modulated either by its status as a real word, or its status as a morphosyntactically correct combination. When <sup>∗</sup> "sicherkeit" follows as a deviant however, the MMN response will be modulated by the word's status either as a pseudoword or an incongruent combination of morphemes. Any difference between these MMN responses however could easily be explained by the acoustic differences between "heit" and "keit", so a further control condition is necessary. We accomplish this by introducing an experimental block where "sauber" replaces "sicher" as the standard stimulus and root lexeme in the deviant stimuli. In this case, -heit completes a pseudoword/incongruent combination and -keit completes a real word/congruent combination, thus yielding an orthogonal design in which additive effects of any of the stems or affixes cannot act as confounds. If the congruent forms "sicherheit" and "sauberkeit" produce stronger responses than the discordant ones, " ∗ sicherkeit" and "<sup>∗</sup> sauberheit", the neurophysiological evidence speaks in favor of whole form retrieval. If the incongruent forms produce stronger responses however, there is a brain-based argument for combinatorial processing and decomposition.

# **MATERIALS AND METHODS**

### **DESIGN**

This experiment elicited MMNs using the classic, oddball paradigm where deviants occur rarely in a stream of more frequent standard stimuli. In this case, 1260 standards (3/4 of total stimuli), and 420 deviants. The stem ("sicher" or "sauber") served as the standard in a given block, and the corresponding deviants were the stem appended with "-heit" or "-keit" (see Materials). The result is four deviants: sicherheit, <sup>∗</sup> sicherkeit, <sup>∗</sup> sauberheit, and sauberkeit (see **Figure 2**).

There were between three and five occurrences of standard stimuli between deviants, and an initial habituation period at the beginning of each block, where the standard was repeated 15 times consecutively. Brain responses to these 15 repetitions were not included in the ERP averages, nor were brain responses to the standard stimuli occurring immediately after a deviant stimulus.

"Sicher" and "Sauber" stimuli were segregated into separate blocks; each block contained 630 presentations of the standard stimuli, and two deviants presented 105 times each. There was a stimulus onset asynchrony (SOA) of 2 s. Block priority was counterbalanced across participants. Stimuli were presented with E-Prime 2.0<sup>1</sup> .

# **PARTICIPANTS**

We collected data from 33 participants, recruited from the student population of the Freie Universität Berlin, who were righthanded, as confirmed using the Edinburgh Handedness Inventory (Oldfield, 1971), and native speakers of German, and had no linguistic or neurological disorders. The experiments were performed with the approval of the Ethics committee of the Charité Universitätsmedizin, Campus Benjamin Franklin, Berlin.

# **STIMULI**

The MMN is highly sensitive to acoustic variation, so stimuli must be temporally aligned and identical, except where demanded by the parameters of the experiment. Toward this end, we recorded a female native speaker of German pronouncing "sicherheit" and "sauberkeit" several times—with a pause between the root and suffix to minimize coarticulatory bias in the root to a particular suffix—as well as the same stems followed by the word, "zeit." We selected the "sicher" and "sauber" recordings that were most similar to each other in terms of length and peak sound

<sup>1</sup>http://www.pstnet.com/eprime.cfm

energy as measured by acoustic wave forms and spectrograms, and eliminated remaining differences along these dimensions by selectively cutting the length of silence at the beginning of the recordings, with the result that they terminate at the same time, and by normalizing their sound energy to −5 dB after splicing (see below). Care was taken that stimuli shared the same intonational contour, as judged by a panel of three native German speakers listening to the stimuli candidates. In the same fashion, the most similar recordings of "heit" and "keit" were selected out. The final [t] morpheme was stripped out of both recordings and replaced by the [t] morpheme from a recording of "Zeit". These edited "heit" and "keit" recordings were then spliced onto the "sicher" and "sauber" recordings. In order to achieve a natural intonation, the "heit" and "keit" recordings were reduced in amplitude by 5 dB, transposed down half a step in pitch, and the initial phoneme ([h] or [k]) was faded in from 50%–100% of the original volume. These steps smoothed the transition from the root into the suffix, resulting in stimuli that sounded like naturally pronounced, multi-morphemic words. Both standards were 625 ms long, and all deviants were 1125 ms long, with the "heit" or "keit" morpheme beginning at 625 ms. Sound recording and editing was performed with Audacity 2.0.3<sup>2</sup> . Acoustic spectra of the stimuli are shown in **Figure 2**.

According to the dlexDB psycholinguistic database for the German language (Heister et al., 2011), "Sicherheit" and "Sauberkeit" have normalized frequencies (n/million) of 116.5 and 5.4, respectively, and lemma frequencies of 118.3 and 5.4. "sicher" and "sauber" have 117.6 and 15.9, respectively, and lemma frequencies of 173.1 and 29.5. So "sicher" and "Sicherheit" are considerably more frequent than "sauber" and "Sauberkeit".

### **PROCEDURE**

Participants were seated in a comfortable chair facing a monitor, through which they watched a silent distractor movie with no linguistic content. They were instructed that they should ignore the acoustic stimuli, and may simply relax and watch the film. Stimuli were presented binaurally through high-quality headphones. The experiment lasted approximately 1 hour.

### **EEG RECORDING**

Electroencephalogram data were recorded with 128 active electrodes (actiCAP system, BrainProducts, Gilching, Germany), with a ground electrode at AFz, and a reference electrode on the nose tip. Scalp electrodes were arranged in a modified 10–5 system, with occipital electrodes OI1 h, OI2 h, I1, and I2 removed. The electrooculogram (EOG), was recorded through three electrodes, two above and below the left eye, and one lateral to the right eye. The two vertical EOG electrodes were off-line re-referenced against each other to form the vertical EOG signal (vEOG), and this signal was then referenced against the third electrode to form the horizontal EOG signal (hEOG). Data were band-pass filtered

<sup>2</sup>http://audacity.sourceforge.net/

# **Table 1 | Mean and range of ERP trials remaining after pre-processing**.


(0.1–250 Hz) and sampled at 1000 Hz. Recordings were taken in an electrically and acoustically shielded chamber.

# **EEG PRE-PROCESSING**

The following stages of pre-processing were carried out in EEGLAB 11.5.4.b<sup>3</sup> . Data were downsampled to 200 Hz, and bandpass filtered at 0.3–30 Hz. We then carried out a manual inspection of the data to remove bad channels and non-systematic bursts of noise. Electrooculogram channels were re-referenced offline as described above. Independent component analysis was used to derive 64 components from the data. Components which correlated with either vEOG or hEOG with *r* < −0.3 or *r* > 0.3 were removed from the data, thus significantly reducing eye-related artefacts. Removed channels were then spherically interpolated back into the data. Triggers used in the averaging process were set to the point where deviant stimuli first diverge acoustically from the standard stimuli, and moved forward 25 ms to compensate for the delay between trigger and auditory stimulus onset immanent to the stimulus delivery system. The continuous recording was then epoched into trials of 850 ms, starting 50 ms before the trigger and ending 800 ms after it. This 50 ms period before the trigger served as the baseline.

From this point, data were pre-processed in SPM8<sup>4</sup> . Epochs with a maximum—minimum voltage difference >120 µv or a >25 µv jump across two consecutive data points were removed, and the remaining trials were averaged into ERPs for each condition and subject. The mean and range of the number of remaining trials after cleaning and rejection are displayed below in **Table 1**.

Participants who produced ERP signals with low signal-tonoise ratios were excluded from the pool. These were identified by reversing the polarity of half the epochs for all deviant stimuli, and averaging them. On the standard ERP assumption that the signal remains constant across trials, the signal in the half of the trials with reversed polarity would cancel the signal in the other half. Therefore the average of flipped and non-flipped trials would be the noise component of the ERP (Schimmel, 1967; Campos, unpublished). The average root mean square (RMS) of this noise was calculated for the *a priori* defined time window of interest (100–200 ms after acoustic deviance), and divided into the RMS of the ERP signal for the same time period, producing a signal to noise ratio (SNR). Five participants either had an SNR less than one, or a signal less than 1 µv, and a further two had excessive muscle artefacts. These participants' data were therefore excluded, leaving 26 (four male) participants.

# **SENSOR SPACE STATISTICS**

Sensor space data were analyzed in two ways. The first was the standard approach, where condition values for each participant

<sup>3</sup>http://sccn.ucsd.edu/eeglab/

<sup>4</sup>http://www.fil.ion.ucl.ac.uk/spm/software/spm8/

were computed for each of the four deviant conditions (sicherheit, <sup>∗</sup> sicherkeit, <sup>∗</sup> sauberheit, sauberkeit) by taking the mean amplitude across the time windows and electrode configurations where deviant response amplitude was strongest. These mean values were entered into a repeated measures analyses of variance (ANOVA), with ROOT (sicher and sauber) and SUFFIX (-heit and -keit) as two-level factors.

The second method is cluster-based permutation on ERP data in a 3d-volume format, where spatial configuration of the electrodes as a flat surface comprise two dimensions, and peri-stimulus time comprises the 3rd dimension (Maris and Oostenveld, 2007). The relevant statistical tests were then performed on each voxel. Voxels where *p*-values were below a given threshold were grouped into clusters, and the "weight" of the clusters was determined by adding the *F*-ratios of all voxels in a given cluster together. In order to determine what cluster weights are likely to reflect real differences, a permutation-based Monte Carlo simulation is run, where for each iteration of the simulation, conditions are randomly distributed through the model, in effect simulating a null hypothesis. For each iteration the clusters are weighed, and the heaviest cluster is selected out. The distribution of these null-hypothesis cluster weights across many iterations (in our case, 1000 iterations) provides a measure of likelihood that the clusters found in the original statistical test are false positives.

# **SOURCE LOCALIZATION**

Electrodes were co-registered in the standard 10–5 spatial configuration onto the scalp of the EEG boundary element forward model, based on the canonical MRI template included in SPM8.

For source localization, conditions were averaged according to congruency ("Sicherheit" and "Sauberkeit" vs. <sup>∗</sup> "Sicherkeit" and ∗ "Sauberheit"). Distributed source localization was carried out with the multiple sparse priors (MSP) approach (Friston et al., 2008) in SPM8. Group inversion was performed, thereby constraining spatial source solutions uniformly across participants (Litvak and Friston, 2008). Voxel images were produced summarizing the source activity at time points of interest (**Figure 5B**), and smoothed with a kernel size of 12 mm. These images were then submitted to their respective multi-voxel paired sample *t*-test.

# **RESULTS**

### **SENSORS Standard analysis**

Mismatch negativies (deviant minus its correponding standard) and deviant topographies for the four conditions are displayed in **Figure 3** relative to a trigger point at the onset of the derivational suffixes "heit" or "keit", where the acoustic waveforms of the standard and deviant stimuli first differed. Topographies show

of congruent (average of Sicherheit and Sauberkeit) and incongruent (average of \*Sicherkeit and \*Sauberheit) deviants, in the time windows selected for statistical comparison. **(D)** MMNs for congruent and incongruent conditions, green vertical lines indicate time windows

that the negative deflections occurred in fronto-central electrodes, as is typical for acoustic MMN paradigms, so waveforms used for display and statistics were calculated from the average of 46 electrodes in this area (pictured in **Figure 3A**). Note that MMNs are displayed in **Figures 3A,B,D**, but all analysis and source localization was carried out directly on the unsubtracted deviants.

The main MMN deflection emerged between 135–175 ms after acoustic divergence. In addition to this 40 ms-wide window, we investigated several other peaks for sensitivity to linguistic processes: a very early negative deflection (40–80 ms), a large positive deflection directly following the MMN (230–270 ms), and a late, extended negative deflection (340–500 ms).

At the main MMN peak (135–175 ms; **Figure 4**), congruent derived words produce stronger responses in both "sicher" and "sauber" conditions, and "sicher" conditions produce stronger responses than "sauber" conditions. Statistical results confirmed this impression. A 2 × 2 ANOVA with ROOT (sicher, sauber) and SUFFIX (-heit,-keit) as factors revealed a significant crossover interaction of ROOT and SUFFIX (*F*(1,25) = 9.5, *p* = 0.005). Planned comparisons showed that "sicherheit" produced a reliably stronger response than <sup>∗</sup> "sicherkeit" (*F*(1,25) = 4.4, *p* = 0.045), and "sauberkeit" produced a reliably stronger response than ∗ "sauberheit" (*F*(1,25) = 5.3, *p* = 0.03). In addition, the main effect of ROOT was significant (*F*(1,25) = 5, *p* = 0.035), indicating that, on average, "sicher" conditions produced stronger responses than "sauber" conditions.

Already in the earlier time window (40–80 ms) there was a negative deflection, with the congruent deviants producing a seemingly stronger signal than the incongruent deviants. This

for signal space analysis and topography display in **(C)**. **(E)** Results of voxel-wise factorial ANOVAs of ERP data, converted into a 3d-volume. X and Y axes represent 2d electrode positions, and Z axis represents time. Gray voxels are where interaction of ROOT and SUFFIX factors reached p < 0.05, the orange cluster survived multiple-comparisons correction, and the purple voxels are where planned comparisons showed stronger responses for congruent conditions in both "sicher" and "sauber" conditions.

**FIGURE 4 | Average deviant ERP voltages for the 135–175 ms time window measured at fronto-central channels obtained in the four conditions**. Note that for each pair and experimental block, the congruous condition elicits a significantly larger ERP than the infelicitous one. The interaction is signficant. Error bars are standard error of the mean.

pattern was marginally significant, with a cross-over interaction of *F*(1,25) = 3.2, *p* = 0.086. There were no significant main effects.

In the later positive deflection (230–270 ms) there was a much stronger response to sicher and -keit conditions compared with their respective sister forms, confirmed by a main effect of ROOT (*F*(1,25) = 13.4, *p* < 0.001) and SUFFIX (*F*(1,25) = 6, *p* = 0.021). There was no interaction of these factors.

In the latest time window (340–500 ms), the negative deflection yielded only a main effect of ROOT (*F*(1,25) = 21.92, *p* < 0.001).

### **Cluster-based permutation**

**Figure 3E** shows the results of the same repeated measures described in the previous section, applied voxel-wise to ERP data in 3d-volume format. Gray-scale voxels show uncorrected *F*values on the interaction of ROOT and SUFFIX, thresholded at *F* > 4.24 (*p* < 0.05, df = 1.25). When these voxels were grouped into clusters, one cluster (shown in orange on **Figure 3E**), corresponding to the 135–175 ms time window, was heavier than 965 of the 1000 maximum cluster weights in the Monte Carlo simulation of the null hypothesis (*p* = 0.035). No other cluster passed the *p* < 0.05 threshold. The area shown in purple on **Figure 3E**, corresponding to fronto-central electrodes at around 165–175 ms, is where planned comparisons showed that both the Sicherheit response was significantly more negative than the <sup>∗</sup> Sicherkeit response, and the Sauberkeit response was significantly more negative than <sup>∗</sup> Sauberheit (*p* < 0.05 in both cases).

### **SOURCES**

For all conditions generally, sources were concentrated in classical language areas: perisylvian, temporo-parietal, and inferior frontal gyrus, in both hemispheres, as thresholded at *p* < 0.05, family-wise corrected with random field theory (Brett et al., 2004; **Figure 5A**). For statistical comparisons between congruent and incongruent conditions, we focused on those time windows when ROOT and SUFFIX interacted, namely 40–80 ms and 135–175 ms, and produced voxel images summarizing source activity at each time window's peak global field power, 45 ms and 170 ms, respectively (**Figure 5B**). Unidirectional, voxel-wise *t*-tests on these images found that at 170 ms, congruent deviants produced stronger responses than incongruent deviants at clusters in bilateral superior parietal regions, bilateral central/postcentral/supramarginal regions, and left superior post-central regions (*p* < 0.05, uncorrected), as well as a difference in the left middle/inferior temporal gyrus (*p* < 0.01, uncorrected). At 45 ms, two clusters in bilateral superior parietal regions were significant (*p* < 0.05, uncorrected), contained entirely within the parietal clusters active at 170 ms. Cluster peak coordinates are summarized in **Figure 5C** and **Table 2**. Unidirectional *t*-tests in the other direction (incongruent stronger than congruent) produced no significant voxels at either time point. We stress here that statistical comparisons between congruent and incongruent conditions did not survive whole-brain family-wise error correction, and so should be intepreted with appropriate caution.

### **DISCUSSION**

Derived words including stem and affix consistently produced stronger ERP responses than incongruent sequences of the

congruent and incongruent conditions. The "X" markers indicate the time points of interest when source activity was analyzed. **(C)** Voxels where and when congruent conditions elicited a stronger response than incongruent conditions. All statistics uncorrected for multiple comparisons.

same stems and derivational affixes, a pattern consistent with an lMMN, and therefore whole-form storage. Uncorrected source localization indicated that the generators underlying the enhancement of the MMN response to congruent relative to incongruent forms, the lMMN, were located primarily in bilateral posterior-parietal areas (angular gyrus), the left inferior temporal gyrus, and pericentral sensorimotor areas extending into anterior supramarginal gyrus. There was also a weaker, marginally significant "pre-lMMN" effect at 40–80 ms, localized in bilateral parietal areas.



### **DERIVED FORMS ARE STORED AS WHOLE-FORMS, NOT COMBINED**

Our present results show the brain activation correlates of wholeform storage for derived German words. Therefore, the data can be used to argue that the brain mechanisms sparked by these forms are those of stored whole form retrieval. In contrast, standard grammar theories and psycholinguistic models viewing derivation as a combinatorial process are not supported by these data (for discussion of psycholinguistic implications, see below). The present results cohere with prior studies that used the MMN to study derivational processing. Leminen et al. (2013b) also found larger MMNs to congruent derived forms of Finnish than to incongruent combinations, thus revealing the same neurophysiological signature of stored-form-retrieval as our present data on German nouns do. Leminen et al.'s derivational whole-formstorage MMNs were generated in left temporal areas, as ours here, and these authors also reported that their high-frequency derived words produced a larger MMN in comparison to low-frequency derived words. Whiting et al. (2013) localized MMNs for derived English words to the left middle temporal lobe, again where the lMMN enhancement was most reliably localized in our present study.

The cortical sources of MMN responses to stored linguistic forms and especially the activation enhancement for stored over unstored forms ("lexical MMN" or lMMN) have previously been localized in a range of different areas, most commonly in left or bilateral superior-temporal regions (Pulvermüller et al., 2001, 2004; Shtyrov et al., 2005). Inferior-frontal sources were seen especially for words and constructions semantically related to actions (e.g., Shtyrov et al., 2004; Pulvermüller et al., 2005; Pulvermüller and Shtyrov, 2009). Posterior-inferior parietal sources have been reported too, with special emphasis that these can vary between words (Pulvermüller et al., 2004); parietal sources were previously seen to be pronounced to prepositions and verb particles (Cappelle et al., 2010). This pre-existing research shows that localization of the lexical enhancement can vary substantially in its brain topography, and it appears plausible that this variability depends, in part, on lexical and psycholinguistic features of the particular word stimuli probed (Pulvermüller et al., 2009). Our present results show overall activation to linguistic stimuli across all the regions previously found active in this type of experiment (**Figures 5A,C**, *p* < 0.05, FWE-corrected), including superior-temporal, inferior-frontal and inferior-parietal areas within the perisylvian language cortex and also dorsolateral central cortex, posterior parietal cortex and inferior temporal lobe outside, in "extrasylvian" space. However, amongst these areas generally active to both congruent and incongruent forms, only a subset seemed more active to congruent than to incongruent forms ending in a derivation suffix. These were the extrasylvian parietal and temporal areas around the angular gyrus and the temporal pole, both known as areas that have recently been proposed as "semantic hubs" that process meaning-related information (Patterson et al., 2007; Pulvermüller, 2013). In addition, lMMN sources in perisylvian frontocentral sensorimotor cortex and anterior supramarginal gyrus may suggest action-related meaning processes. Still, we have to warn against giving these results any strong interpretation, as the levels of significance at which between-condition differences in source space could be documented were low (*p* < 0.01 or 0.05), and still more importantly, did not pass family-wise error correction—in spite of the clear and significant differences in signal space. Regardless of the precise interpretation of the source dynamics, the results seem to speak against the involvement of combinatorial processes. The sMMN to ungrammatical strings, which we take as evidence for a combinatorial process, has its typical sources in left superior temporal areas with MEG (Shtyrov et al., 2003; Pulvermüller and Assadollahi, 2007; Herrmann et al., 2009; Bakker et al., 2013) and left inferior frontal areas with EEG (Pulvermüller and Shtyrov, 2003; Hanna et al., 2014), but not where the current source analysis suggested generator differences between conditions. As inferior-temporal and posterior-parietal sources are typical for semantic brain activity frequently seen to single words, whereas combinatoriallinguistic processes usually have a perisylvian signature, the present source pattern supports the lexical whole-form storage interpretation.

Given the tentative nature of our present localizations of lMMN sources obtained for derived words, and of any neurophysiological source localization generally (Hämäläinen et al., 1994), it is important to note that the suggested activation loci agree with those of two recent fMRI studies which focused specifically on derivational morphology processing to auditory stimuli. These studies consistently found activation in bilateral middle temporal lobes (Bozic et al., 2013a,b) when brain responses to derived forms were compared with inflected forms. Our present results, demonstrating left anterior inferior and bilateral middletemporal activation enhancements to congruent derived forms compared with incongruent forms, show a reasonable agreement with these authors' main findings.

Even though the words used in the present study were not matched for all psycholinguistic factors that could potentially affect the brain response, one of them is clearly more common and more frequently used than the other in standard German (dlexDB normalized word frequencies 116.5 vs.5.4). It is therefore noteworthy that, consistent with previous results (Alexandrov et al., 2011; Shtyrov et al., 2011), a stronger MMN emerged for the more frequent item ("Sicherheit"). The frequency sensitivity of the MMN suggested by the present data provides a further indication that we measured a whole-form retrieval process and not a combinatorial one. Word frequency is one of the oldest and most robust test variables for lexical status, widely measured in behavioral tasks (e.g., Balota et al., 2004), metabolic neuroimaging (e.g., Hauk et al., 2008), ERPs (e.g., Hauk et al., 2006; Kutas and Federmeier, 2011; Shtyrov et al., 2011), and specifically is also indexed by MMN to monomorphemic words (Alexandrov et al., 2011; Leminen et al., 2013b). Brain responses indexing combinatorial processes invoked by inflectional and syntactic mechanisms by contrast do not seem to be affected by the frequency of their lexical roots (Pulvermüller and Assadollahi, 2007; Leminen et al., 2013b). The frequency-independence of combinatorial processes which can be described using algorithmic rules is a well-known phenomenon supported by substantial psycholinguistic evidence (Pinker, 1997). The neuromechanistic basis for the frequency-sensitivity of whole-form access can be theoretically grounded in the postulate that whole forms are stored as distributed neuronal assemblies that become more frequently connected internally the more frequently they are activated together, thus yielding more strongly connected assemblies for high-frequency words and constructions than for low-frequency ones (Pulvermüller, 1999). Activation dynamics reflect connection strength producing stronger activation with stronger links. In contrast, combinatorial processes rely on mechanisms binding information across large groups of lexical items so that the combinatorial links apply equally to highand low-frequency items and are therefore independent of the frequency of a particular sequence of words (Pulvermüller, 2010).

## **IMPLICATIONS FOR PSYCHOLINGUISTIC THEORIES**

These results seem to argue against psycholinguistic models of obligatory decomposition (Clahsen et al., 2003; Marslen-Wilson, 2007). Even if such models allow for secondary whole-form access under special circumstances, i.e., a "rules and words" framework (Pinker, 1997), it would need to be explained why two morphologically different German words show the neurobiological signature of whole-form access and retrieval at earliest latencies (135–175 ms), with marginally significant foreshadowing of such difference already at *ca.* 40 ms, and why similar previous studies by Leminen et al. (2013b) revealed the comparable results for derived forms of Finnish. The special significance of this present study in German comes from the complexity of the morphological rules and construction schemes underlying the forms "Sicherheit" and "Sauberkeit". According to standard German morphology and grammar (Fleischer and Barz, 2012), there is a semi-regularity according to which a bisyllabic adjective ending in the syllable "er" (common to both of stems here) tends toward the nominalizing derivational suffix -keit, not -heit. The assumption of such a regular pattern is supported by the fact that many more nominalizations of bisyllabic er-adjectives take -keit than -heit. "Sauberkeit" could therefore be an instance of a rule-combined form. For nouns including an "er" adjective with two syllables and -heit, the argument can therefore be made that they represent exceptions from the "keit-rule" and can therefore be regarded as whole-form-stored mini-constructions. Such exceptional whole-form storage should therefore apply to "Sicherheit." The prediction of this theory is that the brain dynamics elicited by congruent -keit forms are those of combination and composition, whereas those to -heit forms should index whole form storage. In showing the whole form storage pattern is elicited by both types, our results speak against this "mixed" account.

However, it must be pointed out here that while "Sicher" and derivatives are much more frequent than "Sauber" and derivatives, both are quite frequent in German. It may be that when very infrequent words are tested against frequent words, a combinatorial mechanism is used in ther former, and a wholeform mechanism in the latter. This remains a promising avenue for future research.

At the level of linguistic theory, the present results seem to sit comfortably with current approaches to construction grammar according to which a large repertoire of constructions can be learned and stored from experience (Goldberg, 2003). In this approach, derived forms would be considered mini-constructions stored on an item-by-item basis, based on general neurobiological laws such as Hebbian learning (Pulvermüller, 1999). It is clear that, if linguistic forms are frequently recombined with each other, this combinatorial information is also mapped at the biological circuit level so that combination schemas are created. The neurobiological mechanisms for such formation of combinatorial schemas has been explored with neurocomputational network simulations and the linguistic theory for such schemas particularly well developed in the domain of argument structure constructions (Goldberg, 2006). This research encourages future empirical questions, especially ones about the cause behind the shift between storage of single whole forms and the development of a combinatorial schema and structural construction.

The most probable reason for the discrepancy between the dominating opinion in psycholinguistics and our present findings is that most studies that investigated this issue in the past used written stimuli, whereas we used auditory stimuli. While spoken and orthographic speech clearly must at some point share common linguistic substrates, they also must use distinct systems, and this is more likely to be so in the earlier stages of processing. Processing of written language also relies partly on visual object identification systems, further shaped by the noninnate capability to read and write (Rastle and Davis, 2008). We recommend then that this imbalance should be corrected, with further research on early-stage neurophysiology of morphological processing in the auditory modality.

# **MMN AS A TOOL FOR PSYCHOLINGUISTIC INVESTIGATION**

Neuroscience research on the psycholinguistic question about whole-form retrieval or combinatorial processing of complex symbols and constructions requires a brain response that shows different dynamics to the fundamentally different types of predictions these mechanisms entail. Whole construction retrieval of a complex form AB implies that single representation or neuronal circuit is activated partially by utterance part A and the second utterance part B fully "ignites" the unitary AB circuit. The ignition of the larger circuit AB produces more activation than the activation of the composite circuit B on its own. In sharp contrast to this dynamic, a combinatorial mechanism connecting forms A and B implies separate autonomous mechanisms for the processing of both constituents and a functional combinatorial link between them. In this case, utterance part A activates its own circuit, which, in turn leads to partial activation (priming) of circuit B by way of the combinatorial mechanism. When B appears in this combinatorial congruent context, its circuit is already pre-active and therefore its full ignition leads to less activation relative to the pre-B baseline than when B appears in an incongruent context, where no combinatorial priming is present. As to the best of our knowledge, the only brain response that reflects this difference between storage-related and combinatorial mechanisms of prediction and processing in different and opposite dynamics is the MMN. Most other brain responses that have been successfully used to investigate language and cognitive processing show a "surprise signature" according to which the less expected event leads to increased amplitudes relative to the expected or predicted one. This expectancy violation or prediction error signature is well-documented for event-related responses including the N1 and P300 (sensory expectation and attention), N400 (lexical or semantic expectation), and ELAN, LAN and P600 (syntactic expectation) (Donchin, 1981; Neville et al., 1991; Osterhout et al., 1997; Kutas et al., 2006; Näätänen et al., 2007; Kutas and Federmeier, 2011). The opposite dynamics of the MMN to wholeform retrieval and combinatorial processing also makes it possible to obtain information from neuroimaging experiments about the cortical loci of activation, which may also provide clues about the storage-related or combinatorial nature of the neurocognitive processes. Looking back at the surprising set of results recently revealed by linguistic MMN research—including the evidence for combinatorial processing of inflected forms, whole form retrieval of derived ones and whole-form storage of particle verbs (Cappelle et al., 2010; Bakker et al., 2013; Leminen et al., 2013b), this response offers itself as a fruitful tool for future investigation of the neurobiological basis of words, constructions and meaningful communication generally.

# **CONCLUSION**

We investigated early, automatic brain responses to derived words in German using the lMMN. The results indicate such words are processed as whole forms, evidenced as follows:


In sum, these findings provide new evidence for a robust whole-form access route in the auditory perception of derived words—even highly transparent, productive ones—in the form of enhanced MMNs for existing, derived words, presumably reflecting extra activation from their lexical memory circuits. We hope these results shed new light on a crucial linguistic and psychological issue, namely the interplay between stored units or forms, and the combinatorial mechanisms which productively combine them.

# **ACKNOWLEDGMENTS**

We wish to thank Verena Büscher, Philip Schimpf, Sarah von Saldern, Anne Autenrieb, Rosario Tomasello, Mauro Cantino, Natalie Miller, and Laura Besch for their assistance in stimuli production and data collection. We also wish to thank the three reviewers for their insights and helpful suggestions. This project was supported by the Freie Universität Berlin, the Deutsche Forschungsgemeinschaft (Excellence Cluster Languages of Emotion, Project Pu 97/16-1 on "Construction and Combination") and the Engineering and Physical Sciences and Behavioral and Brain Sciences Research Councils (UK) (BABEL grant, EP/J004561/1).

# **REFERENCES**


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 29 July 2014; accepted: 15 October 2014; published online: 06 November 2014*.

*Citation: Hanna J and Pulvermüller F (2014) Neurophysiological evidence for whole form retrieval of complex derived words: a mismatch negativity study. Front. Hum. Neurosci. 8:886. doi: 10.3389/fnhum.2014.00886*

*This article was submitted to the journal Frontiers in Human Neuroscience*.

*Copyright © 2014 Hanna and Pulvermüller. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms*.

# Morphological priming during language switching: an ERP study

# *Saskia E. Lensink1,2 \*, Rinus G. Verdonschot 2,3 and Niels O. Schiller1,2*

<sup>1</sup> Faculty of Humanities, Leiden University Centre for Linguistics, Leiden University, Leiden, Netherlands

<sup>2</sup> Leiden Institute for Brain and Cognition, Leiden University, Leiden, Netherlands

<sup>3</sup> Graduate School of Languages and Cultures, Nagoya University, Nagoya, Japan

### *Edited by:*

Minna Lehtonen, University of Helsinki, Finland

### *Reviewed by:*

Juhani Järvikivi, University of Alberta, Canada Eleonora Rossi, Penn State University, USA

### *\*Correspondence:*

Saskia E. Lensink, Faculty of Humanities, Leiden University Centre for Linguistics, Leiden University, Van Eyckhof 3, 2311 BV Leiden, Netherlands

e-mail: s.e.lensink@hum.leidenuniv.nl

Bilingual language control (BLC) is a much-debated issue in recent literature. Some models assume BLC is achieved by various types of inhibition of the non-target language, whereas other models do not assume any inhibitory mechanisms. In an event-related potential (ERP) study involving a long-lag morphological priming paradigm, participants were required to name pictures and read aloud words in both their L1 (Dutch) and L2 (English). Switch blocks contained intervening L1 items between L2 primes and targets, whereas non-switch blocks contained only L2 stimuli. In non-switch blocks, target picture names that were morphologically related to the primes were named faster than unrelated control items. In switch blocks, faster response latencies were recorded for morphologically related targets as well, demonstrating the existence of morphological priming in the L2. However, only in non-switch blocks, ERP data showed a reduced N400 trend, possibly suggesting that participants made use of a post-lexical checking mechanism during the switch block.

**Keywords: morphological priming, compounds, bilingual language processing, language switch, ERP**

# **INTRODUCTION**

It is not clear how morphologically complex words are represented and processed in non-native speakers. This study aims to shed more light on this issue by means of an overt speech production experiment where both behavioral and event-related potential (ERP) data were collected. Participants were presented with a morphological priming task where Dutch speakers with English as their L2 were required to read aloud words and name pictures in both English and Dutch. The results of this study not only provide more insights into the issue of how morphologically complex words of the L2 are represented in the brain, but also inform theories concerning bilingual language control.

Bilingual language control has been a much-debated issue in the literature over the past few years (Green, 1998, 2011; Christoffels et al., 2007; Abutalebi and Green, 2008; Colzato et al., 2008; Verdonschot et al., 2012; Bobb and Wodniecka, 2013). People who are fluent in more than one language are quite capable of keeping their languages apart. This process seems to be effortless and usually is without intrusions from one language into the other (Poulisse, 1999). This is particularly striking considering the evidence suggesting that both languages of bilinguals are active, even when only one is being used (Green, 1986; Kroll et al., 2006; Van Heuven et al., 2008).

It is generally assumed that in language production, lexical items compete for selection (Levelt et al., 1999; Bloem and La Heij, 2003; but see Mahon et al., 2007), and that the item with the highest level of activation wins. In bilingual language production this would potentially pose problems, as the same concept would activate two lexical representations in both language A and language B. There are several models accounting for how bilinguals manage selecting the target language. One type of model assumes an active or reactive inhibition of the non-target language – the Inhibitory Control or IC models (Green, 1998, 2011). Conversely, activation levels of the target language could be raised, so that no inhibitory processes need to be posited to achieve selection of the correct language – the non-Inhibitory Control or non-IC models (Costa et al., 1999; Costa and Santesteban, 2004). A third possibility concerns a hybrid model, where both inhibition of the non-target language and raising activation levels of the target language occur, with context influencing which process is employed, and at what level – global or local. Global inhibition suggests that the whole lexicon of a language is inhibited, whereas local inhibition refers to the inhibition of a small group of semantically and/or phonologically related items, or even the inhibition of a single item in the non-target language (Colzato et al., 2008; see Green, 2011, for an overview of different types of inhibition).

One way to investigate which of the two possibilities, IC or non-IC, is most accurate in predicting naming latencies, is to make use of long-lag morphological priming experiments. In long-lag priming experiments, the prime and target are separated from each other by several intervening items. IC models assuming global inhibition make a specific prediction for these types of experiments. If a language switch causes inhibition of the non-target language, then it is expected that any heightened activation of the prime and its related items would be inhibited after a language switch has occurred. That in turn would result in a reduced facilitation if not inhibition of the target item. For instance, when a prime in language A is followed by a language switch to language B, then the heightened activation of this prime and at least its related items will be decreased by the inhibition exerted by language B. Therefore, priming of a target in language A would possibly not be measurable anymore after this language switch.

Long-lag morphological priming was first used by Zwitserlood et al. (2000), who showed that morphological priming, a form of priming where the prime is morphologically related to the target, survives intervening lags of ten trials. However, effects of semantic and phonological priming were not obtainable anymore at those lags. These findings suggest that priming in those cases does not occur at a phonological or semantic level, but takes place at a separate morphological level. Koester and Schiller (2008, 2011) replicated these results in a combined behavioral and ERP study, and later in an fMRI study, showing that Dutch compounds prime morphologically related target picture names. Mere form overlap, as in *jasmijn* 'jasmine' – JAS 'coat' did not facilitate picture naming, suggesting that morphological priming is indeed another form of priming, different from identity priming (Wheeldon and Monsell, 1992). Their ERP data showed a reduced N400 component in posterior scalp regions when targets were preceded by morphologically related primes, but not when items with mere form overlap preceded them. Generally, an increased N400 component can be measured when participants are presented with unexpected items (Kutas and Federmeier, 2011). As the priming of a target item subconsciously prepares a participant for what is coming next, a primed target is less unexpected compared to an unprimed target, and therefore a reduced N400 peak is expected. The Koester and Schiller (2011) study found a neural priming effect in the left inferior frontal gyrus.

Interestingly, both transparent primes, where the target word is semantically related to one of the constituents of the compound prime (e.g., *eksternest* 'magpie nest' – EKSTER 'magpie'), and opaque primes, where there is no semantic relationship between the compound and the target (e.g., *eksteroog* 'corn,' lit. 'magpie eye' – EKSTER'magpie'), resulted in faster naming latencies. These results suggest that complex words that need to be stored as wholes due to their non-decompositional semantics, the opaque compounds, are not only represented as wholes in the lexicon, but are also parsed into their constituents. This is in line with Baayen et al.'s (1997) parallel dual-route model, where both a computational and a storage component work in parallel when producing or retrieving a complex word.

However it seems that L2 speakers rely much more on a storage component than on computational processes (Brovetto and Ullman, 2001; Ullman, 2005; Silva and Clahsen, 2008). Brovetto and Ullman (2001) report on a speeded production task where high-frequency irregular past tense verbs were responded to faster, but high-frequency regular past tense forms did not show any frequency effects in L1 speakers. They argue that this reflects processes of storage for the irregulars, but computational processes for the regular forms. L2 speakers, however, showed frequency effects for both the irregular forms and for the regulars, indicating that their L2 knowledge relied much more on storage (but see Baayen et al. (2002) for a more subtle view on the balance of storage and computation in L1 speakers). Ullman (2005) proposes in his declarative/procedural model of memory that L2 speakers rely more on the declarative memory system and less on the procedural system – in other words, instead of parsing morphological complex forms, L2 speakers are expected to employ full-form storage much more than L1 speakers. Silva and Clahsen (2008) conducted

several masked priming experiments in which they also found that L2 speakers process morphologically complex words differently from L1 speakers, relying much more on full-form storage than on computational processes. In their L1 groups, clear priming effects for complex words were found, where the prime consisted of an inflected or derived word form, and its simplex form as the target. In their group of L2 speakers, consisting of Chinese, Japanese and German learners of English, only identity priming was found, but no priming for the morphologically related forms.

Hahne et al. (2006)report on a behavioral and ERP study where they, however, did find evidence for both processes of storage and computation in L2 speakers. Their participants responded differently to violations of regular and irregular inflections – the first elicited an anterior negativity and a P600, whereas the latter resulted in an N400 effect. Their ERPs were very similar to those of L1 speakers.

Morphologically complex words, both transparent and opaque compounds, also facilitate naming their constituents when participants have to switch between their native language and their L2. Verdonschot et al. (2012) conducted a long-lag morphological priming experiment where participants had to switch between Dutch and English. They instructed Dutch (L1) – English (L2) bilinguals to read aloud words and to name pictures switching between Dutch and English. The primes and targets were presented in Dutch, whereas the intervening trials were presented in English for the switch blocks, and in Dutch during non-switch blocks. The primes consisted of Dutch compound wordsfrom which one of the constituents was identical to the target picture name. They used both transparent primes (e.g., *jaszak* 'coat pocket') and opaque primes (e.g., *grapjas*'funny person,' lit. 'joke coat'). Results showed that targets combined with morphologically related compounds, both transparent and opaque, yielded significantly faster naming latencies than targets preceded by morphologically unrelated primes. Despite the intervening language switch, priming effects still occurred, which according to the authors suggested that in switching to L2, no reactive inhibition is employed to suppress activation at the morphological level in L1 – for otherwise the heightened activation levels of the L1 primes would likely not have survived the repeated activation of the L2 and supposed inhibition of L1, and could therefore not have facilitated target picture naming.

However, it is possible that the dominant L1 (i.e., Dutch) is represented much stronger in the brain than the L2 (English) in Dutch-English bilinguals. If this were the case, then these inhibitory effects might be minor compared to the strength of the Dutch representation. This could mean that even though inhibition has taken place, priming effects are still measurable as the L2 was not strong enough to suppress all activation of the L1. In other words, the findings from Verdonschot et al. (2012) are still compatible with an inhibitory model where switching to another language causes the former language to be reactively inhibited by the language currently in use. Moreover, as the experiment was conducted in Dutch, in a Dutch-speaking country, with Dutch L1 speakers who would use Dutch in the majority of their daily lives, it is not unlikely that their Dutch lexical representations were much stronger activated overall. Therefore, if inhibition indeed plays a role, then one would expect that priming effects should be absent when the two languages are switched (e.g., using English primes/targets and Dutch intervening trials). If indeed Dutch is represented much stronger, and if there are mechanisms of inhibition employed when switching between languages, then the effect of inhibition caused by the Dutch intervening trials should be much larger than the priming effect of the English primes and targets, and therefore it is very likely that priming effects do not appear anymore or are at least so much reduced that they cannot be measured anymore by current experimental methods.

Given the previous research, several questions emerged. First of all, we aimed to replicate the morphological priming effects after a language switch, but then using L1 intervening items and L2 primes and targets, for reasons explained above. As a consequence, the experiment would involve priming in an L2. Therefore, not only an experiment with a language switch was needed, but also an experiment completely conducted in L2, to see whether morphological priming would occur at all in an L2. As L1 speakers seem to parse both transparent and opaque compounds (Koester and Schiller, 2008, 2011; Verdonschot et al., 2012), the question arises whether L2 speakers might process these two types of compounds differently. If they parse opaque compounds, like native speakers do, a morphological priming effect for opaque compounds is expected. If, however, they would simply retrieve the stored full forms, no morphological priming of the opaque forms would be observable. All these questions can be addressed by using a similar experimental design as in Verdonschot et al. (2012) but with the primes and targets in L2, English is this case, and the fillers in L1, Dutch.

Furthermore, we tested whether we would observe an N400 effect in morphological priming in an L2. Reduced N400 peaks have been found to occur in lexical priming paradigms (Kutas and Federmeier, 2011). Moreover, several studies have found N400 effects in early L2 speakers in semantic, associative, and categorical priming paradigms (Mueller, 2005). Therefore, it is expected that, if indeed there is morphological priming occurring in L2, there will be a reduced N400 effect in both the transparent and opaque conditions, both in the non-switch (only English items) and in the switch block. This was indeed observed in the ERP study of Koester and Schiller (2008) for morphological priming in the L1. When measured in late L2 learners (acquired their L2 after the age of 11), the N400 peak is often delayed and has a decreased amplitude (Mueller, 2005). However, as most of our participants have acquired English already in primary school, starting with classes at age 10, we do not expect them to behave very differently from native speakers.

We were able to bring all these elements together in the following experiment. We had Dutch speakers read aloud words and name pictures in a long-lag priming experiment, consisting of an English block and a block where they had to switch between English and Dutch. We collected both behavioral and ERP data in order to get a more fine-grained idea of the underlying processes of morphological processing in bilinguals.

# **MATERIALS AND METHODS**

### **PARTICIPANTS**

Thirty-six Dutch-English bilingual speakers currently enrolled in higher education or with a graduate degree in higher education

(18 female, average 24.2 years), who had not participated in the Verdonschot et al. (2012) study, participated in the experiment. All had normal or corrected-to-normal visual and auditory acuity. They completed a questionnaire, which included general and language-specific questions. They were asked to rate their Dutch and English proficiency on a scale from 1–10 (with 1: very poor and 10: native-like). The average self-assessment of English proficiency was 7.8 (SD = 1.2) and Dutch proficiency 9.8 (SD = 0.5). Participants were also asked about their average proportion of English use per day. On average their percentage of English use per day, with respect to their use of Dutch, was 21.1% for speaking, 52.8% for reading, and 47.1% for listening. All participants gave informed consent and took part in an off-line English proficiency assessment (Meara and Buxton, 1987)1.

Four participants were excluded from the EEG analysis due to excessive movement artifacts consisting of eye blinks and/or muscular activity, and three were not included because they were left-handed. The remaining 29 were on average 23.6 years of age (15 male).

### **STIMULUS MATERIAL**

The target stimulus set consisted of 36 black-and-white line drawings of concrete objects. Each target picture was combined with three different English compound words as primes. Two of the primes were morphologically related to the target, as one of their constituents was identical to the target picture name. The third type of prime was used as a control and therefore was neither morphologically, phonologically, nor semantically related to the target. The two morphologically related primes were either transparent, with the compound semantically related to the picture name, or opaque, where the compound is not semantically related to the picture name. An example of a transparent primetarget combination is moonlight-MOON. The first constituent of the compound is identical to the target MOON, and both the constituent and the target are identical in meaning. The opaque variant used in the experiment is honeymoon-MOON. Here, the constituent 'moon' of 'honeymoon' does not literally mean 'moon' in the compound it appears in. The compound 'earring' was used as the unrelated prime to the target MOON. It is neither phonologically nor semantically related to the word 'moon.'

Word frequency, number of syllables, word length in phonemes, word length in letters, and stress position were matched – see **Table 1** for more information. Zwitserlood et al. (2000) have shown that the position of the target morpheme does not influence morphological priming – both the first and second constituent of compounds cause faster naming latencies in morphologically related targets. This finding has also been replicated in Koester and Schiller (2008) and Verdonschot et al. (2012). Thus, the position of the target morpheme was evenly distributed across conditions, in half of the cases as the first constituent, and in the other half as the second constituent. Also, only compounds written as one orthographic word were used, as De Jong et al. (2002) have found that compounds written with a space between the constituents are processed

<sup>1</sup>Average scores were 4898/4414, with a SD of 130/353.


**Table 1 | Mean and SD (between parentheses) of the number of syllables, word frequency per million, number of phonemes, word length and stress position for each prime type and for the targets.**

differently from compounds that are written as one orthographic word.

To assess the semantic transparency of the opaque and transparent primes, a group of 31 students who did not participate in the experiment was asked to rate the semantic relationship of each target picture name to a constituent in either a transparent or opaque prime. They rated this relationship on a seven-point scale (1: not at all semantically related, 7: identical in meaning). Transparent compounds were rated more semantically related (5.9) than opaque compounds (2.9), *t*(70) = 14.6; *p* < 0.01. For all 36 targets, the transparent primes received on average higher scores than the opaque primes.

As a long-lag morphological priming design was employed, additional fillers to create intervening trials were used. As it was crucial that participants actually accessed their L1 in the switch block, we also employed pictures as intervening items. By using pictures, participants could not just rely on an orthography-tospeech route where they would not have to access the concepts themselves, and thereby possibly not the indicated language. Therefore, these intervening trials consisted of both words and pictures. An additional 25 pictures and 140 English and 140 Dutch filler words were selected.

In the appendix, an overview of all prime-target combinations used in the experiment can be found.

### **DESIGN**

For this experiment, the design was identical to Verdonschot et al. (2012). The experiment was designed and controlled using Eprime 2.0 (Psychology Software Tools). A 2 (Block Type: Switch vs. non-Switch)×3 (Prime Type: Opaque, Transparent, and Unrelated) design was implemented, using six different experimental lists. The lists consisted of two different orders and three different prime-target combinations. Each participant saw each picture only twice – once in the non-switch condition, and once in the switch condition. This resulted in 72 (2 × 36) target trials per participant over all blocks. This way, participants did not see a target twice in the same condition and all targets were tested in all conditions across all participants. In **Table 2**, an example target with its three prime conditions is given.

Between each prime and target, filler items were included to create intervening trials. Previous experiments have shown that morphological priming effects even survive lags of up to 10 items (Zwitserlood et al., 2000), but to reduce the length of the experiment, only lags of either seven or eight items long were used. Each trial consisted of both pictures and words. They were positioned in the experiment such that intervening trials did not contain

**Table 2 | Example of a target with all three prime types – transparent, opaque and unrelated.**


any items that were phonologically or semantically related to the following target picture in either language. Before every target picture, another picture was inserted that was to be named in English, to prepare participants to naming pictures instead of reading words, and in the switch blocks, also to avoid any additional language switching costs. These pre-stimulus pictures were also neither phonologically nor semantically related to the target pictures. See **Figure 1** for a prime-target example in both a non-switch and a switch block. To avoid any order expectation, additional sequences of words and pictures that did not match the order of a regular prime-target sequence were included.

### **PROCEDURE**

Before the start of the experiment, participants were given information about the experiment (written in Dutch), completed a questionnaire about general and language-specific information, and gave written informed consent. Participants were given 5 m to familiarize themselves with the Dutch and English names of the pictures used in the experiment by studying a booklet. In the booklet, all pictures were printed accompanied by their Dutch and English name, the Dutch name printed in a red font, and the English name printed in a blue font. Next, participants were seated individually in front of a computer screen in a quiet room, and were connected to an EEG setup. On the computer screen, Dutch instructions were presented in white letters against a black background. The participants were asked to read aloud words and name pictures as fast and accurately as possible. A voice-key (SR-BOX) was used to measure the naming latencies.

First, a practice block was administered which was identical in form to a switch block. This block consisted of 50 words and pictures; participants had to repeatedly switch between reading words and naming pictures in both Dutch and English. This way, the participants could familiarize themselves with the task, and the experimenter was able to see whether the voice-key was reacting appropriately to the voice of the participant.

The main experiment consisted of two blocks, one of which only contained English words and English pictures (the nonswitch block), and the other which contained English primes and targets and Dutch intervening trials (the switch block). In the non-switch block, all words and pictures were presented in white against a black background. In the switch block, red words and a red frame indicated that the participants had to use Dutch, whereas blue indicated that English was to be used. The words were already written in the target language, and no translation was required – the colors were added to facilitate picture naming in the correct language.

Each trial began with a fixation cross in the middle of the screen for 250 ms, followed by a blank screen for 250 ms. Next, a picture or word was presented for 400 ms, after which it disappeared from the screen and participants had an additional 1,100 ms to name the item. The experimenter assessed the validity of the trial on-line, indicating whether word errors or voice-key errors occurred. After each experiment, participants completed an off-line English proficiency assessment task (Meara and Buxton, 1987).

### **EEG RECORDINGS**

The EEG was recorded using 32 Ag/AgCl electrodes (BioSemi ActiveTwo), which were placed on the scalp sites according to the standards of the American Electroencephalographic Society (1991). Eye blinks were measured by two flat electrodes placed at the sub- and supra-orbital ridge of the left eye (VEOG1 and VEOG2), horizontal eye movements were measured by two flat electrodes placed at the right and left outer canthi (HEOG1 and HEOG2), and two flat electrodes were placed at the two mastoids. The electrodes CMS and DRL were used as ground references. The EEG signal was later re-referenced off-line using the mean of the two mastoids. Sampling occurred at 512 Hz, and a band-pass filter of 0.01–30 Hz was applied off-line.

### **DATA ANALYSIS**

Participant errors (7.7%) and voice key errors (8.3%) were excluded from further analysis. Reaction times that deviated more than 2.5 SDs from a participant's mean per condition (4.1%) were removed2. The trimmed data were non-normally distributed as

indicated by a Shapiro–Wilk test (all *p*s < 0.05); therefore, it was decided to take the natural log of the reaction times. A repeatedmeasures ANOVA on both the trimmed data and on the trimmed, log-transformed data was performed. ANOVAs from both sets are reported.

Mauchley's test showed violations of sphericity against the factor Prime Type (*F*1) and the interaction of Prime Type and Block (both *F*1 and *F*2), *W*(2) = 0.73, *p* < 0.01, *W*(2) = 0.70, *p* < 0.01, and*W*(2)=0.70, *p*<0.01, respectively. A 2×3 repeated-measures ANOVA was conducted with a Greenhouse-Geisser correction (ε = 0.78, ε = 0.79, ε = 0.77) to test for statistical significance, with Block and Prime Type as factors, and participants (*F*1) and stimuli (*F*2) as random factors. For all six different experimental conditions, the mean, SD, and 95% confidence intervals were calculated for picture naming latencies.

Regarding the ERP data, four participants were excluded from the analysis due to excessive movement artifacts. Trials were mostly excluded because of movement artifacts due to eye blinks and overt speech. These were trials with amplitudes below –200 μV, above 200 μV or trials within which there was an absolute voltage difference of more than 200 μV. Also, all trials that were responded to incorrectly by participants were excluded from the EEG analysis. Therefore, a total of 41.3% of all trials was used in the averaging procedure (41.1% for the opaque non-switch condition, 40.6% for the transparent non-switch condition, 38.3% for the unrelated non-switch condition, 43.8, 40.6, and 43.5% for the opaque, transparent, and unrelated switch condition, respectively).

Mean amplitude ERPs were calculated for each participant separately, using a time window of 100 ms prior and 600 ms following picture onset. Between 0 and 600 ms post stimulus onset, mean amplitudes per time windows of 50 ms, with an overlap of 25 ms, were evaluated for an N400. Repeated-measures ANOVAs with Greenhouse-Geisser corrections were used to analyze the ERP amplitudes.

# **RESULTS**

### **BEHAVIORAL DATA**

There is a main effect of Block, *F*1(1, 35) = 80.55, *MSE* = 5433.07, *<sup>p</sup>* <sup>&</sup>lt; 0.01, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.18; *<sup>F</sup>*2(1, 35) <sup>=</sup> 79.60, *MSE* <sup>=</sup> 5598.64, *<sup>p</sup>* <sup>&</sup>lt; 0.01,

<sup>2</sup>The errors and trimmed (removed) data were evenly distributed over the six conditions. The percentages are 2.9% for non-switch opaque, 3.0% for non-switch

transparent, 3.2% for non-switch unrelated, 3.8% for switch opaque, 3.3% for switch transparent and 3.9% for the switch unrelated condition.

<sup>η</sup><sup>2</sup> <sup>=</sup> 0.20; *min F* - (1,70) = 40.04, *p* < 0.01, showing that there is a significant difference in naming latencies between trials from the switch block and from the non-switch block. After the logtransformation results were similar, again showing a main effect of Block: *<sup>F</sup>*1(1,35) <sup>=</sup> 76.02, *MSE* <sup>=</sup> 0.03, *<sup>p</sup>* <sup>&</sup>lt; 0.01, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.16; *<sup>F</sup>*2(1,35) <sup>=</sup> 92.90, *MSE* <sup>=</sup> 0.03, *<sup>p</sup>* <sup>&</sup>lt; 0.01, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.20; *min F* - (1,70) = 41.8, *p* < 0.01. Participants took on average 85 ms longer to name items when they had to constantly switch between Dutch and English. The Block Type did not affect accuracy; participants made on average 2.6% errors in the switch blocks, compared to 2.9% in the non-switch blocks. There was also a main effect of Prime Type, *F*1(2,70) = 21.64, *MSE* = 2926.42, *<sup>p</sup>* <sup>&</sup>lt; 0.01, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.06; *<sup>F</sup>*2(2,70) <sup>=</sup> 19.11, *MSE* <sup>=</sup> 2804.29, *<sup>p</sup>* <sup>&</sup>lt; 0.01, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.06; *min F* - (2,140) = 10.15, *p* < 0.01, indicating that the type of prime influenced response latencies. After the logtransformation of the per-item scores, the statistics showed similar values: *<sup>F</sup>*1(2,70) <sup>=</sup> 20.95, *MSE* <sup>=</sup> 0.02, *<sup>p</sup>* <sup>&</sup>lt; 0.01, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.05, *<sup>F</sup>*2(2,70) <sup>=</sup> 19.17, *MSE* <sup>=</sup> 0.02, *<sup>p</sup>* <sup>&</sup>lt; 0.01, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.06; *min F* - (2,140) = 10.01, *p* < 0.01. The interactions between Block and Prime Type did not reach significance, *F*1(2,70) = 3.08, *MSE* <sup>=</sup> 2789.46, *<sup>p</sup>* <sup>=</sup> 0.052, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.009; *<sup>F</sup>*2(2,70) <sup>=</sup> 1.34, *MSE* <sup>=</sup> 2858.76, *<sup>p</sup>* <sup>=</sup> 0.27, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.004; *min F* - (2,121) = 0.93, *p* = 0.40. After the log transform, the interaction was not significant either: *<sup>F</sup>*1(2,70) <sup>=</sup> 1.60, *MSE* <sup>=</sup> 0.02, *<sup>p</sup>* <sup>=</sup> 0.21, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.004; *<sup>F</sup>*2(2,70) <sup>=</sup> 1.34, *MSE* <sup>=</sup> 0.02, *<sup>p</sup>* <sup>=</sup> 0.48, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.002; *min F* - (2,139) = 0.73, *p* = 0.48.

*Post hoc* paired *t-*tests revealed that in the non-switch block, targets primed by both opaque and transparent primes differed significantly from the control targets. As expected, there was no significant difference between the opaque and the transparent condition. In the switch block, both opaque and transparent primes resulted in significant faster naming latencies of the target picture than the control (unrelated) primes did. There was again no significant difference between transparent and opaque primes. See **Figure 2** for a plot with the average reaction times. All mean reaction times, SD, 95% confidence intervals, error rates, and the *post hoc* paired *t*-tests are shown in **Table 3**.

### **ERP DATA**

One set of analyses was conducted for lateral sites, divided into four different Regions of Interest (ROIs; see **Figure 3**): anteriorleft (F3, FC5, C3), anterior-right (F4, FC6, C4), posterior-left (CP5, P3, PO3), and posterior-right (CP6, P4, PO4), with the factors Prime Type (3), Block (2), and ROI (4) as factor. Another set comprised the midline electrodes Fz, Cz, and Pz, with a Prime Type by Block by Electrode (3) design. All the mean amplitude values were compared in a repeated-measures ANOVA per blocks of 50 ms, with an overlap of 25 ms each, within the 0 and 600 ms post-stimulus-onset time window.


**Table 3 | Overview of all mean reactions times (RT), error rates (E), 95% confidence intervals, differences between the conditions, and paired comparisons.**

Standard deviations are between parentheses.

The time window between 400 and 575 ms showed a significant interaction or trends toward an interaction between Block and Condition for both the midline electrodes and the lateral regions.

Although the behavioral data showed no difference between the non-switch and the switch block, the EEG data do (see **Figures 4** and **5**). In the non-switch block, the unrelated condition indicates an increased negativity around 400 ms post-stimulus onset in frontal regions, whereas in the switch block the transparent condition is indicating a reduced N400. As the graphs seem to show a different pattern for the non-switch and the switch block, and the factor Block was involved in significant interactions in both the lateral and midline regions, it was decided to perform separate analyses for the 400–575 ms time window for the nonswitch and the switch block. For the midline electrodes, there was no significant interaction between Condition and Electrode in neither the non-switch nor the switch block, *F*(4,88) = 0.92, *p* = 0.42, and *F*(4,88) = 0.76, *p* = 0.50. Condition is a significant main effect only in the non-switch block, *F*(2,44) = 4.81, *p* = 0.01. *Post hoc* Tukey tests indicate that there is no difference between the opaque and transparent condition (*p* = 0.79), but that the unrelated condition is significantly different from the opaque condition (*p* = 0.01) and near-significant from the transparent condition (*p* = 0.08).

Since there was also a near-significant interaction between Block and Condition (*F*(2,44) = 3.47, *p* = 0.06) in the lateral regions, separate analyses for the non-switch and switch block were conducted. In the switch block, there is no significant interaction between Condition and ROI, *F*(6,132) = 0.47, *p* = 0.64, and no main effects of neither Condition nor ROI, *F*(2,44) = 4.42, *p* = 0.25, and *F*(3,66) = 0.42, *p* = 0.63. In the

non-switch condition, there is a significant main effect of Condition, *F*(2,44) = 3.90, *p* = 0.03, and a main effect of ROI, *F*(3,66) = 5.68, *p* = 0.002. *Post hoc* Tukey tests for the factor Condition in the non-switch block show no difference between the opaque and the transparent condition (*p* = 0.32), but a significant difference between the unrelated condition and both the transparent (*p* < 0.01) and opaque condition (*p* < 0.01).

# **DISCUSSION**

Participants took on average longer to name items in the switch block than in the non-switch block. These switching costs were expected, as the task in the switch blocks was more difficult. Participants not only had to switch between reading words and naming pictures, but were also required to switch between languages (Koch et al., 2010; Verdonschot et al., 2012). In both the non-switch and the switch blocks, the participants named targets preceded by morphologically related primes significantly faster than target pictures preceded by unrelated primes. Whether the primes were opaque or transparent compounds, did not have any influence on priming effects, as both types resulted in statistically faster naming latencies and there were no significant differences between those two conditions. These results lend further support to models of language production where morphemes constitute a separate level, which is independent of semantics (e.g., Levelt et al., 1999). What is even more interesting is that this study clearly suggests that this independent morpheme level also has to be present in the L2 of proficient bilinguals.

The independence of morphology from semantics is further supported by the fact that in this and other studies (Koester and Schiller, 2008, 2011; Verdonschot et al., 2012) there is no statistical difference between the effect of both transparent and opaque primes. Transparent and opaque compounds differ from each other on whether their meaning is compositional, so that you can derive the meaning of the whole compound from the meaning of its constituents, as in 'moonlight,' or whether the compound is not compositional, so that the meaning of the whole compound is not derivable from the meanings of its constituents, as in 'honeymoon.' Thus, in transparent compounds the constituent identical to the target still shares semantic content, whereas in opaque compounds this is not the case. The shared morpheme of the prime and target only shares its form, but not its meaning. However, as Koester and Schiller have shown, form overlap does not lead to priming in long-lag designs. Therefore, in order to account for the presence of priming effects in the opaque condition, at least some form of processing of separate morphemes of complex, opaque words has to be assumed. Consequently, even though the meaning of opaque compounds such as 'butterfly' needs to be stored, the separate morphemes of the compound are also available. This seems to be the case for opaque compounds both in the L1 (Verdonschot et al., 2012), but also in an L2, as this study has shown.

Importantly, the results of this study suggest that morphological priming does occur in L2 in proficient bilinguals. Considering only the behavioral data, it also seems to be the case that a language switch to L1 between the prime and target does not interfere with priming effects, suggesting that a language switch from L2 to L1 does not result in reactive inhibition of, at least,

morphologically related items in L2. However, the ERP data seem to suggest otherwise. In the non-switch block, a reduced N400 effect was found for unrelated primes, corroborating with Koester and Schiller's (2008) results. However, the ERP data from the switch block do not show any N400 effects. This might indicate different participant strategies for the non-switch and the switch block.

The language switch might have made the relation between prime and target too salient so that participants recognized this relation. Therefore, they could have employed a post-lexical checking strategy that facilitated naming of the target items, which also resulted in faster response latencies. After having uttered the prime item, hypotheses about possible following items could have been checked against the concept accessed when naming the target picture. This could then have sped up the naming process, or it might have even sped up the recognition of the picture itself or the access of the concept related to the picture. Thus, whereas the patterns seen in the non-switch block seem to reflect an automatic priming process, the patterns from the switch block could reflect a less automatic process where participants relied more on a post-lexical checking strategy. It is also possible that in the switch block, having to switch from one language to the other constantly has led to an increased activation of the translated concept in the other language. In that way, when a participant had to read out loud a compound in English, it might have led to increased activity not only of the English constituents of this compound, but also of the translated variants in Dutch. Previously, Christoffels et al. (2007) found a phonological activation for cognates in the non-response language. Therefore, it seems likely that also in this study participants might have activated the translated variants of cognate items, or even all translated variants in the non-response language, i.e., Dutch. These Dutch concepts might then later have facilitated picture naming in English. This process is different from a pure priming process in only English, but would also lead to faster picture naming. This could explain why the behavioral data show decreased reaction latencies in both the switch and the non-switch block, but why the ERP data only show an N400 effect in the non-switch block3.

If it is indeed the case that participants used a different mechanism in the switch block, then it raises the question what this means for the conclusions that can be drawn for BLC. In any case, a full inhibition of the L2 could not have taken place, even if participants just relied on a post-lexical checking strategy. Otherwise they

<sup>3</sup>We thank an anonymous reviewer for this suggestion.

would not have been able to keep the concepts related to the primes active during the intervening Dutch trials. However, only items related to the prime could have been hold active until the target was encountered, which is compatible with an account assuming almost full inhibition of the non-target language, with just a very marginal activation of specific items. This is also compatible with an account assuming no inhibition at all.

The results of this priming study show that both transparent and opaque compounds in the L2 are parsed up to the morphological level, suggesting that even compounds that need to be stored as wholes, as their semantics are not compositional, are internally parsed. The results also indicate that behavioral data benefit from being augmented with EEG data, i.e., only the ERP data showed that participants were actually processing languages differently when switching between their L1 and their L2 from speaking only in their L2. Moreover, it has shown that accounts assuming full inhibition of the non-target language in bilinguals are not compatible with the observations made in Verdonschot et al. (2012) and the current study.

Combining ERP data with behavioral data in language switching paradigms, as well as using a diverse range of participants with different language backgrounds, language ecologies, and language proficiencies may shed further light on the issue of the representation of complex words.

# **ACKNOWLEDGMENTS**

Many thanks go to Renée Middelburg for her valuable feedback on earlier drafts of this paper, and to Joey L. Weidema and Leticia Pablos Robles who provided valuable assistance with the ERP data and analysis. Furthermore we would like to thank two reviewers for their helpful suggestions and insightful comments.

# **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fnhum.2014.00995/ abstract

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 July 2014; accepted: 23 November 2014; published online: 12 December 2014.*

*Citation: Lensink SE, Verdonschot RG and Schiller NO (2014) Morphological priming during language switching: an ERP study. Front. Hum. Neurosci. 8:995. doi: 10.3389/fnhum.2014.00995*

*This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2014 Lensink, Verdonschot and Schiller. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited andthatthe original publication inthis journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Differential recall of derived and inflected word forms in working memory: examining the role of morphological information in simple and complex working memory tasks

# *Elisabet Service1,2 \* and Sini Maury2*

<sup>1</sup> Language, Memory and Brain Lab, Department of Linguistics and Languages, McMaster University, Hamilton, ON, Canada <sup>2</sup> Cognitive Science, Institute of Behavioural Sciences, University of Helsinki, Helsinki, Finland

### *Edited by:*

Alina Leminen, Aarhus University, Denmark

### *Reviewed by:*

Lorraine Boran, Dublin City University, Ireland Alexandre Nikolaev, University of Eastern Finland, Finland

### *\*Correspondence:*

Elisabet Service, Language, Memory and Brain Lab, Department of Linguistics and Languages, McMaster University, Togo Salmon Hall 614, 1280 Main Street West, Hamilton, ON L8S 4M2, Canada e-mail: eservic@mcmaster.ca

Working memory (WM) has been described as an interface between cognition and action, or a system for access to a limited amount of information needed in complex cognition. Access to morphological information is needed for comprehending and producing sentences. The present study probed WM for morphologically complex word forms in Finnish, a morphologically rich language.We studied monomorphemic (boy), inflected (boy+'s), and derived (boy+hood) words in three tasks. Simple span, immediate serial recall of words, in Experiment 1, is assumed to mainly rely on information in the focus of attention. Sentence span, a dual task combining sentence reading with recall of the last word (Experiment 2) or of a word not included in the sentence (Experiment 3) is assumed to involve establishment of a search set in long-term memory for fast activation into the focus of attention. Recall was best for monomorphemic and worst for inflected word forms with performance on derived words in between. However, there was an interaction between word type and experiment, suggesting that complex span is more sensitive to morphological complexity in derivations than simple span. This was explored in a within-subjects Experiment 4 combining all three tasks. An interaction between morphological complexity and task was replicated. Both inflected and derived forms increased load in WM. In simple span, recall of inflectional forms resulted in form errors. Complex span tasks were more sensitive to morphological load in derived words, possibly resulting from interference from morphological neighbors in the mental lexicon. The results are best understood as involving competition among inflectional forms when binding words from input into an output structure, and competition from morphological neighbors in secondary memory during cumulative retrieval-encoding cycles. Models of verbal recall need to be able to represent morphological as well as phonological and semantic information.

**Keywords: complex span, morphological processing, inflected, derived, Finnish, morphology**

# **INTRODUCTION**

The study described below attempts to answer questions addressing characteristics of the interface between long-term memory (LTM) for word forms, i.e., the mental lexicon, and working memory (WM) for word forms bound together in focused attention. How does morphological complexity affect performance in two classes of WM tasks involving word recall: immediate serial recall of words and serial recall of words in the presence of interference from distractor sentences? These two tasks also reflect capacity for learning and recalling word forms and collections of words in linear order or as parts of more complex structures. They can, therefore, speak to our ability to learn and call up from memory morphologically complex forms.

The past few decades have experienced a growing interest in the mental representation of morphologically complex word forms (see e.g., Feldman, 1995; Bertram et al., 2011). The great majority of this work has addressed the question of morphological decomposition (e.g., Marslen-Wilson and Welsh, 1978; Butterworth, 1983; Taft, 1991; Niemi et al., 1994; Schreuder and

Baayen, 1995; Stockall and Marantz, 2006; Marslen-Wilson and Tyler, 2007; Rastle and Davis, 2008). Most of the research has been based on experiments on written word access as well as case studies of neurological patients. The present study looks at morphological processing from a WM perspective. Following up on some of our earlier results (Service and Tujulin, 2002), we wished to explore how morphological complexity affects the ability to keep word forms active for immediate binding for serial recall [usually referred to as short-term memory (STM)] as well as for recall in connection with interference from a secondary task of sentence processing. We used standard STM and WM paradigms, i.e., immediate serial recall, and serial recall immediately following tasks that have been designed to involve both storage and processing demands (complex span tasks). Immediate serial recall provides a means for estimating capacity to bind together words and word forms into an ordered structure. Complex span tasks interleave encoding of words into a memory list with secondary processing tasks, such as sentence reading, equation verification, counting, etc. Such complex span tasks are thought to measure

capacity of attentional control and targeted search in LTM to keep memory content such as word sequences available in the focus of attention (i.e., STM) for tasks such as serial recall (Unsworth and Engle, 2007; Shipstead et al., 2014). Our rationale for choosing these two types of task was that they should reveal how morphological complexity affects binding processes in STM on the one hand and WM processes involving LTM on the other.

The materials we used to study morphological WM cost were Finnish word forms. Finnish is a morphologically rich language that typically expresses a number of syntagmatic relations by changing the forms of words by adding suffixes to them. Finnish nouns, adjectives, and numerals can appear in at least 12 different case forms that are actively used. Typically, case inflections are used to express syntactic functions like subject, object, modifier, as well as semantic functions like proximity, possession, location, and change of location. Most functions expressed by case inflections in Finnish, are signaled by word order and prepositional phrases in English. In addition to case endings, there are various clitics that can be added to nominals to express meanings often conveyed by function words in English. Adding such extraparadigmatic clitics does not change the form of the word (Niemi et al., 2009). In contrast, case suffixes are attached to a limited number of word stem variants that typically differ from the nominative, dictionary form, of the word. An example of perfectly normal Finnish agglutination is the word "laulajattarillammeko" (it is our female singers that have?) which can be broken up into the following elements: LAULA (derivational root related to the infinitive form "laulaa"= to sing) + JA (derivational suffix for agent = "er") + (T)TAR (derivational suffix expressing female sex like "ess" in lioness. The t-sound is lengthened by a morpho-phonological process) + I (a plural marker, which takes a specific form in inflected words) + LLA (adessive case marker = "at", expresses possession in this case) + MME (possessive clitic = "our") + KO (question clitic). The double letters in the example above are Finnish orthographic signs for long sounds.

Representation of morphology in the mental lexicons of Finnish speakers has been studied in a number of experiments inspecting visual lexical decision, naming, eye movements in word recognition, picture matching, and visual word recognition during progressive demasking, and other, mostly visual, paradigms (for a review see e.g., Soveri et al., 2007). More recent work has used electro-physiological and brain imaging methods with these tasks (Lehtonen et al., 2009; Leminen et al., 2011). Another source of knowledge has been the detailed analysis of morphological task performance in Finnish aphasics (Laine et al., 1994, 1995; Laine and Niemi, 1997). Early findings were summarized in the so called SAID (Stem Allomorph/Inflectional Decomposition) model (Niemi et al., 1994) proposing that: (1) The nominative singular is a psychologically real base form of Finnish nouns; (2) Inflected, but not derived, nouns are parsed into stems and affixes in word recognition; (3) In production, Finnish case inflected nouns are constructed from stems and affixes and derived nouns from roots and affixes; (4) The different variants of a stem occurring with different endings (the stem allomorphs) are separately represented; (5) Decomposition proceeds only to the level of the allomorph, i.e., all stem variants

are represented as wholes. Thus, more opaque forms are processed similarly to transparent forms. A slight revision to this model, postulating possible morphological decomposition for derived forms at recognition, was suggested later (Laine, 1996). Recent evidence from visual lexical decision (Soveri et al., 2007) supports full-form representations for some inflected forms in the orthographic input lexicon but only for those of very high frequency.

The main hypotheses of the SAID model concerning the organization of the Finnish input lexicon have been mainly upheld in newer work. However, theory has moved toward assuming multiple levels of representation (e.g., Schreuder and Baayen, 1995; Järvikivi et al., 2006), i.e., a form level that is separate from a more abstract conceptual or lemma level which connects to syntactic and semantic knowledge about the form. Variants of multilevel models involve both bottom–up and top–down flow of information, as well as possible lateral inhibition between competitors at different levels. Further, stem allomorphs and suffixes resulting from form decomposition in comprehension, appear to serve as entry points of access to the more abstract lemma level, which holds syntactic and semantic information for the family of stems belonging to a specific word (Järvikivi and Niemi, 2002a,b). Electrophysiological and magnetoencephalographic work (Lehtonen et al., 2007; Vartiainen et al., 2009; Leminen et al., 2012) able to track processing in time indicates that the processing cost attached to comprehension of inflected word forms stems from the semanticsyntactic level of processing rather than early form decomposition in word recognition. Derived forms have been studied less in Finnish than inflected ones. In general, full-form input processing has been supported (e.g., Vannest et al., 2002). However, recent work (Järvikivi et al., 2006) suggests that also derived forms may undergo decomposition in processing if the derivational affixes are very salient, in particular when they have one or few allomorphic variants, whereas no evidence for decomposition could be detected for words with even highly productive affixes if these had many allomorphs.

The postulated compulsory composition process at output for derived as well as inflected words originally rested on evidence from one Finnish aphasic patient who produced a number of false stem/root + affix combinations in reading (Laine et al., 1995). However, a body of later international work supports the use of compositional representations in word production (for a review see, Cohen-Goldberg, 2013). The present study takes at its starting point the conclusion that sufficient evidence exists for separate stem and affix representations for Finnish inflected words and for differences between processing of inflected and derived words, mainly in visual word recognition, but also in spoken word processing (Leminen et al., 2011, 2013). The present question concerns the extent to which morphological complexity adds processing cost to binding word sequences into ordered representations for immediate recall (usually referred to as short-term or primary memory) as compared to the processes that underlie recall from activated LTM (also referred to as secondary memory) in complex span tasks.

According to the influential WM framework developed by Baddeley and Hitch (1974) and Baddeley (2012) serial word recall relies mainly on the phonological loop component of WM, Service and Maury WM and morphology

which in turn consists of a phonological store and an articulatory rehearsal process. If only these components were involved in immediate serial recall of words we should not see any effects of morphology on performance. However, although this is the prototype STM task, it is well known that it is also affected by lexical factors in LTM (Tehan and Humphreys, 1988; Hulme et al., 1991). Lexical representations are thought to provide a means of patching up partly damaged, or incorrectly encoded, representations at recall. In Baddeley's (2000) framework, the more recently introduced episodic buffer component is responsible for integrating different sources of information in STM, possibly including also morphological structure as it is represented in the mental lexicon. Alternative models view WM as an activated part of LTM. In the embedded processes model by Cowan (1995, 2005), WM consists of an area of activated content in LTM with a few items in a limited-capacity focus of attention. A similar model has been proposed by Oberauer (2002), with the difference that one item is selected for processing at any one time within the focus of attention. Cowan and Oberauer do not assume modality-specific systems in STM. Their view of representation of information in the focus of attention is compatible with the feature model of Nairne (1990), in which items are represented as feature vectors. Features of the same type (i.e., modality-specific vs. modalityfree) can overwrite each other. These types of representations are also used in recent computational models of serial STM and WM (Farrell and Lewandowsky, 2002; Lewandowsky and Farrell, 2008; Oberauer et al., 2012). They are, therefore, more compatible with the idea that morphological information may load up verbal STM over and above other types of information, such as sensory traces, phonology, meaning etc. To summarize, models of immediate memory in the activated LTM view can readily accommodate effects of morphological load in immediate recall. Morphological forgetting can be assumed to result from overwriting of morphological features of both inflected and derived forms. The Baddeley and Hitch (1974) framework may be able do so through top–down LTM effects in the episodic buffer. Such effects in the present study would depend on representation of inflectional and derivational affixes or their conceptual counterparts in the mental lexicon.

If the original SAID model is right, there is an asymmetry between input and output representations so that both monomorphemic words in the base form, i.e., nominative singular, and uninflected derived words have full-form *input* representations that could directly support recall, whereas only the monomorphemic word forms would have full-form support for immediate memory from *output* forms, if such support was available, for example, for rehearsal. A serial recall task makes it possible to explore differences between the three types of words: monomorphemic base forms, derived base forms and case inflected words. If differences are found between inflected and uninflected word forms this supports the psychological significance of the base form. If monomorphemic words are better remembered than derived words, separate representations for roots and derivational affixes or the syntactic and semantic information attached to them have to be postulated in some part of the lexicon. From the point of view of immediate memory, morphological load effects for inflected forms can be accommodated by top–down effects from stem and affix allomorph representations in the mental lexicon through the episodic buffer in the Baddeley and Hitch (1974) model of WM and through feature overwriting in variants of the embedded processes model. Morphological load effects for derived forms are less readily accommodated by the Baddeley and Hitch (1974) framework if the input lexicon does not have decomposed forms, as the phonological store is proposed to be an input store. In feature models of STM representation, derivational information would be treated in the same way as inflectional information with the difference that the availability or weight of derivational features could be more limited than those of inflectional features. This would be especially plausible for Finnish as its inflectional affixes are formally invariant (with the exception of low-level phonological processes). Finnish derivational affixes take many different forms depending on the additional inflections that they frequently occur together with, making them less salient for decomposition (Järvikivi et al., 2006).

# **EXPERIMENT 1**

The first experiment employed serial recall of word lists of fixed length to explore differences in immediate memory for Finnish monomorphemic nouns, derived nouns, and nouns inflected in case forms. Results from serial recall of morphologically complex word forms has been reported in two previous studies. Service and Tujulin (2002) found for two groups of 8-year-old children that lists of spoken monomorphemic words were better recalled than both lists of inflected and of derived Finnish words, and that derived words were better recalled than inflected. For an adult sample of university students, performance on monomorphemic lists was again superior to performance on inflected lists. However, performance on lists of derived words did not significantly differfrom performance on monomorphemic lists.Whereas all groups showed signs of morphological load when recalling inflected forms, the children, but not the adults, were also sensitive to morphological load when recalling uninflected derived words. The difference between age groups could have resultedfrom some other difference than morphological between the derived and monomorphemic word sets. For instance, the derived words may have been less familiar to the children. Németh et al. (2011) studied serial recall in Hungarian adults. They report several signs pointing to morphological information creating a load in serial STM. Recall was better for monomorphemic word lists compared to inflected word lists, and better for derived word lists than inflected word lists. Furthermore, words with two suffixes were harder to remember than words with one suffix, which were harder than monomorphemic words. Regularly inflected words were easier than irregularly inflected words. However, the authors do not report a direct comparison between recall for monomorphemic and derived word lists. Previous studies, thus, suggest that inflectional information limits capacity to bind words together for ordered serial recall whereas the evidence for the role of derivational information remains less clear. In our first experiment, we simply asked whether immediate serial recall performance differs for monomorphemic, derived, and inflected word forms. Such differences could be modality-specific, for instance, because of greater auditory than visual perceptual confusability between

suffixes. We therefore investigated recall of both auditorily and visually presented lists.

The words were presented in blocked lists to accentuate possible morphological effects. Based on the SAID model of Finnish morphological processing, we expected uninflected word forms to be better remembered than inflected forms. A finding of morphological load (inflection and/or derivation) affecting recall would suggest that immediate serial recall for word lists is not limited by phonological information only, as suggested in the Phonological Loop model of verbal STM. Such effects could be better handled by the feature model of Nairne (1990) or variants of the distributed serial order in a box (SOB) model (Farrell and Lewandowsky, 2002; Lewandowsky and Farrell, 2008) which accommodate information of many kinds to be represented in any immediate recall task.

### **METHOD**

### *Participants*

Twenty students volunteered for the experiment, either for course credit or a small sum of money. There were 16 females and 4 males, whose ages ranged from 18 to 54 years (mean = 26.3). All participants spoke Finnish as their first language and none had experienced problems with reading or writing.

### *Stimuli*

Ten lists of seven nouns were constructed for each of the word types: monomorphemic, inflected, and derived, by random selection without replacement from pools of 70+70+70 words. Frequency information for the words was acquired from an unpublished computerized corpus which includes 22.7 million word tokens from a major Finnish newspaper *Turun Sanomat*. Lemma frequency (i.e., frequency of the word in any form) was controlled between different word types, as were word length in letters or phonemes (in Finnish these are identical with a few rare exceptions; see **Table 1**). The frequency of the surface form could not be perfectly controlled at the same time as the lemma frequency. However, it was known (**Table 1**) and could therefore be used in item analyses as a covariate. To avoid a confound between word type and concreteness, we tried to match imageability between the three word types by selecting the stimuli in triplets that were subjectively similar in evoking imagery associations. The monomorphemic forms were words in singular and nominative case (dictionary form) with no derivational affixes. The nominative case is in Finnish the unmarked subject case, but it is also used for predicate complements and objects in certain constructions. Eight different cases were used to make up the inflected forms. All these cases have multiple functions. The cases used and their most prototypical functions are: genitive

(expressing possessor)/accusative (unmarked object case), partitive (most common object case with many other functions as well), inessive (locative form "*in*"), elative (locative form "*from within*"), illative (locative form "*into*"), essive ("as something"), and translative [expresses state that something changes/has changed into, e.g., "*Lumi* (snow) *sulaa* (melts) *vedeksi* (water+translative)." *Snow melts into water*]. Seven of the cases were physically different (the genitive and the accusative singular are homonymous in Finnish). The derived words had constructions employing eight different productive derivational suffixes (deadjectival -*UUs*, denominal - *UUs*, deverbal -*Us*, deadjectival -*Us*, deverbal -*nA*, deverbal -*ntA*, deverbal -*nti*, deverbal -*jA*; capitals indicate vowels that change as a consequence of vowel harmony, i.e., only front vs. back vowels are allowed in a specific word form, /e/ and /i/ are treated as neutral; double letters indicate long sounds). Homonymous forms could not be avoided as the number of frequent productive derivational endings of a certain length is limited. It should be pointed out, though, that nominalizations of different word classes constitute different derivational processes (e.g., the method of choosing the root that the ending has to be attached to as well as the semantic effect differ) and are therefore not confusable. Moreover, despite similar affixes, the resulting word forms often had different last syllables because of phonological processes (such as vowel harmony) or because of resyllabification after an ending was added. To allow control for similarity of word endings within a list, similarly ending words were also included in the monomorphemic lists. Most of the derivational suffixes have multiple allomorphic forms (e.g., nominative *virta-us*, flow, has the genitive form *virta-uksen*). None of the derivational endings were structurally invariant in both singular and plural forms. As structural invariance has been found to increase the salience of Finnish derivational affixes (Järvikivi et al., 2006), it can be noted that our set of derived words was not biased to maximize decomposition in this respect. Example lists of word forms to remember are shown in **Table 2**.

### *Procedure*

Every participant took part in an auditorily and a visually presented condition. In the auditory condition words were presented from a minidisk at a rate of one word per second. At the end of each seven-word list, the participants immediately orally recalled as many words in their presented form as they could remember. In the visual condition, PsychLab software was used to present words in black Geneva 36-point font at the center of a Macintosh Quadra 950 computer screen at a one-word-per-second rate. The same words were presented in both modalities but in differently ordered lists. Half of the subjects received one set of lists in the auditory modality and the other set in the visual modality. For the other half of subjects the list sets were reversed. The order

**Table 1 | Frequency per million words, word length, and imageability mean (standard deviation in parentheses), and ranges for the word stimuli in Experiments 1−3.**


**Table 2 | Examples of lists of monomorphemic, derived, and inflected words.**


Note that both stems and suffixes have allomorphic variants for different case forms of the word so that a mechanic agglutinating process of adding endings to an invariant root or stem is often not possible.

of the different presentation modalities and the blocks with the three types of words was counterbalanced between participants. Participants had been instructed to recall the words in the same order as they had been heard. We initially scored using both a strict order criterion, scoring only correct word forms that were produced in the same order as presented, and a more lenient item criterion scoring each correctly recalled word form for a list. The main results were practically identical. As the item score allowed us to also look at confusions combining stems/roots with incorrect endings we have opted to report only the item scores here.

### **RESULTS**

Recall performance for items per list is shown in **Figure 1A** and **Table 3**. The immediate recall scores (number of words recalled across 10 lists) were subjected to a 2 (Modality: auditory vs. visual) × 3 (Word type: monomorphemic vs. inflected vs. derived) analysis of variance (ANOVA) with repeated-measures and explored by planned contrasts, comparing the different word types. Because generalization in language experiments is made both from individuals to a population and the sampled language items to all similar items in language, the analysis by subjects was complemented by an analysis with items as the random factor. In the latter, the number of subjects recalling each item was used as the dependent measure. Because, different word types were represented by different items, the analysis was a less powerful between-items model. Both measures were normally distributed (Kolmogorov–Smirnov test). We report effect sizes for the statistical tests based on recommendations proposed by Lakens (2013), giving both η<sup>2</sup> <sup>p</sup> and η<sup>2</sup> <sup>G</sup>; the former is able to inform power analyses and the latter allows comparison of between- and withindesigns. The two main effects of Modality and Word type were significant in analyses with both subjects and items as random effects. Recall was better in the auditory than in the visual condition, *<sup>F</sup>*1(1,19) <sup>=</sup> 23.18, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 0.55, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.15; *<sup>F</sup>*2(1,207) <sup>=</sup> 58.20, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 0.22, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.05, thus showing a typical modality effect. Word type also made a difference, *<sup>F</sup>*1(2,38) <sup>=</sup> 109.45, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 0.85, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.32; *<sup>F</sup>*2(2,207) <sup>=</sup> 19.08, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 0.16, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.13. Recall for monomorphemic forms was better than for inflected forms (subjects: *p* < 0.0001, η<sup>2</sup> <sup>p</sup> <sup>=</sup> 0.84, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.35; items: *p* < 0.0001, η2 <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.15) in both analyses. It was also better than for derived words in the subject although not the item analysis (subjects: *p* < 0.005, η<sup>2</sup> <sup>p</sup> <sup>=</sup> 0.21, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.03; items: *p* = 0.1258). Recall for derived forms was significantly better than for inflected forms in both the subjects (*p* < 0.0001, η<sup>2</sup> <sup>p</sup> <sup>=</sup> 0.76, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.25) and items (*p* < 0.0001, η<sup>2</sup> <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.09) analyses. The interaction between the factors did not reach significance in either subject or item analysis, *<sup>F</sup>*1(2,38) <sup>=</sup> 2.68, *<sup>p</sup>* <sup>=</sup> 0.0817, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.12, η2 <sup>G</sup> <sup>=</sup> 0.01; *<sup>F</sup>*2(2,207) <sup>=</sup> 1.35, *<sup>p</sup>* <sup>=</sup> 0.2605, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 0.01, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.002. The same pattern of results was clear also in separate ANOVAs on the auditory and visual data. The main effect of Word type was highly significant both in the auditory data, *F*1(2,19) = 78.36, *p* < 0.0001, η<sup>2</sup> <sup>p</sup> <sup>=</sup> 0.80, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.34; *F*2(2,207) = 21.14, *p* < 0.0001, η2 <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.17, and the visual data, *F*1(2,19) = 49.58, *p* < 0.0001, η<sup>2</sup> <sup>p</sup> <sup>=</sup> 0.72, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.31; *F*2(2,207) = 11.05, *p* < 0.0001, η2 <sup>p</sup> <sup>=</sup>η<sup>2</sup> <sup>G</sup> =0.10. Additional analyses looking at all correctly recalled roots/stems, i.e., including suffix confusions as correct, were performed to see if the differences between word types could be explained by confusions between suffixes. These analyses showed a similar pattern to the original analysis with one exception. The difference between monomorphemic and derived words did not reach significance in either the auditory or the visual condition, suggesting that the above reported differences between these word types in the subject analysis were mainly due to suffix confusions. Examples of suffix confusions in the derived lists are *keila–us* (bowling) for *keilaa–ja* (bowler) or *melo–ja* (canoer) for *melo–nta* (canoing). However, impaired recall for inflected forms compared to uninflected forms could be seen even when suffix confusions were ignored. As we had not been able to perfectly match surface frequency between the different types of words we also ran analyses of covariance (ANCOVAs) in both modalities, using surface frequency as a covariate. This made no difference to the results and the relationship between form frequency and recall did not approach significance for either presentation modality.

### **DISCUSSION**

Experiment 1 showed very clearly that morphologically complex words were harder to recall than monomorphemic words. The largest effect sizes were between monomorphemic words and inflected words. The analysis by subjects revealed a smaller difference between monomorphemic words and derived words. In the analysis by items this difference did not reach significance. It is possible that this lack of effect is a result of some of the presented derived forms being treated as lexicalized despite their derivational endings being productive in principle. Whereas all monomorphemic forms are lexicalized irrespective of their frequency of use, only derived nouns with a high frequency are likely to be represented solely as whole forms in the mental lexicon. All of our derivational suffixes also have many allomorphs, especially the –(U)Us endings, known to decrease affixal salience (Järvikivi et al., 2006). Inflected forms are also potentially more confusable than derived forms. All Finnish nouns can take most case forms, whereas specific derivational suffixes can only be

**FIGURE 1 | Mean recall of monomorphemic, inflected, and derived Finnish words in lists of seven words in (A) Experiment 1, in the Auditory or Visual Words tasks; in Experiments 2 and 3 in (B) the Last Words and (C) the Independent Words tasks.** The blue portions of the columns indicate perfectly correctly recalled forms. Form confusions are marked by the red portions of the columns. The error bars indicate standard error of the mean for the correctly recalled word forms.

**Table 3 | Recall performance: mean number of words recalled by participants and mean number of participants recalling each item, standard deviation in parenthesis.**


applied to restricted subsets of words (cf. Bybee, 1985). Whatever the explanation for the relative size of the effect, the difference between derived and monomorphemic words replicates the finding from children (Service and Tujulin, 2002) with a completely new set of words. It suggests the presence of a decomposed representation for derivations at some level of the lexicon, either formal or conceptual or both. This is in line with work on Finnish production (Niemi et al., 1994) and reception of derived words with formally invariant affixes. The pattern of effects did not depend on modality and can therefore not be readily explained by perceptual differences between the different word types. Word length in terms of phonemes and letters was controlled and should not have affected the phonological coding or rehearsal of the words.

Immediate serial recall is generally thought to depend on phonologically coded STM, the phonological loop in the WM model by Baddeley and Hitch (1974) and Baddeley (1986). However, this memory component may be aided by LTM if the items have familiar lexical representations (Tehan and Humphreys, 1988; Hulme et al., 1991). The differences found between the different types of words could depend on differences in the LTM support available to them during the task. Complex WM tasks that combine processing and storage (Daneman and Carpenter, 1980) are thought to reflect capacity to activate and manipulate selected portions of LTM whereas there is limited opportunity for rehearsal of phonologically coded words in an internal loop. These tasks have been interpreted to depend on both attentional control to manage primary memory contents and ability to do effective searches in secondary memory for the information that has been displaced from primary memory during the processing task (Unsworth and Engle, 2006; Shipstead et al., 2014). In Experiment 2, we investigated the effects of morphological complexity on a variant of complex span that we call Last Words as it involved reading of sentences and memorizing of their last words. If both inflected and derived words have decomposed representations in the mental lexicon, this task, relying more on secondary memory, should be even more sensitive to morphological load than simple span recall.

# **EXPERIMENT 2**

Different views on complex span performance largely agree that performance depends on the availability of attentional resources for binding memoranda into a list representation for recall while fending off forgetting resulting from the processing task. Models differ on whether they assume forgetting of memory words to be a result of decay with time (Barrouillet et al., 2004) or a result of feature overwriting from other memory items as well as distractor items included in the secondary processing task (Oberauer et al., 2012; Oberauer and Lewandowsky, 2014). They also hypothesize different roles for attention in either refreshing memoranda through rehearsal (Barrouillet et al., 2004) or by suppressing distractors (Oberauer and Lewandowsky, 2014). Furthermore, rehearsal can rely on two mechanisms: attentional or articulatory refreshing (Camos et al., 2009). Some critical factors supporting memory in complex span tasks are the strength of the bindings of the memoranda to their list positions for recall (order memory) and their discriminability from other memoranda (item memory) as well as the availability of attentional resources to establish a good search structure to support recall.

Because complex span tasks depend on the availability of attentional resources to boost memory, we thought that these tasks could be even more sensitive to morphological operations than simple serial recall. Thus, inserting our stimulus words into a memory task that combines reading of sentences and memory for their last words, could show the effect of morphological complexity on recalling words in a task that depends on alternating between encoding memory words into a cumulative list and processing distractors. Recall of morphologically complex words in complex span tasks with sentence processing as secondary task has been reported in two previous studies. Service and Tujulin (2002) found better recall for monomorphemic words than both inflected and derived word forms in two groups of 8-year-old children and one group of adults. However, unlike in the simple serial recall task, none of the groups recalled derived words better than inflected words. Cohen-Mimran et al. (2013) studied recall of regularly and irregularly inflected Arabic nouns in a listening span task. Eleven-year-old children listened to sentences and memorized their last words. Memory was better for monomorphemic words than inflected words, and better for regular forms compared to irregular forms. Thus, two previous studies suggest that complex span tasks are sensitive to morphological complexity. In the present study, we hypothesized that decomposed representations for inflected forms would result in them receiving less support from lexical memory at recall, as uninflected nominative forms are the preferred access forms for nouns in Finnish (Niemi et al., 1994; Laine, 1996). This would also be true for the roots of derived words with productive derivational suffixes (Laine, 1996) to the extent that their morphological features can be expected to decay or be overwritten independently of the root. However, some of the derived words are likely to be treated as lexical wholes, and therefore the effect would be smaller for derived words as a group. These hypotheses were tested in Experiment 2.

In the second experiment we employed a Last Words task, closely resembling the sentence span task developed by Daneman and Carpenter (1980). In this task the participants were shown sentences on a computer screen and asked to read them aloud. For every sentence they also had to memorize the last word. The main difference to the Daneman and Carpenter (1980) procedure was that rather than determining individual spans we tested the participants on 10 groups of seven sentences, aiming to be above span for most individuals. We hypothesized that LTM support from the mental lexicon would lead to the best recall for monomorphemic words in nominative case. Nominative singular is often the most frequent form of a word. Because it also has special communicative functions, such as in introducing a word (*This is an*

*Xnominative singular*), it is also likely to be special at a more abstract lemma level, binding together syntactic and semantic information (Järvikivi and Niemi, 2002a). Second best recall could be expected to occur for derived words, which again are nominative singulars, but which could also activate competitors through a parallel route, based on parsing the units into roots and suffixes. Recall would be worst for inflected words, for which syntacticsemantic decomposition processes are believed to be obligatory in Finnish.

# **METHOD**

### *Participants*

Twenty native Finnish-speaking students volunteered for the experiment for course credit. Of them, 15 were females and 5 males, with ages ranging from 19 to 35 years (mean = 21.9). None of them had taken part in Experiment 1. Neither had any of the participants experienced reading or writing difficulties.

### *Stimuli*

The same lists of monomorphemic, inflected, and derived words as those in Experiment 1 were used. There were again two versions of the stimulus material, presenting the words in different orders. Sentences were constructed containing these words as their last elements. The sentences were controlled for length (ranging from 9 to 13 words, means = 11.2, 11.1, and 11.0 in monomorphemic, derived, and inflected conditions, respectively) and complexity: each sentence consisted of a main clause and either a subordinate clause or a participial phrase. Two versions of 10 lists of seven sentences were formed for each word type. Examples of the sentences can be seen in **Table 4**.

### *Procedure*

The participant's task was to read aloud the sentences and try to memorize their last words. The stimulus sentences were presented using PsychLab software and a Macintosh Quadra 950 computer. They were shown slightly above the center of the computer screen in Monaco 24-point font. When the participant finished reading a sentence aloud the experimenter pressed a button revealing the next sentence after a 2009-ms blank screen. At the end of each list of seven sentences, the participants immediately retrieved as many of the last words as they could in the same order as they had appeared in the lists. Presentation was blocked by word type. The order of presentation of the three types of different words was counterbalanced between participants. Half of the participants saw one version of the stimulus lists, and the other half the other version. Item scores based on one point for each correctly recalled word form are reported.

### **RESULTS**

The mean number of recalled words per list can be seen in **Figure 1B** and descriptive statistics are shown in **Table 5**. The data were analyzed with a repeated-factors ANOVA with Word type as a within-subjects factor and number of words recalled in all lists as a dependent variable in the subject analysis. In the item analysis, Word type was a between-items factor and number of subjects who had recalled a word form the dependent variable. Both dependent variables were normally distributed (Kolmogorov–Smirnov test). There was again a significant effect



The Finnish memory words are shown in bold italic font and their English translations in regular bold font.

of Word type [*F*1(2,38) <sup>=</sup> 20.72, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 0.52, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.14; *<sup>F</sup>*2(2,207) <sup>=</sup> 6.09, *<sup>p</sup>* <sup>&</sup>lt; 0.005, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.06] resulting from better recall of monomorphemic words than inflected words (*p* <0.0001, η2 <sup>p</sup> <sup>=</sup> 0.52, <sup>η</sup><sup>2</sup> <sup>G</sup> <sup>=</sup> 0.14, in subject and *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.05, in item analysis). Recall was also better for monomorphemic than derived words (*p* < 0.0005, η<sup>2</sup> <sup>p</sup> <sup>=</sup> 0.30, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.06, in the subject, and *p* < 0.05, η<sup>2</sup> <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.02, in the item analysis). A somewhat smaller advantage for derived compared to inflected words was significant in the subject (*p* < 0.05, η<sup>2</sup> <sup>p</sup> = 0.12, η2 <sup>G</sup> <sup>=</sup> 0.02) but not in the item (*<sup>p</sup>* <sup>=</sup> 0.1699, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.009) analysis.

An analysis based on accepting all correctly recalled roots/stems (ignoring suffix errors) revealed no significant differences between word types [*F*1(2,19) <sup>=</sup> 0.72, *<sup>p</sup>* <sup>=</sup> 0.4932, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 0.04, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.02].



Thus, all detectable word type differences in the Last Words task seemed to be due to explicit suffix confusions. Item ANCOVAs with lemma and surface form frequencies as covariates did not change the pattern of results. Neither were the frequency factors significant (*F*s < 1).

### **DISCUSSION**

Although the main effect of Word type could be replicated in Experiment 2 the pattern of results was slightly different when simple list recall was replaced by performance in the Last Words task. Monomorphemic words were still recalled the best, probably because they have strong lexical representations and there is no parallel access route based on decomposition for them. However, this time the main split seemed to be between monomorphemic and morphologically complex items rather than between uninflected and inflected forms. Together, Experiments 1 and 2 replicate the pattern of effects found in our earlier study in both children and adults (Service and Tujulin, 2002).

The difference in results between Experiments 1 and 2 could reflect the increased influence of lexical memory on recall performance in a complex WM task where articulatory rehearsal is prevented. Monomorphemic words would receive maximal lexical LTM support in complex span tasks. Derived words would have both whole-word and root + suffix routes available, which might decrease direct lexical support for the word forms as wholes and increase the tendency to substitute one derived form for another. For inflected words, access would always be followed by syntactic-semantic decomposition and direct lexical support would not be available for production. This would result in competition between several activated inflections and be reflected in suffix confusions. There is one detail in the results that does not support this analysis: it does not look like the relative performance on inflected forms was worse in the Last Words task than in the simple spans in Experiment 1, where the influence of the lexicon can be assumed to have been smaller because of active rehearsal. In fact, it looks rather as if performance on inflected forms had slightly improved in comparison to the two other types of words. It also appears as if relative performance on derived words in Experiment 2 was somewhat poorer than in Experiment 1. We will return to a statistical comparison between tasks in connection with Experiment 3.

A more or less similar pattern of results could be expected if competition for shared processes between morphological processes and sentence reading rather than amount of support from the lexicon determined the main pattern of results. The morphological processing involved in dealing with monomorphemic words would be minimal, with derived words intermediate, and with inflected forms the most demanding. Recall could be assumed to reflect this ordering of the processing demands of the different types of words. At the same time recall could be expected to be somewhat lower than in the simple list recall task with articulatory rehearsal and no extra processing demands. This is the general pattern that was found. However, again the relatively improved performance on all three types of words, compared to that, at least, in the visual condition in Experiment 1, undermines the credibility of this argument.

# **EXPERIMENT 3**

A possible explanation for the relatively improved recall in the Last Words task could be the effect of context. Presentation in sentence context could be thought to aid the recall of, especially, inflected words, as the inflectional forms were tied to the syntacticsemantic relations that were expressed in the sentences. Recall of these forms could, therefore, have been relatively easier than in the first Experiment. The sentence context in which the words were embedded could have provided additional memory support in many ways. An episodic context that included the words could have been re-activated at recall. This would have supported all three kinds of words. The semantic context, provided by the sentence, could also have supported recall of all three types of words. A final possibility is that the syntactic and/or semantic role assigning processes invoked in sentence reading and understanding either led to richer encoding of the forms, or still remained partly active at the time of word recall, thus providing priming or support from within WM (Potter and Lombardi, 1998). These possibilities were inspected in the third experiment.

Experiment 3 was a replication of Experiment 2, except that now participants were presented with extra words after the sentences they had to read, for later recall. The extra words were thus included in the episodic context of sentence reading but were not syntactically or semantically connected with the sentences. We hypothesized that if only episodic context mattered in creating richer memory representations then the pattern of results would be similar to that in Experiment 1, with a clear advantage for uninflected words compared to inflected words for lexical processing reasons, but overall recall would be better than for a simple word list. If, on the other hand, syntactic or semantic sentence context had mattered in Experiment 2, this effect should now be missing. If the semantic context of the sentence plays a role it should have decreased both inflectional and derivational confusions in Experiment 2. With the semantic context removed, performance for both types of morphologically complex words, and to some extent

monomorphemic words as well, should be worse in Experiment 3 than 2. If the syntactic context had supported recall in Experiment 2, this should have predominantly helped recall of inflected forms. With the syntactic context removed in Experiment 3, performance for inflected forms should in this case suffer relatively more than for derived forms, as syntactic structure should have restricted the range of possible inflectional, but not derivational, confusions in Experiment 2.

# **METHOD**

### *Participants*

Twenty students took part in the experiment for course credit. There were 16 females and 4 males, with ages ranging from 19 to 38 years (mean = 24.8). None of them had participated in the previous experiments. They were all native speakers of Finnish and none had experienced problems with reading or writing.

### *Stimuli*

The same lists of monomorphemic, inflected, and derived words as in Experiments 1 and 2 were again used. The sentencesfrom Experiment 2 were taken as a starting point and 3 × 70 new sentences were constructed by replacing the last words in the new versions. The original last words were now presented separately. The lists of sentences and target words were recombined. For instance, the last word of a sentence ending in an inflected form in Experiment 2, was replaced, and a monomorphemic or derived target word was attached to the sentence. The sentences were controlled for length (ranging from 9 to 14 words; means = 11.1, 11.2, and 11.3 for monomorphemic, derived, and inflected conditions, respectively) and complexity, as in Experiment 2.

# *Procedure*

The procedure was identical to that in Experiment 2, with the exception that the reading aloud of the last word in each sentence was now followed by a 500-ms blank screen, after which a single unrelated word was presented at the center of the screen in Monaco 28-point font for 510 ms. The word was one of the three word types. If it was an inflected word, its case form was different from that of the last word of the sentence. The participants were asked to memorize this word rather than the last word of each sentence. The memory word was followed by a 2009-ms interstimulus interval before presentation of the following sentence. After seven sentences and target words had been shown the participant attempted to recall the words in the order they had been presented. However, only item scores irrespective of output order are reported here.

### **RESULTS**

### *Recall of independent words*

The mean number of recalled words of different types per list are shown in **Figure 1C**. Descriptive statistics are in **Table 5**. Analyses by subjects were carried out on mean number of words recalled in the different conditions and analyses by items on the number of participants recalling a word. Both variables were normally distributed (Kolmogorov–Smirnov test). A one-way repeatedmeasures ANOVA with Word type as the within-subjects factor showed again a significant main effect paralleled by a betweenitems effect of Word type in the item analysis [*F*1(2,38) = 30.83, *p* < 0.0001, η<sup>2</sup> <sup>p</sup> <sup>=</sup> 0.62, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.19; *F*2(2,207) = 21.52, *p* < 0.0001, η2 <sup>p</sup> <sup>=</sup> 0.17, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.17]. Planned contrasts revealed that monomorphemic words were remembered more often than inflected words [*F*1(1,38) <sup>=</sup> 59.98, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 0.61, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.18, for subjects and *<sup>F</sup>*2(1,207) <sup>=</sup> 41.96, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.17, for items] and derived words [*F*1(1,38) <sup>=</sup> 7.54, *<sup>p</sup>* <sup>&</sup>lt; 0.01, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 0.17, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.03, for subjects, *<sup>F</sup>*2(1,207) <sup>=</sup> 5.48, *<sup>p</sup>* <sup>&</sup>lt; 0.05, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.03, for items]. Furthermore, an advantage for derived words over inflected words was significant for both subjects and items [*F*1(1,38) = 24.98, *p* < 0.0001, η<sup>2</sup> <sup>p</sup> <sup>=</sup> 0.40, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.09; *F*2(1,207) = 17.11, *p* < 0.0001, η2 <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.08].

Including surface frequency as a covariate in the item model did not change the results in any way, and it did not have a significant effect in the model (*p* = 0.2897). Lemma frequency as a covariate was significant, *F*2(1,206) = 5.480, *p* < 0.05, but it did not change the other effects. When all correctly recalled roots/stems were analyzed ignoring morphological errors, a main effect of Word type remained, *F*1(2,38) = 9.35, *p* < 0.005, η2 <sup>p</sup> <sup>=</sup> 0.33, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.06. In planned contrasts, significantly superior recall was found for the two uninflected word types compared to words encountered in inflected form [*F*1(1,38) = 18.59, *p* < 0.0005, η<sup>2</sup> <sup>p</sup> <sup>=</sup> 0.32, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.06, between monomorphemic and inflected words, *<sup>F</sup>*1(1,38) <sup>=</sup> 5.97, *<sup>p</sup>* <sup>&</sup>lt; 0.05, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.14, η2 <sup>G</sup> = 0.02 between derived and inflected words]. The advantage for monomorphemic compared to derived words also approached significance, *<sup>F</sup>*1(1,38) <sup>=</sup> 3.49, *<sup>p</sup>* <sup>=</sup> 0.0696, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.08, η2 <sup>G</sup> = 0.01.

### *Comparison between experiments*

To see whether there were significant interactions between memory task and type of word we entered the results of all three experiments in one ANOVA model with Experiment (Visual Words vs. Last Words vs. Independent Words) as a betweensubjects factor and Word type as a within-subjects variable. Only the results in the visual condition of Experiment 1 were used, as there had been a modality effect in this experiment and presentation in the two other experiments was visual. As with subjects, we also carried out an analysis by items including the data from Experiment 1 (Visual Words task), Experiment 2 (Last Words task), and Experiment 3 (Independent Words task). The dependent variable was the number of subjects that had recalled an item, with Word type as between-items variable and Experiment as within-items variable. Both analyses showed a significant main effect of Word type [*F*1(2,114) = 88.74, *p* < 0.0001, η2 <sup>p</sup> <sup>=</sup> 0.61, <sup>η</sup><sup>2</sup> <sup>G</sup> <sup>=</sup> 0.19; *<sup>F</sup>*2(2,207) <sup>=</sup> 18.55, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.15, η2 <sup>G</sup> = 0.10]. Monomorphemic words were better remembered than inflected words [*F*1(1,114) <sup>=</sup> 174.61, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.61, η2 <sup>G</sup> <sup>=</sup> 0.19; *<sup>F</sup>*2(1,207) <sup>=</sup> 36.47, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.15] or derived words [*F*1(1,114) <sup>=</sup> 26.41, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.19, η2 <sup>G</sup> <sup>=</sup> 0.03; *<sup>F</sup>*2(1,207) <sup>=</sup> 5.46, *<sup>p</sup>* <sup>&</sup>lt; 0.05, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.03], and performance on derived words was better than on inflected words [*F*1(1,114) <sup>=</sup> 65.21, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 0.36, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.08; *<sup>F</sup>*2(1,207) <sup>=</sup> 13.70, *<sup>p</sup>* <sup>&</sup>lt; 0.0005, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.06]. The main effect of Experiment did not reach significance in the analysis by participants [*F*1(2,57) <sup>=</sup> 1.68, *<sup>p</sup>* <sup>=</sup> 0.1965, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.05]

although it did in the item analysis [*F*2(2,414) = 12.56, *p* < 0.0001, η2 <sup>p</sup> <sup>=</sup> 0.06, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.02]. The analysis by participants also revealed a significant interaction between Experiment and Word type, *<sup>F</sup>*1(4,114) <sup>=</sup> 2.81, *<sup>p</sup>* <sup>&</sup>lt; 0.05, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 0.09, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.01), although an interaction between these factors did not reach significance in the analysis by items [*F*2(4,414) <sup>=</sup> 1.79, *<sup>p</sup>* <sup>=</sup> 0.1307, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.02, η2 <sup>G</sup> = 0.01). The interaction appears to stem from the fact that although recall for the three word types differed from each other in all three experiments, the main split in the Visual Words task was between inflected and uninflected words, whereas it was between morphologically simple and complex words in the Last Words task. Results in the Independent Words task fell somewhere in between. The interaction is further investigated below.

One difference between the Last Words task and the two other tasks was that the words to be remembered had been said aloud as the sentences had been read. It is conceivable that an auditory trace could have helped memory for the last word in each sequence before recall. Other auditory traces can be assumed to have been masked by subsequent orally read sentences. To see if the results had been affected by this difference between tasks we reanalyzed the recall data ignoring the results for the seventh words in the seven-word sequences. The item analysis was now based on 52 monomorphemic words, 53 derived words, and 52 inflected words. The results suggest that auditory persistence may have played some role in the Last Words task. Even in the original analysis, the main effect of Experiment had not been significant in the subjects analysis (*p* = 0.1965). In the new six-word analysis there was not even a hint left of an overall difference between tasks in either analysis [*F*1(2,57) <sup>=</sup> 0.03, *<sup>p</sup>* <sup>=</sup> 0.9683, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.001; *<sup>F</sup>*2(2,308) <sup>=</sup> 1.17, *<sup>p</sup>* <sup>=</sup> 0.3105, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 0.01, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.002], suggesting that the overall advantage for words in the Last Words task was a modality effect, limited to the last words in the sequences. Otherwise the results remained very much the same. The main effect of Word type was again significant [*F*1(2,114) = 85.55, *p* < 0.0001, η2 <sup>p</sup> <sup>=</sup> 0.60, <sup>η</sup><sup>2</sup> <sup>G</sup> <sup>=</sup> 0.19; *<sup>F</sup>*2(2,154) <sup>=</sup> 14.6, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> = 16, η2 <sup>G</sup> = 0.12], showing the same pattern as in the original analyses. The interaction between Experiment and Word type was now significant in both analyses [*F*1(4,114) <sup>=</sup> 2.77, *<sup>p</sup>* <sup>&</sup>lt; 0.05, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.09, η2 <sup>G</sup> <sup>=</sup> 0.02; *<sup>F</sup>*2(4,308) <sup>=</sup> 2.71, *<sup>p</sup>* <sup>&</sup>lt; 0.05, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 0.03, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.01], indicating that the three word types were differently affected by task. This effect appeared to be due to the derived words, which were less well recalled in the complex span tasks than in simple span. Lastly, we carried out an ANCOVA on the item data with surface frequency as a covariate. However, this factor was not significant and did not change the pattern of effects.

### **DISCUSSION**

Experiment 3 was carried out to determine if the smaller disadvantage for inflected words seen in Experiment 2, compared to Experiment 1, had been caused by their inclusion in sentence contexts. In Experiment 3 the words to be recalled were independent of the sentence that had to be read aloud. The results supported the hypothesis. In the analysis by subjects the pattern fell somewhere between that in Experiments 1 and 2: monomorphemic words were easiest to remember, derived words were significantly harder to remember, but the greatest gap was between an advantage for both uninflected word types compared to inflected words. This picture was further supported by a significant interaction between Experiment and Word type in a combined data analysis. Morphological information appears to, at least sometimes, result in a processing cost for derived words, compared to monomorphemic words, more easily detected in the more complex tasks. To control for a possibly enhanced auditory recency effect in the Last Words experiment, data were reanalyzed excluding the seventh word in each list. The new analysis removed a recall advantage in the Last words task that seemed to affect all types of words in seventh position, interpreted to stem from a general modality effect. It is, however, also possible to think of it as an effect created by a sentence context still active at the time of recall of the last words in the last sentences within a group of seven. One detail speaking against this interpretation is that the boost in seventh-word recall seemed to be the same for all three types of word, whereas the sentence context manipulation affected different types of words in different ways. An auditory trace effect could be expected to be the same for all types of word but a sentence inclusion effect would not. Similarly, a non-semantic, i.e., purely episodic, context effect could be expected to be neutral to Word type. The interaction between Experiment and Word type in the item analysis revealed a similar pattern for the different types of words as the analysis by subjects in analyses both including and excluding the seventh words in the Last Words task, suggesting that morphological load varied from one task to another. Further interpretation of the interaction requires caution because different groups of participants were involved in the tasks and individual differences may have played a role.

### **EXPERIMENT 4**

The conclusions of an interaction between memory task and morphological word type depend so far on the combination of results from three different experiments with different participants. Furthermore, it is unclear to what extent a confound with modality of stimulus processing (listening/silent reading/oral reading) may have contributed to the interaction. Our last experiment was designed to combine simple list recall with the two complex span tasks in a single experiment to reveal if the pattern of results could be replicated. As recall had been somewhat low for the seven-word lists, we now used lists with six words. The possible effects of pronouncing aloud words in some conditions and not others was controlled by asking participants to read aloud the visually presented words in the simple list condition, read aloud the sentences including their last words in the Last Words condition, and read aloud also the additional words in the Independent Words condition. To encourage participants to deeper processing of the sentences in the Last Words and Independent Words conditions, we added a task that required recognition of the gist of the sentences after each word list recall had been completed.

### **METHOD**

### *Participants*

Eighteen students (mean age = 23 years, SD = 5.4), 13 females and 5 males, took part in the experiment. They received either course credit or a cinema ticket for their participation. All participants spoke Finnish as their native language, and none had had any known problems with learning to read or spell.

### *Stimuli*

The stimulus-words were identical to those in Experiments 1, 2 and 3, except that 10 words from each of the three stimulus-groups – monomorphemic, inflected, and derived – were excluded. The words in the new stimulus-groups were controlled for length and lemma frequency (see **Table 6**). The sentences in the Last Words and IndependentWords conditions were similar to those in Experiment 2 and 3. In Last Words they were 11.2, 11.07, and 10.97 words long and in the Independent Words condition 11.1, 11.1, and 11.3 words long in the monomorphemic, derived and inflected conditions, respectively. The 60 words in each of the morphological stimulus-groups were randomly assigned to lists of six items. Three different orders were created for the three memory tasks, respectively. Thus, the same words occurred in all tasks but were randomly ordered to form different lists in each task.

### *Procedure*

*In the Visual Words condition the stimuli* appeared at the center of a computer screen in Monaco 36-point font at a one-item-per-1250-ms rate. The participants were instructed to read aloud each word. At the end of each six-word list, they had to orally recall as many words as they could in their presented form and order.All correctly recalled items were scored irrespective of output order for the analyses reported here. The equipment used was identical to that in Experiments 1, 2, and 3.

*The Last Words condition* was similar to Experiment 2, except that now each trial included only six sentences and the last word of each sentence was always written in capital letters. Furthermore, a sentence recognition task was presented after the recall of each six-word list. In the recognition task, one of the six sentences just read was shown in its original or an altered form. The participant had to say whether the sentence had been changed or not from one in the list of six. Half of the probe sentences had been altered by replacing one word (never the last one) with a word that changed the meaning of the sentence (*When I go to a familiar barber I always get a little reduction from the normal price/When I go to a*

**Table 6 | Frequency per million words and word length mean, standard deviation in parentheses, and range for the word stimuli in Experiment 4.**


*familiar dentist I always get a little reduction from the normal price*). To further emphasize the importance of deeper processing of the sentences, the experimenter gave feedback after each recognition trial.

Th*e Independent Words condition* was conducted as in Experiment 3. However, unlike before, the unrelated word was written in capital letters and presented on the screen together with, rather than after, the sentence. The participants were instructed to read aloud both the sentence and the word-to-be-remembered. When the participant finished reading, the experimenter pressed a button to proceed to the next sentence–word pair, which followed after a 2009-ms blank screen. After each list recall, a recognition probe similar to the one in the Last Words condition was presented.

The order of the blocks with the three types of words, as well as the presentation order of the three tasks, was counterbalanced between participants. To keep the testing time reasonable, the whole experiment was divided into two parts, so that the shorter Visual Words condition was always run together with either one of the two longer conditions. At least 1 week intervened between the two testing sessions. The scoring procedures were the same as previously.

### **RESULTS**

The number of recalled words per list can be seen in **Figure 2**. Descriptive statistics are shown in **Table 7**. The results with participants as random variable were subjected to a 3 × 3 repeated-measures ANOVA with Word type and Memory task as within-subjects variables. The dependent variable was the number words recalled irrespective of their order across all lists in the experiment. Conservative Greenhouse–Geisser corrected degrees of freedom were used when appropriate. The dependent variable in the 3 × 3 item analysis with Word type as a between-items variable and Memory task as a within-items variable was the number of subjects that had recalled the word. The dependent variables in

**forms in VisualWords, LastWords, and IndependentWords tasks in Experiment 4.**

both analyses were normally distributed (Kolmogorov–Smirnov test). The main effect of Word type was again significant in both analyses [*F*1(2,34) <sup>=</sup> 37.20, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 0.69, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.14; *<sup>F</sup>*2(2,177) <sup>=</sup> 14.31, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 0.14, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.09]. Planned contrasts showed that monomorphemic words were better recalled than inflected [*F*1(1,34) <sup>=</sup> 74.39, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 0.69, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.21; *<sup>F</sup>*2(1,177) <sup>=</sup> 28.44, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.14] and derived words [*F*1(1,34) <sup>=</sup> 19.59, *<sup>p</sup>* <sup>&</sup>lt; 0.0005, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 0.37, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.07; *<sup>F</sup>*2(1,177) <sup>=</sup> 5.27, *<sup>p</sup>* <sup>&</sup>lt; 0.05, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.03].

The effect of Memory task was not significant in the analysis by participants, *<sup>F</sup>*1(2,34) <sup>=</sup> 1.05, *<sup>p</sup>* <sup>=</sup> 0.3493, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.06, η2 <sup>G</sup> = 0.02, although it was in the item analysis, *F*2(2,354) = 6.31, *p* < 0.005, η<sup>2</sup> <sup>p</sup> <sup>=</sup> 0.03, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.01, reflecting the fact that words were recalled by a greater number of participants in the Visual Words task compared to the Last Words and Independent Words tasks. Most importantly, there was again an interaction between Word type and Memory task in both analyses [*F*1(4,68) <sup>=</sup> 4.37, *<sup>p</sup>* <sup>&</sup>lt; 0.01, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 0.20, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.03; *<sup>F</sup>*2(4,354) <sup>=</sup> 5.19, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 0.06, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.02]. The interaction reflected the result that the advantage for monomorphemic compared to inflected words changed little from one memory task to another whereas the disadvantage for derived compared to monomorphemic word type depended on the memory task. Both types of morphologically complex words were harder to recall than monomorphemic words in both complex span tasks, i.e., inflected words [*F*1(1,34) <sup>=</sup> 30.83, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.44, η2 <sup>G</sup> <sup>=</sup> 0.16; *<sup>F</sup>*2(1,177) <sup>=</sup> 14.38, *<sup>p</sup>* <sup>&</sup>lt; 0.0005, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.06] and derived words [*F*1(1,34) <sup>=</sup> 10.18, *<sup>p</sup>* <sup>&</sup>lt; 0.005, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.20, η2 <sup>G</sup> <sup>=</sup> 0.06; *<sup>F</sup>*2(1,177) <sup>=</sup> 3.813, *<sup>p</sup>* <sup>=</sup> 0.0524, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.02, approaching significance] in the Last Words task as well as inflected [*F*1(1,34) <sup>=</sup> 41.24, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 0.49, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.21; *<sup>F</sup>*2(1,177) <sup>=</sup> 31.99, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.15] and derived [*F*1(1,34) <sup>=</sup> 28.57, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 0.40, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.15; *<sup>F</sup>*2(1,177) <sup>=</sup> 19.02, *<sup>p</sup>* <sup>&</sup>lt; 0.0001, <sup>η</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.10] words in the Independent Words task. In contrast, only the difference between monomorphemic words and inflected words [*F*1(1,34) = 13.60, *p* < 0.0008, η<sup>2</sup> <sup>p</sup> <sup>=</sup> 0.38, <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.07; *F*2(1,177) = 9.98, *p* < 0.005, η2 <sup>p</sup> <sup>=</sup> <sup>η</sup><sup>2</sup> <sup>G</sup> = 0.05] was significant in the Visual Words simple span task, whereas the difference between monomorphemic and derived words did not even approach significance [*F*<sup>1</sup> and *F*<sup>2</sup> < 1], the means for derived words being, in fact, a little higher. This pattern replicates the one seen across experiments above, in which inflected forms were less well recalled than uninflected forms (monomorphemic and derived) in simple span whereas both complex forms were more poorly recalled than monomorphemic words in complex span tasks. However, in this within-subjects experiment, stressing comprehension, effects in Last Words, and Independent Words tasks were in the same direction for inflected and derived words.

Two variables not formally controlled in our tasks were the imageability of the items and the number of orthographic neighbors they have. We asked 55 students at the Faculty of Behavioural Sciences at the University of Helsinki to rate the imageability of all 210 items used in the experiments on a 7-point scale (1 = hard to generate an image; 7 = easy to image). Most of the items fell


**Table 7 | Recall performance: mean number of words recalled by participants and mean number of participants recalling items in Experiments 2 and 3, standard deviations are in parenthesis.**

into the middle range (see **Tables 1** and **6**). The values were a little lower for the inflected items. However, using imageability rating means as a covariate in an item ANCOVA of the recall data from Experiments 1–3 did not affect the main effect of word type or the interaction results between task and word type. The main effect of task was no longer significant. Imageability correlated significantly with recall in all three tasks [*r*s*(208)* = 0.20, 0.21, and 0.28, *p*s < 0.005, for Visual Words, Last Words, and Independent Words, respectively]. The number of orthographic neighbors (this is almost identical to phonological neighbors in a near-perfectly transparent orthography) was checked using the online dictionary by the Institute of the Languages of Finland and Kielikone Oy at http://www.kielitoimistonsanakirja.fi/netmot.exe?motportal = 80 [accessed November 11, 2014]. The orthographic neighbor count for monomorphemic and derived words did not significantly differ (*M* = 1.41, SD = 2.12 for monomorphemic and *M* = 1.94, SD = 2.30 for derived words; *F*(1,138) = 2.00, *p* = 0.1596). Larger neighborhoods have been found to boost immediate recall (Roodenrys et al., 2002). However, in our data, all correlations between orthographic neighborhood and recall were close to zero (*r*s between −0.13 and 0.03). We repeated these analyses for Experiment 4 but found again that the effects of word type and interaction between word type and task remained as in the original analysis.

### **DISCUSSION**

Experiment 4 was carried out to see if the interaction between memory task and word type found in an analysis over Experiments 1–3 could be replicated in a single within-subjects experiment. The main pattern of results was replicated showing relatively poorer recall for derived words in complex than simple span. One subtle difference was that the results of the Independent Words task in Experiment 4 now looked more like those of the Last Words task. One reason for this may have been the two small methodological changes that had been made to the task. As the word to be memorized was presented on the same screen as the sentence, and semantic processing of the sentence was encouraged by the gist recognition task, the memory words probably became harder to isolate from the distracting sentences. This would have made the task less like a simple span task and increased the necessity to allocate attention to creating good search structures in LTM for later recall (Unsworth and Engle, 2007).

This process may have been more demanding for items containing more morphological information than for monomorphemic words. Thus, the results of Experiment 4 strengthen the conclusion that morphological information creates different challenges for immediate serial recall and complex span tasks. The differences appear to relate to the fact that inflectional suffixes are highly activated in simple span and ready to recombine with different stems. There are less affordances for derivational suffixes to recombine in STM. However, the complex span tasks reveal that derivational affixes add to the information that has to be organized for later selective recall from an activated part of LTM.

# **GENERAL DISCUSSION**

We examined the effects of morphological complexity on recall in four different WM tasks: auditory and visual serial recall, complex span with recall of last words of read sentences and complex span with recall of independent words. All four tasks revealed robust effects of morphological word type. These effects showed that monomorphemic, derived, and inflected words are all processed somewhat differently in WM. Thus, there must be differences in the representations of all three word types in the mental lexicons of Finnish speakers. If we assume separate input and output lexicons, these differences could lie in both the input and the output lexicon, as suggested by the revised version of the SAID model (Laine, 1996), which proposed decomposed representations for both inflected and derived words in both lexicons. The evidence for the revised model was found in an experiment with pseudoroots and derivational suffixes (Laine, 1996). However, more recent work suggests that derived words with salient suffixes with no or few allomorphic variants also show decomposition effects in input processing (Järvikivi et al., 2006). In the present experiment, the majority of derivational suffixes have many allomorphs, biasing the stimulus material against detecting morphological load effects for derived words. The suggestion of derivational decomposition in the output lexicon derives from studies of a Finnish aphasic (Laine et al., 1995). The present study revealed differences between monomorphemic and derived words in basic form when unimpaired participants were tested with a good-sized sample of real Finnish words. Furthermore, none of the examined word characteristics explained our findings. It is, of course, possible that some other systematic difference, such as familiarity or emotional

valence, in the three sets of words accounts for the pattern of results. This is for future studies to explore further with specific hypotheses in mind.

Recent work suggests that the original SAID based on form representations was too simple. Various complicated effects found in later work are better modeled by assuming two levels with both form and more abstract syntactic-semantic representations of stems/roots and affixes (cf., Järvikivi and Pyykkönen, 2011). Such models have been suggested by Schreuder and Baayen (1995) and Diependaele et al. (2005, 2009). Recent brain imaging studies have also provided further evidence by highlighting the dynamic character of word processing. Several studies (Lehtonen et al., 2007; Vartiainen et al., 2009; Leminen et al., 2012) of reading or listening to Finnish inflected words suggested that processing costs incur at a relatively late, presumably syntacticsemantic rather than orthographic/phonological, stage, and that they require attention. It seems reasonable to assume that morphological information present in the language must also be represented in the human language system. However, this information may play different roles in different tasks. The present studies have revealed differences in morphological load effects of derivational affixes and inflectional affixes when word forms are held in an ordered structure in the focus of attention in STM. Here, Finnish inflectional suffixes appear to compete whereas derivational suffixes are supported by the roots they are attached to. When the task is to find an ordered word set from an activated part of LTM, as is required in verbal complex span tasks, morphological information related to both derivational roots and affixes may be separately activated, leading to competition between morphological neighbors and opportunities for recombination of roots and affixes. It is also possible that lingering activation of morphological information from earlier trials affects recall.

From a memory point of view, the results revealed differential sensitivity to morphological load of complex span compared to simple span tasks. Based on further work in our lab (not reported here) we suspect this may have resulted from the particular implementation of the complex span tasks in the present study. In our versions, a 2-s inter-stimulus interval followed the word that had to be memorized before the next sentence was presented for reading. This was inserted because pilot studies suggested participants tried to rehearse between words during reading aloud the sentences. We wanted to concentrate rehearsal to the end of the sentence for all participants. However, a consequence of this decision was that there was enough time for cumulative rehearsal, i.e., for participants to retrieve the previously memorized words from LTM and bind the newest item to the list on each trial. Instead of making the task more like simple span, relying on newly encoded phonological and morphological information, the establishment of a search set in LTM (Unsworth and Engle, 2006) could now be prioritized. Such strategic choice of refreshment strategies in complex span has been shown in other work (Camos et al., 2011). In our case, it seems to have revealed a dissociation of morphological information processing in immediate serial recall, showing larger morphological effects for inflected than derived words, on the one hand, and a task relying on repeated searches from an activated part of LTM

(Cowan, 1995) on the other, being more sensitive to morphological neighbors of derived words. For the Baddeley and Hitch (1974) WM framework, our morphological load results suggest that immediate serial recall of words relies on a combination of information from the phonological loop and other information, perhaps best presented as feature vectors as proposed by Nairne's (1990) feature model. In the most recent description of the WM framework (Baddeley, 2012), recall would then be from the episodic buffer.

### **ACKNOWLEDGMENTS**

We would like to thank Marika Liimatainen for help with collecting preliminary data and Matti Laine and Turun Sanomat for letting us use a large newspaper text corpus with coded lemma and surface form frequencies in Finnish. The research was funded by The Academy of Finland, grant 39253, Finnish Cultural Foundation and Natural Sciences and Engineering Research Council of Canada (NSERC), grant 311804-07 to Elisabet Service.

## **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fnhum.2014.01064/ abstract

# **REFERENCES**

Baddeley, A. D. (1986). *Working Memory*. Oxford: Clarendon Press.


Vartiainen, J., Aggujaro, S., Lehtonen, M., Hulten, A., Laine, M., and Salmelin, R. (2009). Neural dynamics of reading morphologically complex words. *Neuroimage* 47, 2064–2072. doi: 10.1016/j.neuroimage.2009. 06.002

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 August 2014; accepted: 22 December 2014; published online: 15 January 2015.*

*Citation: Service E and Maury S (2015) Differential recall of derived and inflected word forms in working memory: examining the role of morphological information in simple and complex working memory tasks. Front. Hum. Neurosci. 8:1064. doi: 10.3389/fnhum.2014.01064*

*This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2015 Service and Maury. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Decomposability and mental representation of French verbs

# *Gustavo L. Estivalet 1,2\* and Fanny E. Meunier 1,2*

*<sup>1</sup> Centre National de la Recherche Scientifique UMR5304, Laboratoire sur le Langage, le Cerveau et la Cognition, Lyon, France <sup>2</sup> Université de Lyon, Université Claude Bernard Lyon 1, Lyon, France*

### *Edited by:*

*Mirjana Bozic, University of Cambridge, UK*

### *Reviewed by:*

*Marcus Taft, University of New South Wales, Australia João Veríssimo, University of Potsdam, Germany*

### *\*Correspondence:*

*Gustavo L. Estivalet, Laboratoire sur le Langage, le Cerveau et la Cognition, Institut de Sciences Cognitives, 67 Boulevard Pinel, 69675 – Bron CEDEX, France e-mail: gustavo.estivalet@isc.cnrs.fr*

In French, regardless of stem regularity, inflectional verbal suffixes are extremely regular and paradigmatic. Considering the complexity of the French verbal system, we argue that all French verbs are polymorphemic forms that are decomposed during visual recognition independently of their stem regularity. We conducted a behavioral experiment in which we manipulated the surface and cumulative frequencies of verbal inflected forms and asked participants to perform a visual lexical decision task. We tested four types of verbs with respect to their stem variants: a. fully regular (*parler* "to speak," *[parl-]*); b. phonological change e/E verbs with orthographic markers (*répéter* "to repeat," *[répét-]* and *[répèt-]*); c. phonological change o/O verbs without orthographic markers (*adorer* "to adore," *[ador-]* and *[adOr-]*); and d. idiosyncratic (*boire* "to drink," *[boi-]* and *[buv-]*). For each type of verb, we contrasted four conditions, forms with high and low surface frequencies and forms with high and low cumulative frequencies. Our results showed a significant cumulative frequency effect for the fully regular and idiosyncratic verbs, indicating that different stems within idiosyncratic verbs (such as *[boi-]* and *[buv-]*) have distinct representations in the mental lexicon as different fully regular verbs. For the phonological change verbs, we found a significant cumulative frequency effect only when considering the two forms of the stem together (*[répét-]* and *[répèt-]*), suggesting that they share a single abstract and under specified phonological representation. Our results also revealed a significant surface frequency effect for all types of verbs, which may reflect the recombination of the stem lexical representation with the functional information of the suffixes. Overall, these results indicate that all inflected verbal forms in French are decomposed during visual recognition and that this process could be due to the regularities of the French inflectional verbal suffixes.

**Keywords: morphology, regularity, decomposition, lexical access, frequency effects, verb inflection**

# **INTRODUCTION**

The surface frequency effect, which reflects differences in word recognition as a function of form frequency, is one of the most reliable phenomena described in the psycholinguistic field in the last 35 years (Taft and Forster, 1975; Taft, 1979, 2004; Burani et al., 1984; Meunier and Segui, 1999b; Domínguez et al., 2000). Polymorphemic words, in addition to their surface frequency, are characterized by their cumulative frequency (also called lemma frequency), which is defined as the sum of the frequencies of all affixed words that carry that stem (e.g., for the stem *[parl-]*, the sum of the surface frequency of *parlons* "we speak" + the surface frequency of *parlez* "you speak" + the surface frequency of *parlent* "they speak," etc.). Therefore, word and morpheme frequencies are directly related to the time spent for word recognition, with more frequent words being recognized faster than less frequent ones (Taft and Forster, 1975).The effects of the different frequencies of polymorphemic words are of great interest in the investigation of morphemic representations in the mental lexicon and morphological decomposition during word processing (Colé et al., 1989; Domínguez et al., 2000), especially in languages with rich and paradigmatic morphological systems. The cumulative frequency effect is interpreted as reflecting a decomposition process and shows the influence of the morpheme frequency in retrieval and lexical access (Taft and Forster, 1975; Taft, 2004), whereas the surface frequency effect is interpreted as reflecting either the time spent to retrieve and access a whole word in the mental lexicon (Manelis and Tharp, 1977; Butterworth, 1983) or the morphosyntactic recombination process between stem and affixes (Taft, 1979, 2004).

In this research, we investigated the mental representation of French verb stems, their allomorphy (the alternative forms of a morpheme depending on its phonological and morphological context) and verbal decomposability. Unlike the English verbal system, which is generally divided into two groups (regular and irregular verbs) with few suffixes (i.e., walk[s], walk[ed] and walk[ing]) (Stanners et al., 1979; Aronoff, 1994), the French verbal system has different degrees of stem regularity and a paradigmatic set of suffixes for tenses and agreements. Similarly to other Romance languages (Oltra-Massuet, 1999; Domínguez et al., 2000; Say and Clahsen, 2002; Veríssimo and Clahsen, 2009),


**Table 1 | Examples of the three French verbal groups conjugated in the present tense showing the stem regularity and the suffix paradigms.**

French has three groups of verbs (see **Table 1**). However, in contrast to most Romance languages, the French verbal groups are not explicitly defined in function of the theme vowels. Moreover, French has a particular iambic prosodic system that directly influences the phonetic production of the stems and inflectional suffixes in a predictive way (Aronoff, 2012; Andreassen and Eychenne, 2013). In particular, the pronunciation of the syllables to the right of the stem produces prosodic consequences, which are reflected in phonetic production. Thus, for verbs from the first group that undergo phonological changes, the last vowel of the stem is open pronounced (/E/ and /O/) if the stem is merged with a non-pronounced suffix (e.g., *[-e], [-es], [-ent]* as in *[répèt]e /Re'pEt/* "I/he/she repeat(s)") but is close pronounced (/e/ and /o/) if the stem is merged with a suffix that has a pronounced vowel (e.g., *[-ons], [-ez], [-ai], [-i], [-er]* as in *[répét]ons /Repe'tõ/* "we repeat") (Touratier, 1996). A question that remains open is whether different phonological forms of a verb have different lexical representations or whether they share an abstract or underspecified representation.

The first verbal group in French is regular concerning its conjugations and is characterized by the infinitive ending *[-er]*. The second group is also regular and is characterized by the infinitive ending *[-ir]* associated with the realization of the morpheme *[-ss-]* before suffixes beginning with vowels. The third group comprises irregular verbs, including verbs with different infinitive endings (e.g., *[-dre]*, *[-ire]*, *[-oir]*, etc.) and a different number of stems per verb (Kilani-Schoch and Dressler, 2005; Aronoff, 2012). Therefore, the first group has verbs with just one stem, such as *ramer* "to paddle." The only modification that is observed within a sub-group of stems is a phonological predicted alternation in the stem (these verbs can also be called morpho-phonological verbs), such as the verb *céder* "to cede" (e.g., *[cèd]es /'sEd/* "you cede," *[céd]ons /se'dõ/* "we cede") (Halle and Idsardi, 1996; Andreassen and Eychenne, 2013). Stems from the second group are always the same in the full inflectional system (e.g., *[fini]r* "to finish"). Finally, the third group includes verbs with just one stem, such as *rendre* "to render," verbs with small changes in the stem, such as *écrire* "to write" (e.g., *[écri]t* "he/she writes," *[écriv]ons* "we write"), and verbs with idiosyncratic stem allomorphs, such as *devoir* "must" (e.g., *[doi]s* "I/you must," *[dev]ons* "we must," *[doiv]ent* "they must") (Touratier, 1996). Unlike stems that carry the lexical meaning, the morphosyntactic inflectional system of tense and agreement suffixes in French is extremely paradigmatic and can be easily detached from the stem to which it is merged (Meunier and Marslen-Wilson, 2004). Thus, all verbal inflected forms in French can be decomposed based on their regular and salient inflectional system of suffixes, and this evident morphosyntactic decomposition may determine the mental representation of verbal stems.

The first objective of the current work was to determine whether the systematic French verbal inflectional system underlies the morphological decomposition of all forms on visual recognition (Rastle and Davis, 2008) or whether inflected verbs can be accessed as whole words. The second objective was to investigate how stems are represented in the mental lexicon in function of their regularity (Bybee, 1995). For this purpose, participants performed a visual lexical decision task on French inflected verbs. We manipulated the surface and cumulative frequencies for four types of stem variants: a. fully regular verbs from the first group (*parler* "to speak," one form *[parl-]*); b. phonological change e/E verbs from the first group with orthographic markers (*répéter* "to repeat," two forms *[répét-] /repet-/* and *[répèt-] /repEt/*); c. phonological change o/O verbs from the first group without orthographic markers (*adorer* "to adore," two forms *[ador-] /ador-/* and */adOr-/*); and d. idiosyncratic verbs from the third group with different stems (*boire* "to drink," two forms *[boi-]* and *[buv-]*). We tested two different phonological change verbs (i.e., with and without orthographic markers) because the orthographic markers can be a strong hint for phonetic realization (Kilani-Schoch and Dressler, 2005) in visual stimulation, yielding different results (Seidenberg, 1992; Rastle and Davis, 2008).

To explain the word-recognition process, different models have been suggested to account for morphological processing in lexical access. The first type of model proposes an obligatory decomposition process for polymorphemic words upon lexical retrieval and recognition (Halle, 1973; Taft and Forster, 1975; Taft, 1979; Halle and Marantz, 1993; Marantz, 2013) in which the components of polymorphemic words are represented at the form and morphemic levels. The meaning of the whole word form is retrieved when the lexical information of the stem is combined with the morphosyntactic information of the affixes. The second type of model proposes an exclusively associative whole-word lexical access (Manelis and Tharp, 1977; Butterworth, 1983). This type of model includes the connectionist model, with its different variations (Rumelhart and McClelland, 1986; Seidenberg, 1992; Baayen et al., 2011), basically suggesting that morphology emerges from the overlap between meaning, phonology and orthography. The third model type aggregates both decompositional and associative lexical access to propose a dual-route model (Caramazza et al., 1988; Baayen et al., 1997; Clahsen, 1999, 2006; Pinker, 1999).

The dual-route models, such as the Augmented Address Model (AAM) (Burani et al., 1984; Caramazza et al., 1988), the Race Model (RM) (Baayen et al., 1997; Schreuder and Baayen, 1997), and the Words and Rules model (W&R) (Pinker, 1999; Pinker and Ullman, 2002), have been supported by a significant amount of research in different languages in the past few years, with different specifications for each of their versions. However, more specifically for our study, the Minimalist Morphology model (MM) (Wunderlich, 1996) uses the morpheme-based assumption, highlighting the computational route by proposing that regular inflected forms are established by merging constant lexical entries and affixes and that irregular inflected forms are represented by subnodes of lexical entries containing variables (Clahsen, 1999, 2006). Empirical research has been conducted to better understand the general principles of word recognition, including specific morphological parameters that drive the morphological processing and representation in different languages (Beard, 1995). These examinations in verbal inflection have been conducted in English with the now-famous English past tense debate (Stanners et al., 1979; Rumelhart and McClelland, 1986; Marslen-Wilson and Tyler, 1998; Pinker, 1999; McClelland and Patterson, 2002; Pinker and Ullman, 2002; Fruchter et al., 2013), German (Clahsen, 1999), Dutch (Baayen et al., 1997; Schreuder and Baayen, 1997), and Finnish (Leinonen et al., 2008). Romance languages have also been investigated, including Spanish (Domínguez et al., 2000), Catalan (Oltra-Massuet, 1999), Portuguese (Sicuro Corrêa et al., 2004; Veríssimo and Clahsen, 2009), Italian (Burani et al., 1984; Caramazza et al., 1988; Orsolini and Marslen-Wilson, 1997; Say and Clahsen, 2002), and French (Meunier and Marslen-Wilson, 2004; Meunier et al., 2008, 2009).

Altogether, the literature clearly shows that morphological processing has a fundamental role in lexical access, especially in inflected polymorphemic words in which the computational system and the mental lexicon interact for word recognition (Halle, 1973; Colé et al., 1997; Marslen-Wilson and Tyler, 1998; Clahsen, 1999). Concerning verbal form identification, findings in English, Dutch, and German are clear, with multiple sources of evidence in favor of a lexical associative process for irregular words and a rule-based process for regular ones. These findings suggest that regular inflected words are completely combinatorial, whereas irregular inflected words are internally structured and represented in the mental lexicon (Wunderlich, 1996; Baayen et al., 1997; Pinker, 1999). However, based on a facilitatory priming effect for irregular pairs such as *fell—fall* in masked priming, Crepaldi et al. (2010) recently challenged the idea of an exclusively semantic relationship between the irregularly inflected forms and their base forms (see also Forster et al., 1987). These authors proposed a shared representation that underlines both forms at the lemma level where inflected words share their representation irrespective of orthographic regularity (McCormick et al., 2008; Crepaldi et al., 2010). The results observed within Romance languages with a richer verbal morphology are somewhat more puzzling than these results in Germanic languages. For example, using a cross-modal priming paradigm in Italian, Orsolini and Marslen-Wilson (1997) did not report any difference between effects observed for regular (e.g., *amarono*—*amare*, "they loved"—"to love") and irregular sub-class (e.g., *presero*—*prendere*, "they took"—"to take") verbs (but see Say and Clahsen, 2002). In contrast, findings in Portuguese have supported dual-route models, differentiating the lexicon and computational systems (Sicuro Corrêa et al., 2004; Veríssimo and Clahsen, 2009). These language-specific differences may reflect cross-linguistic specificities that are broadly noted in the morphological components (Beard, 1995; Chomsky, 1995; Marslen-Wilson, 2007).

Very few studies have assessed French inflectional categories to understand their lexical representation, access, and processing. Meunier and Marslen-Wilson (2004) used cross-modal and masked priming paradigms and showed that French inflected verbal forms present a facilitatory priming effect independently of their degree of stem regularity and allomorphy. In the crossmodal priming experiment, the priming effects were on the order of 51 ms for all types of verbs. In the masked priming experiment, significant priming effects varied from 16 ms up to 32 ms, depending on specific conditions. The authors concluded that morphologically related primes in French significantly facilitated response times (RTs) for all type of verbs, suggesting that decomposition takes place regardless of stem regularity. However, the variability of the effects observed in the masked priming experiment may suggest a more complex picture because the stem included in a prime such as *buvais* "I/you drank" overlaps minimally with the target *boire* "to drink." Thus, if *[buv]ais* is decomposed, the remaining stem *[buv-]* does not overlap with the target stem *[boi-],* as in the case of fully regular verbs (e.g., *[pass]ais* - *[pass]er* "I/you passed"—"to pass"). Therefore, the priming effects for idiosyncratic verbs, much like the system for their stem representation in the mental lexicon, remains open to question.

The use of priming techniques may cause specific experimental effects due to form-related processing that overlaps between priming and target (Allen and Badecker, 2002). One effective method to test verbal form decomposition is to measure the influence of the surface and cumulative frequencies on RT modulation (Taft and Forster, 1975; Taft, 1979, 2004; Burani et al., 1984; Colé et al., 1989, 1997; Schreuder and Baayen, 1995, 1997; Baayen et al., 1997; Meunier and Segui, 1999a; Domínguez et al., 2000). Therefore, we conducted a visual lexical decision task experiment in which we manipulated the cumulative and surface frequencies of verbs that differed in stem regularity.

In a seminal work in English word recognition, Stanners et al. (1979) showed that words matched in surface frequency have RTs modulated in the function of the cumulative frequency, with more frequent stems being recognized faster than less frequent ones. In Dutch, Schreuder and Baayen (1997) found the same type of results between high and low cumulative frequency words matched in the singular form in medium surface frequency. In a frequency study investigating Italian inflected verbs, Burani et al. (1984) obtained a significant difference between words with high and low cumulative frequencies matched in low surface frequency. Therefore, verbal inflection processing may be strongly related


**Table 2 | Examples of experimental items according to verb type and frequency conditions.**

to cumulative frequency given its influence in the morphemic representation (Aronoff, 1994).

In French, as in other Romance languages, the right side of a verb has verbal suffixes that are paradigmatic realizations of morphosyntactic features of tense and agreement. The left side of the verb has a stem containing the root, which provides lexical information (Halle and Marantz, 1993; Kilani-Schoch and Dressler, 2005; Aronoff, 2012). In our experimental paradigm, we tested four verb types. (a) Fully regular verbs from the first group that have just one stem representation in the mental lexicon, which can be merged with the complete inflectional paradigm (Bybee, 1995). Thus, our hypothesis is that verbs are decomposed prior to lexical access, yielding a cumulative frequency effect between the forms of two regular verbs matched on their surface frequencies but with different cumulative frequencies. (b) Phonological change e/E verbs with orthographic markers are verbs from the first group but with two different predictable phonetic outcomes from the last *<*e*>* of the stem according to which suffix the stem is merged with (e.g., *[mèn]es* "you lead," *[men]ons* "we lead"). They have an orthographic marker associated with the open phonetic production (i.e., *<*è*>*, *<*\_ll*>* or *<*\_tt*>*). (c) Phonological change o/O verbs without orthographic markers are verbs from the first group that present a predictable phonetic alternation in the last *<*o*>* of the stem but without any orthographic marker (e.g., *[dévOr]es* "you devour," *[dévor]ons* "we devour") (Kilani-Schoch and Dressler, 2005; Andreassen and Eychenne, 2013). For these two verbs types, the question is whether French speakers have two different phonetic representations of the stem in their mental lexicon or one phonological abstract underspecified representation of the stem that receives its phonetic form only in the spell-out of the word (Halle and Marantz, 1993; Marslen-Wilson and Zhou, 1999; Embick, 2013). This point was tested by contrasting the cumulative frequencies of different phonetic stem alternations. Finally, (d) idiosyncratic verbs from the third group have two or more unpredictable stem allomorphs to which the suffixes are merged (e.g., *[peu]t* "he/she can," *[pouv]ons* "we can," *[pu]* "could," *[puiss]e* "I/he/she can"). Although previous results from Meunier and Marslen-Wilson (2004) suggested that these verbal forms are processed as fully regular ones, contrasting the cumulative frequencies of the different stems will allow us to test whether these idiosyncratic verbs have two or more different stem representations in the mental lexicon.

# **MATERIALS AND METHODS**

### **PARTICIPANTS**

Thirty-two adult native speakers of French between the ages of 18 and 32 (mean age: 20.31, 16 females) took part in this experiment as volunteers. All of the participants were right-handed, had normal hearing, normal or corrected-to-normal vision, no history of any cognitive disorder, and were undergraduate students at the *Université Lumière Lyon 2*. The participants did not know the purpose of the research and provided written consent to take part in the experiment as volunteers.

# **MATERIALS AND DESIGN**

We asked the participants to perform a lexical decision task on visually presented items. The participants gave their responses on a computer keyboard using two hands, a right-hand button "yes" to indicate existing words and a left-hand button "no" to indicate pseudowords. All of the words were chosen from the French corpus *Lexique 3 <*http://www*.*lexique*.*org/*>* (New et al., 2004), which gives the frequency of the whole-word form (surface) and the frequency of the lemma per million words. In our study, the stem cumulative frequency was defined by summing the surface frequency of all inflected forms from each stem of interest.

To observe the different effects on the RTs as a function of the whole-word form and stem frequencies, we thoroughly manipulated and matched the cumulative and surface frequencies in the high and low ranges (Taft, 1979; Burani et al., 1984; Colé et al., 1989; Meunier and Segui, 1999a) as shown in **Table 2**.

Eighty stem pairs from the four verb types researched were selected, with 20 pairs for each verb type. All of the experimental words were inflected French verbs. We avoided inflected forms from the *passé simple*, the *subjonctif imparfait* and the participles because of their morphological productivity and specificity. The four verb types investigated were as follows: a. fully regular verbs, b. phonological change e/E verbs with orthographic markers, c. phonological change o/O verbs without orthographic markers, and d. idiosyncratic verbs. For the fully regular verbs, we did not use a stem pair from the same verb because these verbs have only one stem; instead, we used two different verbs with the same surface frequency. For the phonological change verbs, we calculated the stem cumulative frequency by summing all forms of each stem's phonetic realization. For the idiosyncratic verbs, we summed all forms of each allomorphic stem. We manipulated the cumulative and surface frequencies to match the four different conditions: two conditions with high cumulative frequency and high or low surface frequencies and two conditions with low cumulative frequency and high or low surface frequencies.

The experimental words in all verb types and conditions were not homographic with any other existent forms in French and had between six and eleven letters, between three and nine phonemes, and between one and four syllables. The words had an orthographic neighborhood size between one and three, as measured by the orthographic Levenshtein distance (OLD20), which compares words between all pairs of words in the lexicon, even with different lengths (Yarkoni et al., 2008). All of the experimental words were matched in their number of letters, number of phonemes, number of syllables, and OLD20 (see **Table 3**). The high cumulative frequency condition contained words with stem cumulative frequencies greater than 140, whereas the low cumulative frequency condition contained words with stem cumulative frequencies lower than 80. The high surface frequency condition had words greater than five form frequencies, whereas the low surface frequency condition had words fewer than 0.5 form frequencies. The complete list of experimental stimuli is available in the Supplementary Material.

A set of 320 pseudowords was added to the 320 experimental items to produce the non-existent word response such that the experiment had 640 stimuli in total. The pseudowords were constructed by merging a non-existent but possible stem to an existent verbal inflectional suffix in French (pseudoverbs) (e.g., ∗*[[pors]ent]*, ∗*[[[lomb]i]ons]*). Four different lists were constructed in a strict pseudo-random order to counterbalance the sequence of stimulus presentation between conditions. Each list was performed by eight participants. The lists had the following criteria: a. a stimulus was never preceded by another stimulus starting with the same letter, b. there were at maximum three words or pseudowords presented in sequence, c. there were at least 20 stimuli between words from the same lemma, and d. there were at least five stimuli between words/pseudowords with the same suffixes.

### **PROCEDURE**

Participants were tested individually in a quiet room in the library at the *Université Lumière Lyon 2*. We used the E-Prime v2.0 Professional® (Schneider et al., 2012) software to construct

**Table 3 | Stimulus frequencies, letters, phonemes, syllables and OLD20.**

the experiment as well as for stimulus presentation and data collection. Each trial followed the same sequence. First, a fixation point was displayed in the center of the screen for 500 ms at the same time as a "bip" sound was played. Immediately following the fixation offset, the target stimulus was displayed in the center of the 15 LCD screen in 18 point Courier New font in white letters against a black background. The target stimuli were presented in upper-case letters to avoid extra processing on the French accents. The RT recording started with the onset of the target stimulus presentation, which remained on the screen for 2000 ms or until the participant's response. After the target stimulus disappeared, the next trial started with the presentation of the fixation point. Participants were asked to perform a visual lexical decision task in which they decided whether the stimulus was an existent or a non-existent word (pseudoword) in each trial, pushing one of two keys as quickly and accurately as possible to indicate their choice. If the stimulus presented was an existent word, the participants were asked to push the right button; if the stimulus was a non-existent word (pseudoword), they were asked to push the left button. The experiment started with an instructional screen followed by a practice phase with eight stimuli. One break was provided in the middle of the experiment after 320 trials. The entire experiment lasted approximately 18 min.

# **RESULTS**

For the experimental words, the by-participant average RT of correct acceptance was 695 (197) ms. Incorrect responses (9.62%) were removed from further analysis. Responses faster than 400 ms or slower than 1800 ms were also discarded (0.36%). Overall, 9.94% of the responses from the original data were discarded prior to statistical analysis.

RTs were logarithmically transformed to normalize their distribution. We conducted a mixed-effect model analysis (Baayen


et al., 2008) on the data, with the logarithm of the RTs as the dependent variable in one analysis, and the accuracy as the dependent variable and a binomial distribution specified in another. Participants and Items were the random variables, and the Cumulative Frequency (high vs. low), Surface Frequency (high vs. low), and Verb Type (a. fully regular, b. phonological change e/E verbs with orthographic markers, c. phonological change o/O verbs without orthographic markers, and d. idiosyncratic) were the fixed-effect variables. The general RT means with their standard deviations in parenthesis and the error rates for each type of verb and each condition based on the by-participant analysis are displayed in **Table 4**.

# **RT RESULTS**

Overall, we found a significant effect for surface frequency [*F*(1*,* 293) = 22*.*494, *p <* 0*.*001] and cumulative frequency [*F*(1*,* 293) = 12*.*861, *p <* 0*.*01], but we did not find a significant effect between the different verb types [*F*(3*,* 293) = 0*.*462, *p* = 0*.*709]. Regarding the general interactions, the only one that reached significance was between word type and cumulative frequency [*F*(3*,* 293) = 8*.*238, *p <* 0*.*05]. This significant interaction effect will be further discussed by means of the different representations between regular and idiosyncratic verbs compared with phonological change verbs. Our main goal was to determine how the RT differences behaved for each verb type in terms of the surface and cumulative frequencies.

Planned comparisons given by the mixed effect model showed that fully regular verbs demonstrated a main effect for surface frequency, with high-frequency words being recognized faster than low-frequency words. This effect of 26 ms for high cumulative frequency words and 27 ms for low cumulative frequency verbs was significant [*t*(292) = 2*.*942, *p <* 0*.*01]. There was also a main effect for cumulative frequency, with high-frequency words having faster responses than low-frequency words. This effect of 17 ms for high surface frequency verbs and 19 ms for low surface frequency verbs was also significant [*t*(289) = 2*.*442, *p <* 0*.*05]. There was no significant interaction between cumulative and surface frequencies [*t*(294) = 0*.*181, *p* = 0*.*857], suggesting that the two effects are independent of each other.

For phonological change e/E verbs with orthographic markers, there was a significant effect for surface frequency [*t*(293) = 2*.*802 *p <* 0*.*05] of 25 ms in high cumulative frequency and 30 ms in low cumulative frequency verbs. However, there was no cumulative frequency effect [*t*(290) = 0*.*521, *p* = 0*.*603], with a negative difference of −2 ms in high surface frequency verbs and only 3 ms in low surface frequency verbs, indicating that different frequencies in the stems of the phonological change e/E verbs with orthographic markers do not elicit different RTs for word recognition. There was no significant effect on the interaction between cumulative and surface frequencies [*t*(291) = 0*.*535, *p* = 0*.*593].

For phonological change o/O verbs without orthographic markers, there was a significant effect for surface frequency [*t*(294) = 2*.*406, *p <* 0*.*01], confirming the surface effect. This effect was 24 ms in high cumulative frequency verbs and 23 ms in low cumulative frequency verbs. However, there was no cumulative frequency effect [*t*(292) = 0*.*078, *p* = 0*.*938], with a difference of only 3 ms in high surface frequency and 2 ms in low surface frequency verbs. There was no significant effect for the interaction between cumulative and surface frequencies [*t*(294) = 1*.*358, *p* = 0*.*175].

Finally, idiosyncratic verbs showed a main effect in the surface frequency of 16 ms in high cumulative frequency and 18 ms in low cumulative frequency verbs. This effect was significant [*t*(292) = 3*.*397, *p <* 0*.*01], confirming the surface effect. Importantly, there was also a significant main effect in cumulative frequency of 17 ms in high surface frequency verbs and 19 ms in low surface frequency verbs [*t*(292) = 2*.*312, *p <* 0*.*05]. There was no significant effect on the interaction between cumulative and surface frequencies [*t*(294) = 0*.*149, *p* = 0*.*882], suggesting that the surface and cumulative frequency effects are independent.

## **RT DISCUSSION**

Overall, we systematically observed a surface frequency effect for the four types of verbs tested; however, the picture for the cumulative frequency is different. Although its effect is clearly observed in the fully regular and idiosyncratic verb types, it does not appear in either type of phonological change verbs (with or without orthographic markers). This result explains why we found a significant interaction between verb type and cumulative frequency in the general analysis: regular and idiosyncratic verbs have different cumulative frequency behaviors compared with phonological change verbs. Because we did not find any cumulative frequency effect in this last verb type, phonetic alternations in the stem production may not be considered to be differently represented in the mental lexicon (Marslen-Wilson and Zhou, 1999; Embick, 2013). Therefore, these phonetic alternations do not result from different phonological representations but are most likely due to phonological abstract representations that receive their phonetic form after suffix computation in a later

**Table 4 | Overall RT means, standard deviations, and error rates for each type of verb and condition.**


stage (Embick and Halle, 2005). To test this interpretation, we reconsidered the cumulative frequency for the stems as being the total cumulative frequency (i.e., the lemma frequency provided by the corpus), meaning the sum of both phonological changes for each type of verb (e.g., for the verb *lever* "to lift," the cumulative frequency of the stem *[lev-]* of 347 per million was added to the cumulative frequency of the stem *[lEv-]* of 91 per million, resulting in a total cumulative frequency of 438 per million for all of its verb forms). We then conducted a *post-hoc* analysis through a new mixed-effect model (Baayen et al., 2008) that used the frequency values of surface and cumulative frequencies as continuous predictors. The logarithm of the RTs was the dependent variable, Participants and Items were the random variables, and the TotalCumulativeFrequency (numeric), SurfaceFrequency (numeric), and Verb Type (b. phonological change e/E verbs with orthographic markers, and c. phonological change o/O verbs without orthographic markers) were the fixed-effect variables.

For phonological change e/E verbs with orthographic markers in this analysis, there was a main effect of surface frequency [*t*(291) = 2*.*495, *p <* 0*.*01]. Most importantly, there was a main effect of total cumulative frequency [*t*(292) = 2*.*929, *p <* 0*.*01], confirming that the cumulative frequency of the phonological change verbs should not be considered separately between the different phonetic stem realizations. There was no significant effect for the interaction between total cumulative and surface frequencies [*t*(287) = 1*.*055, *p* = 0*.*292].

For phonological change o/O verbs without orthographic markers, similarly to the phonological change e/E verbs, there was a main effect for surface frequency [*t*(295) = 2*.*104, *p <* 0*.*01], and most importantly, there was also a main effect of total cumulative frequency [*t*(288) = 2*.*238, *p <* 0*.*05], definitively confirming the total cumulative frequency effect in phonological change verbs. There was no significant effect for the interaction between total cumulative and surface frequencies [*t*(292) = 0*.*868, *p* = 0*.*386], suggesting that both effects are independent.

These results confirm that phonological stem changes have only one abstract phonological underspecified representation in the mental lexicon (Marslen-Wilson and Zhou, 1999) and that the different phonetic productions are reflexes of phonological rules driven by the merger operation between the stem and suffixes (Embick, 2013).

# **ERROR RATE RESULTS**

Fully regular verbs had an error rate of 8.12%, phonological change e/E verbs had an error rate of 8.83%, phonological change o/O verbs had an error rate of 9.88%, and idiosyncratic verbs had an error rate of 9.65%. High and low surface frequencies had error rates of 8.24% and 11.01%, respectively, whereas high and low cumulative frequencies had error rates of 7.79% and 11.45%, respectively. Overall, we did not find any significant error rate difference between the verb types [*F*(3*,* 303) = 0*.*216, *p* = 0*.*885]. However, we did find significant error rate differences between the surface frequencies [*F*(1*,* 303) = 5*.*202, *p <* 0*.*05], suggesting that words with higher surface frequencies are not only recognized more quickly but are also more easily recognized in visual stimulation as well as in the cumulative frequency [*F*(1*,* 303) = 9*.*149, *p <* 0*.*01], suggesting that more frequent stems are more easily recognized than less frequent ones. No interaction reached significance, suggesting that verb type, surface frequency and cumulative frequency are independent.

# **GENERAL DISCUSSION**

In this work we investigated the mental representations and decomposability of French verbs. French is a rich morphological language in terms of lexical morphemes with fully regular stems, phonological stem changes, and idiosyncratic allomorphy in the stem (Kilani-Schoch and Dressler, 2005; Aronoff, 2012). We conducted an experiment in which the cumulative and surface frequencies were manipulated using high and low frequency conditions. Participants were asked to perform a lexical decision task as quickly and accurately as possible on visual items. The RTs and error rates were then analyzed as a function of our hypothesis.

We observed surface frequency effects for all types of verbs tested. More importantly, we observed cumulative frequency effects for the fully regular verbs from the first group and for the idiosyncratic verbs from the third group. The phonological change verbs presented slightly different results, yielding no cumulative frequency effect when the frequencies of the two phonetic stem forms were computed separately. However, the phonological change verbs yielded a significant total cumulative frequency effect when the cumulative frequency count included all of the conjugated forms of the verb, regardless of the phonetic form alternations. These results shed light on how verbal inflected forms are processed and how stems are represented in the mental lexicon depending of their type of regularity.

# **REGULARITY**

Fully regular French verbs from the first group have a single stem on which the verbal inflectional paradigm is based. Due to the paradigmatic system of verbal suffixes, it is extremely easy to identify and decompose the lexical morpheme (stem) from the inflectional endings containing morphosyntactic features (suffixes) (Bybee, 1995). Confirming our hypothesis, the significant cumulative frequency effect indicates that it is a predictive factor in word recognition, and its manipulation results in RT modulations (Taft, 1979). In this context, accordingly to Taft (2004, p. 747), the surface frequency effect "is explained in terms of the ease with which the information associated with the stem can be combined with the information associated with the affix."

## **PHONOLOGICAL CHANGES**

Unlike fully regular verbs, phonological change verbs have predictable alternations in their phonetic forms according to the phonological properties of the suffix to which the stem is merged (Embick, 2013). Therefore, the lack of an effect in the cumulative frequency between the phonetic alternation forms and the significant effect of total cumulative frequency confirms our hypothesis that verbs with phonological changes have an abstract phonological underspecified representation that is contacted during processing. Verbs with phonological changes are decomposed, and the different phonetic forms activate a single phonological underspecified stem (Marslen-Wilson and Zhou, 1999). An alternative hypothesis is that both different phonetic stems have a rule-based relation and only one of them is stored in the lexicon.

# **IDIOSYNCRASY**

For idiosyncratic verbs, similarly to the other verb types, the surface frequency effect should be interpreted as the recombination between the stem and affixes (Taft, 1979, 2004). Interestingly, we found a significant main effect in the cumulative frequency that can be broadly interpreted as differential access to different mental representations of the idiosyncratic stem allomorphs (Forster et al., 1987). However, this finding also suggests that even idiosyncratic known verbs are decomposed during visual recognition. These results are incompatible with models postulating that known words or irregular words are accessed by the direct whole-word route, such as the AAM (Caramazza et al., 1988) and the W&R (Pinker, 1999). Our results are in accordance with the earlier priming study in French on inflected verb recognition (Meunier and Marslen-Wilson, 2004). In French, even idiosyncratic verbs from the third group are decomposed due to the paradigmatic verbal inflectional system of suffixes (Bybee, 1995; Kilani-Schoch and Dressler, 2005).

## **DECOMPOSABILITY AND REGULARITY**

According to Rastle and Davis (2008), the recognition of polymorphemic words in visual modality begins with a morphological decomposition based on an analysis of orthography. Thus, because the orthographic regularity and relationships across the stems and the suffixes are extremely consistent in French (Bybee, 1995), we suggest that morphological decomposition is triggered more by the decomposability of verbal forms than by their regularity *per se*. Therefore, we argue that all French inflected verbs are first decomposed to their stem and suffixes and then these morphemes are accessed according to their cumulative frequency, generating the cumulative frequency effect. This decomposition activates lexical and morphosyntactic information systems, which are later recombined and verified for word recognition, generating the surface frequency effect. This assumption strongly supports the full-decomposition models (Halle, 1973; Taft, 1979, 2004; Halle and Marantz, 1993; Embick and Halle, 2005; Marantz, 2013) or the dual-route models, with a special emphasis on the combinatorial route (Wunderlich, 1996; Baayen et al., 1997; Orsolini and Marslen-Wilson, 1997; Clahsen, 1999). In this case, the bound-stems are stored in the mental lexicon, and inflected verbs share morphemic representations (such as roots, stems and suffixes) with all of the words from the same morphological family that have their own lexical entry representation.

# **NATURE OF THE REPRESENTATION**

Studies conducted on Spanish have shown that word stress is defined by word structure, meaning that the morphemic nodes and the phonological characteristics of the merged morphemes are crucial for word stress (Oltra-Massuet and Arregi, 2005). The same analysis was conducted in Catalan (Oltra-Massuet, 1999), and similar assumptions were made by Andreassen and Eychenne (2013) in French (however, their argument was not deeply developed). Nevertheless, we suggest that word stress in French is strongly driven by word structure. In the case of verbs, word stress is defined by the tense and agreement nodes. The French iambic prosodic system is different from other Romance languages, which have a trochaic prosodic system. In this sense, it is the stressed syllable that defines the phonetic production in French phonological change verbs (Kilani-Schoch and Dressler, 2005; Aronoff, 2012). This means that the different phonetic stem productions of phonological change verbs are exclusively driven by prosodic rules, not by different morphological representations (Halle and Idsardi, 1996; Marslen-Wilson and Zhou, 1999; Embick, 2013). Accordingly to this assumption, our results showed that two phonetic alternation forms did not present any difference but activated a shared stem representation that is partly underspecified. Another possibility is that all morphemes are purely abstract and have no phonological content. Just after the morphemes are merged in the inflected word, the phonetic form is guided by phonological readjustment rules and is defined in a late insertion (Halle and Marantz, 1993; Embick and Halle, 2005; Marantz, 2013).

For idiosyncratic verbs, Meunier and Marslen-Wilson (2004) showed that different allomorphic stems have the same priming effects as fully regular verbs (e.g., *[boi]rons* "we will drink" and *[buv]ons* "we drink") when priming their infinitive form (*[boi]re*, "to drink"). Our results significantly extend this investigation and suggest that allomorphic stems have different representations in the mental lexicon. Thus, the priming effect observed may be due to links between the different representations, or accordingly to Crepaldi et al. (2010) to a shared underlined representation in the lemma level (Forster et al., 1987; Allen and Badecker, 2002). Our results show that idiosyncratic verbs are decomposed and recognized through the specific stem representations of a single verb in the mental lexicon (Aronoff, 2012). Idiosyncratic stem allomorphs are represented in the mental lexicon as different bound-morphemes but are linked at a common abstract morphological level (Aronoff, 1994; Wunderlich, 1996; Clahsen, 1999). Thus, the time spent to recover a specific stem allomorph is modulated as a function of its cumulative frequency.

# **CONCLUSION**

The overall cumulative frequency effect is strong evidence that all inflected verbs in French are decomposed in visual modality independent of their stem regularity and phonological realization. Consequently, the surface frequency effect is interpreted as the result of the recombination between the lexical information of the stem and the morphosyntactic features of the suffixes (Taft, 1979, 2004). Taken together, our results can be explained by either an obligatory decomposition model (Halle and Marantz, 1993; Taft, 2004; Marantz, 2013) or a revised dual-route model similar to the MM model (Wunderlich, 1996), which posits completely combinatorial and internally structured representations.

## **ACKNOWLEDGMENTS**

We thank two anonymous *Frontiers* reviewers for helpful comments on earlier versions of this article. We are grateful to F. -X. Alario and D. Fabre who worked on a previous version of the experiment. This research was supported by funding from the "Centre National de la Recherche Scientifique— CNRS" (Fanny E. Meunier and Gustavo L. Estivalet: UMR5304). Gustavo L. Estivalet was supported in this research by a PhD Grant from the "National Council of Scientific and Technological Development—CNPq" (238186/2012-1).

# **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www*.*frontiersin*.*org/journal/10*.*3389/fnhum*.*2015*.* 00004/abstract

# **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 19 September 2014; accepted: 02 January 2015; published online: 20 January 2015.*

*Citation: Estivalet GL and Meunier FE (2015) Decomposability and mental representation of French verbs. Front. Hum. Neurosci. 9:4. doi: 10.3389/fnhum.2015.00004 This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2015 Estivalet and Meunier. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Cross-language activation of morphological relatives in cognates: the role of orthographic overlap and task-related processing

# *Kimberley Mulder 1\*, Ton Dijkstra1,2 and R. Harald Baayen3*

<sup>1</sup> Centre for Language Studies, Radboud University Nijmegen, Nijmegen, Netherlands

<sup>2</sup> Donders Institute for Brain, Cognition, and Behaviour, Radboud University Nijmegen, Nijmegen, Netherlands

<sup>3</sup> Department of Linguistics, Eberhard Karls University, Tübingen, Germany

### *Edited by:*

Minna Lehtonen, University of Helsinki, Finland

### *Reviewed by:*

Laurie Feldman, The University at Albany – State University of New York, USA Ruth De Diego-Balaguer, Institució Catalana de Recerca i Estudis Avançats, Spain

### *\*Correspondence:*

Kimberley Mulder, Centre for Language Studies, Radboud University Nijmegen, Wundtlaan 1, 6525 XD, Nijmegen, Netherlands e-mail: Kimberley.Mulder@mpi.nl

We considered the role of orthography and task-related processing mechanisms in the activation of morphologically related complex words during bilingual word processing. So far, it has only been shown that such morphologically related words (i.e., morphological family members) are activated through the semantic and morphological overlap they share with the target word. In this study, we investigated family size effects in Dutch-English identical cognates (e.g., tent in both languages), non-identical cognates (e.g., pil and pill, in English and Dutch, respectively), and non-cognates (e.g., chicken in English). Because of their cross-linguistic overlap in orthography, reading a cognate can result in activation of family members both languages. Cognates are therefore well-suited for studying mechanisms underlying bilingual activation of morphologically complex words.We investigated family size effects in an English lexical decision task and a Dutch-English language decision task, both performed by Dutch-English bilinguals. English lexical decision showed a facilitatory effect of English and Dutch family size on the processing of English-Dutch cognates relative to English non-cognates. These family size effects were not dependent on cognate type. In contrast, for language decision, in which a bilingual context is created, Dutch and English family size effects were inhibitory. Here, the combined family size of both languages turned out to better predict reaction time than the separate family size in Dutch or English. Moreover, the combined family size interacted with cognate type: the response to identical cognates was slowed by morphological family members in both languages. We conclude that (1) family size effects are sensitive to the task performed on the lexical items, and (2) depend on both semantic and formal aspects of bilingual word processing. We discuss various mechanisms that can explain the observed family size effects in a spreading activation framework.

**Keywords: morphological family size, bilingual word recognition, response competition, cognates, spreading activation**

# **INTRODUCTION**

Past research has shown that the mental lexicon is a highly interactive system, in which words that share orthographic/phonological, morphological, or semanticfeatures can be co-activated along with the actually presented word. One demonstration of this interactive nature is the finding that upon reading a word like *house*, morphologically related complex words are co-activated that contain this word, like *housekeeper, housing,* and *wheelhouse* (Schreuder and Baayen, 1997). The set of activated words has been called the 'morphological family' of the target word. Even more intriguing, reading the same English word *house* may activate morphologically related complex Dutch words such as *bejaardenhuis* 'elderly home' or *huizenmarkt* 'house market' in speakers that are familiar with both of these languages (Mulder et al., 2013). The set of Dutch items that is morphologically related to the English target words is called the 'cross-language morphological family' of that word.

This paper provides a more detailed investigation into the activation of such morphologically related complex words in bilingual word processing. More specifically, we investigate how the word recognition of bilinguals is affected by the activation of morphological families from one or both of their languages. These effects of within-language and cross-language family size (i.e., the total number of morphological family members of a word in the same or another language) are investigated in two different paradigms (lexical decision and language decision) and for three different types of words (identical and non-identical cognates, and non-cognates). The manipulation of task and item type allows us to test the hypothesis that bilingual family size effects vary in accordance with task demands and degree of cross-linguistic orthographic overlap. This extends current theories of morphological family size effects that have been proposed for monolinguals, and allows the development of a bilingual model for such effects. To set the stage for our experiments, we will first discuss the nature of family size effects in monolinguals and then address possible implications for bilingual processing.

In monolingual studies, words with larger morphological families are generally found to be processed faster and more accurately than words with smaller morphological families. Facilitatory effects are observed in lexical decision studies for several languages with a concatenative morphology (e.g., for Dutch: Schreuder and Baayen, 1997; Bertram et al., 2000; De Jong et al., 2000; De Jong, 2002; Kuperman et al., 2009; for English: Baayen et al., 1997; De Jong et al., 2002; Juhasz and Berkowitz, 2011; for (non-Germanic) Finnish: Moscoso del Prado Martín et al., 2004; Kuperman et al., 2008). Moreover, facilitatory effects are also observed for languages with an alphabetic writing system and a non-concatenative morphology (for Hebrew: Moscoso del Prado Martín et al., 2005; for Arabic: Boudelaa and Marslen-Wilson, 2011). Finally, written Chinese is non-alphabetic and non-concatenative, but shows effects similar to the family size effect in terms of the productivity of semantic radicals (Feldman and Siok, 1997).

Schreuder and Baayen (1997) explained facilitatory family size effects by means of global lexical activation along the lines of the multiple read-out model of Grainger and Jacobs (1996): words that co-activate many other words (lemmas1) give rise to more global lexical activation supporting a positive lexicality decision. De Jong et al. (2003)simulated this mechanism in a computational model of monolingual morphological processing the Morphological Family Resonance Model (MFRM). They showed that read-out of global activation may not be necessary if activation is allowed to resonate between forms, lemmas, and meanings. In their model, associated lemmas (family members) of a target word are activated via the semantic representation of that target word, but not via its form representation. When a semantic representation of a target word is linked to many associated lemmas, a large amount of activation is spread back and forth between this semantic representation and the associated lemmas, gradually increasing the shared semantic activation and the activation level of the target lemma. Such resonance within the morphological family will thus speed up the rate at which the activation of the target lemma increases, resulting in faster word recognition.

Thus, although morphological family members are connected to a target word via both orthographic and semantic links, family size effects are generally assumed to be semantically driven (e.g., Schreuder and Baayen, 1997; Bertram et al., 2000; De Jong et al., 2000; Moscoso del Prado Martín et al., 2005; Mulder et al., 2013, 2014). For instance, Schreuder and Baayen (1997) showed that only semantically transparent family members contribute to the family size effect. Moscoso del Prado Martín et al. (2005) even observed inhibitory family size effects for family members that were not semantically related to the target. In line with this, words with a large family size (the family members being semantically related to the target; Mulder et al., 2013) and words with a large number of orthographic neighbors (the neighbors being semantically unrelated; Holcomb et al., 2002; Müller et al., 2010) elicited

different N400 effects, showing facilitation and inhibitory effects on word processing, respectively. Finally, Mulder et al. (2014) investigated primary and secondary family size effects. Secondary family size concerns the number of family members of family members. As an example, *work force* is a secondary family member of *clock*, because it is morphologically related to *clock work*, which is a primary family member of *clock*. Facilitatory effects of English primary family size were observed, while the activation of secondary family members elicited inhibitory effects, showing that when activation spreads too far out, to words that are semantically unrelated to the target, word processing is hindered. In sum, when these various observations are combined, they give rise to the hypothesis that it is the semantic convergence or divergence between target word and family members that determines the direction of the family size effect.

Although these studies indicate that the family size effect is predominantly semantic in nature, the role of orthography in activating morphologically related complex words has never been sufficiently investigated. In a lexical decision task with Dutch monolinguals, De Jong et al. (2000) observed family size effects for both regular and irregular past participles (e.g., *roei – geroeid*, 'row – rowed' versus *vecht – gevochten,* 'fight – fought'), even though the irregular past participle did not share the exact orthographic form with its stem and other family members. This suggests that, at least in monolinguals, the activation of morphologically related complex words is not dependent on complete orthographic overlap between target word and family members.

In the present paper, we investigate whether this finding can be generalized to family size effects in bilingual word processing and across empirical tasks. In bilingual processing, the co-activation of words in the non-target language for a large part depends on the degree of formal overlap between words in these languages. There are word pairs that have complete orthographic and nearly complete semantic overlap across languages. Such words are called identical cognates. An example word pair is provided by the English and Dutch word *tent*. Other translation equivalents have form overlap, but it is partial. An example word pair is *pill* (English) – *pil* (Dutch). Finally, there are translation equivalents with little or no overlap, for instance *chicken* (English) – *kip* (Dutch). In word processing by Dutch-English bilinguals, the English word *tent* is more likely to activate the Dutch orthographically similar word *tent* than *chicken* is to activate its Dutch translation equivalent *kip*. In the present paper, we hypothesize that differences in cross-linguistic overlap will have consequences for the activation of morphological family members in the two languages. Said differently, we expect that morphological family size effects will differ between identical cognates, non-identical cognates, and non-cognates. If this is the case, morphological family size effects are shown to be sensitive not only to crosslinguistic semantic overlap, but also to orthographic aspects of input words.

Until now, the few studies that addressed family size effects in bilingual word processing did not pay attention to this aspect. Some studies only used items that had complete orthographic overlap but different meanings between languages, i.e., interlingual homographs such as the English and Dutch word *room*, meaning 'cream' in Dutch; Dijkstra et al., 2005). Other studies did

<sup>1</sup>Lemmas are abstract word units. In Schreuder and Baayen's (1995) model of morphological processing, lemma nodes links form information at the access level with higher-order semantic and syntactic information. See also Taft (2011), who discussed an interactive activation framework incorporating a lemma level that captures lexical information.

vary the degree of cross-linguistic orthographic overlap, but they did not consider how family size effects depended on such orthographic overlap between word representations (i.e., family size in Dutch-English cognates such as *tent/tent* and *pill/pil* in Mulder et al., 2013). In this paper, we test the effects of orthographic overlap by examining cross-language family effects in both identical (e.g., the English-Dutch word *tent*) and non-identical cognates (e.g., *admiral* – *admiraal*, in English and Dutch). Cognates are particularly useful to examine cross-language effects of family size, not only because their degree of orthographic overlap can be manipulated, but also because they can reveal whether, despite their overlap in semantics, the activation of cross-language family members facilitate a response in different task contexts.

We can extend the predictions of the monolingual MFRM, mentioned above, to bilingual family effects. The model suggests that the cross-language family size effect should be predominantly based on semantic co-activation and resonance between the semantic representation of the target word and the family members. Therefore, regardless of task, the response to a cognate would always be facilitated, because any converging cross-language semantic information strengthens the activation of the target. However, if family members are activated initially in a 'bottomup' way via orthography, cross-language family size effects not necessarily facilitatory, because they may induce response competition between activated within-language and cross-language representations. Moreover, if family members are activated via orthography, then the activation of cross-language family members depends on the degree of orthographic overlap between cognate representations. This means that upon reading an English target word like *work*, the Dutch family member *werkvergunning'* 'work permit' is activated to a lesser extent than the English family member *workspace* as a result of less orthographic overlap. In this case, effects of family size should then interact with cognate type.

We further investigate to what extent the effects of crosslinguistic orthographic overlap are task sensitive. To do so, we examine how cross-language family size affects the response to two types of cognates (with complete and non-complete form overlap) and non-cognates in a lexical decision task and a language decision task. In English lexical decision (Experiment 1), participants must decide if the input letter string is an English word or not. Because both readings of a cognate will become activated on the basis of the input letter string, a cognate facilitation effect should arise that is dependent on the degree of cross-linguistic orthographic overlap (thus, it will be larger for identical cognates than for non-identical cognates). Given the demands of the task, participants should base their response read-out primarily on the English lexical representation and English language membership of the word (Dijkstra, 2007). There will be relatively little time for the Dutch orthographic reading of the cognate to activate its family members; as a result, the activation of cross-language family members is expected to proceed indirectly and especially via semantic co-activation. This should lead to facilitatory family size effects for both identical and non-identical cognates, with relatively little difference between both types.

In contrast, in English-Dutch language decision (Experiment 2), participants have to decide as quickly and accurately as possible

whether a presented letter string is an English word or a Dutch word. In the case of a cognate, a response conflict is expected to arise, because of the formal overlap between cognate representations. For instance, the words *tent* and *admiral* – in Dutch '*admiraal'* – could activate both a Dutch and English response. As a consequence, the response competition between the two readings of a cognate should result in a cognate inhibition effect (cf. Dijkstra et al., 2010). In this paradigm, co-activation of crosslanguage family members might be expected to lead either to facilitatory effects (because both families strengthen the activation of the target word via semantics) or to inhibitory effects (because of response competition and because both families reinforce English and Dutch language nodes). Especially in this mixedlanguage paradigm, in which the orthography is important for making a correct decision about the language membership of a word, an interaction between family size and cognate type is expected.

In all, we test the hypotheses that morphological family size is sensitive to cross-linguistic overlap and to task demands by including different item types (identical and non-identical cognates, and non-cognates) in two bilingual experiments: English lexical decision (Experiment 1) and English-Dutch language decision (Experiment 2).

# **EXPERIMENT 1 – ENGLISH LEXICAL DECISION METHOD**

### *Participants*

Twenty-nine native speakers of Dutch, mainly students of the Radboud University Nijmegen (mean age 23.8 years, SD = 5.49) took part in this experiment. All participants had English as their second language, having learnt English at school from around the age of 11. All had normal or corrected–to-normal vision. Participants were paid or received course credits for participating in the experiment.

### *Materials*

The stimulus set consisted of 400 items, half of which were English words and half were pseudo-words. All word items were selected from the CELEX database (Baayen et al., 1995). Only word items with an English lemma frequency of at least one per million in the CELEX lexical database and a length between three and eight letters were selected. All word items were mono-morphemic words. For each item, the Englishfamily size values and the English lemmafrequencies per million were extracted from the CELEX database and logarithmically transformed. The English morphological family of a word in CELEX consists of the number of English morphological derivations and compounds of a given word (not including inflections; for studies on inflectional family size effects, see Bertram et al., 2000; Traficante and Burani, 2003).

The experimental items were 90 Dutch-English cognates. Forty of these items were identical in form in Dutch and English (identical cognates; e.g., *horizon*–*horizon*), while the other 50 items were nearly identical in orthography in both languages (nonidentical cognates; e.g., *admiral*–*admiraal*). The non-identical cognates were always presented in their English form. The degree of orthographical overlap was calculated by the Levenshtein (1966) distance measure. For each cognate item, the Dutch family size values and the Dutch lemma frequencies per million were extracted from the CELEX database and logarithmically transformed. Similar to the English family size values in CELEX, the Dutch morphological family of a word consists of the number of Dutch morphological derivations and compounds of a given word (not including inflections). Half of the identical and half of the non-identical cognates had a large family size in Dutch, while the other half of these cognates had a small Dutch family size. The sets of identical and non-identical cognates with a large Dutch family size were matched on English Frequency, English Family Size<sup>2</sup> and Length (in letters) to the identical and non-identical cognates with small Dutch family size (*t*-tests, all *p*'s > 0.05). Moreover, the nonidentical cognates with large and small family size were matched on Levenshtein Distance.

The experiment further included 90 English non-cognate words that were matched to the set of cognates on English Frequency, English Family Size, and Length, and 20 English filler words that were matched on Length to the cognates and noncognates. Finally, 200 pseudo-words were added that were matched to the set of 200 word items on Length. These pseudo-words could be orthographically and phonologically legal words in English. **Table 1** presents the characteristics of the cognate and noncognate items. The order of word and pseudo-word items was then pseudo-randomized with the restriction that no more than four words or pseudo-words were allowed to follow each other. A new pseudo-randomization was made for each participant.

### *Procedure*

Participants performed an English visual lexical decision task. In this task, participants decide whether or not the visually presented stimulus is an existing English word by pressing a button corresponding to either the answer 'yes' or 'no.' The task was developed and carried out in *Presentation* version 13.0 (Neurobehavioral Systems3) and was run on a HP Compaq Intel Core 2 computer with 1.58 GHz memory and a refresh rate of 120 Hz. The participants were seated at a table at a 60 cm distance from the computer screen. The visual stimuli were presented in white capital letters (24 points) in font Arial in the middle of the screen on a dark gray background. Participants were tested individually in a soundproof room. The study was approved by the ethical committee of the Faculty of Social Sciences at Radboud University (ECG2912-2711-059).

Participants first read the English instructions, which informed them that they would be presented with word strings and which asked them to push the 'yes' button if the letter string they saw was an existing English word and to push the 'no' button if it was not. They were asked to react as accurately and quickly as possible. Participants pushed the 'yes' button with the index finger of their dominant hand and the 'no' response with the index finger of their non-dominant hand.

Each trial started with the presentation of a black fixation point '+,' which was displayed in the middle of the screen for 700 ms. After 300 ms the target stimulus was presented. The stimulus disappeared when the participant pressed a button, or when a time limit of 1500 ms was reached, and a new trial was started after an empty black screen of 500 ms.

The experiment was divided in two parts of equal length. The first part was preceded by 20 practice trials. After the practice trials, the participant could ask questions before continuing with the experimental trials. The two parts each contained 200 experimental trials. The proportion of items from each condition was the same in the two parts of the experiment. Each part began with three dummy trials to avoid lack of attention during the beginning of the two parts. The end of the first part was indicated by a pause screen. The experiment lasted for approximately 16 minutes.

After completing the lexical decision task, participants performed the X-LEX (Meara and Milton, 2003). This task was used to obtain a general indication of their proficiency in English in terms of vocabulary knowledge. Based on their scores (all scores >3200), all participants could be qualified as highly or intermediately proficient in English. Finally, participants were asked to fill out a language background questionnaire. The total session lasted approximately 30 minutes.

## **RESULTS**

Data cleaning was first carried out based on the error rate for participants and word items. Participants with an error rate of



<sup>2</sup>Recently, Mulder et al. (2014) investigated English primary and secondary family size effects in English visual lexical decision with Dutch(L1)-English(L2) bilinguals. Their stimulus materials included both Dutch-English cognates and purely English items. No effects of Dutch primary and secondary family size effects were observed on the set of cognates. The authors argued that this occurred because the English family size was varied and, consequently, took away part of the effect. However, they hypothesized that cross-language family size effects might be observed in a design in which the family size of the target language is controlled for. This design is adopted in the present study. 3www.nbs.com

more than 15% on the word items were removed from the data set (participant accuracy mean ranged from 66 to 99%), which resulted in the exclusion of the data from five participants.

Three word items (*lung, alley,* and *toad*) that elicited errors in more than 25% of the trials were removed from the data set. After removal of these items, we were left with 4243 data points on the word items. RTs from incorrect responses or null responses were removed from the remaining data set (4.18% of the data points). This resulted in a data set with 4058 data points. Inspection of the distribution of the response latencies revealed non-normality. A comparison of a log transform and an inverse transform (RT = 1000/RT) revealed that the inverse transform was most successful in approximating this non-normality.

Response latencies were analyzed with a linear mixed effects model with subject and item as crossed random effects (see, e.g., Baayen, 2008; Baayen et al., 2008). We considered the following predictors: one lexical variable that is known to affect response latencies is target word frequency. Recent research shows that *SUBTLWF* (logarithmical transformation of English Subtitle frequency per million) is a better predictor of response latencies than the logarithmically transformed English CELEX frequencies per million (see Brysbaert and New, 2009). In the remainder of this experiment, we will use the term *English Frequency* to refer to the logarithmical transformation of *SUBTLWF* as a predictor of target word frequency. Moreover, because bilinguals are expected to be sensitive to non-target language word frequency, we considered the logarithmically transformed CELEX values per million for Dutch lemma frequency (*Dutch Frequency*).

Further, the logarithmically transformed CELEX values for English family size (*English Family Size*) and Dutch family size (*Dutch Family Size*) were included as predictors. The English family size values were collinear with the values of the logarithmically transformed values of *English Frequency* and *Dutch Family Size*. To remove collinearity, we regressed *English Family Size* on *English Frequency* and *Dutch Family Size* and used the resulting residuals as new predictors of English family size uncontaminated by English frequency. Similarly, *Dutch Family Size* was regressed on *Dutch Frequency* and *English Family Size*. Moreover, we added the predictor *Total Family Size* (the sum of the Dutch and English family sizes) to account for possible increased facilitation due to large amount of global activation in the lexicon produced by the family members.

Besides these predictors for target and non-target language family size and frequency, other predictors were considered that could affect lexical decision latencies. In order to test whether cognate items were processed differently from non-cognate items, we included a factor *Cognate* with the levels 'cognate' and 'non-cognate.' Moreover, the predictor *Word type*, containing three levels ('identical cognate,' non-identical-cognate,' and 'noncognates'), was included to account for the degree of form overlap between English and Dutch, with non-cognates having zero overlap, non-identical cognates having intermediate overlap, and identical cognates having maximal overlap. Furthermore, to be able to account for the possibility that family size effects are dependent on a "complete-or-not-complete" distinction in formal overlap, the factor *Identical Cognate* [with the levels Identical

cognates and Other items (the latter including non-identical cognates and non-cognates)] was considered.

Further, *OLD* (the mean distance, in number of steps, from a word to the 20 closest Levenshtein neighbors in the lexicon; OLD-20; see Balota et al., 2007, and Yarkoni et al., 2008) was included as a predictor to account for effects of similarity between English words. Finally, we included *Trial* (the rank of the item in the experimental list) as predictor to account for learning effects during the experiment.

We performed a stepwise variable selection procedure in which non-significant predictors were removed to obtain the most parsimonious model. Moreover, for each significant predictor, it was evaluated whether inclusion of this predictor resulted in a better model (i.e., containing a lower AIC compared to when this predictor was not part of the model). Next, potentially harmful outliers (defined as data points with standardized residuals exceeding 2.5 standard deviation units) were removed from the data set. We then fitted a new model with the same significant predictors to this trimmed data set.

The final model incorporated three parameters for the randomeffects structure of the data: a standard deviation for the random intercepts for subject (SD = 0.21) and item (SD = 0.08), as well as a SD for the by-subject random slope for *Trial* (SD = 0.05). The standard deviation for residual error was 0.29. The model contained four numerical predictors (*English Frequency, Dutch Frequency*, *Dutch Family Size*, and *OLD*), one factorial predictor (*Identical Cognate*) and one two-way interaction (*Dutch Family Size*: O*LD*). The relevant statistics and corresponding coefficients of the final model are reported in **Table 2**. The significant partial effects of the final model are visualized in **Figure 1**. In both **Table 2** and **Figure 1C**, the two levels of *Identical Cognate* are specified as *True* and *False*: the former corresponding to the set of identical cognates, and the latter to the set of non-identical cognates and non-cognates.

The analyses showed a facilitatory effect on response latencies for *English Frequency*, while (non-target language) *Dutch Frequency* had an inhibitory effect. Moreover, the final model revealed a processing advantage for identical cognates in comparison to non-identical cognates and non-cognates. While models including either the predictors *Cognate* or *Word Type* also produced significant facilitation effects for cognates in comparison to non-cognates, with the latter predictor indicating the

**Table 2 | Coefficients of the main effects and interaction effects of the final model, together with the standard error,** *t***-values and** *p***-values in English lexical decision (Experiment 1).**


largest facilitation effects for identical cognates, *Identical Cognate* turned out to be a better predictor than either *Cognate* or *Word Type*, suggesting that it is maximal formal overlap with Dutch words that is most helpful in order to make an L2 lexical decision.

*Dutch Family Size* was a better predictor than *Total Family Size,* which was not significant. *Dutch Family Size* has a significant facilitatory main effect on response latency. However, the significant interaction between *Dutch Family Size* and *OLD,* shows that response latencies were slower when a word has a large *Dutch Family Size* and fewer close orthographic neighbors. However, when a word has more close orthographic neighbors, a large *Dutch Family Size* is beneficial to word processing. No significant interaction between *Dutch Family Size* and either *Cognate Type* or *Identical Cognate* was observed.

# **DISCUSSION**

As predicted, in the English lexical decision task of Experiment 1, Dutch-English bilinguals were sensitive to the frequency of the English target words. English words with a higher frequency led to faster responses than lower frequency words. The effect of *English Family Size* of the target words was not significant. This is not surprising, because this factor was controlled for in order to allow non-target language (Dutch) family size effects to arise.

Importantly, statistical analyses revealed a significant effect of *Identical Cognate*. This predictor turned out to be a better predictor than both *Cognate* and *Word Type*. Responses to identical cognates were faster than to non-identical cognates and noncognates. This result supports the distinction between identical cognates and non-identical cognates. This dissociation between the two cognate types is in line with the findings of Dijkstra et al. (2010), who observed a gradual decrease in L2 response latencies with an increase in similarityfor non-identical cognates and a steep decline in response latencies going from non-identical to identical cognates. As the major mechanism underlying these findings, Dijkstra et al. (2010) proposed that the non-target L1 reading of the presented cognate was activated to an extent dependent on its degree of overlap with the input letter string. This then resulted in differences in semantic co-activation.

There was a significant facilitatory main effect of *Dutch Family Size.* Moreover, *Dutch Family Size* interacted significantly with *OLD*, a measure of orthographic neighborhood density. The interaction revealed a processing disadvantage for words with a large Dutch family size and more distant English orthographic neighbors. Thus, making a lexical decision on an English word is easier when a word is more'English-like' (e.g., when it is orthographically closer to English neighbors) and generates less Dutch activation (e.g., when it has a small Dutch family size).

Interestingly, no significant interaction was observed between *Dutch Family Size* and *Identical Cognate*. A lack of a difference in the direction of the effect or the effect size for identical and non-identical cognates would follow if the family size effect is exclusively semantically driven. Therefore, although a morphological relationship links a target word to its family members, it seems that the effect of the activation of these family members itself is not dependent on the degree of formal overlap they share with the target word. However, while this may be true for the present situation in which bilinguals processed words in a largely monolingual task context, formal overlap might affect the family size effect when there is an explicit bilingual task context. This would especially be the case for a language decision task in which bilinguals have to judge the language membership of presented words (e.g., English or Dutch).

This issue is investigated in Experiment 2. Here Dutch-English bilinguals carried out a Dutch-English language decision task, in which they had to decide whether or not a presented word was English or Dutch. There were no pseudo-words in this task. In this task, the two readings of a cognate are linked to a different response. For instance, in Dutch-English language decision, the English reading *work* of the cognate *work* is linked to an English response, while the Dutch reading *werk* is linked to a Dutch response. Making a language decision on a cognate should therefore result in response competition between the representations of a cognate and slow down target word processing. The task dependency of processing form similar words was earlier observed for both interlingual homographs (Dijkstra et al., 1998, 2000; Dijkstra, 2005) and cognates (Font, 2001; Dijkstra et al., 2010) showing a change in the directionality of the effects in (generalized) lexical decision and language decision. Moreover, Dijkstra et al. (2010) observed a discontinuous strong increase in response latencies in language decision goingfrom nearly identical to identical cognates, mirroring the cognate effects found in lexical decision.

As was hypothesized in the Introduction, the activation of morphological family members of a cognate in language decision may affect target word processing in two ways. First, given that morphological family members of a cognate share part of their semantics with the cognate, activation of both within-language and crosslanguage family members could lead to facilitation for cognates with a largefamily size. This will then reduce the cognate inhibition effect.

Alternatively, activated morphological families may inhibit word processing given that they are linked to cognate representations that are in response conflict. Because family members are assumed to strengthen the activation of the target word to which they are linked, cognates with a large family size could then strengthen response competition and increase the cognate inhibition effect. Moreover, if language-specific information is necessary in order to resolve a response conflict, then family size effects might be sensitive to the degree of form overlap between cognate representations. If this is the case, stronger inhibitory effects of the family size of both languages are expected in identical cognates compared to non-identical cognates, because they activate less language-specific information.

# **EXPERIMENT 2 – DUTCH-ENGLISH LANGUAGE DECISION METHOD** *Participants*

Forty-five students of Radboud University Nijmegen (mean age 20.4 years, SD = 1.92) took part in this experiment. They were all native speakers of Dutch, having English as their second language. They were first exposed to English at school, approximately from the age of 11. They were paid or received course credits for participating in the experiment.

### *Materials*

The stimulus set consisted of 168 items. The set consisted of 72 Dutch-English noun cognates and 96 non-cognate items. The 72 cognate items were 24 form-identical Dutch-English cognates and 48 Dutch-English cognates that were not identical in form. The 96 non-cognate items were 48 English non-cognates and 48 Dutch non-cognates.

Because of the change from an English lexical decision task in Experiment 1 to an English-Dutch language decision task in Experiment 2, Dutch non-cognates and non-identical cognates had to be added to the stimulus materials. Further, 20 of the 90 cognates and 20 out of 90 non-cognates that were used in Experiment 1 (lexical decision) were also used in Experiment 2 (language decision). In Experiment 1, in order to observe Dutch family size effects, English family size was controlled for. As we wanted to look at response competition between the Dutch and English and the contribution of their respective family sizes, we had to vary the English and Dutch family sizes; as a consequence, the item set of Experiment 1 was not completely suited for Experiment 24.

The 48 non-identical cognates were either presented in Dutch or English orthography. A participant was presented with only half of the non-identical cognates in their Dutch form and the other half in their English form. Thus, for each participant, half of the items were Dutch and half of the items were English (24 identical cognates, which could be both Dutch and English). In total, there were 72 Dutch words (24 Dutch non-identical cognates and 48 Dutch non-cognates) and 72 English words (24 English non-identical cognates and 48 English non-cognates).

Within each version, the two sets of 24 non-identical cognates were matched to each other on English Family Size and Dutch Family Size, English Frequency and Dutch Frequency (see Experiment 1 for a definition), Length (in letters), log English Bigram Frequency and log Dutch Bigram Frequency. Furthermore, the two sets of 24 language specific non-identical cognates of version 1 were matched on Length and their language specific bigram frequency with the non-identical cognates from the same language in version 2. Finally, the identical cognates were matched on Length, English Frequency, and English Family Size to the set of 48 non-identical cognate items, but could not be matched on Dutch Family Size and Dutch Frequency. The identical cognates have a lower mean Dutch Frequency and are less productive in terms of morphological family members than Dutch non-identical cognates.

The English and Dutch non-identical cognates and the identical cognates in each version were each matched on English Family Size and Dutch Family Size, English Frequency, and Dutch Frequency, Length, log English Bigram Frequency, and log Dutch Bigram Frequency to 24 English and 24 Dutch non-cognate items, respectively. These non-cognate items only had a noun-reading. **Table 3** presents the characteristics of the cognate and non-cognate stimuli.

The experiment consisted of two item blocks. The proportion of items from each condition was the same in the two parts of the experiment. The presentation order of the items within each item block was randomized for each participant with the restriction

<sup>4</sup>The materials were highly similar to the materials used in Mulder et al. (2014). All 24 identical cognates, 24 out of 25 English non-identical cognates and a large portion of the control words used in Mulder et al. (2014) were also used in Experiment 2. In this study, the family sizes of Dutch and English were varied.


**Table 3 | Item characteristics of the experimental items used in Experiment 2.**

that no more than three cognates or non-cognates followed each other directly.

# *Procedure*

Participants performed an Dutch-English language decision task. In this language classification task, participants have to decide whether the visually presented stimulus is an existing English or Dutch word by pressing a button corresponding to either the answer 'English' or 'Dutch.' The study was approved by the ethical committee of the Faculty of Social Sciences at Radboud University (ECG2912-2711-059).

The task was developed and carried out in *Presentation* version 13 (Neurobehavioral Systems5) on a HP Compaq Intel Core 2 computer with 1.58 GHz memory and a refresh rate of 120 Hz. Participants were tested individually in a sound proof room. They were seated at a table at a 60 cm distance from the computer screen. The visual stimuli were presented in white capital letters (24 points) in font Arial in the middle of the screen on a dark gray background.

Participants first read the English instructions. These informed them that they would be presented with word strings, and asked them to push the 'left' button if the letter string they saw was an existing English word and the 'right' button if the letter string was a Dutch word. They were informed that some words in the experiment could belong to both Dutch and English. In those cases, they were free to choose whichever response they liked. They were asked to react as accurately and quickly as possible.

Each trial started with the presentation of a black fixation point '+,' which was displayed in the middle of the screen for 700 ms. After 300 ms the target stimulus was presented. It remained on the screen until the participant responded or until a maximum of 1500 ms passed by. The experiment was divided into two parts of equal length. The first part was preceded by 20 practice trials. After the practice trials, the participant could ask questions before continuing with the test trials. The two parts each contained 84 experimental trials, and each started with three dummy trials.

After completing the language decision task, participants performed the X-LEX (Meara and Milton, 2003). This task was used to obtain a general indication of their proficiency in English in terms of vocabulary knowledge. All participants obtained a score of 3200 or higher, which qualified them as intermediately or highly proficient in English. Finally, participants were asked to fill out a language background questionnaire. The experimental session lasted approximately 18 minutes.

# **RESULTS**

The data were first screened for high error rates of participants and items. The participant accuracy mean ranged between 90.3 and 100%. Due to the small proportion of errors, data of none of the participants had to be excluded. However, four participants were excluded based on their slow mean RTs (more than 2 SDs from group RT mean) on the task relative to the mean RTs of the other participants.

Items that had more than 20% of errors were removed from the data set. These included two cognate items (*priest* and *thee*) and one non-cognate item (*poem*). Note that responses to identical cognates, which have an identicalform in English and Dutch, could never result in errors, because both an English or a Dutch response is appropriate. Incorrect items and null responses were removed from the remaining data set. This resulted in a dataset of 6473 data points. Inspection of the distribution of the response latencies revealed non-normality, with outliers in both tails. An inverse transform (RT = 1000/RT) was most successful in attenuating this non-normality.

As in Experiment 1, the data were analyzed with a linear mixed effects model. We considered the same predictors as in Experiment 1. *Response Language* and *Previous Language* were added as variables. *Response Language* was defined as the value (Dutch or English) of the response given to the preceding word. *Previous Language* corresponded to the language membership of the preceding word (Dutch, English, or in the case of identical cognates, both). Moreover, we added the predictor *Total Family Size* (the sum of the Dutch and English family sizes) to account for possible increased response conflict due to large amount of global activation in the lexicon produced by the family members. The same procedure as in Experiment 1 was applied to obtain the final model.

Both *Dutch Family* Size and *English Family Size* were considered in one model. Both predictors had an inhibitory effect on response latencies when both were included in the same model or when included in a separate model with only one family size

<sup>5</sup>www.nbs.com

measure. Moreover, *Total Family Size* had an inhibitory effect. An ANOVA revealed that the model with *Total Family Size* was slightly better at explaining the variance (as reflected by lower AIC values). Therefore, *Total Family Size* was included in the model in favor of *English Family Size* and *Dutch Family Size*. Further, the predictor *Dutch Frequency* produced an insignificant coefficient and was removed from the model. Finally, *Word Type*, *Identical Cognate,* and *Cognate* were considered. The model with *Identical Cognate* resulted in the best fit of the data.

The final model incorporated two parameters for the randomeffects structure of the data: a standard deviation for the random intercept for item (SD = 0.07) and subject (SD = 0.14), as well as a standard deviation for the by-subject random slope for *Trial* (SD = 0.06). The SD for residual error was 0.35. The model contained three numerical predictors (*English Frequency*, *Total Family Size,* and *OLD*), three factorial predictors (*Identical Cognate, Response Language*, and *Previous Language*), and four interactions (*Identical Cognate: Total Family Size*, *Identical Cognate: English Frequency*, *Total Family Size: Response Language*, and *Identical Cognate: Previous Language*). The relevant statistics and corresponding coefficients of the final model are reported in **Table 4**. The significant effects of the final model are visualized in **Figure 2**. In both **Table 4** and **Figures 2E,G**, *Identical Cognate* has two levels: *True* and *False*: the former corresponding to the set of identical cognates, and the latter to the set of non-identical cognates and non-cognates.

A significant facilitatory main effect of *English Frequency* was observed. Further, *Total Family Size* had an inhibitory effect on word processing. Moreover, *OLD* had an overall inhibitory effect, showing that the more distant orthographic neighbors are in terms of orthographic similarity, the harder it is to make a language decision.

**Table 4 | Coefficients of the main effects and interaction effects of the final model, together with the standard error,** *t***-values and** *p***-values in English-Dutch language decision (Experiment 2).**


The main effect of *Response Language* revealed slower response latencies when Dutch was chosen as response language (including responses to Dutch identical cognates and Dutch non-cognate words). Moreover, we observed an interaction between *Total Family Size* and *Response Language* demonstrating faster RTs for words with a large combined family size when the response language was Dutch.

There was no significant main effect of *Identical Cognate* when multiple interactions were included in the model. *Identical Cognate* interacted significantly with *Total Family Size* and revealed more inhibition with an increasing number of Dutch and English family members for identical cognates than for the other stimuli. Finally, *Identical Cognate* interacted with *Previous Language* showing faster response latencies for non-identical cognates and non-cognates compared to identical cognates when the response language was English.

The possibility of a response strategy was considered in a model predicting the response language chosen by the participant (English or Dutch) on identical cognates only. The same predictors that were considered in the analysis of the complete data set were included. Again, all non-significant predictors were removed.

The final model incorporated two parameters for the randomeffects structure of the data: a standard deviation for the random intercept for item (SD = 0.09) and subject (SD = 0.16), as well as a standard deviation for the by-subject random slope for *Trial* (SD = 0.06). The standard deviation for residual error was 0.42. The model contained two numerical predictors (*Dutch Frequency* and *Dutch Family Size*) and one interaction (*Dutch Family Size: Dutch Frequency*). The relevant statistics and corresponding coefficients of the final model are reported in **Table 5**. The significant interaction of the final model is visualized in **Figure 3**.

*Dutch Family Size*interacted significantly with *Dutch Frequency*, revealing that a high *Dutch Frequency* led to more Dutch responses when the *Dutch Family Size* was small (and vice versa). When both the *Dutch Family Size* and *Dutch Frequency* were low, more English responses were given.

In order to obtain a more fine-grained picture, we further looked at non-linear relationships involving family size and cognate status. We therefore also analyzed the data by means of a generalized additive mixed model (GAMM)6. The parametric part of the model contained the predictor *IRL* specifying the four combinations of *Identical Cognate* and *Response Language*, while the non-parametric part included tensor product smooths for the interactions of *IRL* with *English Frequency* and *Total Family Size,* and smooth terms for item and the interaction of *Trial* by participant. **Table 6** presents the coefficients for the main effects and

<sup>6</sup>A generalized additive mixed model (GAMM) extends the general linear model by allowing non-linear relationships between one or more predictors and the dependent variable. It consists of a parametric part that is identical to that of a standard (generalized) linear model, and a non-parametric part that provides functions for modeling non-linear functional relations in two or higher dimensions. GAMMs are especially useful for modeling interactions of numerical predictors. Whereas multiplicative interactions in the generalized linear model impose a very specific (and highly restricted) functional form, the so-called tensor product smooths of GAMMs make it possible to fit wiggly regression surfaces and hypersurfaces (see Wood, 2006, for further details).

**FIGURE 2 | Partial effects of the significant predictors on response latencies in English-Dutch language decision (Experiment 2). (A)** Log English Frequency, **(B)** Log Total Family Size, **(C)** Response Language (2 levels: English and Dutch), **(D)** OLD, **(E)** the interaction of Log Total Family Size with Identical Cognate (2 levels: True, corresponding to the set of identical cognates, and False, corresponding to the set of

non-identical cognates and non-cognates), **(F)** the interaction of Log Total Family size and Response Language (2 levels: English and Dutch), and **(G)** the interaction of Previous Language (2 levels: English and Dutch) and Identical Cognate (2 levels: True, corresponding to the set of identical cognates, and False, corresponding to the set of non-identical cognates and non-cognates).

**Table 5 | Coefficients of the model predicting the choice for response language in identical cognates in Dutch-English language decision (Experiment 2).**


interaction effects of the GAMM, together with the standard error, *t*-value and *p*-value. **Figure 4** visualizes these effects. The results of the GAMM refined the results of the earlier linear mixed effects model as follows.

In the parametric part of the model, the reference level of *IRL* refers to identical cognates responded to with an English decision (TRUE.EN in **Table 6**). Relative to identical cognates, responses with'English'for non-identical cognates and English non-cognates were faster (by 0.093). Identical cognates that received a 'Dutch' response were responded to more slowly than identical cognates receiving an'English' response (by 0.16). 'Dutch' responses to nonidentical cognates and Dutch non-cognates were faster, just as for English (by 0.096). In other words, identical cognates were difficult to respond to, especially so when participants decided to go for 'Dutch' as response.

The non-parametric part of the model showed that for identical cognates responded to with 'English,' mainly an effect arose of *Total Family Size*: a greater combined Dutch and English family size slowed the participants' responses. For identical cognates

**FIGURE 3 | Significant interaction between Dutch Family Size and Dutch Frequency as a predictor of the choice for response language (0 = English, 1 = Dutch) on identical cognates.**

**Table 6 | Coefficients of the GAMM predicting response latencies in Dutch-English language decision (Experiment 2).**


responded to with 'Dutch,' there was mainly a facilitatory effect of *Frequency*. For non-identical cognates and non-cognates in English, *Frequency* and *Total Family Size* were not predictive. Finally, for non-identical cognates and non-cognates with 'Dutch' as response language, both *Frequency* and *Total Family Size* were

at work. Both effects were now facilitatory. The final two panels of **Figure 4** show a large variability in subjects and items. For subjects, the factor smooths show large differences between fast and slow subjects, plus considerable variation in how they proceeded through the experiment.

A second GAMM analysis was performed to analyze the choice for response language upon seeing an identical cognate. The model included the predictor *Total Family Size* as well as smooth terms for *RT*, item, and the interaction of *Trial* by participant. **Table 7** presents the coefficients for the main effects and interaction effects of the model, together with the standard error, *z*-value, and *p*value. **Figure 5** visualizes these effects that assess the log of the Dutch/English odds ratio. The upper left panel indicates that, as*RT* increases, Dutch is more likely to be selected. For shorter response latencies, however, there is considerable uncertainty about the estimate, suggesting guessing behavior. The upper right panel shows that, with incomplete information about the time series of responses (when only identical cognates are included in the analysis), most of the participant differences concern a language bias on the part of the participants, some preferring Dutch, others preferring English. The lower left panel indicates that the item effects were fairly normal. Finally, the lower right panel presents the effect of *Total Family Size*. The greater the joint English-Dutch family size, the more likely Dutch was as the response category.

In sum, the model on response latencies reveals from the shifts in intercepts, that when dealing with an identical cognate, participants were faster to choose English and slower to choose Dutch. When they chose English, a large *Total Family Size* (mostly coming from Dutch family size) worked against this decision (upper left panel of **Figure 4**). When they chose Dutch, a greater *Frequency* facilitated this response. When dealing with a nonidentical cognate or a non-cognate, responses were on average faster: the item's orthography was informative about the language. For English, lexical distributional properties had no predictivity. For Dutch, *Frequency,* and *Total Family Size* worked in the usual way, both affording facilitation. From the analysis of the language selected for response, we see that participants based their ultimate decision on semantics: the better integrated a word was in the lexical network, as evidenced by a large family size, the more likely a participant was to opt for Dutch. As family sizes in English are probably smaller than those for Dutch for these participants, using family size as a guide to language is a rational choice. Of course, using family size as a rationale for selecting Dutch words must give rise to longer decision latencies when actually a decision is made favoring English. This is exactly what we see in the reaction time data (upper left panel of **Figure 4**).

We conclude that participants performing this languagedecision task thus operate under two potentially conflicting sources of information. First, the orthography provides, for nonidentical cognates and non-cognates, but distributionally also for identical cognates, a bias toward one or the other language. Second, the semantic activation of a word, gaged by its family size, does not allow a language decision. Participants in this experiment chose to optimize their responses by taking a large family as evidence for their native language. For English, this slowed their responses.

### **DISCUSSION**

The aim of Experiment 2 was to tap into the task dependency of the family size effect for cognates. In this experiment, we applied a language decision task in which participants had to decide if a visually presented word was either English or Dutch. Because in this task participants have to distinguish the two readings of a word, response conflicts are expected to arise upon seeing a cognate and these conflicts should result in a cognate inhibition effect. We hypothesized that activation of both target and nontarget language family members should strengthen the activation



of both representations and add to the response competition in cognates.

As was shown in a linear mixed effects model and confirmed by a GAMM, there was a clear dissociation between identical cognates and non-identical cognates in terms of response latencies. Identical cognates were processed more slowly than non-identical cognates and non-cognates, though the main effect of *Identical Cognate* disappeared when multiple interactions with *Identical* *Cognate* were considered in the linear mixed effects model. The inhibitory effect can be explained asfollows. For identical cognates, which have an overlapping similar orthography in both Dutch and English, there is no language specific orthographic cue that will resolve the language decision, and both language responses will be appropriate (participant's choice). This will induce response competition for identical cognates. The response competition is attenuated in non-identical cognates, because these items contain orthographic cues that resolve the language ambiguity, resulting in no significant inhibition for these types of cognates compared to language specific non-cognates.

The family size effects were found to be inhibitory for both languages (in the final model, both family sizes were combined into one count *Total Family Size*, which resulted in an even larger coefficient for family size). This finding argues against the hypothesis that cross-language family size effects are exclusively driven by the semantic overlap between family members and target word. This would logically always lead to facilitatory effects in cognates. Instead, the inhibitory family size effects observed for both languages show that family size effects are sensitive to task context. Activated family members were found to increase the induced response competition between cognate representations (i.e., the more a word points to both languages, the more

difficult it is to make a choice between a Dutch and an English response).

Interestingly, the observed dissociation between identical cognates and non-identical cognates was also reflected in the strength of the combined family size effect. *Total Family Size*interacted with *Identical Cognate*, reflecting a large inhibition effect for identical cognates but not for non-identical cognates and non-cognates. This shows that activation of Dutch and English morphological family members added to the competition in identical cognates, increasing the inhibitory effect for these words.

Surprisingly, although participants were more fluent in Dutch than in English, they were slower when they chose Dutch as a response language (for both items that either require a Dutch response or items that may receive a Dutch response). Moreover, participants were slower on non-identical cognates and noncognates compared to identical cognates when they were preceded by a Dutch item. This suggests that participants applied a response strategy in which English was set as a default response (cf. the language decision experiment in Dijkstra et al., 2010). Finally, *Total Family Size* moderated the Dutch responses: a Dutch response for words with a large combined family size resulted in faster response latencies.

The possibility of a response strategy was considered in a model predicting the choice for a given response in English or Dutch on identical cognates only. The choice pattern for identical cognates could be predicted from *Dutch Family Size* and *Dutch Frequency*. Identical cognates that were highly frequent in Dutch elicited more Dutch responses than less frequent identical cognates. Similarly, identical cognates that had a high productivity in terms of Dutch family members more often elicited a Dutch response than identical cognates with a smaller number of Dutch family members. However, when both the Dutch frequency and family size were either very low or very high, participants more often pressed the English response button. Relating this finding to the observed pattern in the response latencies, it suggests that our bilingual participants adopted a response strategy in which English was the default response language, which was hindered by the strong Dutch activation. These results were largely confirmed by the GAMM analysis: participants used the combined morphological family (consisting for a substantial part of Dutch family members) as a rationale for selecting Dutch words. This resulted in longer decision latencies when actually a decision was made favoring English

In sum, the language decision results reveal that the direction of the family size effect is sensitive to task-induced processes such as response competition between cognate representations. Furthermore, we found a dependency of the family size effect on the cross-linguistic degree of form overlap in cognates, which is an indication that the activation of family members depends on their similarity to the input word. For instance, the input letter string *work* may activate Dutch a family member like *werkplaats* somewhat less than *hotel* would activate *hotelkamer*, because of the cross-linguistic difference in orthographic overlap between the target words and their family members. In language decision (in contrast to lexical decision), this effect of orthographic overlap becomes visible, because, due to an increased activation of both language nodes for identical cognates, response competition becomes enlarged and magnifies the family size effects.

# **GENERAL DISCUSSION**

The present study investigated the role of task-dependency and orthographic overlap in activating cross-language family members. By looking at family size effects in cognates, we aimed at answering two main questions. First, is the cross-language family size effect sensitive to language-specific orthographic cues of stimuli, such as the degree of orthographic overlap between cognate representations? Second, is the cross-language family size effect sensitive to more task-dependent processes, such as response competition between cognate representations? These questions were investigated with Dutch-English bilinguals in two behavioral experiments: an English lexical decision task (Experiment 1) and an English-Dutch language decision task (Experiment 2).

In Experiment 1, English lexical decision, a cognate facilitation effect was observed for both identical and non-identical cognates relative to English non-cognates, with the largest effects for identical cognates. Dutch family size was observed to have a facilitatory effect on cognate processing. Further, no interaction between Dutch family size and cognate type was found, indicating that the strength and the direction of the cross-language family size effect did not significantly change as a function of the degree of form overlap in the cognate items.

In Experiment 2, a Dutch-English language decision with the same type of bilinguals as was used in Experiment 1, response competition between Dutch and English cognate representations was experimentally induced by means of a two-choice forced decision about the language membership of the items. Relative to non-cognates, this resulted in an overall inhibitory cognate effect for identical cognates but not for non-identical cognates. English family size had an inhibitory effect on response latencies to both cognates and purely English words. With respect to Dutch family size effects, similar inhibitory effects were observed for cognates and purely Dutch items. Moreover, the inhibitory effects of Dutch and English family size in cognates were stronger when they were combined into one family size count (*Total Family Size*). These results demonstrate that the direction of the within-language and cross-language family size effects (facilitatory or inhibitory) is not only driven by semantic overlap in the morphological family, but is sensitive to other processes that play a role in the task at hand, such as response competition.

Interestingly, the combined family size effect was also found to depend on cognate type: a large combined morphological family induced more inhibition in identical cognates than in non-identical cognates. This can be explained by assuming that identical cognates, due to their complete orthographic overlap, lead to a stronger activation of semantics and of family members than non-identical cognates. In language decision, this complete cross-linguistic overlap might increase the amount of response competition between activated cognate representations.

How do these bilingual family size effects in cognates relate to the findings of earlier and predominantly monolingual studies that argued that the family size effect is a purely semantic effect? We found that the cross-language family size effect is sensitive to the demands posed by the task to be performed. In a task like English lexical decision (Experiment 1), only one language (English) is relevant for responding ("is this an English word or not"), and the activation of English words is assessed against the background activity in the lexicon produced by English non-words. In this task situation, English is not explicitly contrasted with Dutch. Under these circumstances, especially semantic convergence of family members in the two languages seems to determine the direction of the family size effect for cognates, resulting in facilitation. Similar findings arise for generalized lexical decision, in which words of both languages underlie the "yes, it is a word" (e.g., Dijkstra et al., 2005; Mulder et al., 2014). These results are similar to those obtained in the monolingual domain (e.g., Schreuder and Baayen, 1997; De Jong, 2002).

In contrast, in our language decision task (Experiment 2), the two languages must be contrasted explicitly to arrive at a correct response ("is this word English or Dutch?"). Here orthographic language-specific information is relevant for distinguishing activated cognate representations, each of which is linked to a particular response. As a consequence, the processing of cognates suffers from response competition between activated representations. In line with this argumentation, Dijkstra et al. (2010) observed longer response latencies for identical cognates compared to non-identical cognates in language decision. This finding shows that the larger the orthographic overlap in cognates is, the larger the competition between activated representations is as well. Our data attenuate this finding by showing that it is more a complete-incomplete distinction with respect to orthographic overlap rather than a graded effect. In this sense, identical cognates might have a special status that allow for maximal cross-linguistic effects to occur (cf. Mulder et al., 2014).

In fact, in our language decision experiment, semantic convergence between target and family members did not lead to facilitatory effects of family size, even though activated family members are assumed to strengthen the activation of each cognate representation to which they are linked. Due to the response competition between cognates, inhibitory family size effects arose. Especially in identical cognates, a large family size in one of the two languages is not beneficial for word processing in language decision: the activation of a large number of family members that contain language-ambiguous orthographic information (e.g., the activation of *water* in the English family member *water fall* and Dutch family member *drink water* for the target cognate *water*) increases the response conflict between competing cognate representations. This results in more inhibition for identical cognates with a large family size in one the two languages relative to non-identical cognates (that contain more language-specific information to resolve the response conflict) with a large family size.

We note that the different direction of the family size effects observed in lexical decision (i.e., facilitation) and language decision (i.e., inhibition) is not due to a difference in the item set, because Mulder et al. (2014) observed also facilitatory effects of family size in lexical decision with an item set that was highly similar to the item set used in our language decision experiment.

This indicates that the direction of the effect is not dependent on a specific subset of items but differs as a function of task demands.

Our findings have consequences not only for models explaining morphological effects but also for models of bilingual word processing. According to the MFRM, De Jong et al. (2003), family members are activated through the activated semantic representation of the target to which they are linked, and family size effects occur because of the resonance of activation between the activated family members and the semantic representation.

However, we argue that, in addition to the semantic family effects, in bilingual processing orthographic factors must also play a role. Reconsidering the way in which morphological family members may become activated, two possible routes may be assumed. The first possibility (similar as in the MFRM) is that, upon reading the input *water*, the orthographic representations of 'water' in each of the two languages become activated. These may then activate their respective (or shared) semantic representations in each language, which will in turn activate their morphological family members. The second possibility is thatfamily members can also be activated indirectly, via a formal route, e.g., the input *water* activates its family member *water fall*, *drinking water*, etcetera via their orthographic compound representations. Evidence for such bottom–up activation of family members is supported by the early family size effects observed in the ERP study of Mulder et al. (2013). The finding that family size effects occurred around 200 ms after stimulus onset could point at activation via a formal route, as it is not evident that semantic activation already is effective at this point in time. Furthermore, the assumption of a formal route leads to the prediction that there should be family size effects in progressive demasking. Although Schreuder and Baayen (1997) failed to observe family size effects in monolingual progressive demasking, this could have several causes. For instance, orthographic factors on family size effects might only play a role in bilingual processing, because of competing representations of different languages. Alternatively, the traditional ANOVA in their paper might not have been sensitive enough to pick up family size effects in progressive demasking. A replication of this monolingual study and additional bilingual progressive demasking experiments might unravel under which conditions the formal route plays a role.

In sum, the data presented in this paper support an account proposing two routes of activation for family members depending on the task at hand: a direct, bottom–up route via the orthographic representation of the target and an indirect, semantic route via resonance with the target. The explanation of family size effects presented above thus proposes a bilingual extension of the MFRM model of De Jong et al. (2003) in terms of adding an orthographic route to activate family members. Importantly, resonance of activation between the semantic level and lemma level can still occur via this route, and in many task situations, the semantic route may be the dominant route.

Our data are also in line with language non-selective access accounts of bilingual word processing, such as in bilingual interactive activation models like the BIA+ model (Dijkstra and Van Heuven, 2002). Although it allows co-activation of orthographically or phonologically related lexical items, the BIA+ model has no specific account for resonance between family members and the target to which they are linked. Integrating the MFRM model of De Jong et al. (2003) within the BIA+ model would result in a model that allows activation of family members via an orthographic route and a semantic route, and allows resonance between semantic and orthographic representations. This model is displayed in **Figure 6**7.

However, a further model extension is required to account for all our bilingual data. In a task situation in which two languages need to be distinguished, such as language decision, activation of language membership information determines the role of the activated family size. In language decision, a response conflict arises when activated representations from two languages overlap in form (e.g., cognates or interlingual homographs) and are linked to different responses. The response competition is more directly dependent on language membership information than on semantic convergence between target and family members. Inhibitory effects of family size of both languages can be explained

by summed language membership activation that increases the response conflict. Language membership information based on the orthographic input should come available in parallel to the semantic representation that has been activated (cf. Van Kesteren et al., 2012). However, additional effects of response competition might influence later stages of word processing also when family members have been activated via the overlapping semantic representation. The effect of summed language membership activation on response competition is weaker when the orthographic overlap between the target word and family members is reduced (i.e., there is less activation sent to the inappropriate language membership node). Thus, in an interactive activation account, family size effects can ultimately be explained via three mechanisms: facilitation due to *orthographic* co-activation of morphological family members in cognates, facilitation due to *semantic* co-activation in cognates, and *response inhibition* due to co-activated morphological family members, captured in one value as summed language membership activation (as in language decision)8.

**FIGURE 6 | Schematic representation of activation of family members within a bilingual interactive activation model based on BIA+.** The activation of the morphological family of a target word can affect the processing of a target word positively when (a) family members are activated via a semantic route, or (b) family members are activated via an orthographic

route but there is resonance of activation between semantics and orthography, and negatively when (a) activated family members map onto a different response or (b) family members are activated via an orthographic route and resonance of activation between semantics and orthography is still under development.

<sup>7</sup>Note that, in contrast to Schreuder and Baayen (1995) and De Jong et al. (2003), the BIA+ model does not specify a level for lemmas and morphemes, but contains an orthographic level.

In sum, we observed effects of cross-language family size for cognates in two paradigms (English lexical decision and English-Dutch language decision) that have similarities and differences in the demands they make on the participant. Semantic resonance between family members and target word was shown to be a major mechanism underlying family size effects, but orthographic overlap also played a role when it was relevant for making the correct response in language decision. All in all, we argue that the effect of morphological family size is sensitive to both semantic and orthographic factors, and also depends on task demands. As such, the research in this paper is of fundamental importance to the study of morphology, because it clarifies how simplex words activate morphologically complex associates (their family members) in bilingual word processing.

### **ACKNOWLEDGMENT**

We would like to express our gratitude to our late colleague, prof. Robert Schreuder, who commented on the design of the experiments.

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 19 September 2014; accepted: 09 January 2015; published online: 02 February 2015.*

*Citation: Mulder K, Dijkstra T and Baayen RH (2015) Cross-language activation of morphological relatives in cognates: the role of orthographic overlap and task-related processing. Front. Hum. Neurosci. 9:16. doi: 10.3389/fnhum.2015. 00016*

*This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2015 Mulder, Dijkstra and Baayen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited andthatthe original publication inthis journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Changes in functional connectivity within the fronto-temporal brain network induced by regular and irregular Russian verb production

# *Maxim Kireev1,2 , Natalia Slioussar 3,2 \*, Alexander D. Korotkov1,4 , Tatiana V. Chernigovskaya1,2 and Svyatoslav V. Medvedev1*

<sup>1</sup> N.P. Bechtereva Institute of the Human Brain, Russian Academy of Sciences, St. Petersburg, Russia

<sup>2</sup> Faculty of Liberal Arts and Sciences, St. Petersburg State University, St. Petersburg, Russia

<sup>3</sup> Faculty of Philology, Higher School of Economics, Moscow, Russia

<sup>4</sup> Radiological Center of Tyumen Regional Oncology Center, Tyumen, Russia

### *Edited by:*

Mirjana Bozic, University of Cambridge, UK

### *Reviewed by:*

Yury Y. Shtyrov, Aarhus University, Denmark Emmanuel A. Stamatakis, University of Cambridge, UK

### *\*Correspondence:*

Natalia Slioussar, Faculty of Philology, Higher School of Economics, Staraya Basmannaya Street 21/4, Moscow 105066, Russia e-mail: slioussar@gmail.com

Functional connectivity between brain areas involved in the processing of complex language forms remains largely unexplored. Contributing to the debate about neural mechanisms underlying regular and irregular inflectional morphology processing in the mental lexicon, we conducted an fMRI experiment in which participants generated forms from different types of Russian verbs and nouns as well as from nonce stimuli.The data were subjected to a whole brain voxel-wise analysis of context dependent changes in functional connectivity [the so-called psychophysiological interaction (PPI) analysis]. Unlike previously reported subtractive results that reveal functional segregation between brain areas, PPI provides complementary information showing how these areas are functionally integrated in a particular task. To date, PPI evidence on inflectional morphology has been scarce and only available for inflectionally impoverished English verbs in a same-different judgment task. Using PPI here in conjunction with a production task in an inflectionally rich language, we found that functional connectivity between the left inferior frontal gyrus (LIFG) and bilateral superior temporal gyri (STG) was significantly greater for regular real verbs than for irregular ones. Furthermore, we observed a significant positive covariance between the number of mistakes in irregular real verb trials and the increase in functional connectivity between the LIFG and the right anterior cingulate cortex in these trails, as compared to regular ones. Our results therefore allow for dissociation between regularity and processing difficulty effects. These results, on the one hand, shed new light on the functional interplay within the LIFG-bilateral STG language-related network and, on the other hand, call for partial reconsideration of some of the previous findings while stressing the role of functional temporo-frontal connectivity in complex morphological processes.

**Keywords: fMRI, Russian, inflectional morphology, functional connectivity, psycho–physiological interactions, fronto-temporal brain network, dual-route theories, single-route theories**

### **INTRODUCTION**

Numerous studies examine morphologically complex forms to compare different models of inflection in the mental lexicon. One of the crucial things they focus on is the distinction between regular and irregular forms. The so-called "dual route" (DR) approach assumes that the former are generated and processed by symbolic rules, while the latter stored in the lexicon, from where they can be retrieved through associative memory mechanisms (e.g., Pinker and Prince, 1988; Pinker, 1991; Marslen-Wilson and Tyler, 1997; Orsolini and Marslen-Wilson, 1997; Clahsen, 1999; Ullman, 2004). According to the "single route" (SR) approach, all forms are computed by a single integrated system that contains no symbolic rules (e.g., Rumelhart and McClelland, 1986; MacWhinney and Leinbach, 1991; Plunkett and Marchman, 1993; Ragnasdóttir et al., 1999; McClelland and Patterson, 2002).

Behavioral studies testing DR and SR approaches analyze a variety of languages, but neuroimaging studies rely primarily on English and German data (e.g., Jaeger et al., 1996; Indefrey et al., 1997; Ullman et al., 1997; Marslen-Wilson and Tyler, 1998; Münte et al., 1999; Newman et al., 1999, 2007; Beretta et al., 2003; Sach et al., 2004; Joanisse and Seidenberg, 2005; Desai et al., 2006; Sahin et al., 2006; Oh et al., 2011). Inflectional morphology in morphologically richer languages like Finnish, Polish, and Arabic was examined in a number of neuroimaging studies (e.g., Lehtonen et al., 2006; Boudelaa et al., 2010; Leminen et al., 2011; Szlachta et al., 2012). However, these studies did not compare regular and irregular forms, focusing on other problems (the distinction between inflectional and derivational morphology, the role of general perceptual and specifically linguistic complexity, etc.).

In the present study, we turned to Russian, a language with rich and diverse morphology, and conducted an fMRI investigation where participants were asked to generate present tense forms from different real and nonce (nonword) verbs and to pluralize real and nonce nouns. Addressing the problem of regularity in a morphologically rich language is important because one can tease apart several factors that are confounded in a language like English (while English definitely has its own advantages with its minimalist system and sharp contrasts between inflectional classes). To give one example, all regular past tense forms are morphologically complex in English, i.e., contain a stem and a suffix (*-ed*), while irregular forms are morphologically simplex. In Russian, all past tense forms are morphologically complex, which gives us an opportunity to find out whether the effects observed in English were due to regularity or to morphological complexity. Other properties of Russian that may be relevant for the debate will be pointed out in Section "A Brief Description of the Russian Verb and Noun Systems." We opted for a production task because it was used in the majority of neuroimaging studies focusing on regular vs. irregular inflectional morphology.

Experimental data reflecting the localization and the direction of the change in functional activity are reported in Slioussar et al. (2014). In this paper, we present a ROI-whole brain voxel-wise analysis of context dependent changes in functional connectivity [a psychophysiological interaction (PPI) analysis; Gitelman et al., 2003]. The first type of analysis makes it possible to reveal functionally segregated brain areas that change their activity in response to experimental manipulations, while PPI is a measure of functional connectivity, which provides complementary information showing how these segregated brain areas are integrated (Friston, 2011). Although PPI analysis does not make it possible to infer causal relationships, it gives an opportunity to observe how the functional interplay between involved brain regions is changed as a function of the psychological context.

Therefore, we saw PPI analysis as a valuable tool to approach the problem from a new angle, especially given the fact that we found only one previous PPI study of inflectional morphology (Stamatakis et al., 2005). Important similarities and differences between Stamatakis et al.'s (2005) findings and our results offer a novel perspective on our findings from Slioussar et al. (2014), the account proposed by Stamatakis et al. (2005) and a number of problems discussed in other studies.

# **A BRIEF DESCRIPTION OF THE RUSSIAN VERB AND NOUN SYSTEMS**

The Russian verb system is very complex, and there are several approaches to dividing verbs into classes. According to the one developed in Jakobson (1948), Townsend (1975) and Davidson et al. (1996), Russian has 11 verb classes and several so-called anomalous verbs. Ten classes are identified by their suffixes, while the 11th class has a zero suffix, and is subdivided into subclasses depending on the quality of the root-final consonant [Jakobson (1948) and Townsend (1975) counted them as 13 separate classes].

All verbs have two stems: the present/future tense stem and the past tense stem. Depending on the class, the correlation between them may include truncations or additions of the final consonant or vowel, stress shifts, suffix alternations, alternations of stem vowels, and stem-final consonants. The verb class also determines which set of endings is used in the present and future tense (first and second conjugation types). Usually, the class is unrecoverable from a particular form. For example, *délat*' 'to do' belongs to the AJ class, and its third person plural present tense form is *délaj-ut* (-*j*- suffix is added, first conjugation type).1 *Pisát'* 'to write' belongs to the A class, and its third person plural present tense form is *píš-ut* (-*a*- suffix is truncated, first conjugation type, final consonant alternation, stress shift). *Deržát*' 'to hold' belongs to the ZHA class, and its third person plural present tense form is *derž-át* (-*a*- suffix is truncated, second conjugation type).

Verb classes dramatically differ in frequency, and five of them are productive. Thus, there is no single productive pattern that can be applied to any stem irrespective of its phonological characteristics, and no obvious division into regular verbs (RVs) and irregular verbs (IVs) in this system. In our fMRI experiment, we decided to look at the two poles of this system, comparing verbs from the most frequent and productive AJ class to verbs from small unproductive classes (we reasoned that if any differences between these two groups were found, we could compare them to other verbs in subsequent studies). For the sake of brevity, we will further call these groups *regular* and *irregular*.

Russian nouns are inflected for number and case and are classified into different declensions depending on the set of their number and case endings. In many ways, this system is simpler than the system of verb classes. There are only three declensions (plus a group of nouns with adjectival endings, several exceptional cases and a number of uninflected nouns). These declensions differ in frequency, but all three are productive. Usually, the declension can be unambiguously determined from the nominative singular form. Inside every declension there are small groups of nouns with minor irregularities: unusual endings in some forms or stem alternations. For our study we selected a group of nouns that lose the last vowel of the stem in many forms including the nominative plural form (e.g. *koster* 'fire' – *kostry*) and a group where the stem never changes, as in the majority of Russian nouns (e.g., *šofer 'driver' – šofery*). We will further call the first group *irregular*, although this is a relatively minor irregularity.

## **PREVIOUS STUDIES TESTING THE SR AND DR APPROACHES ON RUSSIAN**

Behavioral studies testing SR and DR approaches on Russian looked at adult native speakers, L1 and L2 learners and subjects with various neurological and developmental deficits (e.g., Gor and Chernigovskaya, 2001, 2003, 2005; Gor, 2003, 2010; Chernigovskaya et al., 2007; Svistunova, 2008; Gor et al., 2009; Gor and Jackson, 2013). Participants were provided with infinitives or past tense forms of real or nonce verbs and prompted to generate first person singular and third person plural present tense forms. The findings did not unambiguously support either DR or SR approach. For example, on one hand, adults were shown to use the most frequent AJ class pattern as the default one, although Russian has several highly frequent productive verb classes. In particular, it was often applied to nonce verbs irrespective of their morphonological properties. On the other hand, children consecutively overgeneralize several conjugational patterns in the

<sup>1</sup>There are several ways to transliterate Russian words from Cyrillic to Latin alphabet. In this paper, we use the so-called scholarly transliteration system.

course of acquisition. As a result, the group of authors working on Russian argued that Yang's (2002) model relying on multiple rules of different status might be better suited to account for their findings. A similar model for Russian was developed by Gor (2003).

Subtractive analysis of the data from the present experiment we reported in Slioussar et al. (2014)is the only fMRI study of Russian inflectional morphology we are aware of. Previous neuroimaging studies arguing for the DR approach, as well as a number of studies that do not directly address the DR vs. SR debate (e.g., Marslen-Wilson and Tyler, 2007; Bozic et al., 2010, 2013; Szlachta et al., 2012), argue that rule-based processing is supported by the frontoparietal network, particularly by Broca's area. However, only two fMRI studies comparing regular vs. irregular form production found more activation in Broca's area for regulars (Dhond et al., 2003; Oh et al., 2011). Increased left IFG activation for regulars was also observed in an fMRI study where the processing of spoken regular and irregular forms was compared in a same-different judgment task (Tyler et al., 2005).

Other fMRI studies report the opposite pattern: Broca's area was activated more by irregulars (Beretta et al., 2003; de Diego-Balaguer et al., 2006; Desai et al., 2006; Sahin et al., 2006). Two alternative explanations are proposed. Proponents of the DR approach suggest that these results can be explained by conflict monitoring between the regular rule and irregular form or by inhibition of regular rule application (e.g., Sahin et al., 2006). Desai et al. (2006) argue for the SR approach: they conclude that the observed activation differences reflect the greater processing load posed by irregulars, which rely on less frequent inflection patterns than RVs and therefore have greater attentional and response selection demands.

In Slioussar et al. (2014), nonce verbs and nouns were added to the comparison. Participants silently read stimuli and produced aloud particular forms from them. We found that functional activity within the fronto-parietal network was influenced by regularity and lexicality: it was greater for IVs than for regular ones and for nonce verbs than for real ones. We demonstrated that the effects of regularity and lexicality were very similar and concluded that the observed BOLD changes were induced not by (ir)regularity as such, but by the increase of processing load from RV to irregular (IV) to regular nonce verb (RNV) to irregular nonce verbs (INV).

This conclusion was supported by the (RV > B) < (IV > B) < (RNV > B) < (INV > B) parametric contrast, where B is an implicitly modeled baseline, and by behavioral results: the number of mistakes increased from RV to IV to RNV to INV condition. The results for nouns were similar. Only the main effect of regularity did not reach significance in the factorial analysis of fMRI data – presumably, because the only irregular feature we could find for our noun stimuli was rather minor (see A Brief Description of the Russian Verb and Noun Systems).

### **A PREVIOUS PPI STUDY OF INFLECTIONAL MORPHOLOGY AND THE PRESENT STUDY**

We were only able to find only one PPI study of inflectional morphology (Stamatakis et al., 2005). In this study, functional connectivity between functionally predefined regions of interest (ROIs) located in the left inferior frontal gyrus (LIFG), anterior cingulate cortex (ACC), superior temporal gyrus (STG), and middle temporal gyrus (MTG) was assessed during the same/different judgment task. Stimuli were aurally presented pairs of English words and nonce words, in particular, RV and IV pairs like *jumped – jump* and *thought – think*.

Stamatakis et al. (2005) report a positive influence of LIFG activity on the activity in the left STG/MTG and a modulatory influence of ACC activity on this fronto-temporal connectivity. The former effect did not depend on regularity *per se*, but we know from the subtractive analysis of the data reported in Tyler et al. (2005) that RVs activated the LIFG, bilateral STG and MTG significantly more than irregular ones in this study. The latter effect was significantly stronger for regulars than for irregulars. Stamatakis et al. (2005) believe that these findings indicate greater engagement of the fronto-temporal network in RV processing, with the ACC playing a monitoring role. They conclude: "this reflects the additional processing demands posed by regular inflected forms, requiring modulation of temporal lobe lexical access processes by morphological parsing functions supported by the LIFG" (p. 116).

Undertaking a PPI analysis of our data, we were primarily interested in two things. Firstly, an advantage of this approach is that task-dependent connectivity changes may be detected even when the levels of functional brain activity are not affected by experimental manipulations. We aimed to reveal functional interactions underlying changes in functional activity observed within the LIFG during regular and irregular form production (Slioussar et al., 2014). As we noted above, the increase in LIFG activity in IV trials was explained by the difference in processing load between these two tasks in Slioussar et al. (2014). In principle, this difference could attenuate functional activity changes associated with regularity. Therefore we turned to PPI analysis to find out whether this was indeed the case and to tease apart connectivity changes associated with morphological properties and with cognitive demands.

Secondly, we were interested how our findings would compare to Stamatakis et al.'s (2005) given several important differences in our experiments. First of all, there are obvious differences in the experimental task and in the language used (morphologically poor English vs. morphologically rich Russian). Furthermore, subtractive analyses presented in Tyler et al. (2005) and Slioussar et al. (2014) revealed the opposite results, in particular, the LIFG was more activated by regulars in the first study and by irregulars in the second. Finally, the analyses of behavioral data (the number of mistakes in different conditions) showed that irregular trials were more difficult than regular ones for the participants of our study, while Tyler et al. (2005) reported very similar accuracy rates.

In general, we wanted to see whether the functional connectivity of LIFG would be substantially different during comprehension and production of regular vs. irregular forms (although our task definitely involves a silent reading stage as well). In particular, we expected that if the findings from Stamatakis et al. (2005) are genuine regularity effects, we might be able to replicate them despite all the differences, teasing them apart from processing difficulty effects identified in Slioussar et al. (2014). Foreshadowing the results, this is exactly what we did in the present study.

# **MATERIALS AND METHODS**

### **PARTICIPANTS**

Twenty-one healthy subjects participated in the study (13 females, 8 males). All participants were native speakers of Russian, 19– 32 years of age, with no history of neurological or psychological disorders. All participants were right-handed, as assessed by the Edinburgh Handedness Inventory (Oldfield, 1971). Subjects were given no information about the specific purpose of the study. All subjects gave their written informed consent prior to the study and were paid for their participation. All procedures were in accordance with the Declaration of Helsinki and were approved by the Ethics Committee of the N.P. Bechtereva Institute of the Human Brain, Russian Academy of Sciences.

### **MATERIALS**

Materials consisted of eight groups of real and nonce verbs and nouns, illustrated in**Table 1** (a complete list is given in Supplementary Material). The first group of 35 real verbs belonged to the AJ class (RV); the second group contained 35 verbs from several small non-productive classes (IV). Only unprefixed imperfective verbs were used. Two matching groups of 35 nonce verbs (RNVs and INVs) mimicked the general characteristics of the corresponding real verb groups (length and phonological properties of the stem).

The first group of 35 real nouns had no stem changes (regular nouns, RN), while in the second group the last vowel of the stem was dropped in many forms including the nominative plural form (irregular nouns, IN): e.g., *šofer* 'driver' – *šofery* vs. *koster* 'fire' – *kostry*. All nouns were masculine, belonged to the first declension and had the nominative plural form ending in *-y*. Two groups of 35 nonce nouns (regular nonce nouns, RNN, and irregular nonce nouns, INN) were created to match two real noun groups. Frequency was balanced for all real stimulus groups using *The Frequency Dictionary of the Modern Russian Language* (Lyashevskaya and Sharoff, 2009). Stimuli in all groups were matched for length (see Supplementary Material).

Vowels are dropped only in a subgroup of noun stems ending in particular vowel and consonant clusters (e.g., *-er, -or, -el, -ol* etc.). We selected stems with such clusters both for irregular and for RN groups so as not to make the former more phonologically homogenous than the latter. Final vowel dropping is usually predictable from the combination of consonants before this vowel



and from the position of the stress. However, since stimuli were presented visually, no information about stress was available for nonce nouns, and different nominative pluralforms could be licitly derived from them.

### **LANGUAGE PROTOCOL AND EXPERIMENTAL fMRI PARADIGM**

In total, we had 280 stimuli. Each stimulus was visually presented for 700 ms. Fixation crosses ("xxxxx") were displayed during interstimulus intervals, which varied between 3100 and 3500 ms with a 100 ms step. 140 "null-events" (fixation crosses) were pseudorandomly intermixed with the stimuli (Friston et al., 1999). The experiment was divided into three consecutive runs with 2–5 min rest between them and was preceded by a short practice run. The first 10 dummy scans of each run were discarded. Stimulus delivery and synchronization with fMRI data acquisition were carried out via the Eloquence fMRI System (*In vivo*) and E-Prime software (version 1.1, Psychology Software Tools Inc., Pittsburgh, PA, USA).

Verbs were presented in the infinitive form, and nouns were presented in the nominative singular form. Subjects were instructed to generate aloud as fast as possible the first person singular present tense form if they saw a verb or the nominative plural form if they saw a noun. All responses were recorded simultaneously with fMRI data acquisition by means of the Persaio MRI Noise Cancelation System (Psychology Software Tools, Inc.). Their correctness was assessed offline. When a participant's responses were no longer appropriate for the target's category, the corresponding trials were discarded in the subsequent fMRI analyses.

### **MR IMAGING PROTOCOL**

Magnetic resonance imaging was performed ona3Tesla Philips Achieva scanner. In addition to a scout sequence, participants underwent structural and functional imaging. Structural images were acquired applying a T1-weighted pulse sequence (T1W-3D-FFE; TR = 2.5 ms; TE = 3.1 ms; 30◦ flip angle) measuring 130 axial slices (field of view, FOV = 240 mm × 240 mm; 256 × 256 scan matrix) of 0.94 mm thickness. Functional images were obtained using an echo planar imaging (EPI) sequence (TE = 35 ms; 90◦ flip angle; FOV = 208 mm × 208 mm; 128 × 128 scan matrix). Thirty-two continuous 3.5 mm thick axial slices (voxel size = 3 mm × 3 mm × 3.5 mm) covering the entire cerebrum and most of the cerebellum were oriented with reference to the structural image. The images were acquired with a repetition time (TR) of 2000 ms. In order to avoid extensive head motions we used an MR-compatible soft cervical collar.

### **CONNECTIVITY ANALYSIS**

fMRI data preprocessing included realignment, slice-time correction, spatial normalization, and 8 mm full-width/half-maximum isotropic Gaussian smoothing (for details, see Slioussar et al., 2014). It was carried out using SPM8 software (Wellcome Department of Cognitive Neurology, London, UK). Artifact Detection Toolbox<sup>2</sup> was used to remove fMRI outliers from the analysis.

During the PPI analysis, ROIs were selected from the cluster in the LIFG, which exhibited greater BOLD values for the production of irregular forms (Slioussar et al., 2014). Three ROIs

<sup>2</sup>http://www.nitrc.org/projects/artifact\_detect/

**(A)** Three ROIs overlaid on the areas Slioussar et al. (2014) identified as sensitive to the main effect of regularity (regular real and nonce verbs were compared with irregular real and nonce verbs). **(B)** Increase in functional

connectivity induced by regular real verb production in the RV > IV comparison for the LIFGop3 seed region. **(C)** Covariance between the number mistakes in irregular real verb production and functional connectivity induced by irregular real verbs in the IV > RV comparison for the LIFGop3 seed region.

were created by centering a 4 mm radius sphere in the corresponding local maxima in the opercular part of the LIFG (BA 44, see **Figure 1A**), as defined by the Anatomy toolbox 2.0 (Eickhoff et al., 2005). The analysis of functional connectivity changes was performed between each of the selected ROIs and the remaining voxels of the brain using the generalized PPI toolbox<sup>3</sup> (McLaren et al., 2012) and included the following steps. First, neuronal activity underlying the observed BOLD changes in every ROI was mathematically estimated (Gitelman et al., 2003). Then the estimated neuronal activity was multiplied by the vectors of each condition's ON times and convolved with the hemodynamic response function (McLaren et al., 2012; Cisler et al., 2013).

As a result, PPI-regressors corresponding to every experimental trial were created, and the PPI analysis was performed using the general linear model (GLM). Additionally, the GLM included the following nuisance variables: (1) regressors modeling the BOLD signal changes induced by eight experimental conditions and mistake trials (as in the conventional subtractive GLM analysis); (2) head motion parameters and the global mean fMRI outliers; (3) a regressor reflecting the time series of BOLD signal changes within (2005), we focused on the contrast between regular and irregular *real* word trials in the connectivity analysis, as these authors did. The fronto-temporal connectivity observed by Stamatakis et al. (2005) is most reasonably described as a frontal modulation of lexical access processes, which is obviously not applicable to nonce stimuli. However, the findings from other comparisons are also reported. As in Slioussar et al. (2014), we analyzed verbs and nouns separately rather than putting them together and treating word category as the third factor, primarily because the type of irregularity we were able to find for nouns was very minor compared to what we had in the case of verbs. Thus, RV > IV and IV > RV contrasts of PPI-parameters were estimated with the use of one-sample *t*-tests. Additionally, PPI-parameters for all real and nonce verb trials were analyzed using the ANOVA with two repeated measure factors: lexicality (real vs. nonce) and regularity. The same was done for nouns.

Statistical parametric mappings were computed using the *p* < 0.001 voxel-wise uncorrected threshold. To avoid false positive findings, the FWE *p* < 0.05 correction for multiple comparisons was applied at the cluster level. Since two *t*-tests were calculated

the ROI to exclude context-dependent changes occurring at the hemodynamic level. To be able to compare our results to those of Stamatakis et al.

<sup>3</sup>http://www.nitrc.org/projects/gppi

for each of the three ROIs, an additional Bonferroni–Holm correction for multiple comparisons was used. The anatomical location of the functional connectivity changes revealed was identified by the Anatomy toolbox. The REX toolbox<sup>4</sup> was used to demonstrate differences between beta values reflecting functional connectivity changes in the revealed clusters.

### **RESULTS**

In the RV > IV comparisons, the PPI analysis revealed clusters bilaterally located in the anterior portion of the superior temproral gyri (STG, see **Table 2**; **Figure 1B**). This effect was observed only for the LIFGop3 ROI seed, RV > IV PPI-contrasts for the other two ROI seeds were not significant. Calculating the mean values of PPI-parameters within the obtained clusters pointed to a relative increase in connectivity in RV trials in comparison to IV trials.

In the IV > RV comparisons, no significant changes in functional connectivity were found for all selected ROI seeds. Since we had concluded in Slioussar et al. (2014) that IV trials were characterized by an increase in processing load in comparison to RV trials, and this conclusion relied not only on neuroimaging, but also on behavioral data (number of mistakes in different conditions), we undertook the following subsidiary analysis to reveal processing load effects. We took the number of mistakes committed by every participant in the IV trials as an individual measure of task difficulty. As we reported in Slioussar et al. (2014), participants made significantly more mistakes in the IV condition than in the RV condition (96 out of 735 vs. 22 out of 735 responses in total, or 13.1% vs. 3.0%, respectively). If we look at each participant separately, the number of mistakes in the IV trials varies from 1 out of 35, or 2.9%, to 9 out of 35, or 25.7%.

Then we submitted IV > RV contrasts of PPI parameters calculated for every participant to the second level group analysis. A one-sample *t*-test, as it is implemented in SPM8, was used with the percentage of mistakes committed by every participant as a variable of interest and estimates of individual IV > RV PPI contrasts as a dependent variable. The results were significant only for one ROI seed, LIFGop3, the same as in the RV > IV PPI analysis above. For this ROI seed, we observed a significant positive covariance between the number of mistakes in the IV trials and the difference in functional connectivity between the LIFG

4http://www.nitrc.org/projects/rex/

### **Table 2 | RV** *>* **IV PPI-contrast for a ROI seed in the LIFG (BA 44, LIFGop3).**


\*Significant clusters after Bonferroni–Holm correction.

BA, approximate Brodmann's area; L/R, left/right hemisphere; k, cluster size in voxels; STG, superior temporal gyrus.

**Table 3 |The effect of task difficulty in the IV** *>* **RV PPI-contrast for a ROI seed in the LIFG (BA 44, LIFGop3).**


\*Significant clusters after Bonferroni-Holm correction.

ACC, anterior cingulate cortex; BA, approximate Brodmann's area; L/R, left/right hemisphere; k, cluster size in voxels.

and the right ACC (BA 32; see **Table 3**; **Figure 1C**). Notably, when error rates are low, the difference is negative, i.e., connectivity between the LIFG and the ACC is greater in RV trials. As error rates grow, the difference approaches zero and then becomes positive, i.e., for participants who made more errors than the others, connectivity between the LIFG and the ACC is greater in the IV trials.

Comparisons involving real noun stimuli (RN > IN and IN > RN), as well as factorial analyses for verb and noun conditions, did not yield significant results.

### **DISCUSSION**

We believe that the most noteworthy outcome of the present study is that the connectivity analysis allowed us to dissociate regularity and processing difficulty effects and, as we hope to show below, gain a deeper understanding of their nature. Since we are going to compare our results to Stamatakis et al.'s (2005) and Tyler et al.'s (2005), let us start by highlighting some relevant differences between English and Russian verbs.

Stamatakis et al. (2005) and Tyler et al. (2005) looked at stimulus pairs like *stayed – stay* vs. *taught – teach.* In the regular pairs, the first stimulus was morphologically complex and the second was not, while in irregular pairs, both stimuli were morphologically simple. Obviously, the regular pattern also differs from irregular ones in terms of productivity and type frequency, and it is the morphological default (some authors argue that being the default pattern is a separate property that cannot be reduced to productivity and type frequency, e.g., Clahsen, 1999; Beretta et al., 2003). Behavioral results (error rates) were very similar for regular and irregular sets in this study: 5.1 and 4.3% respectively.

Due to the nature of the Russian language, in our study, all verb stimuli read or produced by the participants were morphologically complex: e.g., *nyr-ja-t'* 'to dive' *– nyr-ja-ju* (regular) and *mol-o-t*' 'to grind' *– mel-ju* (irregular). The difference in productivity and type frequency is the same as in English. Finally, there was a difference in error rates in our study, indicating that IVs were more difficult to process. Ideally, all three factors – morphological complexity, regularity and processing difficulty – must be assessed separately and then studied in more detail (for example, to see whether the role of productivity can be dissociated from the role of type frequency etc.). We are infinitely far from this goal now, but arguably, our study lets us make a small step toward it.

Firstly, we observed an increase in functional connectivity between the LIFG and temporal cortex, in particular, the left and right STG, in the RV > IV comparison. Stamatakis et al. (2005) reported similar findings. They saw a positive influence of LIFG activity on the activity in the *left* STG and MTG both for regular and irregular real verb trials. Given that the subtractive analysis of the data reported in Tyler et al. (2005) demonstrated that RVs activated the LIFG, bilateral STG and MTG significantly more than irregular ones in this study, the authors conclude that this indicates greater engagement of the fronto-temporal network in RV processing. Since two PPI studies gave similar results in this case, we suggest that this is an effect of regularity.

Stamatakis et al.'s (2005) study and the present study rely on very different languages: morphologically poor English with a clear-cut distinction between regular and IVs vs. morphologically rich Russian, with numerous verb classes and where the notion of regularity is difficult even to define. Experimental tasks were also different: a same/different judgment task for aurally presented stimuli and a production task for visually presented stimuli (which obviously involves a silent reading stage). The fact that our findings partly replicate Stamatakis et al.'s (2005) despite these major differences shows that the observed regularity effect is indeed robust and has cross-linguistic validity. Moreover, it is present both in comprehension and in production, which we consider good news because no major model addressing the problem of regularity defines this notion differently for production and comprehension. An important advantage of our study that strengthens this result is that we used a ROI-whole brain analysis, i.e., did not predefine the set of regions to be analyzed.

Why did Stamatakis et al. (2005) observe coactivation between the LIFG and the left-lateralized temporal brain network, while in our study, both left and right temporal areas were involved? Given the above-mentioned differences between the two studies, explanations can only be very tentative. This could result from task-related differences: for example, in contrast to passive listening, active word production probably involves self-monitoring associated with bilateral STG activation (e.g., Indefrey, 2011). Alternatively, based on the fact that in Tyler et al. (2005) RVs induced an increase of activation in both *left and right* STG and MTG, one could hypothesize that connectivity changes in the right-lateralized temporal language areas simply did not reach significance in Stamatakis et al. (2005).

In addition to the STG, we also observed an increase in connectivity between the LIFG and the putamen. Since this result did not reach significance after Bonferroni–Holm correction, we will refrain from interpreting it and will only point to some potentially relevant observations in the literature. Numerous studies show that this part of the basal ganglia plays a role in articulation (e.g., Brown et al., 2009, Price, 2010). Some authors also believe that the basal ganglia are part of the system underlying rule-based language processing (e.g., Pinker and Ullman, 2002), but this model is controversial (e.g., Longworth et al., 2005; Macoir et al., 2013). Our observations agree with this model, but could be explained without it. In our stimuli, we matched the length of infinitives, but the present tense forms of some IVs are shorter, so they might require less effort in terms of articulation.

Now let us turn to the results involving the ACC, which did not coincide in the two PPI studies. Stamatakis et al. (2005)found that ACC activity influencedfronto-temporal connectivity, and that the effect was significantly stronger for regulars than for irregulars.

They conclude that the ACC plays a monitoring role, "which, in the context of processing real regular inflected words, would reflect greater engagement of an integratedfronto-temporal language system. Morpho-phonological processes, such as the decomposition of regular inflected forms into stems and affixes, may place higher demands on this system, calling on additional resources" (p. 120). Since we did not observe similar results in our study, we hypothesize that this finding is due to the difference in morphological complexity between regular and IVs in English, which is absent in Russian. This hypothesis is very similar to Stamatakis et al.'s (2005) conclusions quoted above, but now we can dissociate morphological complexity from regularity (in the sense of defaultness and/or type frequency).

In our study, we observed covariance between the number of mistakes in the IV trials and functional connectivity changes between the LIFG and the right ACC in the IV vs. RV comparison (see **Figure 1C**). For participants who had low error rates, LIFG–ACC connectivity was greater during RV trials, while for participants who had high error rates, the opposite was true. We believe that we are dealing with two distinct effects here, and that the former is overshadowed by the latter as processing load increases. We do not have a definitive answer as to why LIFG–ACC connectivity may be greater for RVs. Both regular and irregular forms are morphologically complex in Russian and, if there is any rule-based processing system at all, both engage it (infinitival suffixes must be stripped and first person singular endings must be added). However, regular forms might engage this system more than irregular ones: it may also be activated for present tense stem formation. Further research is necessary to test this explanation, but, if it is correct, this would be an argument for the DR approach.

At the same time, LIFG–ACC connectivity increases in IV trials as the processing load they pose grows. This pattern of connectivity changes can be interpreted as a top–down general regulatory effect of the LIFG–ACC interaction, given the fact that the ACC is identified as an important part of the cognitive control network for the detection and resolution of processing conflicts (e.g., Carter and van Veen, 2007; Westerhausen et al., 2010). This effect completely overshadows the one described above when error rates are high. Let us try to formulate more precisely what might be going on. When an irregular form is produced successfully, the stem is simply taken from memory, which is the easiest option for the morphological processing system. But when somebody cannot find the right form and tries to derive it somehow, it is more taxing for the system than dealing with a regular form because the pattern is infrequent and unproductive. In this light, the absence of similar findings in Stamatakis et al.'s (2005)study is expected: different trial types did not differ significantly in terms of processing load in their experiment. This could be due to the fact that Stamatakis et al. (2005) examined comprehension, where one does not have to find or derive any forms. In general, passive comprehension might require more shallow processing than production, and low-status rules or morphological patterns associated with IVs in associative memory might get activated only in the latter case, but not in the former.

Let us briefly comment on the opposite results from Tyler et al. (2005) and Slioussar et al. (2014). The fronto-temporal languagerelated areas were activated more for irregulars in the former and

for irregulars in the latter study. We attributed our findings to processing difficulty (more details above in Section "Previous Studies Testing the SR and DR Approaches on Russian"), while Tyler et al. (2005) explained theirs by regularity. In the light of the Section "Discussion" above, we conclude that in both cases, the increased activity levels might reflect greater engagement of the morphological processing system (this does not contradict the conclusions made in these studies and only clarifies the picture). In English, regular forms rely on it more than irregular ones because the former are morphologically complex, while the latter are simplex and do not require any morphological processing at all. In Russian, all forms are morphologically complex, but when people cannot retrieve an irregular form or try to construct a form from a nonce verb, especially from an INV, the morphological processing system has to work harder.

Now, what do our conclusions mean for the DR and SR approaches? In the SR approach, only the frequency of a morphological pattern really matters. In this respect, regular stimuli had the same properties in both PPI studies, yet the results diverged. The canonical version of the DR approach postulates one default rule and argues that all other forms are stored in memory. Again, *prima facie* this does not predict any differences between regular stimuli in the two studies. One could go on to argue that Russian irregular stimuli must undergo morphological decomposition (at least to get rid of the infinitival affix), and some combination of morphological analysis and memory retrieval processes makes them more difficult than regular stimuli on a certain scale, while English irregular stimuli are the easiest on this scale because no morphological analysis is required at all. Potentially, hybrid models with several rules of different status such as the ones in Yang (2002) are better suited to account for the data. As we mentioned in Section "Previous Studies Testing the SR and DR Approaches on Russian," such models were proposed for Russian based on the results from behavioral experiments. In any case, it is clear that simplistic views must be discarded.

Further studies are needed to give more definitive answers to the questions above. In particular, the Russian verb system with its numerous classes has much more to offer than what we have used so far. In the present study, we compared verbs from the least frequent unproductive classes to verbs from the most frequent productive AJ class. However, Russian has other highly frequent productive classes. This might allow us to explore the nature of the effects we have observed so far in more detail: what (if anything) would be associated with the morphological default, with productivity, with type frequency, with the complexity of the morphological pattern (e.g., whether it involves stem and suffix alternations etc.)? This might eventually let us figure out what precisely stands behind the regularity effect. Then it will be clear whether it can be accounted for in terms of the DR or SR approach.

Now let us turn to the results for noun stimuli. The fact that the RN > IN comparison gave no significant results is not surprising, given that the main effect of (ir)regularity also did not reach significance for nouns in the factorial analysis in Slioussar et al. (2014). Most probably, this is because the irregular feature we had to select for our noun stimuli was rather minor – the Russian noun system is not very complex in this respect.

Finally, let us look at our data in the context of recent research arguing that fronto-temporal language brain regions are spatially and functionally distinct from the domain-general fronto-parietal multiple demand (MD) system (e.g., Duncan, 2010; Fedorenko et al., 2013; Fedorenko, 2014). In our study, an increase in connectivity between the LIFGop3 region located in one of the languagespecific areas and the bilateral STG was driven by linguistic properties of the stimuli (regularity in the sense of defaultness, type frequency and/or productivity). At the same time, we observed how connectivity between this very same region and the right ACC, which is argued to be part of the domain-general cognitive control network, depends on the processing difficulty. As the discussion above shows, the source of this processing difficulty might also be language-specific, namely, it might be a morphological processing difficulty. However, it has an effect on response selection demands, so the cognitive control network must be invoked. In total, in contrast to recent functional connectivity studies arguing for the independence of language-related and domain-general cognitive control systems (Blank et al., 2014), our data demonstrate how these systems can be functionally integrated.

To summarize, the present PPI study allowed us to tease apart processing difficulty and regularity effects in the domain of inflectional morphology. We not only observed the processing difficulty effect we identified earlier in Slioussar et al. (2014), but were also able to find a novel effect of regularity, and gained a better understanding of these two effects by comparing our study to the only other published PPI study of inflectional morphology (Stamatakis et al., 2005). In Slioussar et al. (2014) some regularityrelated differences in functional activity could be attenuated by the processing load effect, but the PPI analysis was sensitive enough to reveal such differences in functional connectivity. The present study makes us reconsider some findings from Stamatakis et al. (2005), Slioussar et al. (2014) and several other previous studies.

# **ACKNOWLEDGMENTS**

The study was partially supported by the grant "Integrative Physiology" from the Department of Physiological Studies of the Russian Academy of Sciences (to Maxim Kireev, Alexander D. Korotkov, and Svyatoslav V. Medvedev) and by the grant #0.38.518.2013 from St. Petersburg State University (to Natalia Slioussar and Tatiana V. Chernigovskaya). We are very grateful to the reviewers for their most valuable comments.

# **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fnhum.2015.00036/ abstract

# **REFERENCES**


*Acquisition*, eds A. Housen and M. Pierrard (Berlin: Mouton de Gruyter), 103–139.


*Material Russkogo Jazyka ('Mental Lexicon Structure. Development and Decay of the Verbal Inflectional Morphology: An Experimental Study of Russian')*. Thesis, St. Petersburg State University, St. Petersburg.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 17 September 2014; accepted: 14 January 2015; published online: 18 February 2015.*

*Citation: Kireev M, Slioussar N, Korotkov AD, Chernigovskaya TV and Medvedev SV (2015) Changes in functional connectivity within the fronto-temporal brain network induced by regular and irregular Russian verb production. Front. Hum. Neurosci. 9:36. doi: 10.3389/fnhum.2015.00036*

*This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2015 Kireev, Slioussar, Korotkov, Chernigovskaya and Medvedev. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Take a stand on understanding: electrophysiological evidence for stem access in German complex verbs

# **Eva Smolka<sup>1</sup>\*, Matthias Gondan<sup>2</sup> and Frank Rösler <sup>3</sup>**

<sup>1</sup> Department of Linguistics, University of Konstanz, Konstanz, Germany

<sup>2</sup> Department of Psychology, University of Copenhagen, Copenhagen, Denmark

<sup>3</sup> Biological Psychology and Neuropsychology, University of Hamburg, Hamburg, Germany

### **Edited by:**

Minna Lehtonen, University of Helsinki, Finland

**Reviewed by:**

Markus J. Hofmann, University of Wuppertal, Germany Antje Lorenz, University of Münster, Germany

### **\*Correspondence:**

Eva Smolka, Department of Linguistics, University of Konstanz, Universitätsstrasse 10, 78457 Konstanz, Germany e-mail: eva.smolka@uni-konstanz.de The lexical representation of complex words in Indo-European languages is generally assumed to depend on semantic compositionality. This study investigated whether semantically compositional and noncompositional derivations are accessed via their constituent units or as whole words. In an overt visual priming experiment (300 ms stimulus onset asynchrony, SOA), event-related potentials (ERPs) were recorded for verbs (e.g., ziehen, "pull") that were preceded by purely semantically related verbs (e.g., zerren, "drag"), by morphologically related and semantically compositional verbs (e.g., zuziehen, "pull together"), by morphologically related and semantically noncompositional verbs (e.g., erziehen, "educate"), by orthographically similar verbs (e.g., zielen, "aim"), or by unrelated verbs (e.g., tarnen, "mask"). Compared to the unrelated condition, which evoked an N400 effect with the largest amplitude at centro-parietal recording sites, the N400 was reduced in all other conditions. The rank order of N400 amplitudes turned out as follows: morphologically related and semantically compositional ≈ morphologically related and semantically noncompositional < purely semantically related < orthographically similar < unrelated. Surprisingly, morphologically related primes produced similar N400 modulations—irrespective of their semantic compositionality. The control conditions with orthographic similarity confirmed that these morphological effects were not the result of a simple form overlap between primes and targets. Our findings suggest that the lexical representation of German complex verbs refers to their base form, regardless of meaning compositionality. Theories of the lexical representation of German words need to incorporate this aspect of language processing in German.

**Keywords: event-related potentials, derivational morphology, morphological priming, semantic priming, form priming, stem access, complex verbs**

# **INTRODUCTION**

One intriguing question in psycholinguistic and neurolinguistic research is how morphologically complex words like *understand* are represented in lexical memory, via their base {stand} or via the whole form? Traditional means to study the lexical memory of complex words have been (a) the use of overt priming conditions; and (b) the manipulation of semantic compositionality between morphologically related words. Overt priming conditions such as auditory or visual priming at long exposure durations (230– 250 ms stimulus onset asynchrony (SOA) or longer) guarantee that the prime is consciously perceived. Under these conditions, semantic processing takes place and the meaning of the word can be retrieved. Overt priming can thus be used to tap into lexical memory.

The manipulation of semantic compositionality has been a further means to study lexical representations. For example, the meaning of the word *underdress* is semantically transparent (i.e., compositional), since it can be derived from the meaning of its morphemic constituents. The priming of a word like *dress* by *underdress* can thus be attributed to either morphological or semantic relatedness between the two words or both. By contrast, *stand* and *understand* are purely morphologically related, since the meaning of *understand* is semantically opaque and cannot be composed of the meaning of its parts. Any facilitation of the target *stand* by the word *understand* cannot be attributed to a meaning relation between the two words. Such facilitation would rather stress the morphological relatedness between the words, indicating that the two share some lexical representation in spite of their opaque meaning relation. In other words, the facilitation of *stand* by *understand* would indicate that *understand* is lexically represented via its base {stand}.

Behavioral findings in Indo-European languages such as English, French, and Dutch suggest that the lexical representation of complex words depends on semantic compositionality: Stems like *confess* were primed by semantically transparent derivations like *confessor*, but stems like *success* were not facilitated by morphologically related but semantically opaque derivations like *successor* (for cross-modal priming, see Marslen-Wilson et al., 1994; Longtin et al., 2003; for visual priming at long SOAs, see Feldman and Soltano, 1999; Rastle et al., 2000; Feldman et al., 2004). Similar to the stimuli used in the present study, also prefixed derivations like *distrust* primed their stems like *trust*, as well as other prefixed or suffixed derivations like *entrust* or *trustful*, though only if they were semantically transparent and not if they were semantically opaque (Marslen-Wilson et al., 1994; Feldman and Larabee, 2001; Meunier and Segui, 2002).

These findings were taken to indicate that overt priming triggers morphological decomposition as a high-level process, either following whole word access or constrained by semantic knowledge. Morphological models assume that semantically transparent words like *confessor* possess a lexical entry that corresponds to their base, such as {confess} and the suffix {or}, while semantically opaque words like *successor* must be represented in their full form (Rastle et al., 2000, 2004; Taft and Kougious, 2004; Diependaele et al., 2005; Meunier and Longtin, 2007; Marslen-Wilson et al., 2008; Taft and Nguyen-Hoan, 2010). By contrast, the distributed connectionist or *convergence of codes* view assumes that the above findings of morphological effects (whether or not priming occurs between morphologically related words) do not depend on explicit representations but rather on the degree of shared semantic and form similarity between the words (e.g., Plaut and Gonnerman, 2000; Gonnerman et al., 2007; Kielar and Joanisse, 2011). Indeed, under cross-modal priming conditions the priming effects of morphologically related (and hence form-related) prime-target pairs varied with the degree of their shared meaning (Gonnerman et al., 2007). Pairs with no or little shared meaning like *hardly-hard* showed no priming at all, those with moderate semantic similarity like *lately-late* showed medium effects, and those with high semantic similarity like *boldly-bold* showed the strongest priming effects. Most importantly, though, and regardless of whether morphological structure is assumed to be explicitly or implicitly represented, all of the above models assume that the semantic compositionality of a word determines its conscious processing and representation.

In contrast to overt priming, evidence for morphological priming without a semantic relation occurred in English or French, but they occurred only when the participants were unaware of the prime. Under masked priming conditions (and visual prime presentations below 50 ms SOA), bases were not only primed by morphologically related and semantically transparent words (e.g., *successful*-*success*) but also by semantically opaque derivations (e.g., *successor*-*success*), by pseudoderivations (e.g., *corner*-*corn*), and by nonwords that comprise a stem and an affix (e.g., <sup>∗</sup> *volter*-*volt*). The priming of the latter two types has been taken to indicate that any morpheme-like ending (e.g., *-er*) induces a segmentation process without differentiating between real morphological derivations (e.g., *successful* and *successor*), pseudoderivations (e.g., *corner*) or nonwords (e.g., <sup>∗</sup> *volter*). Such a differentiation occurs only under lexical processing (e.g., Rastle et al., 2000, 2004; Longtin et al., 2003; Diependaele et al., 2005; Marslen-Wilson et al., 2008; McCormick et al., 2009).

Integrating these data on prelexical and lexical processing, recent models of morphological processing (e.g., Rastle et al., 2000, 2004; Longtin et al., 2003) assume that the word recognition process occurs in two stages: In the early prelexical stage, complex words are decomposed on a purely orthographic basis, independent of true morphological or semantic compositionality. In the second, lexical stage, the morphemic (or morphemelike) constituents are reappraised for semantic and syntactic information. As soon as semantic integration occurs, only semantic compositionality (but not the purely orthographically based segmentation process) affects the word recognition process.

However, since behavioral data (e.g., response times and errors) represent the endpoints of processing and decision making, the above theories of morpho-lexical processing leave some open questions concerning the processing and representation of morphologically complex words, in particular with respect to the time course of processing: Is there a sequential ordering of the different processing types? For example, does form processing occur prior to or simultaneously with morphological processing? Does semantic processing indeed occur alongside or after morphological processing? In this respect, electrophysiological evidence provides a useful means to answer these questions.

# **ELECTROPHYSIOLOGICAL EVIDENCE FOR MORPHOLOGICAL EFFECTS**

Event-related potentials (ERPs) derived from the electroencephalogram before, during or after an event of interest have the advantage of revealing more directly differences of processes intervening between input and output than it is the case for behavior indices as reaction time and error rate. The latter measures are necessarily omnibus measures in which stimulusand response-related effects are integrated. In contrast, an ERP effect as the N400 effect is primarily related to differences in processes which mediate between input and output and which are largely independent of stimulus bound processes (e.g., perceptual discriminability) or response bound processes (e.g., response frequency). Evidence accumulated over the last 30 years revealed that the N400 effect is a sensitive measure of the effort required to process a word, and the amplitude might be interpreted as reflecting the ease of memory access. The easier such access is, due to contextual, morphological or semantic priming, the smaller is the N400 effect (Kutas and Federmeier, 2011).

With respect to morphological processing, ERPs may reveal whether morphological priming effects resemble form processing that is reflected in early negativities or whether morphological effects ensue semantic memory access as reflected in N400 effects.

Similar to the behavioral studies, most ERP studies on morphological processing have applied repetition priming under masked or unmasked stimulus presentation. In the studies considered here, priming is concluded if the negative going ERP amplitude in the latency range of 250 ms (N250) or 400 ms (N400) is attenuated relative to an unrelated baseline condition (in which prime and target are neither semantically nor formrelated), that is, to the most pronounced negativity. In other words, priming effects are concluded if the negativity in the N250 or N400 latency range is attenuated or reduced in the related condition relative to the unrelated condition (for a review, see Kutas and Federmeier, 2011).<sup>1</sup> For a summary of morphological ERP effects induced by violation paradigms as compared to repetition priming, see Smolka et al. (2013). **Table 1** of the present study summarizes the ERP findings of priming effects generated by real morphological derivations as compared to pseudoderivations or stem homographs, as compared to the effects of pure form or meaning relatedness.

ERP Studies using the masked visual priming paradigm with short prime presentations (below 50 ms) observed that real morphologically related (semantically transparent or identical) word pairs like *hunter-hunt* or *table-table* induced either an N250 attenuation alone (cf. Morris et al., 2008) or both N250 and N400 attenuations relative to the unrelated condition (cf. Holcomb and Grainger, 2006; Lavric et al., 2007; Morris et al., 2007, 2008, 2011, 2013). The variation of effects was more diverse regarding pseudocomplex word or nonword pairs of the *corner-corn* or \**cornity-corn* type or regarding form-related word pairs of the *scandal-scan* or \**teble-table* type, ranging from no effect in either condition (Morris et al., 2007), to N250 attenuations in both conditions (cf. Morris et al., 2008), to N250 alongside N400 attenuations in form-related pairs (cf. Holcomb and Grainger, 2006) or in both the pseudocomplex and form-related pairs (cf. Lavric et al., 2007; Morris et al., 2008, 2011, 2013).

Different results were obtained when the priming for morphologically complex words was compared with that of pseudocomplex or form-related words. While Morris et al. (2007) observed significantly more priming from morphologically related words than by either pseudocomplex or form-related words in both the N250 and N400 latency range, other studies by Morris et al. (2008, 2011, 2013) found no priming differences between these three types of complexity. Other studies, yet, revealed differential processing patterns during the early (N250) and later (N400) negativity. The similar N250 deflections by real morphologically and pseudomorphologically related word pairs were taken as evidence that all words undergo the same segmentation process in early visual word recognition. Similar N400 effects of pseudocomplex words and real complex words (Lavric et al., 2007; Morris et al., 2011) were interpreted to indicate a single mechanism with two-stages (orthographybased morphological decomposition followed by semantic interpretation, (see e.g., Meunier and Longtin, 2007; Lavric et al., 2011). By contrast, similar N400 effects of pseudocomplex and form-related words (Morris et al., 2008, 2011) were interpreted as evidence for a dual-route model that comprises two-mechanisms of decomposition (one orthography-based plus one semantically based, (see e.g., Diependaele et al., 2005; Holcomb and Grainger, 2006; Morris et al., 2013)).

Since the present study focuses on lexical representations (and not on early visual word recognition), we are indifferent with respect to models on early visual word recognition. Most importantly, all models so far assume different processing outcomes for semantically transparent and opaque words at the lexical level, when semantic information is integrated (in the two-stage model, e.g., Lavric et al., 2011), or when shared representations operate at the morpho-semantic level (in the dual-route model, e.g., Morris et al., 2013), or when form and meaning codes overlap (in the connectionist model, e.g., Plaut and Gonnerman, 2000). We will now turn to review the ERP studies that examined lexical representation and processing.

Under overt priming conditions with either auditory or visual prime presentations at long SOAs all the studies reviewed here observed N400 attenuations relative to the unrelated baseline for morphologically related and semantically transparent or inflected word pairs like *hunter-hunt* or*loca-loco* ("crazy woman"- "crazy man"), respectively (cf. Barber et al., 2002; Domínguez et al., 2004; Kielar and Joanisse, 2011; Lavric et al., 2011), as well as an additional early positivity for inflected word pairs (Domínguez et al., 2004). The picture was again, more diverse for pseudocomplex word-pairs of the *corner-corn* type or stem homographs of the *rata-rato* ("rat"-"time") type, ranging from no effect at all for pseudocomplex words (Kielar and Joanisse, 2011), to an early positivity for stem homographs (Domínguez et al., 2004), to N400 attenuations for pseudocomplex words or stem homographs (cf. Barber et al., 2002; Domínguez et al., 2004; Lavric et al., 2011), and an additional modulation of a late negativity for stem homographs (cf. Barber et al., 2002; Domínguez et al., 2004). In contrast to pseudocomplex words, purely form-related words did not reveal substantial effects relative to the unrelated condition (cf. Domínguez et al., 2004; Kielar and Joanisse, 2011), though an N400 attenuation was found as well (cf. Lavric et al., 2011).

The main interest of the above studies was to investigate the processing of different levels of word complexity. For example, Lavric et al. (2011) found that the N400 effect was largest when it was induced by morphologically related word pairs like *hunter-hunt*, smaller by pseudocomplex words like *corner-corn* and smallest by purely form-related words like *brothel-broth*. They interpreted the differences in deflections in favor of a twostage model of visual word recognition, with orthography-based morphological decomposition in the first stage, and validation by semantic information at a later stage.

By contrast, Kielar and Joanisse (2011) found evidence in favor of their convergence of codes view: They manipulated the semantic transparency between real morphological derivations by constructing a fully transparent condition of the *governmentgovern* type, a semi-transparent condition of the *dresser-dress* type, and a semantically opaque condition that comprised about two thirds real morphological derivations of the *apartmentapart* type and one third pseudomorphological derivations of the *corner-corn* type. They found similar N400 priming effects for semantically transparent and semi-transparent and no effect at

<sup>1</sup>The N400 effect was originally observed in sentences with semantically nonfitting words. Over the years researchers agreed that the N400 amplitude is also related to additional search processes in long-term memory when the system attempts to find a coherent interpretation for a word, a sentence, a numerical equation, etc. Thus the expression *N400 effect* or *N400 deflection* is usually used to address an increase of a negative-going amplitude in the latency range of 400 ms when stimuli induce semantic activity in memory (see e.g., Kutas and Federmeier, 2011; Rösler, 2011). A reduction of the N400-effect is therefore accepted as a sign of less activity in memory, as it is, for example, the case when a stimulus has been primed.



(Continued)

**Frontiers in Human Neuroscience www.frontiersin.org** February 2015 | Volume 9 | Article 62 | **134**


interpreted

latency range of 250 ms (N250) and 400 ms (N400) is attenuated

 as N250, if they occur within a negativity in the 200–300 ms post-target

 relative to an unrelated baseline condition, in which the largest negative going amplitude is evoked.

 window, otherwise we refer to them as "early positivity";

 Priming is concluded,

 if the negative going ERP amplitude in the

all for semantically opaque pairs. In line with the distributedconnectionist or convergence of codes view "morphological effects were graded in nature and modulated by phonological and semantic factors" (Kielar and Joanisse, 2011, p. 170). Since neither pure form similarity like *panel-pan* nor semantic associations like *sofa-coach* produced any significant effects, the authors concluded that the morphological effects could not be explained by pure form or meaning relatedness alone.

## **LEXICAL REPRESENTATION IN GERMAN**

To summarize, previous studies on lexical representation (using auditory or visual prime presentation at long SOAs) in English or French observed that semantic transparency plays a key role in lexical representation. These findings strongly contrast with our behavioral findings in German (Smolka et al., 2009, 2014): Under overt priming with either auditory or visual prime presentation (at long SOAs) complex verbs primed their base to the same extent regardless of whether they were semantically transparent (e.g., *aufstehen-stehen*, "stand up"-"stand") or semantically opaque (e.g., *verstehen-stehen*, "understand"-"stand"). Unlike the English and French findings, these findings suggest that lexical representation in German is independent of semantic compositionality: A complex verb like *understand* is not only segmented into {under} and {stand} during early visual (or auditory) word recognition but is also lexically represented via its base {stand}.

Given that there are hardly any studies of this issue in German, we seek to investigate it more fully by means of ERPs. Behavioral responses reflect the endpoint of multiple stages of the word recognition process as well as response preparation. ERPs provide the possibility to tap online into the different processes of morphological, semantic, and form-relatedness—all processes that are hard to detect by means of purely behavioral priming techniques.

The present study used German complex verbs to examine whether morphologically complex words are lexically represented via their base or via their whole form. German complex verbs provide the opportunity to manipulate the semantic transparency and opacity relating to the same base verb. For example, the complex verbs *ankommen* ("arrive"), *mitkommen* ("come along"), *zurückkommen* ("come back"), *nachkommen* ("follow"), *entkommen* ("escape"), *abkommen* ("digress"), *bekommen* ("get"), *verkommen* ("degenerate"), and *umkommen* ("perish") generate a wide range of meaning variation with respect to the same base verb *kommen* ("come").

The linguistic literature (e.g., Fleischer and Barz, 1992; Olsen, 1996) distinguishes between prefix and particle verbs. Both types comprise a simple verb and a prefix or a particle. Prefix and particle verbs differ in some syntactic and prosodic characteristics. However, these differences do not surface under the stimulus presentation of the present study (i.e., in citation form under visual presentation). Further, previous studies (using phoneme monitoring or priming) found similar effects by prefix and particle verbs in German (Drews et al., unpublished) and Dutch (Schriefers et al., 1991). Hence, the two types are not further differentiated here and subsumed under the general term "complex verbs".

Most importantly, prefix and particle verbs may be both transparently and opaquely related to the meaning of a base verb. For example, the prefix *ver-* occurs in the transparent prefix verb *verbleiben* ("remain") of the base verb *bleiben* ("stay"), and in the opaque derivation *verschwimmen* ("blur") of the base *schwimmen* ("swim"). Similarly, the particle *auf* ("up") may occur in a transparent derivation like *aufheben* ("lift") of the base verb *heben* ("lift") as well as in an opaque derivation like *aufhören* ("stop") of the verb *hören* ("hear").

Importantly, and different from previous studies in English (e.g., Rastle et al., 2000, 2004; Morris et al., 2007), all complex verbs used in this study share a morphological relationship established on etymological grounds with their base, even if they are not semantically compositional.

# **THE PRESENT STUDY**

The present study is closely modeled on the behavioral study of Smolka et al. (2009). In that study, response latencies and accuracies were measured under overt priming conditions with visual prime presentations at long SOAs (300 ms) to ensure that the experimental conditions were sensitive to semantic processing and tapped into lexical processing. Morphologically related primes were complex verb derivations that were either transparently (e.g., *mitkommen*, "come along") or opaquely (e.g, *umkommen*, "perish") related to their base verb (e.g, *kommen*, "come") that served as target. Contrary to the view that semantic meaning presides over conscious word processing, semantic transparency did not modulate the magnitude of morphological priming. In the first experiment, semantically transparent (*mitkommen*, "come along"), opaque (*umkommen*, "perish"), and identity (*kommen*, "come") primes facilitated the recognition of base verbs (*kommen*, "come") to the same extent. By contrast, purely semantically associated verbs (*nahen*, "approach") did not prime.

The second experiment examined the influence of form overlap on morphological processing and exchanged the identical primes with form-related primes (*kämmen*, "comb"). However, form relatedness hindered target recognition (*kommen*, "come") at the same time as morphological relatedness facilitated target recognition, again regardless of semantic transparency. In that experiment, semantic associates (*nahen*, "approach") induced significant priming, though weaker in magnitude than that by morphologically related primes. In the third experiment, this time under prime-exposure durations of 1000 ms, priming from semantic associates was as strong as that by morphologically related primes; and accuracy (but not latency) data showed a small semantic transparency effect in favor of semantically transparent over opaque derivations.

To summarize, the three experiments demonstrated strong morphological priming that was (a) equivalent for semantically transparent and opaque complex verbs; (b) stronger than semantic priming (at SOA 300); and (c) different from form inhibition. These results were in line with previous behavioral experiments in German that observed equivalent priming from semantically transparent and opaque primes under overt priming conditions (e.g., Drews et al., unpublished; Schirmeier et al., 2004).

In the present study, we modeled on the second experiment of Smolka et al. (2009) and measured ERPs under overt visual priming conditions at 300 ms SOA. Priming was measured against the control condition with unrelated verb pairs like *tarnen-ziehen* ("mask"-"pull"). This condition was expected to yield the most negative potentials. As described above, we concluded priming, if the ERP amplitude of a related condition was attenuated (in the latency range around 250 ms or around 400 ms) relative to this baseline condition, which shows neither form nor semantic relatedness between primes and targets. A condition with semantic associations between verb pairs like *zerren-ziehen* ("drag"-"pull") was used to measure the pure meaning relatedness between verbs. We hypothesized that this condition will induce an N400 attenuation, if semantic associations between verbs are strong enough to activate the automatic spreading within a semantic network. Looking at previous results (see **Table 1**) it is possible though, that semantic associations do not induce significant effects (cf. Kielar and Joanisse, 2011), while synonyms do (cf. Domínguez et al., 2004).

We induced morphological priming by using real morphological derivations that were either semantically transparent or opaque with respect to their base, such as *zuziehen-ziehen* ("pull together"-"pull") and *erziehen-ziehen* ("educate"-"pull"), respectively. As in all previous ERP studies using overt priming (Barber et al., 2002; Domínguez et al., 2004; Kielar and Joanisse, 2011; Lavric et al., 2011), we expected a strong N400 attenuation for semantically transparent derivations.

With respect to the semantically opaque condition, this is the first ERP study that used semantically opaque words that were real morphological derivations of their base. There is only one comparable study that used (partly) real morphological but semantically opaque derivations, and they did not find any priming effect in this condition (Kielar and Joanisse, 2011). According to the two-stage model or dual route view that morphological structure and processing depends on semantic compositionality (i.e., the meaning relation between prime and target), we should not find any effect for semantically opaque words. If, however, German complex words are accessed and represented via their stem regardless of meaning compositionality, we will observe a priming effect in form of an N400 attenuation in this condition, too.

Similarly the connectionist or convergence of codes view assumes that semantic similarity plays a role in that "morphological effects vary continuously as a function of the degree of semantic and form similarity among words" (Kielar and Joanisse, 2011, p. 162). In the present study, the morphologically related primes all contain the complete target and thus share the same form overlap between prime and target. Hence, the only difference between morphologically related primes in the semantically transparent and opaque conditions is that the former have a strong meaning similarity with the target, while the latter show no or only little meaning similarity with the target. Semantically opaque words should thus induce either no priming at all—as it was the case in the behavioral study by Gonnerman et al. (2007) and in the ERP study by Kielar and Joanisse (2011)—or, in case that semantically opaque words do induce priming, its magnitude should be significantly weaker than that by transparent words. In ERP terms, if the morphological effects were a combination of form and meaning overlap, we should find stronger effects, that is, more positive-going N400 amplitudes, for semantically transparent than for semantically opaque primes. If, however, our view holds that all German complex words access and activate their base, and if our previous behavioral findings (Smolka et al., 2009, 2014) generalize to electrophysiological data, we will find equivalent priming effects by semantically transparent and opaque derivations. Additionally, if our hypothesis holds that morphological regularities generalize beyond meaning relatedness, we should expect stronger morphological than semantic effects. In ERP terms, the N400 effects will be more positive-going for the two morphological conditions than for the semantic condition.

Morphologically related pairs are always form-related as well. Hence, orthographically similar verbs like *zielen-ziehen* ("aim"- "pull") were used to measure the effects of form similarity between verbs. Previous overt priming studies so far revealed either a very small form effect (cf. Lavric et al., 2011) or none at all (Domínguez et al., 2004, 2006; Kielar and Joanisse, 2011). For this reason, we were indifferent as to whether we should expect a significant form effect relative to the baseline condition. Importantly, though, if our hypothesis holds that morphological structure in German generalizes beyond form, the amplitude in the form condition will be more similar to the unrelated condition and hence more negative than that of the morphologically related but semantically opaque condition (both representing form without meaning relatedness).

# **METHODS**

# **PARTICIPANTS**

Seventeen students of the Philipps-University, Marburg, took part in the experiment for course credit or payment. All participants were monolingual speakers of German, not dyslexic and had normal or corrected-to-normal vision. All participants were right-handed and gave their written informed consent.

# **MATERIALS**

# **Critical stimuli**

Thirty-six different base verbs were selected as critical targets; each target was combined with five primes. **Table 2** summarizes the stimulus characteristics of primes and targets (for the whole stimulus set, see Smolka et al., 2009). For illustration, we consider the example *ziehen* ("pull"). Three factors defined the primetarget relations: morphological, semantic, and form relatedness with the base verb. All morphological derivations held a prefix or particle and were, by definition form-related to the target: (a) T, semantically transparent derivations of the base like *zuziehen* ("pull together"); (b) O, semantically opaque derivations of the base like *erziehen* ("educate"). All other primes comprised simple verbs that were morphologically unrelated to the target; (c) S, purely semantically related verbs like *zerren* ("drag"); (d) F, formrelated verbs like *zielen* ("aim") where the onset or first syllable matched that of the target and where the rime differed from the target by a single grapheme (1 or 2 letters); all but two formrelated primes were verbs; (e) U, unrelated verbs like *tarnen* ("mask") were neither semantically nor form related to the target.


**Table 2 | Stimulus characteristics of primes that were semantically related (S), morphologically related and semantically transparent (T), morphologically related and semantically opaque (O), form-related (F), or unrelated (U) to targets**.

Note. Mean lemma and word form frequencies, mean number of letters and mean rating scores (on a 7-point scale from 1 to 7); standard deviations in parentheses. All frequencies are from the CELEX database (Baayen et al., 1993), count is per million.

Semantically associated prime-target pairs had a (positionspecific) mean letter overlap of 20% (SD = 17); form-related primes had a letter overlap of 70% (SD = 23) with the targets, and unrelated prime-target pairs shared 10% (SD = 13) of the letters. The primes of the two morphologically related conditions, by definition, contained the whole target words.

The meaning relatedness between primes and targets was tested in a previously conducted association test (for details, see Smolka et al., 2009). The five prime conditions of the same target were distributed across five lists according to a Latin square design. Fifty native speakers of German (who did not participate in the ERP experiment) rated on a 7-point scale from *completely unrelated* (1) to *highly related* (7) whether two verbs like *erziehenziehen* are meaning related.

The verb pairs in the S and T conditions were rated as being highly semantically related with mean ratings of 5.9 (SD = 0.63) and 5.1 (SD = 0.68), respectively. By contrast, verb pairs in the O, F, and U conditions were rated low in meaning relatedness with mean ratings of 2.8 (SD = 0.66), 1.8 (SD = 0.65), and 1.4 (SD = 0.35), respectively. A one-way ANOVA was performed on mean rating scores with items (*F*2) as random variables. The repeated measures factor Prime Type (S/T/O/F/U) was highly significant, *F*2(4, 139) = 414.03, *p* < 0.0001. Scheffé *post hoc* comparisons confirmed that the mean rating scores of the S, T, and O conditions significantly differed from each other as well as from the F and U conditions, while the latter two did not significantly differ. Most importantly, with respect to the morphological conditions, semantically transparent primes like *zuziehen* ("pull together") were rated as significantly higher (5.1) related in meaning to their target base *ziehen* ("pull") than semantically opaque (2.8) primes like *erziehen* ("educate").

### **Filler prime-target pairs**

Five hundred and forty filler pairs were added to the 180 critical pairs. All filler pairs were semantically unrelated and differed from the items of the critical set. All filler pairs had words as primes, which followed the same morphological composition as the experimental set: 216 primes were complex verbs and 324 were simple verbs.

Similar to the critical set, 180 filler pairs had pseudoverbs as targets that were form-related to the 36 base verbs of the critical set. All pseudoverbs followed the phonotactic constraints of the German language. For example, the pseudoverbs \**stehmen*, \**stehnen*, \**steben*, \**steken*, and \**stedern* were created to be formrelated to the verb *stehlen*. The rest of the filler pairs had 180 verb targets and 180 pseudoverb targets.

# **Summary of the stimulus material**

To summarize, the whole material set comprised 720 prime-target pairs. Half of them had verbs, the other half had pseudoverbs as targets. Primes were always existing verbs: 288 (40%) were complex verbs and 432 (60%) were simple verbs. All primes and targets were presented in the infinitive (stem + -*en*), which is also the citation form in German.

The large amount of fillers should diminish both facilitatory and inhibitory effects (Napps and Fowler, 1987) and prevent both expectancy and failed expectancy effects. Overall, the proportion of (a) critical prime-target pairs was reduced to 25%; (b) that of semantically related pairs to 10%; and (c) that of form-related prime-target pairs (both words and nonwords) to 15% of the entire material.

### **APPARATUS**

Stimuli were presented on a 17<sup>00</sup> monitor, connected to an IBMcompatible personal computer. Stimulus presentation and data collection were controlled by the *Presentation* software developed by Neurobehavioral Systems.<sup>2</sup> Responses were recorded from the left and right "control" keys on a standard keyboard.

### **DESIGN**

Each participant saw all 36 simple verbs in all five priming conditions. Primes of the same target were rotated over ten blocks according to a Latin Square design, in such a way that the same

<sup>2</sup>http://www.neurobs.com/

target appeared in every second block. Likewise, the prime-target pairs of similar pseudoverb targets were distributed across the ten blocks. The remaining filler pairs were evenly allocated to the blocks, so that each block comprised equal numbers of complex and simple primes as well as verb and pseudoverb targets.

In total, an experimental session comprised 720 prime-target pairs, presented in ten experimental blocks, with 72 primetarget pairs per block. Within blocks, prime-target pairs were randomized separately for each participant with the constraint that there were maximal four adjacent word or nonword targets. There were 20 practice trials.

### **PROCEDURE**

Participants were tested individually in a dimly lit room and were seated at a viewing distance of about 60 cm from the screen. Each trial started with a fixation cross in the center of the screen for 1000 ms. Primes and targets were presented in the center of the screen, in white Sans Serif letters on a black background. Primes were presented in uppercase letters, point 22, targets in lowercase letters, point 26. The prime appeared for 200 ms, followed by a blank screen for 100 ms (SOA = 300 ms), after which the target appeared for 500 ms. A prompt (*"?"*) appeared 1000 ms after target-onset on the screen. The inter-trial interval was constant at 2000 ms.

Participants were asked to refrain from blinking and to respond until after the prompt. Participants made lexical decisions to the targets, in other words, they responded whether the stimuli were existing words or not, and were instructed to respond as accurately as possible.<sup>3</sup> "Word" responses were made by pressing the right "control" keyboard key with the index finger of the right hand, "nonword" responses were made with the left hand on the left "control" key. During practice trials, participants received feedback on the accuracy of each response; during the experimental session, feedback was given only on incorrect responses.

The experiment lasted for about 1 h. Participants selfadministered the breaks between the ten blocks, and took at least two longer breaks.

### **EEG RECORDING**

The EEG was recorded from 61 scalp electrodes using a cap in which Ag/AgCl inserts are fixated by individual electrode supports (System Falk Minow, Munich, Germany). All scalp electrodes were randomly referenced to the left or right earlobe during the recording and re-referenced offline to averaged earlobes; the left or right mastoid served as ground. Horizontal and vertical eye movements were monitored with appropriate electrode pairs. Impedances of all electrodes were kept below 5 k. Two 32 channel amplifiers (SYNAMPS, NeuroScan) were used for EEG recording. Band pass was set from DC to 40 Hz and the sampling rate was 500 Hz. Prior to the beginning of each experimental block, a DC reset was manually initiated. DC drift was corrected

corresponding to the 19 electrodes of the 10–20 system were used in the analyses of the EEG data. Each of the pooled electrodes comprised three adjacent electrodes, as follows: Fpz (Fp1, Fpz, Fp2), AFz (AF3, AFz, AF4), F5 (F3, F5, F7), Fz (F1, Fz, F2), F6 (F4, F6, F8), FC5 (FC3, FC5, FT7), FCz (FC1, FCz, FC2), FC6 (FC4, FC6, FT8), C5 (C3, C5, T7), Cz (C1, Cz, C2), C6 (C4, C6, T8), CP5 (CP3, CP5, TP7), CPz (CP1, CPz, CP2), CP6 (CP4, CP6, TP8), P5 (P3, P5, P7), Pz (P1, Pz, P2), P6 (P4, P6, P8), POz (PO3, POz, PO4), Oz (O1, Oz, O2).

according to the method suggested by Hennighausen et al. (1993). Eye blinks and trials with other artifacts were removed by applying a threshold criterion (max. voltage step per sampling point >50 µV or absolute difference in a trial segment >100 µV). ERPs were extracted from the edited set of raw data by averaging single trials separately for subjects, electrodes, and experimental conditions. Post-stimulus epochs were baseline-adjusted to the average amplitude of a 100 ms epoch preceding the onset of the target word. Only segments with correct responses entered the analysis. We created a subset of electrodes resembling the 19 standard electrodes of the 10–20-system. Three adjacent electrodes were pooled for each of these "standard" electrodes (see **Figure 1**). The pooled 19 electrodes entered the statistical analysis.

### **ANALYSIS OF EVENT-RELATED POTENTIALS**

Semantic priming effects were investigated to assess the sensitivity of the experimental setup. For this purpose, the ERPs to targets with semantically related primes were compared to unrelated primes (S vs. U) using standard *ad hoc* methods for ERP comparisons of two conditions. Semantic priming effects were assumed at those electrodes and time intervals at which S differed from U in pointwise *t* tests (α = 5% two-tailed) for an interval of at least 50 ms. Similar *ad hoc* analyses were performed to assess the influence of morphology (O vs. U) and form (F vs. U). The electrodes and locations at which semantic priming effects were

<sup>3</sup> Since response latencies were collected previously (see Experiment 2a in Smolka et al., 2009), we dispensed with the collection of latencies here so as to assure that the event-related potentials following target presentation were not confounded with brain potentials usually seen for response preparation.

observed were then used to define a region of interest (ROI) for the analyses of the interplay between semantic and morphological priming.

Permutation tests were used to assess whether these *ad hoc* methods yielded robust results (Blair and Karniski, 1993). Permutation tests allow controlling the family-wise Type 1 error rate in multiple, possibly dependent significance tests (for a review, see Maris and Oostenveld, 2007). To this end we calculated a *t*max distribution for S – U using 10,000 permutations. In each permutation, the sign of S – U was selected at random for each participant, thereby simulating the null hypothesis in which *x* = S – U has the same probability than −*x* = U – S. We calculated the *t* values at each sampling point that fell into the ROI, and chose the maximum absolute value over all electrode clusters and time points (*t*max). The 95th percentile of this permutation distribution was selected as the critical *t*max. This means that the probability is 5% that any absolute *t*max value in the main analysis is above the critical *t*max value if the null hypothesis holds. Similar procedures were applied for investigating morphological and form-related priming.

The primary question of this study is whether semantic transparency exerts priming on top of morphological priming (T vs. O). In order to avoid overly conservative correction due to the high number of partial tests in the permutation procedure, the permutation test that examined T – O was restricted to a ROI that was defined on the basis of the standard semantic priming effect, obtained by the comparison of semantically related to semantically unrelated primes, S – U. The ROI included those electrodes and time intervals at which S – U differed from zero in running *t* tests (α = 5% two-tailed) for at least 50 ms. Such a ROI approach is able to control the Type 1 error while preserving power: For electrodes and intervals where semantic priming effects are observed, that is, S 6= U, it can be expected that T differs from O to a similar extent, if the same processes of semantic analyses are activated. Since we are interested in the differential effect of semantic priming on top of morphology, we will be testing the difference between T and O only at electrodes and intervals where semantic priming effects are visible (e.g., Gondan et al., 2007). Excluding other time points and electrodes from the analysis thereby reduces the set of partial tests that have to be controlled. The critical *t*max value is equally reduced, thereby increasing power to detect priming effects within the ROI.

# **RESULTS**

# **SEMANTIC PRIMING**

**Figure 2** shows the grand averages of semantically related (S, *zerren-ziehen*, "drag"-"pull") and unrelated (U, *tarnenziehen*, "mask"-"pull") prime-target pairs. The curves start to deviate from each other at about 300 ms after stimulus onset with unrelated targets being more negative than associated targets on the central and posterior electrodes. The maximum difference is reached around 400–600 ms, indicating the typical attenuation of the N400 component by semantic associations. The upper panel in **Figure 6** provides the significant *t*- and permutation tests for this semantic effect.

### **MORPHOLOGICAL PRIMING**

Do morphologically related complex verbs prime their base? To calculate the priming induced by morphological relatedness, both morphological conditions (T and O) were compared with the unrelated condition U. **Figure 3** depicts the results. Each of the morphological conditions was far more positive going than the unrelated condition. The curves start to deviate in an early negativity, indicating an N250, followed by a positivity (P325), which again was followed by a strong N400 effect. The amplitude deviations between U and T as well as between U and O in the range of the N250, P325, and N400 components were significant for all electrode clusters and for both semantically transparent and opaque derivations. The second panel from the top in **Figure 6** provides the significant *t*- and permutation tests for the pure morphological effect (O vs. U).

Is there a semantic transparency effect in the lexical representation of morphologically related German words? To this end, we compared the ERPs of morphologically and semantically transparent word pairs (T, *zuziehen-ziehen*, "pull together"- "pull") to those of morphologically related, but semantically opaque word pairs (O, *erziehen-ziehen*, "educate"-"pull"). **Figure 3** shows the striking similarity of the two conditions: Most importantly, the priming effects of T and O were equivalent in amplitude. In line with this, the permutation test within the ROI defined by S – U did not reveal any significant difference between T and O. The third panel from the top in **Figure 6** demonstrates this resemblance between the two morphological conditions (T and O) in the *t*- and permutation tests.

We further tested the hypothesis whether or not morphological regularities generalize beyond meaning relatedness. If this is the case, we should find stronger morphological than semantic effects. Indeed, **Figure 4** shows the comparison between the conditions S and T, and indicates that the effects induced by morphologically and semantically related prime-target pairs were much stronger than the N400 effect produced by pure semantic associations.

### **FORM PRIMING**

To calculate whether form-relatedness affects target recognition, a condition with orthographically similar verbs (F, *zielenziehen*, "aim"-"drag") was compared with the unrelated baseline condition (U). As can be seen in **Figure 5**, this form effect starts at about 180 ms in a right frontal positivity (relative to the unrelated condition) that converges to an N250 effect and further extends to a weak frontal N400 effect. Form-related prime-target pairs typically induce the early positivity and N250 effects.

Importantly, this form effect significantly differed from the priming effect by morphologically related word pairs. While the form effect is right frontal, the morphological effect occurs at centro-parietal sites that characterize a typical N250 and N400 effect. The forth panel from the top in **Figure 6** provides the significant *t*- and permutation tests comparing the two effects "form without meaning", that is, O vs. F.

Finally, we calculated the comparison (O – F) vs. (T – S). This comparison represents the effects of form-relatedness (without meaning) with those of meaning relatedness. The effect occurs left anterior, with the difference (O – F) more negative going than

the amplitude of the comparison (T – S), indicating an extended N250 effect or an anterior positivity. The bottom panel in **Figure 6** provides the corresponding significant *t*- and permutation tests for this comparison.

# **DISCUSSION**

This study investigated the lexical representation of German complex verbs and compared the processing of morphological derivations that were either semantically transparent or opaque with respect to their base. Since effects of semantic transparency and semantic association are difficult to detect in either the masked or the long-term priming task, we used immediate repetition priming; and since semantic effects among morphological relatives tend to increase with SOA (for a review, see Raveh and Rueckl, 2000; Feldman and Prostko, 2002; Feldman et al., 2004), we used overt visual prime presentations at 300 ms SOA. We thus made sure that we are tapping into lexical processing. Our results were straightforward: We observed strong morphological priming effects in both conditions. Before we discuss these effects in more detail, though, we will first turn to inspect the semantic and form effects.

### **SEMANTIC PRIMING**

As hypothesized, we found a broad N400 effect with attenuated curves for semantically related verbs (*zerren-ziehen*, "drag"- "pull") relative to the unrelated verbs (*tarnen-ziehen*, "mask"- "pull"). This modulation of the N400 component is typical for semantic associations and it indicates that the semantic associations between verbs are strong enough to activate automatic spreading within a semantic network. In contrast to Kielar and Joanisse (2011) who observed no effect for semantic associations, our findings indicate that not only synonyms (cf. Domínguez et al., 2004) but also semantic associations are automatically activated within the semantic network.

Even though the N400 attenuation we found for semantic associations might be smaller than expected, one has to keep in mind that we are dealing with verb-verb pairs, which generally

show smaller priming effects than noun-noun associations. While there are plenty of ERP studies measuring the effects of semantic association between nouns (e.g., Bentin et al., 1985), there are only few measuring the semantic relatedness between verbs (cf. Rösler et al., 2001; Smolka et al., 2013), so that there are only few studies for a direct comparison. With respect to verb pairs, we have repeatedly found a dissociation between the electrophysiological and the behavioral data: While the former always indicated strong semantic-priming effects in terms of N400 modulations (Rösler et al., 2001; Smolka et al., 2013), the latter generated both significant (cf. Exp. 2 and 3 in Smolka et al., 2009; Exp. 3 in Smolka et al., 2014) and nonsignificant priming effects (cf. Exp. 1 and 2 in Smolka et al., 2014; Exp. 1 in Smolka et al., 2009, 2013). This dissociation between electrophysiological and behavioral data suggests that ERPs represent a fine-grained means that makes it possible to measure subtle effects that do not surface under behavioral data collection.

Most importantly, the present N400 deflection by semantic associations proves that the experimental procedure in this experiment is sensitive to detecting semantic influences and tapping into lexical processing.

# **FORM PRIMING**

To control the effects of form similarity, we compared orthographically similar verbs like *zielen-ziehen* ("aim"-"drag") with the unrelated baseline condition. Orthographically similar primes induced a priming effect in terms of an early right frontal positivity that converges into an N250 effect and further extends to a frontal N400 effect. Form-related prime-target pairs typically induce the early positivity and N250 effects. This finding corresponds to previous masked priming studies that found anterior N250 and N400 effects (Holcomb and Grainger, 2006; Lavric et al., 2007; Morris et al., 2008, 2011, 2013) as well as to an overt priming study that observed an N400 attenuation effect for form priming (Lavric et al., 2011).

The early positivity is typical for form-related relative to unrelated prime-target pairs. The dual-route model, for example, assumes two parallel mechanisms (one orthographybased and one semantically based). Form-priming in terms

of the N250 reflects the mapping of prelexical representations onto whole-word representations (specifically, a feed-forward prelexical morpho-orthographic segmentation that operates independently of lexical status and semantic transparency, see Morris et al., 2011), while later (N400) effects are thought to indicate the mapping of shared representations at the morphosemantic level (see e.g., Diependaele et al., 2005; Holcomb and Grainger, 2006; Morris et al., 2011, 2013). By contrast, the two stage-model assumes a single mechanism with two-stages, an orthography-based morphological decomposition followed by semantic interpretation (e.g., Meunier and Longtin, 2007; Lavric et al., 2011).

Most importantly, the form condition in our study was more negative going than the morphologically related but semantically opaque condition. Since both conditions represent form similarity without meaning relatedness, it is interesting to note that the comparison of the two generates an N400 attenuation, which is typical for semantic effects (with the morphological modulation being more positive than the form condition). This indicates that even semantically opaque but morphologically related pairs are more strongly meaning related to their base than purely form-related pairs are. We will discuss this issue in more detail in the description of the model below.

Overall, we may conclude that the morphological effects we obtained with German complex verbs cannot be reduced to pure semantic and form relatedness between words.

### **MORPHOLOGICAL PRIMING**

To examine whether morphologically related complex verbs prime their base, we compared the two morphological conditions relative to the unrelated condition. Both curves were far more positive going than the unrelated condition, each producing an N250, followed by a P325, again followed by a strong N400 attenuation effect at all electrode clusters.

The strong N400 attenuation for semantically transparent derivations corresponds to the findings of all previous ERP studies using overt priming (Barber et al., 2002; Domínguez et al., 2004;

Kielar and Joanisse, 2011; Lavric et al., 2011). In addition, we also found a strong N400 attenuation for semantically opaque derivations, which contrasts with a previous study using (partly) real morphological but semantically opaque derivations, which did not find any priming in this condition (Kielar and Joanisse, 2011). Our findings thus indicate that German complex words are accessed and represented via their stem regardless of meaning compositionality.

Moreover, we observed not only a strong N400 modulation by semantically opaque derivations but also that this N400 attenuation was as strong as that by semantically transparent derivations. That is, *erziehen* ("educate", semantically opaque) primed its base *ziehen* ("pull") to the same extent as *zuziehen* ("pull together", semantically transparent) did. This indicates that both derivations are accessed via their base regardless of their meaning relation to it.

The finding of equivalent priming from semantically transparent and opaque derivations corresponds to our previous behavioral findings (e.g., Smolka et al., 2014). Specifically, in the behavioral experiment using the same stimulus material and priming conditions as in this ERP study (Smolka et al., 2009), semantically transparent and opaque derivations yielded 43 ms and 40 ms priming effects, respectively (see also the summary of behavioral effects in Table 6 in Smolka et al., 2014). Altogether, these data indicate that semantically transparent and opaque derivations are lexically represented and processed in similar ways. We will discuss this issue in more detail in our proposed model of lexical representations (see below).

Finally, we asked whether morphological regularities generalize beyond meaning relatedness. Indeed, we found stronger N400 attenuation effects for morphological than semantic relatedness. This finding is particularly interesting, because the ratings of the association test indicated that semantic associates like *zerren* ("drag") were rated as significantly higher (5.9) related in meaning to the target *ziehen* ("pull") than the morphologically related and semantically transparent (5.1) primes like *zuziehen* ("pull together").

Stronger priming for morphologically related and semantically transparent primes (i.e., in the T condition) than in the semantic condition can be readily explained by the convergence of

codes view. Given that primes and targets in the T condition overlap both in form and meaning, the N400 should be more positive-going than with either orthographic (in the F condition) or semantic overlap (in the S condition) only. However, according to the same argument, the N400 amplitude in the O condition should be significantly less positive-going than in the T condition, since opaque primes share the form but no or little meaning with the target, but this was not the case.

With respect to the pure semantic effect, its occurrence is important since it indicates that the design of this study was sensitive to detecting semantic influences. The lack of semantic transparency effect in the morphological condition is thus not due to a general lack of semantic processing in this study.

A direct comparison of the present ERP data and the corresponding RT data from the study, modeled on Smolka et al. (2009), reveals striking similarities. **Figure 7** provides all conditions for an easy overview. Targets in the unrelated condition showed slow RTs (532 ms) and the most-negative going N400 amplitude. This condition served as the baseline against which the priming effects were calculated. Formrelated primes significantly inhibited responses (+16 ms) in the behavioral data and induced slightly more positive-going N250 and N400 amplitudes as compared with the unrelated condition. By contrast, the semantic associates yielded faster RTs (−21 ms) and a more positive-going N400 amplitude than the unrelated condition. This semantic effect was smaller than the morphological effects, that is, RTs were slower (≈20 ms) and ERPs were more negative-going than in the morphological conditions. Further, the two morphological conditions, T and O, yielded the strongest priming effects relative to the unrelated condition. This was evident in terms of the fastest RTs (−43 ms and −40 ms, respectively) and the most positive-going N400 amplitudes. Most importantly, neither the RTs nor the ERPs differed between the two morphological conditions. Finally, the morphologically related but semantically opaque condition showed significantly faster RTs (−56 ms) and far more positive-going N400 amplitudes than the form condition.

If we summarize the behavioral (Smolka et al., 2009, 2014) and the electrophysiological data presented here, we may conclude that both types of data revealed strong morphological priming effects that were significantly larger than those induced by purely semantically related or form related complex verbs. This general convergence means that the morphological-priming mechanism involves a general cognitive phenomenon that can be captured by different methods. This renders our results even more robust: We have shown that complex verbs in German are accessed and processed via their stem, regardless of their meaning compositionality.

We thus provide evidence for the existence of a morphological dimension to lexical organization that cannot be reduced to formal or semantic relations between primes and targets. Most importantly, this indicates that morphological structure needs to be incorporated in the modeling of lexical representation in German.

Why is it that morphological processing and representation seems to be different in German compared to other

Indo-European languages like English or French? In the following, we consider some possible factors that may affect language processing.

### **Affixation type**

One might argue that the origin of the strong morphological effects (without effects of semantic transparency) in our study arose due to the use of prefixed (in contrast to suffixed) words. Indeed, only few overt priming studies (Marslen-Wilson et al., 1994; Feldman et al., 2002; Zwitserlood et al., 2005) used prefixed prime-target pairs that are similar to those in the present study. Nevertheless, they found priming from prefixed words only if they were semantically transparent (e.g., *disobeyobey* in English, *privole-volim* in Serbian, or *meebrengen-brengen* in Dutch), but not if they were semantically opaque (e.g., *restrain-strain*, *zavole-volim*, or *ombrengen-brengen*, respectively). Only prefixed verbs in German induced morphological priming from semantically opaque verbs (Drews et al., unpublished). We may thus conclude that the affixation type was not the critical factor that caused the morphological priming effects.

### **Productivity**

The productivity of verb derivations in German is extremely high. A single base verb may yield families of up to 150 complex (prefix or particle) verbs, all with different meanings ranging from truly transparent to truly opaque. For example, the German base *stehen* ("stand") has more than 100 prefixed derivations, while the same base *stand* in English possesses the prefixed derivations *understand* and *withstand* and about 20 phrasal verbs (cf. McCarthy et al., 2006). Furthermore, any complex verb is conjugated in exactly the same way as its base verb (i.e., with the same irregularities, if there are any) and thus keeps the link to its origin. Due to the high number of family members, German speakers may be more responsive to the base than English speakers are.

It is possible that the productivity of German verbs leads to a generalization of (morphological) form that becomes relatively independent of meaning relatedness, as it is the case in root languages like Hebrew and Arabic. Indeed, some connectionist accounts suggest that whether one finds morphological priming without meaning relatedness depends on the morphological structure of the language as a whole (cf. Plaut and Gonnerman, 2000). In morphologically rich languages, the mappings between form and meaning are straightforward, so that morphological regularities will dominate language processing. Indeed, in the simulation of a morphologically rich language, priming effects extended to semantically opaque items as well (Plaut and Gonnerman, 2000). However, the network could not simulate equivalent priming effects for semantically transparent and opaque items, as we have found in German.

# **Particle separation**

German is a verb-second language with an SOV word order (e.g., Haider, 1985) and therefore separates the particle from its stem in finite forms, and places it at the end of the sentence. The particle, which complements the meaning of the complex verb, can thus occur many words after the stem, with an almost infinite amount of material—ranging from complex noun phrases to relative clauses—inserted in between the finite verb and its particle, as in *Der Bub hörte, nachdem er lauthals geschrien und mit den Beinen auf den Boden gestampft hatte, endlich auf/zu* (L: "The boy finally stopped/listened after he had screamed loudly and stamped with his feet"). It is possible that German readers/listeners are used to keeping more than one possible meaning active upon encountering a verb stem.

# **Morphological richness**

Interestingly, so far, strong morphological effects have been observed in Hebrew, Arabic, and German, providing evidence that lexical representation in these languages is guided by morphological structure. Indeed, like Semitic languages, German is a "morphologically rich" language among the Indo-European languages. Differences in morphological richness between Germanic languages such as English, Dutch, and German result from typological differences that emerged during language history (Roelcke, 1997). In synthetic languages like Proto-Germanic, morphology dominantly marked the grammatical relations (hence "morphologically rich"). In analytic languages, morphological markedness is reduced (hence "morphologically impoverished") and is replaced with syntax to mark grammatical relations, such as word order (De Vogelaer, 2007). In this sense, German is "morphologically richer" than other Indo-European languages, since it has kept morphological markers to indicate grammatical functions. For example, particles and prefixes of German complex verbs express the functions of adverbs of place, time, and manner in more analytic languages. Morphological richness—the use of morphology to express syntax—is a language characteristic that makes German more similar to Semitic languages like Hebrew and Arabic than to Indo-European languages.

We therefore stress the importance of cross-language and cross-linguistic evidence in building models of lexical representations. Most psycholinguistic models of lexical representations usually assume that what is true of one language is true of all. However, our results argue for cross-language differences in morphological processing and hence also in lexical representations. We assume that the features of German train native speakers to generalize the morphological form beyond the meaning of a particular whole-word derivation.

Most of the above mentioned pre- and supralexical or connectionist models cannot incorporate the present findings in German, especially not those regarding opaque morphological effects. For example, the convergence of codes view can easily explain the priming effects in the transparent condition due to form-and-meaning overlap (i.e., with both form and meaning similarity with the target). However, we do not see how this approach can explain the occurrence of equally strong effects in the opaque condition that shares form but no/little meaning.

Another conceivable explanation is rooted in the type of associations triggered by primes and targets. Saussure (in Wunderli, 2013) distinguished between syntagmatic and paradigmatic associations. The former result from the different syntactic roles that words take in the same semantic context, such as *verb–noun*, *adjective–noun*, or *preposition–noun* combinations, as in *drink–coffee*, *red–car*, *lay–above*, *fall–down*. By contrast, paradigmatic associations result from the fact that distinct words that share similar meanings occur with the same set of other words. For example, *red* or *blue* co-occur with similar nouns like *flower*, *car*, *skirt*. Therefore, they have a high semantic similarity via these second order associations.

For large text corpora Rapp (2002) showed that first and second order statistical dependencies reflect the distinction between syntagmatic and paradigmatic associations, respectively. Further, a recent computational model of semantic access uses this distinction in terms of a direct association between words (due to Hebbian, syntagmatic, learning), or a large amount of common associates (common, paradigmatic, contextual features) to successfully predict word activation levels (Hofmann et al., 2011; Hofmann and Jacobs, 2014). With respect to the present study, one could argue that opaque and transparent verbs differ in their associative status: opaque verbs may share paradigmatic contexts—not with their base—but with other derivations of their base, while transparent verbs share both a syntagmatic and a paradigmatic associative status with their base. However, future research is necessary to examine whether the syntagmatic/paradigmatic distinction can explain the similar activation of semantically transparent and opaque verbs in our study. For the time being we think that our data are best accommodated in a single-system model that allows for stem access regardless of regularity and semantic transparency. A short description is sketched below.

# **MODEL OF LEXICAL REPRESENTATION IN GERMAN**

In the following, we shortly describe the frequency-based model previously suggested by Smolka et al. (for details, see Smolka, 2005; Smolka et al., 2007b, 2009, 2013, 2014). Its main feature is that complex verbs, including regularly and irregularly inflected verbs as well as semantically transparent and opaque derivations are segmented into stem and affixes and are lexically represented via their stems (and affixes).

The model assumes segmentation processes similar to those suggested by models of prelexical processing. We refer to these studies for a detailed description of the nature of early formto-meaning mappings (cf. Diependaele et al., 2005; Marslen-Wilson et al., 2008; Crepaldi et al., 2010). Importantly, since morphemes are the smallest meaningful units, they emerge as the product of form-to-meaning mappings. In German, letter strings like *zuziehen* ("pull together") and *erziehen* ("educate") are segmented into their constituent morphemes regardless of meaning compositionality: *zu-*, *er-*, *zieh*, *-en*. This accounts for our finding of an N250 effect for all prefixed verbs in the morphological conditions of this study, which fits with the interpretation that N250 modulations indicate a "feedforward prelexical morpho-orthographic segmentation process that operates independently of lexical status and semantic transparency" (cf. Morris et al., 2011, p. 581).

Then the constituents activate their representations at the lexical level, so that both the transparent verb *zuziehen* and the opaque verb *erziehen* are lexically represented via their base {zieh} and affixes {zu}, {er}, and {en}, respectively. Since the target *ziehen* ("pull") activates the same lexical units {zieh} and {en}, its recognition is facilitated by the prior presentation of a complex verb with the same base. This accounts for our findings that the N400 attenuations induced by morphologically related words are independent of meaning compositionality. This also accounts for our finding that the facilitation in form of N400 modulations by verbs sharing the same base is larger than that by semantically associated verbs holding a different base.

Further, the finding that semantically opaque verbs induce the same amount of facilitation as transparent ones explains why both types of derivation induce an additional P325: Both types of derivations are lexically represented via the stem, just as the base verb is. Hence, the priming effect of the base corresponds to identity priming, which is typically reflected in positivities that precede the N400, such as the P325. For example, the P325 was found in repetition priming studies that used identical prime target pairs like *table*-*table* (Holcomb and Grainger, 2006) or gender-inflected nouns like *bobo*-*boba* (cf. Domínguez et al., 2004), see also **Table 1**.

The finding that semantically opaque verbs induce the same amount of facilitation as transparent ones indicates that the stems were accessed before the meaning of the whole word, which contradicts the assumptions of a supralexical model (e.g., Giraudo and Grainger, 2000; Diependaele et al., 2005). This finding further contradicts the assumptions of distributedconnectionist approaches or the convergence of codes view, according to which semantically transparent words should always yield stronger effects than semantically opaque words (e.g., Rueckl et al., 1997; Plaut and Gonnerman, 2000; Kielar and Joanisse, 2011). These assume that morphological regularities emerge during visual word processing when orthographic codes are mapped onto meaning codes. During this mapping process, the strength of the semantic association is expected to affect the formto-meaning mappings. Accordingly, semantically transparent derivations should always yield stronger priming effects than semantically opaque ones. However, this was not the case in the present study.

So far, we have explained how complex verbs are segmented (as indicated by the N250) and accessed via their stem regardless of meaning compositionality (as indicated by the P325 and the N400). How is the specific meaning of a complex word derived? If we are aware of the fact that even semantically transparent derivations yield specific idiosyncratic concepts from the meaning of the base and the function of the prefix, we may assume that transparent and opaque meanings are generated in similar manners. The very specific—more or less idiosyncratic—meaning of a complex word is activated by the lexical constituents that represent a word. For example, the stem-affix combination *zieh* ("pull") and *zu* ("together") will activate the transparent concept PULL TOGETHER, while the stem-affix combination *zieh* ("bind") and *er-*<sup>4</sup> will activate the opaque concept EDUCATE. Note that both concepts differ from the concept PULL of the single constituent.

In our frequency-based model, the specific meanings are selected by mechanisms that rely only on connections between lexical and conceptual units, choosing the most frequently activated concept upon the co-activation of the constituents. It is possible, though, to assume separate whole-word lemmas similar to "*superlemmas*" in idiom processing that are activated by the simultaneous activation of several constituents at the lexical level (e.g., Sprenger et al., 2006; Kuiper et al., 2007; Smolka et al., 2007a; Rabanus et al., 2008). Irrespective of how the specific meaning is activated following lexical access, our findings indicate that the complex verb is lexically accessed and processed via its stem.

In sum, our findings indicate that lexical representation in German refers to the base of a complex verb, regardless of meaning compositionality. This indicates that morphological structure represents an important aspect of language processing in German and must be incorporated in the lexical representation of German words.

# **AUTHOR NOTES**

This study was supported by the German Research Foundation (DFG), Grant For 254/2 awarded to Frank Rösler, and by the VolkswagenStiftung, Grant FP 561/11 awarded to Eva Smolka.

### **REFERENCES**


4 It is under discussion whether German prefixes, which are bound morphemes, have a meaning of their own.

morpho-semantic influences in early word recognition. *Lang. Cogn. Process.* 20, 75–114. doi: 10.1080/01690960444000197


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 27 September 2014; accepted: 23 January 2015; published online: 26 February 2015*.

*Citation: Smolka E, Gondan M and Rösler F (2015) Take a stand on understanding: electrophysiological evidence for stem access in German complex verbs. Front. Hum. Neurosci. 9:62. doi: 10.3389/fnhum.2015.00062*

*This article was submitted to the journal Frontiers in Human Neuroscience*.

*Copyright © 2015 Smolka, Gondan and Rösler. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms*.

# Must analysis of meaning follow analysis of form? A time course analysis

# *Laurie B. Feldman1,2\*, Petar Milin3,4, Kit W. Cho1, Fermín Moscoso del Prado Martín5 and Patrick A. O'Connor <sup>1</sup>*

*<sup>1</sup> Department of Psychology, University at Albany, State University of New York, Albany, NY, USA, <sup>2</sup> Haskins Laboratories, New Haven, CT, USA, <sup>3</sup> Quantitative Linguistics Group of Harald Baayen, Eberhard Karls University, Tübingen, Germany, <sup>4</sup> Faculty of Philosophy, University of Novi Sad, Novi Sad, Serbia, <sup>5</sup> Department of Linguistics, University of California, Santa Barbara, Santa Barbara, CA, USA*

Many models of word recognition assume *that processing proceeds sequentially from* analysis of form to analysis of meaning. In the context of morphological processing, this implies that morphemes are processed as units of form prior to any influence of their meanings. Some interpret the apparent absence of differences in recognition latencies to targets (SNEAK) in form and semantically similar (sneaky-SNEAK) and in form similar and semantically dissimilar (sneaker-SNEAK) prime contexts at a stimulus onset asynchrony (SOA) of 48 ms as consistent with this claim. To determine the time course over which degree of semantic similarity between morphologically structured primes and their targets influences recognition in the forward masked priming variant of the lexical decision paradigm, we compared facilitation for the same targets after semantically similar and dissimilar primes across a range of SOAs (34–100 ms). The effect of shared semantics on recognition latency increased linearly with SOA when long SOAs were intermixed (Experiments 1A and 1B) and latencies were significantly faster after semantically similar than dissimilar primes at homogeneous SOAs of 48 ms (Experiment 2) and 34 ms (Experiment 3). Results limit the scope of form-thensemantics models of recognition and demonstrate that semantics influences even the very early stages of recognition. Finally, once general performance across trials has been accounted for, we fail to provide evidence for individual differences in morphological processing that can be linked to measures of reading proficiency.

### Keywords: reading proficiency, morphological processing, semantic transparency

# Introduction

Models of visual word recognition typically assume that some information about the form of a word must be available before access to the word's meaning is possible. In the absence of any additional knowledge about the word to be recognized, this assumption seems logical. Therefore, when applied to the domain of morphological processing, one might argue that a morpheme is processed as a unit of form prior to any influence of its meaning. This stronger claim is controversial because the classical linguistic position is that morphemes are both, units of form and units of meaning. This contradiction is therefore worthy of further investigation.

### *Edited by:*

*Mirjana Bozic, University of Cambridge, UK*

### *Reviewed by:*

*Jens Bölte, Westfälische Wilhelms – Universität Münster, Germany Elisabeth Beyersmann, Aix-Marseille University, France*

### *\*Correspondence:*

*Laurie B. Feldman, Department of Psychology, University at Albany, State University of New York, SS 369, 1400 Washington Avenue, Albany, NY 12222, USA lfeldman@albany.edu*

*Received: 20 September 2014 Accepted: 15 February 2015 Published: 11 March 2015*

### *Citation:*

*Feldman LB, Milin P, Cho KW, Moscoso del Prado Martín F and O'Connor PA (2015) Must analysis of meaning follow analysis of form? A time course analysis. Front. Hum. Neurosci. 9:111. doi: 10.3389/fnhum.2015.00111*

Words that share a base morpheme (e.g., SHARP) tend to be similar in meaning as well as form (SHARPER, SHARPLY, SHARPEN, SHARPENER). Generally, however, words that look alike are not necessarily related in meaning (SHARK, SHARE, SHARD, HARP, TARP), and words that have similar meanings do not look alike (ACUTE, ASTUTE, CRISP, DISTINCT, CUNNING, INTELLIGENT). Therefore, morphologically related words represent a partial exception to the general claim that in language, form and meaning are related in a complex, and seemingly arbitrary fashion. Yet many would agree that, in the domain of word recognition, the meaning of a word can be informative about that word's form and vice versa (see for example, Bybee, 1985). Thus, in the broadest sense, meaning should provide a source of contextual information that could reduce uncertainty in early processing of form.

A framework for word reading and for morphological processing in particular, with an initial stage devoted to the orthographic properties of the input while remaining stubbornly independent of meaning, unnecessarily deprives that stage of a potentially useful source of information. In the present study we document how meaning and form interact continually when processing morphologically complex words, beginning with the earliest registration of input.

# Does Analysis of Meaning Follow Analysis of Form?

Among priming variants of the lexical decision paradigm, briefly presented primes [stimulus onset asynchronies (SOAs) *<*60 ms] preceded by pattern masks are assumed to capture an early phase of processing (Forster et al., 2003). Under these conditions, similarity between prime and target benefits recognition as evidenced by reduced target decision latencies for similar pairs relative to unrelated controls (facilitation). According to a form-based account of early processing target decision latencies should be faster both for prime-target pairs like sneaky-SNEAK or farmer-FARM (orthographically and semantically similar, often referred to as transparent), and for pairs like sneaker-SNEAK or corner-CORN (orthographically similar, semantically dissimilar, often referred to as opaque) relative to unrelated controls. Crucially, both types of related pairs should be equivalent because they are equally similar in form and semantics plays no role.

Previous studies using a masked priming manipulation report statistically equivalent facilitation for true (prefixed or suffixed) morphological derivations, and for primes that appear to be morphologically complex words but are not (-ER occurs as a suffix in English words such as FARMER but is a pseudosuffix in the word CORNER). Many studies have reported "morphological" facilitation that does not vary reliably with semantic similarity within a prime-target pair (Longtin et al., 2003; Rastle et al., 2004) 1 . Under the same conditions, it is difficult to document facilitation for word pairs like CORNEA-corn or BROTHEL-broth where the prime does not end in a sequence of letters that can function as a suffix (e.g., Rastle et al., 2000, 2004; Longtin et al., 2003; McCormick et al., 2009). Similarly, for non-word primes, facilitation following morphologically structured but not non-suffixed primes (Longtin and Meunier, 2005) is consistent with this account (but see Beyersmann et al., 2014), who showed facilitation following French non-suffixed (flexint-FLEX) as well as suffixed primes (flexent-FLEX) that was comparable in high-proficiency readers. Collectively, facilitation with pairs that appear to be morphologically structured like CORNER-corn provides the foundation for the claim that morphological facilitation in early visual word recognition is based only orthographic structure and the potential to fully decompose a word and isolate its stem, without regard to the semantics of its morphemes. Complementarily, the absence of facilitation for pairs that are only partially decomposable (CORNEA-corn) serves as the foundation for the claim that the effect is morphological and not based only on similar orthographic form in prime and target.

The failure to find a difference (null effect) in magnitudes of facilitation for semantically similar and dissimilar pairs, like SNEAKY-sneak vs. SNEAKER-sneak, in individual experiments provides the foundation for the form-then-meaning account. Within this framework, the potential for successful decomposition determines morphological facilitation and semantic contributions do not arise until a later semantically informed stage that typically requires longer exposure durations of the prime (Feldman, 2000; Rastle et al., 2000; Taft and Kougious, 2004; Meunier and Longtin, 2007; Taft and Nguyen-Hoan, 2010). Accordingly, when semantic contributions are detected in tasks that purportedly tap early processing, they are attributed to feedback activation based on similarity at a morpho-semantic level that accrues fast enough to influence performance in a task that depends on an earlier phase of processing (e.g., Diependaele et al., 2005; Holcomb and Grainger, 2006; Morris et al., 2008, 2011, 2013). This account of semantic effects, however, differs from a 'supralexical' account (e.g., Giraudo and Grainger, 2001), where properties of morphemic constituents only become influential after activation of the full word. Finally, both accounts differ from those whose core claim is that form and meaning processes mutually shape each other. Whether one detects evidence of full word or of constituent processing depends on properties of the word that appears and its attributes and they tend to interact in a complex and non-linear manner (e.g., Kuperman et al., 2009).

In models of lexical access, statistically comparable magnitudes of facilitation for semantically similar and dissimilar pairs in individual experiments is taken as primary support for an early morpho-orthographic stage during which semantics play no role (for a review see Rueckl and Aicher, 2008), although a meta-analytic review of the magnitudes of facilitation reveals an early semantic influence (Feldman et al., 2009). Across those studies that were proffered in support of the claim for a semantically blind process (Rastle and Davis, 2008) we reported that facilitation was significantly greater (10 ms) after semantically similar (transparent) than semantically dissimilar (opaque) morphologically related primes. This outcome attests to the role of semantics in a task that captures early processing

<sup>1</sup>We eschew the terms transparent and opaque because while semantically dissimilar pairs like SNEAKER-SNEAK are true morphological relatives, that is not the case for pairs like CORNER-CORN. The morpho-orthographic accounts treats them as comparable even though there is no morpheme internal to CORNER whose semantics can be evaluated for transparency.

where morphemes are purported to function as units of form (Feldman et al., 2009 but see Davis and Rastle, 2010). Further, meta-analysis contrasting semantically similar and semantically dissimilar primes demonstrates the risk of interpreting individual (null) findings. In our case it is the claim that parsability of a word's orthographic structure into stem (SNEAK) and potential affix (Y; ER) proceeds devoid of information about morphosemantic structure that we contest2 .

# Concurrent Access to the Semantic and Form Properties of Words in the Neurocognitive System

Challenges to the form-then-meaning assumption of processing are not limited to morphological models in word recognition tasks. Findings of near simultaneous access to the orthophonological and semantic properties of whole words are central to some current neurophysiological theories of lexical processing. For instance, Pulvermüller (2002) report that when processing a word, the cortical subnetworks that code semantics rapidly fire when the subnetworks that encode orthographic and/or phonological forms of the words are activated. In Pulvermüller (2002) view, the orthographic and semantic subnetworks form a single functional unit (i.e., a cell assembly in the Hebbian sense). In essence, concurrent access to the semantic and form properties of words seems not to be a peculiarity of masked priming in the lexical decision paradigm. Rather, it seems to be a general property of the neurocognitive system.

Analogous to the behavioral measures, primes with morphological or form similarity to the target typically show negative ERP amplitude in the latency range of 250 ms (N250) or 400 ms (N400) after target onset that is attenuated relative to an unrelated baseline condition (for a review of EEG findings see Smolka et al., 2015). Those ERP studies that report similar patterns for form similar, morphologically structured pairs, with and without semantic similarity, have been marshaled as evidence for a purely orthographic analysis of morphemes that operates at an early stage of visual word recognition. Revealingly, under these conditions relative to an unrelated prime-target pair, form and semantically similar pairs like FARMER-FARM typically generate either an N250 or both N250 and N400 attenuations (cf. Holcomb and Grainger, 2006; Lavric et al., 2007, 2012; Morris et al., 2007, 2008, 2011, 2013). By comparison, form similar and semantically unrelated albeit morphologically structured pairs like CORNER-CORN, and form similar but only partially structured pairs like CORNEA-CORN show less consistent results: no effect for either type, N250 attenuations for both types, or N250 concomitant with N400 attenuations in both types or only in the partially structured pairs (Holcomb and Grainger, 2006; Morris et al., 2007, 2008, 2011, 2013; Lavric et al., 2012).

At a minimum both the meta-analysis of the magnitudes of facilitation based on target decision time after semantically similar and semantically dissimilar morphologically related primes, as well as inconsistencies across ERP studies, highlight the risk of recruiting individual (null) findings as a justification for assigning form and meaning processes to distinct stages.

# Modeling the Time Course Over Which Form and Meaning Interact in Word Recognition

Several studies have investigated the role of a morpheme's semantic properties by holding form constant while manipulating the semantic similarity of a prime and target that share their base morpheme and then examining patterns of facilitation across long and short SOAs in the lexical decision task (Feldman and Soltano, 1999; Rastle et al., 2000, 2004; Feldman et al., 2002; Longtin et al., 2003; Diependaele et al., 2011). Most individual studies failed to observe reliable effects of semantics precisely when primes were masked and appeared at SOAs shorter than 60 ms3 . However, one limitation in almost all of those studies was that different targets appeared with formsimilar primes that did and did not preserve semantic similarity with the target. Consequently, differences between targets were confounded with semantic transparency. Nonetheless, a pattern begins to emerge suggesting that SOA may play a critical role in the detection of early semantic effects among morphological relatives. In Dutch, Diependaele et al. (2005) demonstrated a different time course for effects of semantically similar and dissimilar primes that were similar in form. Likewise in French with an incremental priming technique (Jacobs et al., 1995), facilitation arose with semantically and form similar primes at 40 ms while facilitation after semantically and form dissimilar primes was first evident only at a 67 ms prime duration. Typically, those manipulations of SOA are between experimental blocks (and often between subjects). Obviously, one can obtain a more detailed characterization of the time course of various types of facilitation if the SOA manipulation is within subjects, items, and experimental blocks. More specifically with these constraints, a joint analysis of responses across the different SOAs and prime types within a single regression model permits a more direct assessment and augments the potential to detect different time courses.

A systematic comparison of facilitation across semantically similar (transparent) and dissimilar (opaque) prime types and across SOAs of 100 ms and shorter, while holding form similarity constant, is the primary objective of the current study. The rational for sampling over a somewhat extended range of SOAs was to enhance interpolation by allowing for more precise diagnostics of possible non-linearities in patterns of facilitation4 . In this regard, what we see as the main limitation of the studies enumerated above is that each considered at most two SOAs in the range before facilitation transitions from subliminal to conscious. Therefore, on the basis of those restricted data it is not possible to specify the time course over which facilitation emerges and/or disappears, or to differentiate linear from non-linear patterns within the semantic transparency by SOA interaction. The current design thus maximizes the potential to observe a non-linear relationship between SOA

<sup>2</sup>We use the term "potential" because ER can be an affix (e.g., FARMER) but is not in the context of CORNER.

<sup>3</sup>The Feldman and Soltano study used unmasked SOAs of 48 and 250 ms. 4Extrapolation will be enhanced as well.

and facilitation due to semantic similarity between prime and target.

**Figure 1** illustrates four hypothetical but theoretically plausible profiles of the emergence of facilitation for semantically similar (black solid lines) and semantically dissimilar (gray solid lines) morphologically related pairs, in relation to unrelated pairs (dashed lines). Pattern (A) corresponds to an unlikely "full and instantaneous access" to the meaning as well as the form attributes of a lexical representation in the tradition of opening a lexical entry (Forster et al., 2003). Pattern (B) is a cascaded version of a sequential model where morphological effects are initially form-based and independent of semantics, with a gradually increasing semantic contribution. Within the form then meaning framework where the underlying assumption is that words are decomposed and stems are initially processed in isolation and independently of their morphological context, accounts of semantic contributions sometimes introduce feedback activation based on similarity at a morpho-semantic level that emerges quickly enough to alter early processing. A similar solution has been proposed for form similar pairs with (e.g., Diependaele et al., 2005; Morris et al., 2008) and without (Hino et al., 2002; Pexman et al., 2008) shared morphology. The implication is that semantic effects are not evident at the earliest point at which visual input has been processed, and must await cascading or feedback activation from later in the processing hierarchy. Pattern (C) which we promote, depicts early access to both formal and semantic properties of the word, with a wider semantic neural assembly becoming progressively more activated, in line with the theoretical proposal of Pulvermüller (1999, 2001), Plaut and Gonnerman (2000), and Moscoso del Prado Martín (2007). Here, semantically unrelated pairs that are fully decomposable like BROTHERbroth or CORNER-corn are more similar to pairs that are only partially decomposable like BROTHEL-broth or CORNEAcorn than to semantically similar pairs like BROTHY-broth or FARMER-farm (Milin et al., 2015). The implication is that effects of form and of semantic similarity operate concurrently and interdependently and that contributions increase even across very short SOAs in the 34–67 ms range. Pattern (D) represents the prediction of a purely sequential model in which access to semantic properties is blocked until some basic morphological processing, dependent only on word form, has been completed. This corresponds to models that posit an early morpho-orthographic segmentation stage that remains semantically blind and devoid of cascading semantics for some period of time, such as some readings of Rastle et al. (2004). It posits a discontinuity between discrete stages and thus fails to anticipate graded contributions of meaning. Note that all four patterns assume that decision latencies for the unrelated and semantically dissimilar condition remain relatively unchanged across SOAs shorter than 100 ms. Another version of (C), with no main effect of prime type, could result in a cross-over interaction by which an effect alternates between inhibition and facilitation. Finally, an alternative shape of (D) could show that the difference in similar and dissimilar facilitation is significant only in a particular band of SOA values, and not significant (above or) below that range.

Our goal in the present study is to document semantic influences in the early stages of morphological processing, searching as early as 34 ms. To obtain a more fine-grained characterization of the time-course of activation of the target by the prime, we examine facilitation patterns across a range of SOA values. Experiment 1A includes the three different SOAs of 34, 67, and 84 ms. Experiment 1B includes the SOAs of 48 and 100 ms. All SOAs are short enough to escape strategic processing (Neely, 1991) but vary enough to optimize detection of non-linear patterns of facilitation. Experiment 2 examines the single SOA of 48 ms presented to participants in combination with other SOAs and as a solo SOA. Experiment 3 focuses on a single SOA of 34 ms with consideration of individual differences and their relation to reading skill.

In each experiment, we compare facilitation after semantically similar and dissimilar primes that are forward masked, when both types of primes are highly similar in form to the same target. In earlier studies that have assessed early effects of semantics, different targets appeared with similar primes and with dissimilar primes. Although sets of targets were rigorously matched in those studies, unrelated decision latencies were slower for targets whose related prime context was semantically dissimilar as compared to similar. For example, in Feldman et al. (2009), target latencies (error rates) in the unrelated condition were 20 ms longer (2.8% greater) for dissimilar pairs like corner-CORN than for similar pairs like FARMER-FARM, and in Rastle et al. (2004) that difference was 23 ms (6.1%). Different unrelated baselines make it difficult to determine whether magnitudes of facilitation are comparable across targets whose related primes differ on semantic but not orthographic similarity. In the present study, because the same targets appeared with semantically dissimilar (SNEAKER-sneak) and similar (SNEAKY-sneak) primes we could eliminate any confounding between transparency and facilitation based on target attributes.

# Experiments 1A,B

To obtain a more fine-grained characterization of the time-course of activation of the same targets by different primes, we examine the effect of semantic transparency across five SOA values presented randomly within blocks of trials. This includes an SOA of 48 ms, the conventional duration at which to examine facilitation when primes are forward masked. Experiment 1A includes the three SOAs of 34, 67, and 84 ms. Experiment 1B includes the SOAs of 48 and 100 ms. We also incorporate a principal component analysis (PCA), and then used PC scores as uncorrelated (orthogonal) predictors to offset differences between targets on classical measures of word recognition such as neighborhood size and frequency. Our primary focus is on violations of formthen-meaning processing as revealed by the time course over which evidence of early semantic processing emerges.

# Method

# Participants

One hundred and eight undergraduates participated in Experiment 1A and 86 in Experiment 1B. All were monolingual students at the University at Albany, and participated in partial fulfillment of the introductory psychology course requirements.

# Materials

Sixty-three stems were selected as critical word targets. Each appeared with a derivationally related or compound prime5 . Three primes were created for each target word, and in a given experimental list, a unique third of the items was each paired with semantically similar primes, dissimilar primes, or unrelated primes. The latter were formed from a different stem than their target. In the semantically similar condition, the meaning of the target (e.g., SNEAK, CAB) was retained in the prime (e.g., SNEAKY, CABSTAND). In the semantically dissimilar condition, primes (e.g., SNEAKER, CABBAGE) failed to retain the full meaning of the stem. The dissimilar condition included both semantically opaque primes that were related etymologically to the target (e.g., SNEAKER-SNEAK) as well as pseudomorphemic relatives (e.g., RATIFY-RAT). Unrelated primes (e.g., KEENEST, HEADSTAND) retained the final letter sequence (EST, STAND) of one of the related primes and had minimal

<sup>5</sup>The processing of compound words is subject to very similar debates about transparency as is morphological derivation (de Jong et al., 2000, 2002; Moscoso del Prado Martín et al., 2004a,b, 2005; Kuperman et al., 2008; Baayen et al., 2011). Specifically, transparency effects have been documented with compounds as well as morphological derivations and further, transparency can moderate effects of whole word and constituent frequency (Kuperman et al., 2009). Nonetheless, the claim that early processing is morpho-orthographic but semantically blind typically applies to patterns of facilitation after decomposition of derivations and generally ignores compounds. The most obvious justification is that in derivations, stems are privileged components when they are segmented from the morphemes with which they combine whereas the privileged status of one morpheme over another is less plausible in the context of unaffixed compounds because its two or more components contribute more equally to the meaning of the morphologically complex word. Stated generally, derivations and compounds behave more similarly when components are examined in relation to each other and appear most dissimilar when individual non-stem components are removed and stems are inspected in isolation.

letter overlap6 . Five semantically similar primes, five semantically dissimilar primes and six unrelated primes were compounds.

The semantically similar and semantically dissimilar primes were closely matched on variables known to influence lexical decision latencies as well as normed single word lexical decision reaction time (Balota et al., 2007). These include length, logged Usenet frequencies in the HAL system (Lund and Burgess, 1996), orthographic neighborhood size, and phonological neighborhood size. In addition, similar and dissimilar primes did not differ on the number of sound, spelling, and sound plus spelling changes from prime to target. Critical stems recurred in full in (complex or compound) prime and in target (FIGLET-FIG; FIGMENT-FIG VS. ARCHWAY-ARCH; ARCHER-ARCH) on 75% of trials. For most pairs, the stem's spelling and pronunciation were retained fully in the prime (Widmann and Morris, 2009). Exceptions included final e deletion before some suffixes (SLIMY-SLIME) as well as other less systematic changes (PROVEN-PROOF; CELERY-CELL). Most important for our purposes, the number of instances of systematic and unsystematic mismatch was equalized across semantically similar and dissimilar prime types.

**Table 1** summarizes means and SD for attributes of the 63 items that were included in the experiment. Five target items were eliminated from the dataset before analysis. This included two items whose primes were rated as similar, contrary to our initial classification (ABSENT, SEED). Another three items were removed to retain equal number of items per prime condition in the multiple SOA experiments (PIG, FILL, SKIN). They were removed after the 48 ms SOA experiment.

Latent semantic analysis cosine values (Landauer et al., 1998) that capture semantic similarity based on the extent to which words appear in the same context and rating judgments based on a 7-point scale indicated that the meaning overlap between prime and target was always higher for semantically similar than for dissimilar pairs7 . The LSA cosine values (SD) for semantically dissimilar [0.07 (0.20)] and similar [0.28(0.09)] items were significantly different.

As in Feldman et al. (2009), we introduced many ID filler trials and concomitant list-wise semantic similarity so as to maximize evidence of morphological processing and the potential to detect

7Rating data appear in the Appendix.



*LEN, length in letters; log HAL, log HAL frequency; ON, orthographic neighbors; PN, phonological neighbors; FAM, morphological family size; LSA, latent semantic analysis; Rating, prime-target similarity rating.*

an interaction with semantic transparency in the forward masked primed lexical decision task. (See Appendix A). Experimental lists with a high proportion of lexically identical (ID) primetarget filler trials (e.g., CRACKER–CRACKER) show semantic facilitation even when primes are forward masked and the SOA is brief (Bodner and Masson, 2003). Moreover, the inclusion of form-similar word–word ID and word–non-word quasi-ID trials to create a relatedness proportion of 75% significantly boosts semantic and morphological but not orthographic facilitation (Feldman and Basnight-Brown, 2008).

## Design

Across participants, all targets were preceded by semantically similar, dissimilar, and unrelated primes equally often. No target was repeated within a session. In Experiment 1A each participant responded to seven pairs in each condition created by the 3 prime type × 3 SOA design. In Experiment 1B each participant responded to 10 pairs in each condition created by the 3 prime type × 2 SOA design. Stimuli were counterbalanced such that across participants, all targets were presented with each prime approximately equally often, and no target was presented more than once to a participant.

In addition to the 63 critical items described above, 42 word– word pairs were included as filler stimuli. All of the word– word filler pairs had identical primes and targets (i.e., "identity" trials). Half of these were morphologically simple words. About one third included an affix and thus were complex. About one sixth were compounds. Each participant responded to 105 word target trials in total. In order to make the relation between high form overlap and target lexicality uninformative (cf., Rastle et al., 2004), 84 of the 105 word–nonword pairs contained the non-word target's form plus a frequent letter sequence as the ending (e.g., FRUGAL-FRUG) and 21 shared no letters in the same position (YEARBOOK-ANNON).

### Procedure

Each trial began with a 500 ms fixation mark (+) that appeared in the middle of the screen. An ISI of 48 ms occurred before the forward mask (number of # signs matched to prime length) that lasted 450 ms. The prime then appeared in lowercase letters 34- 67- 84 ms (Experiment 1A) or 48–100 (Experiment 1B) ms and replaced the mask. The target was printed in uppercase letters and replaced the prime in the same position. Targets were visible for 3000 ms or until the participant made a response. The intertrial interval was 1000 ms. There was no mention of the primes in the instructions.

Items were presented in black 16-point font on a white background with E-Prime 2.0 (Psychology Software Tools, Inc.) on a PC-compatible computer with a dell 17 inch LCD, with a 60 Hz refresh rate. A different random order of prime-target pairs appeared for each participant. Participants made a lexical decision for each target by pressing the M key for words and the C key for non-words with their right and left index fingers, respectively. Participants responded to 12 practice trials before the experimental session, and the makeup of the practice stimuli mirrored that

<sup>6</sup>Under forward masked conditions in our experience, ON primes that fail to preserve the first letter with the target fail to produce facilitation. Therefore, occasional single letter overlap of an unrelated prime with the target is unlikely to alter the baseline relative to an unrelated prime that shares so letters.



of the stimuli in the main experiment. The study was approved by the Institutional Review Board of the University at Albany, State University of New York.

# Results and Discussion

Arithmetic means for prime type across the range of SOAs (34, 48, 67, 84, 100) are summarized in **Table 2**. For the analyses, correct latencies were transformed into their negative reciprocal (−1000/RT), to better approximate normality and homoscedasticity8 . The results were analyzed using Generalize Additive Mixed Models (GAMM), with flexible treatment of random effect factors, as well as options for the modeling non-linear interactions of covariates (cf., Wood, 2006, 2008) 9 .

# Principal Component Analysis

A set of target attributes documented to be relevant in word recognition including log-transformed frequency, counts reported in the HAL study (Burgess and Livesay, 1998), log-transformed SUBTLEX frequency per million words (Brysbaert and New, 2009), word length (in characters), and form related neighborhood measures: number of orthographic neighbors (ON), number of phonological neighbors (PN), average distance to ONs (OLD20), and average distance to PNs (PLD20) were collected from the English Lexicon Project (Balota et al., 2007). Although each of these variables is useful to control, many of them are highly correlated. When they are included in analyses, this introduces a risk of multicollinearity, which was confirmed in the present study by a high condition number (κ = 49.94). To circumvent the problems associated with residualizing (see Wurm and Fisicaro, 2014), we applied a PCA, and then used PC scores as recombined and uncorrelated (orthogonal) predictors. Simply, principal component scores represent optimally weighted sums of the original set of variables with the goal of accounting for shared variance among related measures. An important feature is that they are particularly well suited for use in regression modeling (Dunteman, 1989).

We kept only the first two PC components and their respective scores, as suggested by both the Kaiser–Guttman criterion (Horn, 1965) and the Scree-test (Cattell, 1966; Horn and Engstrom, 1979). These two components jointly explained about 77.5% of variance that the full set of the seven original predictors explained. **Figure 2** shows the biplot of the two extracted principal components. The first principal component (PC1) captures neighborhood properties. The length, OLD and PLD neighborhood measures have high positive loadings while the ON and PN counts have negative loadings. The interpretation of the PC1 is that a word with high positive score would be longer, would have fewer neighbors (taking into account negative loadings of ON and PN) and be at the greater distance from those neighbors (since OLD and PLD both have positive loadings). In summary, words that occupy large and dense orthographic and phonological neighborhoods have negative scores on the first component and should be easier to recognize. Conversely, words with negative scores on PC1 would be shorter, with many neighbors and at nearest proximity. In sum, words that occupy more scattered and less densely populated neighborhoods have positive scores on the first principal component and should end to be hard to recognize.

The second principal component (PC2) captures frequencyrelated variables: HAL frequency, and subtitle corpora frequency (SUBTLEX), both show very high positive loadings. Despite the fact that frequency, length, and various form-related neighborhood measures are highly collinear, with theoretically reasonable correlations (c.f., Zipf, 1935; Baayen, 2001), the PCA orthogonalization yielded a frequency dimension and a neighborhood dimension that were uncorrelated (i.e., orthogonal). We, thus, pursued statistical modeling with these uncorrelated principal components as our main continuous predictors – form and frequency covariates.

<sup>8</sup>Box Cox transformations of the power function (Box and Cox, 1964) revealed that reciprocal transformations maximized normality for the data. Additionally, we used a negative reciprocal with base of 1000: −1000/RT, as advised by Baayen and Milin (2010). This way, with negative values we preserve all effects in expected direction, and with a 1000 base wider, more appropriate range of transformed values.

<sup>9</sup>We did not consider applying logistic additive mixed effect models to the error date due to very high accuracy, with fewer than 5% errors.

# Generalized Additive Mixed Modeling: Five SOAs

In order to examine semantically similar and dissimilar primes so as to compare their time course of facilitation, the results from the two multiple SOA Experiments (1A,B) were jointly analyzed using a single generalized additive mixed effect model, retaining reciprocally transformed RT latencies as the dependent variable. We considered fixed effects of type of prime (semantically similar, semantically dissimilar, unrelated), SOA (34, 48, 67, 84, 100), and the interactions between these variables. SOA was defined as an ordered factor; hence, we were considering its linear and non-linear terms (quadratic, cubic, and fourth order; i.e., number of ordered levels minus one), both as a main effect and in interaction with the type of prime.

In addition to effects of SOA and prime type, the best model included additional non-linear effects called "smooth terms": a tensor product of the two principal components and random effects for both target and prime word items and participant identity. The analysis also revealed that the frequency-related principal component (PC2) required additional by-participant adjustments for the slope. The final model was refitted and we removed those absolute standardized residuals exceeding 2.5. In this model *R*<sup>2</sup> was 38%, on a final 8871 data points (after trimming). We describe the best model first and then elaborate on the contributions of smooth terms.

The primary analysis is reported in **Table 3**. It revealed that both prime type and its interaction with SOA as a linear term were statistically significant. The main effect of SOA, again a linear term, was also weakly significant (*p* = 0.04). The main effect of prime type indicated that responses after similar (i.e., transparent) primes were faster than after unrelated primes (β = −0.0834, *p <* 0.0001), and responses after dissimilar primes were faster than after unrelated primes (β = −0.0337, *p* = 0.0006). Further, by contrasting the two related types of pairs we confirmed that targets' response latencies after semantically similar and dissimilar primes (SNEAKER-SNEAK vs. SNEAKY-SNEAK) were significantly different [Wald's test: <sup>χ</sup>2(1) <sup>=</sup> 25.674; *<sup>p</sup> <sup>&</sup>lt;* 0.0001]10.

More interestingly, the linear term for rank-ordered SOA interacted with the type of prime. This interaction is depicted in **Figure 3**. For similar (transparent) pairs, as SOA increased, decision latencies decreased linearly then appeared to stabilize at the longest two SOAs (84 and 100 ms). Informative is that for the dissimilar (opaque) pairs we observed a weaker and later decrease

<sup>10</sup>Modeling techniques standardly use a reference level for categorical predictor (i.e., factor) to set the intercept, and then form comparison(s) for the remaining level(s) of that predictor. Thus, if there are more then two levels, non-referent levels are not directly compared. One statistically sound way to test for a specific difference is to use a test like Wald's. It tests the null hypothesis that a set of parameters is equal to some specified value (for details consult Fox and Weisberg, 2011). If the test does not reject the null hypothesis, this suggests that removing the parameters (which numerically define variables) from the model essentially would not harm model fit. For example, if a model has a three-level factor one will be selected as a reference and the differences to other two will be tested appropriately. It is possible, however, that the Wald's test between the other two, non-referencing levels, come out as non-significant. Then, it would be justified to pool those two levels together and to simplify the model that will, then, have only one comparison: the one between the reference level and the combined levels, which have been tested as insignificantly different.



in latencies as SOA increased. Specifically, a facilitation pattern is starting to emerge from the second shortest SOA (48 ms), rather than at the shortest SOA (34 ms) as in the case of similar (transparent) pairs. For targets after unrelated primes, (chalky-SNEAK) latencies slowly increase until the 67 ms SOA where they, become relatively stable. Although the analysis considered all linear, quadratic, cubic, and fourth order trends, only the linear trend reached significance. In the present study, we focus on whether the two coefficients of interest were statistically equivalent. That is whether there were differences in the linear trends over SOAs for dissimilar vs. similar primes. Wald's test yielded a marginally significant difference [χ2(1) <sup>=</sup> 3.076, *<sup>p</sup>* <sup>=</sup> 0.08], suggesting that the decrease in latency for similar pairs is significantly steeper than for dissimilar pairs (see **Table 2**).

Smooth terms are listed in part B of **Table 3**. The first row of part B reports the non-linear interaction of PC1 and PC2. Including PCAs in the analyses accounts for much of the variability among targets. **Figure 4** shows the fitted surface projected on the PC1–PC2 plane, where shorter response latencies are presented with green and longer latencies are changing into yellow, orange, and then brown; Contour lines connect points on the surface that have the same latencies (that are the same height). This contour plot shows that response latencies tend to be long for words with large values on PC1 and low values on PC2. Simply stated, all else being equal, processing time increases for words that have fewer neighbors and are at a greater distance (positive values of PC1), especially when those words are lowfrequency (negative values on PC2). The model also includes random intercepts for participants and items, both targets and primes. Finally, by-participant random slopes for PC2 also were statistically significant.

To summarize the analysis of target attributes, results show that a set of benchmark predictors, when reparameterized into two mutually independent principal components, entered into a

strong non-linear interaction with decision latencies. In principle, the frequency-related PC2 effect is in the expected facilitatory direction (i.e., negatively correlated with latencies), although it is modulated by the characteristics of the target's formrelated neighborhood (PC1): words with few ONs, when scattered at a greater distance showed the most attenuated effect of frequency component (PC2). Several previous studies results have reported that neighborhood density facilitates decision time (c.f., Forster and Shen, 1996; Balota et al., 2004). In a study by Baayen et al. (2006), however, the neighborhood density effect disappeared when modeling allowed for a non-linear effect of word frequency. In the present study, therefore, we went a step further and tested for the interaction between the two composite predictors. The outcome demonstrates the interplay between a target's neighborhood density and word frequency effects, and that recognition can benefit from both. Stated succinctly, words with low-density neighborhoods benefit least from their frequency of occurrence. Finally, the contribution of the frequencyrelated PC2 benefitted from an additional by-participant adjustment, meaning that the influence of frequency was modulated both, *generally*, by the target's neighborhoods (number of neighbors and neighborhood density), and *specifically* by the differences between participants. This level of detail attests to the true complexity inherent to the dynamics of lexical processing and the excessive simplicity of models that treat all participants or all words as interchangeable.

### Analyses Targeting Exclusively Short SOAs

The primary analysis tested for effects of prime type at five SOAs and included PCAs. In response to reviewer comments, we also report two additional analyses, one restricted only to derivations and a second only to the shorter SOAs. However, we wish to emphasize that the joint consideration of PCA and priming outcomes across all SOAs is preferable to separate analyses at each SOA because only the former model makes use of the full dataset and its power. Furthermore in the full analysis, we reported a significant interaction between SOA and prime type. One consequence of restricting the range of SOAs *post hoc*, is an increase in the chance of a type II error (H0 is false but accepted), basically because correlations tend to be attenuated by reduced variability (e.g., Sackett et al., 2007). Finally, partitioning the data with knowledge of the contents of the partitions, and then applying a statistical procedure designed as a test for random partitions is, by definition, selection bias – a known violation in statistics.

The raw means in **Table 2** deceive a reader into believing that there is no difference between dissimilar and similar facilitation at the shortest (34 ms) SOA11. However, the composite pattern across SOAs, participants and items reveals that the transparency effect is indeed present at 34 ms. Here, it is useful to remind the reader that our analysis of the time-course of facilitation treats SOA as a numerical – rank-ordered, rather than a nominal variable. This is important for two reasons. First, and most trivially, SOA is by its nature a numerical variable, and hence it should be treated as such; for instance, that 48 is bigger than 34 is an important component of the structure of the data, and this is wholly overlooked when analyzing multiple SOAs as unrelated nominal values. Second, and crucially, this enables us to exploit the power of non-linear regression to define the best-fitting line (or curve, if justified by the data) to account for the observed results12. To reiterate, if the critical interaction did significantly deviate from the linear trend that we observed in our analyses, it would have revealed itself in a higher order trend. Semantically similar pairs revealed no such interaction across the range of five SOAs.

Having professed to many concerns about the *post hoc* partitioning the data, at the request of reviewers, we examine the pattern of facilitation at SOAs of 67 ms and shorter to determine if the longer SOAs are responsible for the SOA by transparency interaction and the difference between prime types. In addition, we report analyses excluding the small number of compound primes so as to restrict prime-target pairs to derivations as did most of the previous studies on transparency.

Effects of semantic similarity with visible primes are incontrovertible and it is not impossible that some primes in some trials in the 84 and 100 ms SOA conditions were visible. Therefore here, we ask whether increases in the semantic similarity effect that we have documented with forward masked primes in Experiment 1 can be detected in the 34, 48, and 67 ms SOAs. Analysis showed that all significant effects in the analysis of the full dataset (five SOAs, from 34 to 100 ms) replicated almost perfectly. The only notable change was the weakening of facilitation for semantically dissimilar pairs. Most importantly, the difference between similar and dissimilar pairs across SOAs [5 SOAs <sup>χ</sup>2(1) <sup>=</sup> 3.076, *<sup>p</sup>* <sup>=</sup> 0.08] remained reliable at the three shortest SOAs [χ2(1) <sup>=</sup> 15.856, *<sup>p</sup>* <sup>=</sup> 0.001].

### Analyses Targeted Exclusively at Derivations

As noted above, 8% (16/189) of the prime words were compounds rather than derivations. Therefore, to allay concerns that compounds could fabricate the early effect of semantic similarity between primes and targets, we ran two additional models: (a) removing only compound prime words (16 pairs) and (b) removing all targets that were paired with a compound in any of the prime conditions (nine targets with each of its three primes). In both analyses, similar to the previous *post hoc* analysis over the shortest SOAs (34–67 ms), the interaction of SOA with similar vs. dissimilar primes was robust [χ2(1) = 15.044, *p* = 0.0002].

It remains potentially informative to examine in more detail the pattern of facilitation at individual short SOAs because of claims that early processing relies on morpho-orthographic but not semantic properties of the prime, in which case differences between similar and dissimilar prime-target pairs (viz., semantic transparency effects) should not arise. That is our goal in Experiments 2 and 3, in each of which we concentrate power at a single SOA.

# Experiment 2

Across a range of five SOAs in Experiment 1, we observed that latencies to semantically related pairs decreased as SOA increases, with dissimilar pairs (opaque primes) showing this pattern later then similar pairs (transparent primes). In Experiment 2, we examine in more detail the pattern of facilitation at an SOA of 48 ms because these are the presentation conditions under which contention about early processing tapping not only into morphoorthographic but also semantic properties of the prime has arisen (e.g., Amenta and Crepaldi, 2012). We continue to ask whether facilitation for semantically similar and dissimilar pairs differ. Further to determine whether that finding depends on exposure to a single vs. multiple prime durations, we compare the findings in Experiment 2 to those from the 48 ms SOA in the multiple SOA design of Experiment 1.

# Method

# Participants

In Experiment 2 there were 84 participants from the same population as those in Experiment 1.

# Materials, Design, Procedure

With the exception that all materials appeared at the single SOA of 48 ms, all dimensions were identical to Experiment 1. The

<sup>11</sup>The ordering of latencies for means in **Table 2** indicates that semantically similar (SNEAKY-SNEAK) pairs were recognized faster than dissimilar pairs (SNEAKER-SNEAK), which in turn were faster than unrelated pairs (CHALKY-SNEAK). Considering only the mean RTs, this pattern is clearly visible at the intermediate but less evident at the shortest SOA. As explained above, we caution that these means are deceptive because they are not adjusted for systematic variability due to participant, target, prime or SOA, even though the experiments were designed with the deliberate goal of treating SOA as a numerical. Thus, we base our conclusion about differences between primes types as a function of semantic transparency on the more powerful modeling technique across SOAs, rather than on a comparison between arithmetic means in **Table 2**.

<sup>12</sup>Any apparent lack of a non-linear effect based on the means could be a simple consequence of noise. However, even if one did attribute the absence of a nonlinear effect to noise, that would not be sufficient to claim that a non-linearity is present so as to be able to claim that early processing is semantically blind and that semantic effects appear only later (**Figure 1D**).

items that were removed in Experiment 1 were again removed (PIG, FILL, SKIN, ABSENT, and SEED), for consistency.

# Results and Discussion

**Table 4** summarizes means of prime type by single vs. multiple SOAs (i.e., Experiment 2 vs. the 48 ms SOA data from Experiment 1). As in our previous analysis, latencies on correct trials were transformed into negative reciprocals (−1000/RT) and we used principal component scores to include effects of frequency and form-related neighborhood density. Then data were submitted to analysis using GAMMs.

Fixed effects of type of prime (semantically similar, semantically dissimilar, unrelated), number of SOAs (single vs. multiple), and the interactions between these variables, together with scores on two principal components (form-related PC1, and frequencyrelated PC2), constituted the full set of predictor variables. Prime and target items and participants were random effect terms. **Table 5** reports the final model that was obtained after removing absolute standardized residuals larger then 2.5 units. The estimated explained variance of this model was *<sup>R</sup>*<sup>2</sup> <sup>=</sup> 43%, on the remaining 11149 data points.

**Figure 5** represents the effect of prime type on response latencies at the 48 ms SOA. From **Table 5** we learn that similar pairs induced significant facilitation, compared not only to unrelated (β = −0.0582, *p <* 0.01 and β = −0.01604, *p <* 0.07, respectively) but also to dissimilar pairs [χ2(1) <sup>=</sup> 21.992; *<sup>p</sup> <sup>&</sup>lt;* 0.0001]. At the

TABLE 4 | Raw values of mean decision latency for targets after (form similar) semantically similar, dissimilar, and unrelated primes at 48 ms SOA, contrasting data from Experiment 1 with multiple SOAs, and Experiment 2 with 48 ms SOA only.


same time, the difference between unrelated and dissimilar pairs was only marginally significant at the 48 ms SOA (β = −0.01604, *<sup>p</sup> <sup>&</sup>lt;* 0.07, respectively; see **Figure 5**). We can conclude that transparency effects are robust and generalize across single and multiple SOAs. At the same time, differences between unrelated and dissimilar pairs at 48 ms SOA are more reliable when modeled from an experimental setting with a single SOA than with a range of SOAs.

Finally, **Figure 6** shows a PC surface similar to the one in **Figure 4** (Experiment 1) for the 48 ms SOA when projected on the PC1–PC2 plane. In this case, the contour lines, that connect surface points with the same latencies, are slightly less wiggly and more stable. Nonetheless, the overall trends are quite comparable: longer response latencies for words with large values on PC1 and low values on PC2. As above, processing times are longer for lowfrequency targets (PC2) with sparsely populated neighborhoods (PC1).

In sum, results based on the structure of the final model for the 48 ms SOA data support our claims based on the data collected over a range of SOAs. These include: (1) The main effect of the prime type: similar prime-target pairs contrast with both unrelated and dissimilar pairs, while the later two do not differ reliably. (2) The contribution of random effects, including the need for fine-tuning with by-participant slope adjustments for the frequency-related PC2. (3) A reliable non-linear interaction of the two principal components (PC1: neighborhood density; PC: frequency). (4) The effect of experimental setup with a single vs. multiple SOA contrast showing a small advantage for the pure 48 ms duration (*p* = 0.02). One point of modest divergence is that the difference between unrelated and dissimilar pairs at the 48 ms SOA is reliable with a single but not with multiple SOAs.

# Experiment 3

Experiment 1 revealed that the effect of semantic transparency was significant irrespective of SOA, and, additionally, that the difference between similar and dissimilar pairs increased, as SOA increased. Experiment 2 replicated the effect of semantic transparency at an SOA of 48 ms whether or not SOA varied during the

TABLE 5 | Generalized additive mixed model fitted to the lexical decision latencies for 48 ms SOA, reporting parametric coefficients (A), and non-linear terms, tensor products, and random effects (B) with effective degrees of freedom (edf), reference degrees of freedom (Ref. df), *F*, and *p*-values.


course of the experimental session. Admittedly in Experiment 1, when one considered only those data points at the shortest SOA (34 ms), the advantage of transparency based on the contrast of semantically similar and dissimilar prime target pairs was statistically significant [Wald's test: <sup>χ</sup>2(1) <sup>=</sup> 25.674; *<sup>p</sup> <sup>&</sup>lt;* 0.0001], but not substantial (difference of 6.2 ms). Based on the 34 ms data from Experiment 1 alone, skeptics could argue, that the presence of a transparency effect at the shortest SOA is only an artifact of the advanced (and treacherously deceptive) modeling technique. At the same time, restricting the analysis only to responses at the 34 ms SOA would cause a dramatic reduction in experimental power, such that the null result would be minimally informative. In particular, if the effect at the 34 ms SOA is as small as predicted by the regression (on the order of less then 10 ms, see **Figure 3**), any reduction in power would make it almost impossible to observe the semantic effect in question.

In Experiment 3, we also probe for an interaction of reading skill and morphological processing. One previous study reported that fast readers show greater effects of letter transposition within (*vioilnist-VIOLINIST)* than between (*violiinst-VIOLINIST)* morphemes while slower readers do not (Duñabeitia et al., 2014). Even more relevant to the present study is the claim that differences between semantically similar and dissimilar pairs presented for 48 ms with a forward mask in the lexical decision task depend on a participant's relative proficiency in spelling and vocabulary (Andrews and Lo, 2012, 2013). In addition to our basic design at a pure 34 ms SOA, in an attempt to ascertain influences of reading skill on morphological processing, we incorporated skills measures pertaining to vocabulary and spelling skill. In other respects, the prime-target materials and methods were identical to those in Experiment 2.

# Method

# Participants

In Experiment 3 there were 73 participants from the same population as those in Experiments 1 and 2 who had not participated in either of the previous experiments.

# Materials, Design, Procedure

With the exception that all materials appeared at the single SOA of 34 ms, the experimental setup was identical to Experiments 1. At the end of the experimental session all participants completed a spelling dictation and a vocabulary test.

# Individual Difference Data

Two assessments of individual differences were introduced. The first was a spelling dictation test consisting of 15 items taken from Burt and Tate (2002). The second was a vocabulary 30-item vocabulary test taken from Andrews and Lo (2013). Each item was presented with five response options from which participants had to select the response that best defined the given word. Materials for the spelling dictation and vocabulary tests appear in Appendices B and C, respectively.

# Results and Discussion Semantic Transparency

A generalized additive mixed effect model was fit to the reciprocally transformed correct RTs. This analysis revealed a main effect of prime type (with raw means for unrelated semantically dissimilar and similar primes of 652, 639, and 622 ms, respectively).

Responses to semantically similar pairs, as well as to semantically dissimilar pairs, were significantly faster than to unrelated pairs (respectively: β = −0.0816, *p <* 0.0001; β = −0.0343, *p <* 0.002). Most crucially, similar pairs were significantly faster than dissimilar pairs [Wald's test: χ2(1) = 19.605; *p <* 0.0001]. Notice that the present outcome replicates what was previously predicted by the model in Experiment 1 (compare **Figures 3** and **7**).

The tensor product in Experiment 3 appeared attenuated as compared with results from Experiments 1 and 2 (consult **Table 6** and **Figure 8**. Note that for the tensor product of PC1 by PC2, *p* ≈ 0.05). Additionally, the by-participant adjustment for the frequency-related PC2 was non-significant, as was the by-prime intercept adjustment. Overall, the model at 34 ms is simpler, although the main effect of prime type and the form and frequency PCs remained present. What does change is that the interaction between PC1 and PC2 was attenuated.

To probe for individual differences, we not only included two measures of reading skill (spelling proficiency and vocabulary), but also kept track of trial order, so as to maximize detection of individual variations. Thus trial order was entered into the model as a by-participant smooth factor for trials. It was highly significant (*<sup>F</sup>* <sup>=</sup> 4.777, *<sup>p</sup> <sup>&</sup>lt;* 0.0001). **Figure 9** plots colored curves, one for each participant, representing how the participant's performance changes over the course of the experiment. These changes can be attributed to numerous factors, such as learning, fatigue, and changes in attention.

Inclusion of by-participant factor smooths highlights the inter-trial dependencies in the response latency time-series. Incorporating both random by-participant adjustments for the intercept and, additionally, explicitly handling the response


TABLE 6 | Generalized additive mixed model fitted to the lexical decision latencies for 34 ms SOA, reporting parametric coefficients (A), and non-linear terms, tensor products, and random effects (B) with effective degrees of freedom (edf), reference degrees of freedom (Ref. df), *F*, and *p*-values.

different participant.

latency time-series (i.e., autocorrelation) per participant allows one to test for systematic individual differences that can be attributed to spelling proficiency and vocabulary.

### Reading Proficiency and Morphological Processing

Rastle et al. (2004) claimed that early morphological processing is blind with respect to the semantic similarity of morphologically related prime and target, whereas Andrews and Lo (2012, 2013) claimed that readers who are more highly proficient in vocabulary than in spelling show effects of semantic similarity. In addition, Beyersmann et al. (2014) reported greater detrimental effects of partial morphological structure in non-words when proficiency was low. Nonetheless, Feldman et al. (2009; 2012) reported an effect of semantic similarity throughout their entire sample, regardless of reading skill. To explore the contribution of proficiency based on vocabulary and spelling dictation to effects of semantic transparency among morphologically related prime-target pairs, we compared four models that varied in their treatment of individual difference predictors.

The simplest model consisted of a dichotomized treatment of vocabulary (small vs. large) and spelling (low vs. high), similar to the methodology introduced by Andrews and Hersch (2010). To it we added our critical factor of prime type with three levels of prime-target relatedness (unrelated, dissimilar, and similar), and the random effect factor of target. Like Andrews and Hersch (2010), this model showed a strong main effect of dichotomized vocabulary (β = −0.0747, *p <* 0.0001), dichotomized spelling proficiency (β = −0.2212, *p <* 0.0001), and their interaction (β = 0.1916, *p <* 0.0001). The overall goodness of fit of this model, as expressed by Akaike's Information Criterion was *AIC* = 2678.902.

A second model treated the two measures of individual differences in reading proficiency as continuous and possibly non-linear predictors and allowed for their interaction as well as including prime type and a random effect of target. This was a better model (*AIC* = 2202.487). The tensor product of vocabulary by spelling proficiency was highly significant (*edf >* 23,

More interesting was the change that emerged when we entered the simplest possible term for a random effect of participants; namely, an intercept adjustment. This model (third in the sequence) showed a significant effect of prime type including the essential difference between semantically dissimilar and similar pairs (dissimilar: β = −0.0313, *p <* 0.006; similar: β = −0.0733, *p <* 0.0001), and significant random effects of both items (*edf >* 44, *F* = 7.202, *p <* 0.0001) and participants (*edf >* 64, *F* = 20.819, *p <* 0.0001). In this analysis and unlike Beyersmann et al. (2014), prime type failed to interact with the proficiency measures. In fact, the tensor product of vocabulary by spelling completely vanished (*edf >* 4, *F* = 1.241, *p* = 0.29) while the goodness of fit dramatically improved (*AIC* = 1355.909).

Finally, we introduced a by-participant factor smooth for trials. Of all models, this model achieved the best goodness of fit (*AIC* = 1162.259). At the same time, however, the tensor product of the two predictors of individual difference showed an increased *p*-value (*p* = 0.42), indicating their complete irrelevance for the model's goodness-of-fit. Stated simply, the introduction of byparticipant random variation over trials effectively outperformed our psychometric measures of individual differences in reading proficiency when predicting morphological processing and the role of early semantics.

As mentioned above, measures of reading proficiency should capture systematic differences between readers, whereas byparticipant adjustments for the intercept and/or the factor smooths for trials are, by definition – random effects. Unfortunately, the proficiency measures we relied on failed to decant systematic from unsystematic participant-related codeterminants of word processing. The implication is that although individual differences are bringing new and exciting questions and answers to lexical processing and related fields, there is reason for caution. Psychometrics techniques can reveal robust indicators of systematic individual variations (e.g., Kuperman and Van Dyke, 2011; Van Dyke et al., 2014). They must be tested against unsystematic contributions such as behavioral variability during the course of the experiment, however.

# Combined Analysis of Experiments 2 and 3

Ignoring the issue of selection bias described above, and complying with the request of reviewers, we combined the data from the 48 ms and the 34 ms SOAs into one analysis in order to further document early semantic effects. As in the separate analyses for each experiment, we used reciprocally transformed RTs as the dependent variable and considered prime type (unrelated, dissimilar and similar) and SOA (34 and 48 ms) as fixed factors along with the tensor product of PC1 and PC2 as a smooth term, and random effect of participants, primes and targets. In line with previous analyses, the final model also included significant by-participant adjustments for slope of the frequency- related PC2. All smooth terms were statistically significant.

Combined analysis of the data from Experiments 2 and 3 replicated the significant effect of prime type: both dissimilar and similar prime-target pairs had shorter response latencies than unrelated pairs (β = −0.0433, *p <* 0.0001 and β = −0.0796, *p <* 0.0001, respectively). Also consistent was the main difference between dissimilar and similar pairs [Wald's test: <sup>χ</sup>2(1) <sup>=</sup> 10.844; *p* = 0.0001]. SOA (34 vs. 48) did not reach significance as a main effect (β = 0.0131, *p* = 0.6757), nor did the interaction of SOA with type of prime – i.e., increasing SOA did not alter the difference between each of the prime types (dissimilar: β = 0.0194, *p* = 0.0929; similar: β = 0.0211, *p* = 0.0678). Finally, with the two shortest values of SOAs for each of the two form related word pairs, values matched closely (β = 0.0194 vs. β = 0.0211) and tested statistically as indistinguishable [Wald's test: <sup>χ</sup>2(1) <sup>=</sup> 0.021; *p* = 0.884].

# General Discussion

When the same targets were paired with semantically similar and dissimilar prime types, responses to semantically similar pairs were faster than to semantically dissimilar pairs and the latter differed only marginally from unrelated pairs. SOA in Experiment 1 was manipulated within an experimental block so as to enable us to track the time course of semantic contributions to morphological processing in a context where participants presumably apply the same processes to each trial. It could be argued that the presence of multiple SOAs within the same block, where some were consciously visible while others were only subliminal, induced strategic effects on lexical processing. However, a comparison of the data from the (pure) 48 ms SOA experiment with the 48 ms SOA data from the multiple SOA experiment failed to provide evidence for differences in the magnitude of facilitation. Instead, the only difference was that uncertainty as to when the target would appear in the multiple SOA experiment led to slower performance overall. To reiterate, multiple SOAs did not affect the magnitude of facilitation at a 48 ms SOA in any systematic way. The results of Experiment 3 confirm that the difference between semantically similar and semantically dissimilar morphologically related pairs was present and significant even at an SOA of 34 ms. In addition, the analysis combining the 34 and 48 ms SOA replicated the difference between semantically similar and dissimilar prime-target pairs. The difference increased between the SOAs of 34 and 48 ms too, but only marginally. Taken together, the model in **Figure 1B**, without a main effect of prime type, is not adequate.

Analysis of the short SOAs in Experiment 1 showed that the difference between semantically similar and dissimilar primetarget pairs increased with increasing SOA, whereas combining the 34 and 48 ms SOA data from Experiments 2 and 3 showed that the difference between semantically similar and dissimilar primetarget pairs was present in both and increased only marginally between the 34 and 48 ms SOA. We emphasize that the empirical contribution of our within experiment manipulation of SOAs is its potential to better depict the time-course over which formal and semantic contributions to morphological processing arise. Although several studies have contrasted semantic and morphological effects (Bentin and Feldman, 1990; Feldman and Soltano, 1999; Rastle et al., 2000, 2008; Feldman et al., 2004) and have reported that semantic contributions increase with SOA, to date details of the pattern have not been thoroughly delineated. In part, this is because different targets appeared with similar and dissimilar primes so that disparities among target sets could not be cleanly differentiated from priming effects. Further, prior studies considered only one or two SOAs in the range before priming transitions from subliminal to conscious. For example, Rastle et al. (2000; **Table 2**) reported a main effect of semantic transparency that appeared to increase between the 43 and 72 ms SOA, but magnitudes of facilitation for semantically similar morphologically structured pairs were atypically large (45–60 ms) and baselines after unrelated primes varied widely across target types and SOAs. These factors made it difficult to interpret increasing transparency effects with increasing SOA as fundamentally semantic rather than idiosyncratic to particular targets. Consequently from those data, one could not distinguish between the patterns represented in **Figures 1B–D**. The absence of detail was unfortunate given that the interaction of semantic transparency and SOA has become central to debates about models of morphological processing.

In this respect, a crucial innovation in the present study arises from considering SOA as the numerical variable that it naturally is. This enabled the explicit comparison of the multiple patterns of facilitation that correspond to different psychological theories. As we have seen, when the predictions of such theories are precisely described in the form of regression models, the results offer clear support for the model that is represented by a linear interaction. This method also offers a way of integrating the magnitudes of facilitation across SOAs. Furthermore, notice that the regression models have *predictive* value, one can interpolate and extrapolate the expected magnitudes of facilitation for other, not-observed, SOAs.

A model with a main effect of prime type as well as an interaction of prime type by SOA implies that early in the course of recognition there are contributions of both the formal and semantic aspects of morphological structure. Whereas the formal aspects behave less systematically at longer SOAs, semantic processing continues throughout a more extended period. This is consistent with a spread of activity across a word's neural assembly as occurs in 'full connectivity' types of models, such as those of Pulvermüller (1999, 2001), Plaut and Gonnerman (2000), or Moscoso del Prado Martín (2007).

The form-with-meaning account contrasts with models that assume that the formal aspects of morphological processing, i.e., stem-affix parsing, must be completed before access to the semantic properties can succeed. Form then meaning models, such as that proposed by Rastle et al. (2004) and Rastle and Davis (2008), would predict a non-linear pattern more similar to that illustrated in **Figure 1D**.

In summary, counter to the claims from the morphoorthographic segmentation account, regression analyses allow us to document effects of semantic similarity not only at 48 ms SOA but also at 34 ms SOA. While an effect of semantic transparency earlier than the 48 ms SOA might not be compelling from the means of Experiment 1 alone, they were fully reliable in GAMM and were replicated in Experiments 2 and 3. Thus, results failed to provide evidence for a qualitatively different and semantically blind style of processing at the earliest SOA.

# Reading Skill and Morphological Processing

Statistical predictors in complex models can obstruct each other's contribution by competing to account for the same bits of variation in a dependent variable. We believe that a similar characterization applies to the two measures of individual differences in reading skill that we examined in Experiment 3. Stated bluntly, skill contributions vanished when we introduced other, stronger predictors based on random differences between participants. To garner support for this claim, we examined a series of models with progressively more complex treatments of spelling proficiency and vocabulary as predictors of morphological processing and evaluated each in terms of Akaike's Information Criterion (AIC)13.

Our conclusion was that once the by-participant random variations were properly modeled, the psychometric measures of

Andrews and Lo (2013) handled the high intercorrelation between two standardized measures of individual differences of written language proficiency (ZSpell and ZVocab) by applying a principal component analysis (PCA). With this approach, in essence, the authors reparametrized the original variables into mutually independent (or orthogonal) predictors. This approach might be preferable to the more typical use of residuals of collinear predictors, but it is certainly not without problems (Wurm and Fisicaro, 2014). Crucially, PCA runs its component extraction in a sequential and greedy fashion; i.e., components are obtained oneby-one, each forced so as to explain a maximum of the unexplained variance, given the original set of variables. This "first served rule" grossly favors "first borns," leaving only "screes" for the remaining components (for further discussions about screes and the number of *true* principal components consult seminal works by Cattell (1966) and Horn and Engstrom (1979). Crucially, this is the situation in Lo and Andrews' PCA: the first component captured 84% of the common variance, and the second component acquired the remaining 16%. In other words, with two variables that were subjected to PCA, the total variance is 2 (in standardized units where each of the initial variables has variance 1). The consequence is that in the analysis by Lo and Andrews, the second principal component explains only 32% of the variance of any of the two initial variables (ZSpell or ZVocab). The rest is consumed by the first principal component. Having exceptionally high positive correlations (i.e., loadings) of the first principal component on the one hand, and the original spelling and vocabulary measures on the other hand, one cannot distinguish between the individual contributions of the two language proficiency measures in the linear modeling that ensues.

For the reason enumerated above, in the present study we deliberately pursued a different methodology. In particular, we did not submit our original variables of written language proficiency to a PCA. The correlation between the two variables in the present case was relatively low (*r* = 0.35). Thus, the issue of collinearity was not urgent. Furthermore, we utilized only two measures of individual differences of written language proficiency and that allowed for explicit testing of their respective contributions as predictors when modeling reaction time latencies (this manner of handling of predictors in wide-range linear modeling has its own advantages; see Wurm and Fisicaro, 2014).

Ultimately, we tested the two predictors by gradually tightening the specification of random effects related to variations across participants. In this way, progressively and explicitly and in a conservative manner, we have tested whether two measures of language proficiency could serve as valid indicators of "systematic" differences between individuals.

individual differences in reading proficiency contributed little to our understanding of morphological processing and the role of early semantics. Rather the proficiency measures that were expected to systematically influence patterns of morphological facilitation were almost certainly random (noise). This outcome highlights a general concern about the fashion of incorporating psychometric measures of individual differences to model experimental data and demonstrates the necessity to incorporate detailed model criticism as a default (e.g., Ramscar et al., 2014; Van Dyke et al., 2014).

# Is Morphological Similarity Without Semantics Really Morphological?

When primes are forward masked and presented for very short SOAs, some have differentiated between primes whose morphological structure is partially decomposable into morphemes and primes with fully decomposable morphological structure and have argued that only words with a fully decomposable morphological structure can facilitate their targets. To be more precise, targets (CORN) that follow partially decomposable primes (morpheme plus non-morphemic letter string like CORNEA (EA is not a morpheme) fail to differ from those that follow unrelated controls, whereas targets with exhaustively decomposable primes (all letter strings have the possibility to function as morphemes like CORNER that appears to be composed of CORN + ER) purportedly facilitate recognition of a target word. According to a morpho-orthographic account, facilitation arises only when the morphological structure of the prime allows exhaustive segmentation into possible morphemes (Rastle et al., 2004; Rastle and Davis, 2008). Recent results challenge the claim for a morphologically informed orthographic process by showing significant and equivalent facilitation after word primes that are partially and fully decomposable into morphemes (Milin et al., 2015) as well as after partially and fully decomposable nonword primes (Beyersmann et al., 2014). If both fully decomposable (affixed, pseudo affixed, and compound) words and partially decomposable words function comparably when primes are forward masked, then it becomes difficult to distinguish semantically dissimilar morphological from form-based processing.

In the present study we have demonstrated that even at an SOA of 34 ms, facilitation based on the appearance of a shared morpheme is weaker than facilitation based on semantic similarity in conjunction with a the appearance of a shared morpheme. Collectively, results call into question a rigid differentiation between morpho-orthographic and morpho-semantic stages of processes.

# Conclusion

The overall trend documented in the present study replicates both the findings and the meta-analysis of Feldman et al. (2009) in that when targets were held constant, semantically similar prime-target pairs produce greater facilitation than semantically dissimilar, form similar pairs. The unique contribution of the present study was to track the time course over which semantic

<sup>13</sup>We point out that in our sample, the association between facilitation and reading and spelling skill, around which Andrews and Lo (2013) built their analyses and discussion, was absent. It is only fair to note, however, that our spelling measure entailed only dictation whereas theirs also included other measures. However, it is likely that types of spelling tests and the interrelation are not the real issue.

factors influence recognition when primes are forward masked. Across five SOAs that varied from 34 to 100 ms, when random effects due to items and participants were controlled, the time course of facilitation varied for form-similar prime-target pairs with and without semantic similarity. Finally, semantic transparency effects were reliable even at a uniform 34 ms SOA. These findings replicate and extend the results of Feldman et al. (2009).

The opportunity to detect the linear increase of semantic transparency across SOAs underscores the value of concurrently treating subjects and items as random effects when analyzing latencies in repeated measures designs (Baayen et al., 2002, 2006, 2008; Forster and Masson, 2008). While an effect of semantic transparency in the earliest stage was not evident in the simple means of Experiment 1 that are reported in **Table 2**, consideration of the random effect structure and the systematicity of the relation between priming magnitudes and SOAs rendered this difference reliable in **Figure 2** based on a GAMM model. In fact in Experiment 3, the transparency effect was evident in the GAMM analysis even at an SOA of 34 ms. The outcome is consistent with a view of early lexical processing that entails extensive interaction between processes based on orthographic and semantic similarity.

Our data capture an early interaction of meaning with form. This interaction is noteworthy because it is inconsistent with a characterization of visual word recognition as a sequence of independent (morpho-orthographic then morpho-semantic) components or processes and highlights, instead, the dynamics of their interaction. Developments in cognitive neuroscience likewise are shifting away from an emphasis on independent brain regions and their function toward less localized networks with the potential for complex interactions at multiple scales. The interaction of semantics with form processing whose

# References


time course we have tracked may be representative of a style of processing in which traditionally conceived later processes influence purportedly earlier ones. With few exceptions (e.g., Plaut and Gonnerman, 2000; Moscoso del Prado Martín, 2007) models of visual word recognition typically posit independent and sometimes rate varying semantic and orthographic processes. A benefit early in the course of processing from semantic similarity between a morphological stem in isolation and in a morphologically complex prime word context is not easy to reconcile with models of word recognition that stipulate complete form analysis before analysis of meaning can begin. The outcome suggests that form and meaning properties of words or their constituents can be processed concurrently or otherwise influence each other. In essence, it challenges the universality of the form-then-meaning assumption within models of word recognition.

# Acknowledgments

The research reported here was supported by funds from the National Institute of Child Health and Development Grant HD-01994 to Haskins Laboratories and by Grants 179006 and 179033 from the Ministry of Education, Science and Technological Development of the Republic of Serbia.

# Supplementary Material

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fnhum.2015. 00111/abstract


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright* c *2015 Feldman, Milin, Cho, Moscoso del Prado Martín and O'Connor. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Cascaded processing in written compound word production

Raymond Bertram<sup>1</sup> \*, Finn Egil Tønnessen<sup>2</sup> , Sven Strömqvist <sup>3</sup> , Jukka Hyönä<sup>1</sup> and Pekka Niemi <sup>1</sup>

<sup>1</sup> Department of Psychology, University of Turku, Turku, Finland, <sup>2</sup> Center for Reading Research and Department of Education, University of Stavanger, Stavanger, Norway, <sup>3</sup> Department of Linguistics and Phonetics, University of Lund, Lund, Sweden

In this study we investigated the intricate interplay between central linguistic processing and peripheral motor processes during typewriting. Participants had to typewrite two-constituent (noun-noun) Finnish compounds in response to picture presentation while their typing behavior was registered. As dependent measures we used writing onset time to assess what processes were completed before writing and inter-key intervals to assess what processes were going on during writing. It was found that writing onset time was determined by whole word frequency rather than constituent frequencies, indicating that compound words are retrieved as whole orthographic units before writing is initiated. In addition, we found that the length of the first syllable also affects writing onset time, indicating that the first syllable is fully prepared before writing commences. The inter-key interval results showed that linguistic planning is not fully ready before writing, but cascades into the motor execution phase. More specifically, inter-key intervals were largest at syllable and morpheme boundaries, supporting the view that additional linguistic planning takes place at these boundaries. Bigram and trigram frequency also affected inter-key intervals with shorter intervals corresponding to higher frequencies. This can be explained by stronger memory traces for frequently co-occurring letter sequences in the motor memory for typewriting. These frequency effects were even larger in the second than in the first constituent, indicating that low-level motor memory starts to become more important during the course of writing compound words. We discuss our results in the light of current models of morphological processing and written word production.

### Edited by:

Alina Leminen, Aarhus University, Denmark

### Reviewed by:

Cristina Baus, Universitat Pompeu Fabra, Spain Vedran Dronjic, Carnegie Mellon University, USA

### \*Correspondence:

Raymond Bertram, Department of Psychology, University of Turku, Assistentinkatu 7, FIN-20014 Turku, Finland rayber@utu.fi

> Received: 03 December 2014 Accepted: 29 March 2015 Published: 21 April 2015

### Citation:

Bertram R, Tønnessen FE, Strömqvist S, Hyönä J and Niemi P (2015) Cascaded processing in written compound word production. Front. Hum. Neurosci. 9:207. doi: 10.3389/fnhum.2015.00207 Keywords: morphology, finnish, compound words, writing, cascaded processing, linguistic processing, motor processes, syllable

# Introduction

The processing architecture underlying word production has for a long time been based on spoken language studies. More recently, the development of experimental on-line writing tools have generated studies that are concerned with written word production (e.g., Delattre et al., 2006; Sahel et al., 2008; Kandel et al., 2012; Baus et al., 2013). These studies typically address a number of questions that are related to the intertwinement of central linguistic processes and more peripheral motor processes. The main question here is to what extent linguistic units are planned before and to what extent during motor execution.

Most studies concern the writing<sup>1</sup> of monomorphemic words and the evidence suggests that much of the planning is completed before motor execution (e.g., Baus et al., 2013). The current study is concerned with the writing of Finnish two-constituent nounnoun compounds (e.g., tennismaila "tennis racket"). Studies in language comprehension (e.g., Fiorentino and Poeppel, 2007) and spoken word production (e.g., Bien et al., 2005) have shown that the initial access of compounds may take place via the constituents, but there are also studies showing that it is mediated via whole-word representations (Janssen et al., 2008, 2014). The first issue is thus to investigate whether in written word production compounds are initially accessed as a whole unit (tennismaila) or via their constituent components (tennis and maila).

The second issue addressed in this study concerns the extent to which linguistic planning takes places during motor execution. Given that compounds are typically longer and linguistically more complex than monomorphemic words, it seems more challenging to have a detailed motor execution plan ready before writing them.

The current study investigates these issues by means of a picture-word elicitation paradigm. The introduction will first discuss studies that have investigated the amount of planning completed before production, followed by a discussion of studies that have investigated the amount of additional planning and processes that take place during writing. Finally, these issues will be linked to the model of written word production proposed by Kandel et al. (2011).

# Linguistic Planning before Production of Written and Spoken Words

A number of studies has investigated to what extent linguistic planning of monomorphemic words is completed before writing is initiated (e.g., Lambert et al., 2007; Baus et al., 2013; Roux et al., 2013). Typically these studies have investigated the effect a linguistic manipulation exerts on writing onset latency (WOT). Baus et al. (2013) elicited monomorphemic words in Spanish by means of a picture naming paradigm and found that highfrequency words elicited shorter WOTs than low-frequency ones. Roux et al. (2013) manipulated the lexicality of letter strings by employing a French word/pseudoword-copying task and found that WOTs are much shorter for words than for pseudowords. Lambert et al. (2007) found both a lexicality and frequency effect in a French word/pseudoword-copying task. Taken together these results suggest that for monomorphemic words the whole orthographic representation is retrieved before motor execution and that the level of activation is determined by word frequency. Lambert et al. also found that WOTs are independent of the number of syllables for real words, but not for pseudowords. This led them to conclude that the syllabic structure of words is not analyzed in detail before writing, but that for pseudo-words—as a result of a lacking whole-word orthographic representation letter strings are chunked into syllables. However, it seems that for words at least the first syllable is fully prepared for motor production before writing. This claim is supported by a study in German of Will et al. (2004), who found that WOT is correlated with the length of the first syllable. Longer latencies for longer syllables indicate that all letters have been retrieved and handed over to the motor program before writing commences.

There are no studies of written word production that have investigated the effect of morphological complexity on WOT. That is, no written word production study has addressed the question whether morphologically complex words are initially retrieved via their morphemes, the whole-word form or both. However, a few studies in the other language modality of production, speech, have addressed this question. These studies show mixed results. Bien et al. (2005) investigated whether speech onset latencies were sensitive to constituent and/or compound frequency in a position-response association task. In this task participants first learned to associate each compound with a visually marked position on a computer screen, after which they had to produce the relevant compound in response to the appearance of the position mark. Compounds with high-frequency 1st or 2nd constituents elicited shorter response latencies than compounds with low-frequency 1st or 2nd constituents. The manipulation of the whole-word frequency had little effect on response latencies. Similarly, Koester and Schiller (2008) found that reading aloud Dutch compound words as primes (e.g., jaszak, "coat pocket") speeded the response to a subsequently presented picture of the first constituent (jas 'coat), whereas form-related monomorphemic prime words (e.g., jasmijn, "jasmine") did not. Both studies support a decomposition account, which holds that compounds are initially retrieved via their constituents (see Zwitserlood et al., 2000, 2002 for additional evidence).

However, there are two studies that failed to find constituent effects. Janssen et al. (2008) found that production latencies in a picture naming task eliciting compounds in both Mandarin Chinese and English are a function of whole-word rather than constituent frequencies. Janssen et al. (2014) conducted a large-scale regression study on concatenated English compound words and found the same. In the latter study, a large number of potentially confounding variables was controlled ruling out the possibility that the whole-word frequency effect was a result of methodological differences with other speech production studies. The effect for whole-word frequency thus remained reliable, whereas constituent frequency (or constituent family size) effects could not be found.

Janssen et al.'s (2014)results are not only different from those in other spoken word production studies, but also from those in several word comprehension studies. More specifically, constituent effects are reported by several masked priming (e.g., Duñabeitia et al., 2009), visual lexical decision (e.g., Fiorentino and Poeppel, 2007) and eye movement studies (e.g., Pollatsek et al., 2000), indicating that decomposition is involved in the processing of compound words. When Janssen et al. (2014) extracted lexical decision times from the English Lexicon Project (Balota et al., 2007) for the same compounds as in their production experiment, they found both surface and constituent frequency effects. Janssen et al. concluded that when compounds are explicitly available in the input (as in lexical decision or in the picture-word interference experiments), constituents are actively involved in lexical processing. In contrast, when compounds have

<sup>1</sup>Note that with writing we refer to both handwriting and typewriting.

to be retrieved from semantic memory without recent exposure, they are retrieved as holistic units. The authors propose that taken together the results support a dual route account, where the activation of the decomposed route depends on the nature of the input representation. The present study investigates whether the results of Janssen et al. (2014) showing holistic compound retrieval extend to written word production or whether compounds are decomposed before retrieval, as found in several other speech production and comprehension studies. In case of decomposition, it is possible that only the first constituent is retrieved before writing commences; in this case only a first constituent frequency effect will be observed in WOT.

# Factors that Influence Written Word Production During Motor Execution

According to Damian (2003), central cognitive processes do not influence spoken word production once motor execution (i.e., articulation) has started. However, as Delattre et al. (2006) have argued, this is clearly not the case for motor execution during writing. According to them, there is more scope for cascaded processing in writing than in speaking, as writing (a) is a less practiced activity than speaking; (b) has evolved much later than speaking and (c) typically takes more time than speaking. Several studies indeed show that linguistic planning in general and morphological planning in particular take place during the motor execution phase of written word production. For example, in a handwriting study of Kandel et al. (2008), the inter-letter interval (ILI) between the root and the suffix in derivational suffixed words (e.g., boulette "small ball") was compared with the ILI at the same position in pseudosuffixed words (e.g., goélette "caravel"). It was found that ILIs prior to the suffix were longer for suffixed than pseudosuffixed words. This led the authors to conclude that the writing system anticipated the production of the suffix and that letters are grouped in linguistically motivated chunks. Kandel et al. (2012) replicated these findings and also showed that letter durations (the time it takes to write a letter) before morpheme boundaries are inflated in comparison to letter durations before pseudoboundaries. An interesting additional finding in this study was that the results were only obtained for suffixed but not for prefixed words.

A typewriting study of Sahel et al. (2008) investigated by means of a word copying task whether second constituent and/or whole word frequency predicted the inter-key intervals (IKIs) between the first constituent and second constituent of German compound words. They found that IKIs were affected by both and argued that these results lend support to a dual-route account, which postulates that whole-word and decomposition procedures run in parallel and interact with one another. However, given that the study did not consider WOT as a dependent measure, no conclusions about initial compound retrieval can be drawn. Weingarten et al. (2004) found that two-letter sequences (bigrams) at morpheme boundaries in German compounds elicited much longer IKIs than bigrams at pure syllable boundaries or intrasyllabic bigrams transitions. Thus, in a word like Maiskolben ("corncob"), the IKI at the morpheme boundary between s and k is much longer than the IKI at the pure syllable boundary between l and b or the IKIs of all other intrasyllabic bigrams. In other words, a constituent boundary prolongs the writing of two adjacent letters, much more than any other factor. This is even the case when exactly the same bigrams are considered at different positions within words (intrasyllabic, syllable boundary, constituent boundary, see Weingarten et al., 2004). Taken together, the results imply that at least for suffixed and twoconstituent compound words the second constituent morpheme is activated (or reactivated) at the morpheme boundary.

Weingarten et al. (2004) also report syllable-based effects during writing; the IKIs for the intersyllabic bigrams in their study were much longer than intrasyllabic bigrams. These syllablebased effects are also reported in Spanish and French (Kandel and Valdois, 2006; Kandel et al., 2006; Álvarez et al., 2009). Moreover, they are found with different inputs (visual words, auditory words, pictures), in different dependent measures (IKIs, ILIs, letter writing duration, gaze lifts) and with different populations (adults, children, bilinguals). In all these cases it is reported that writing slows down at or around the syllable boundary, indicating that the system prepares the production of upcoming syllables whilst writing. Kandel et al. (2006) note that the role of syllables is likely to be more prominent in languages with clear syllable structure. Finnish—the language of current investigation—has a regular syllabic structure with clearly defined syllable boundaries, no ambisyllabicity and stress falling practically always on the first syllable. Thus, we may expect solid syllable effects for Finnish as well.

Apart from the impact of clear linguistic boundaries, certain letter combinations within or across such boundaries also may affect motor execution during writing. Weingarten et al. (2004) report that gemination, the doubling of vowels or consonants, leads to faster typing of the second letter in comparison to the second letter of letter sequences with different letters. This has an obvious explanation: the finger is already positioned on the target key when typing the second letter. This benefit is not self-evident in handwriting, where similar movements have to be made for writing the first letter and the second letter in geminate pairs. However, Kandel et al. (2014) reported shorter letter production times also for the second letter in a geminate pair (the second s in Lisser compared to the t in Lister) in handwriting. Moreover, they did not find the typical inflation effect for ILIs at syllable boundaries for words with gemination (Lisser). The available evidence thus suggests that also in handwriting there is some kind of motor preparation effect that speeds up the production of the second letter in the geminate pair. Interestingly, Kandel et al. (2014) found that this kind of motor preparation takes place at the expense of writing the initial letters of a word. More specifically, they found longer writing durations for the first three letters (Lis) in the geminate word (Lisser) than in the control word (Lister). This implies that gemination requires additional planning during motor execution which slows down the writing of the initial letters. All in all, the results led Kandel et al. (2014) to conclude that gemination annuls the syllable-by-syllable programming strategy.

Kandel et al. (2011) investigated the interaction between bigram frequency and syllable boundary in handwriting. For visual word recognition it has been argued that readers become sensitive to orthographic regularities like the co-occurrence of adjacent letters (bigrams, trigrams), such that frequently co-occurring letters develop stronger links and can be processed more quickly than less frequent sequences (Seidenberg, 1987; Treiman and Zukovski, 1988). As intrasyllabic letters co-occur more often than intersyllabic letters (Adams, 1981), the syllable effects reported in comprehension (e.g., Prinzmetal et al., 1986) and production studies (e.g., Kandel et al., 2006) may be bigram frequency effects in disguise. Doignon and Zagar (2005) showed that this is partly the case, as their syllable effects were attenuated for high-frequency bigrams at the syllable boundary. However, the fact that the syllable effect was not completely wiped out by high-frequency bigrams indicates that the syllable is a functional processing unit during visual word comprehension (see also Rapp, 1992). Similarly, Kandel et al. (2011) found that a relatively frequent bigram at the syllable boundary increases ILIs for children and adults alike, but not as much as would be expected on the basis of bigram frequency alone. That is, a high-frequency bigram at a syllable boundary is not written as fast as the same high-frequency bigram in intrasyllabic position.

# A Model for Written Word Production

To account for the findings presented above, Kandel et al. (2011) proposed a model of written word production that includes linguistic modules, a spelling module and motor modules (see **Figure 2**). The linguistic modules pertain to the activation of intentions and gearing up the semantic and syntactic system including semantic retrieval. The spelling module includes a number of abstract processing levels that are active in parallel. In the initial phase of the spelling module, the orthographic representation of the whole word is retrieved. This representation activates in turn syllables at the syllable level, which in turn activate letters at the letter level. The letter level also stores knowledge about letter co-occurrence (bigrams, trigrams) as well as knowledge about phoneme-grapheme correspondences. Subsequently letters will be transferred to the motor modules, where graphomotor planning for handwriting takes place, including the selection of allographs (e.g., uppercase or lower case), leading to the eventual production of letters. Note that this phase is different for typewriting, as for typewriting a series of hand and finger movements have to be programmed in standard keyboard space (for a more detailed description of written word production models, see Weingarten et al., 2004; Kandel et al., 2011; Purcell et al., 2011). Kandel et al.'s model is derived from the classic Van Galen (1991) model, but differs from it by adding a syllable level and an abstract letter level to the spelling module to account for the syllable and bigram/trigram effects found in several studies.

# Experiment

The current study investigated a number of issues. First, we asked whether retrieval of Finnish compound words (e.g., tennismaila "tennis racket") takes place via morphological constituents (tennis and maila) or whether retrieval is holistic in nature. To that end, compounds with varying constituent and whole-word frequencies were selected and these frequency variables were entered in the regression analyses as predictors for Writing Onset Time (WOT). It was assumed that if retrieval took place holistically, the whole-word frequency would predict WOTs, whereas decompositional retrieval would be predicted by constituent frequency effects. In order to investigate in more detail what is prepared before writing we entered a number of other variables as well. We anticipated that at least the length of the 1st syllable would affect WOT (cf. Weingarten et al., 2004).

In order to investigate how much linguistic planning goes on during writing, we extracted all the Inter-Key Intervals (IKIs) between subsequent letters and entered a number of variables as predictors in the regression analyses. In particular, we were interested to investigate to what extent certain linguistic transitions and bigram and trigram frequencies affected IKIs. In our compounds (e.g., tennismaila "tennis racket"), we distinguished four different types of transition: intrasyllabic no-boundary transitions (in our example te, en, ni, is, ma, ai, la), syllabic gemination transitions (in our example nn), pure syllabic transitions (in our example il) and morphosyllabic transitions (in our example sm). The impact of bigram frequency and syllable boundaries on IKIs was assessed in more detail in an additional IKI-analysis to examine whether effects depended on IKIs appearing in the first or second constituent. More specifically, in this way we assessed the time course of effects within words. Finally, we were interested in whether any of the effects were affected by participants' typing skills, so average typing speed was also added to both analyses. All variables will be described in more detail in the method section.

# Method

### Participants

Eighteen undergraduate students of the University of Turku participated in the experiment. All were native speakers of Finnish, and had normal or corrected-to-normal vision.

### Apparatus

The program used for our experiment is called ScriptLog, invented by Strömqvist and Malmsten (1998) and further developed by Strömqvist et al. (2006). Scriptlog is a program with two windows, an elicitation window and an editor window. In the editor window the participant types the word that corresponds to the picture presented in the picture window. The program registers the production time of letters and words, typing errors and their corrections, inter-key intervals and writing onset time (among other things). In other words, it allows for the extraction of a multitude of measures that give a detailed insight into the writing process.

# Materials

Before the experiment proper, we conducted a paper-and-pencil pretest to assure that the pictures would elicit the intended compounds. In this test 15 native Finnish students wrote down the name of 50 preselected target pictures that supposedly would elicit compound words. They also rated the pictures' visual complexity (from 1, visually simple, to 5, visually complex) and typicality (how well does the picture correspond to your own mental representation of this item/object; from 1, not at all, to 5, perfect match). For the experiment proper, only those pictures were included that elicited at least 73.3 of the time the intended compound (average 94.4%) and had an average typicality rating of at least 3. These criteria allowed us to select 26 target pictures that elicited noun-noun compounds. The lexical statistics of these compounds were extracted from an unpublished computerized newspaper corpus of 22.7 million word forms, assessed with the help of the WordMill database program of Laine and Virtanen (1999). The 26 target compounds are listed in Supplementary Material. **Table 1** lists the average and the range of the ratings and variables that were included in the analyses. The experimental items were mixed with 26 filler items. These filler items were pictures that were intended to elicit monomorphemic words (e.g., lasi "glass," vasara "hammer," kana "chicken").

### Procedure

After instruction, participants were exposed to 52 pictures in the picture window (see **Figure 1**), the first four pictures being filler items eliciting monomorphemic words. After that, the pictures eliciting monomorphemic filler words and those eliciting target compound words were presented in random order, but such that no more than 3 compound items appeared after each other. The task was to write down what each picture represented. The participants started the experiment by pointing the mouse cursor to the "start"-button on the screen and clicking the left mouse button. After that, a picture appeared in the left window of the screen. In the right window, the editor window, the participant had to write down as quickly and accurately as possible what the picture represented. After this the mouse cursor had to be pointed to the "next"-button on the screen and the left mouse button had to be clicked again. This made the following picture appear and the same procedure was repeated until all 52 pictures were responded to. The experiment lasted approximately 10–15 min.

### Dependent Variables and Predictors

We used two written word production measures as the dependent variables in our analyses. The first one was writing onset time (WOT), the time between picture presentation and the first keystroke; in addition, we considered the inter-key intervals (IKI), the time in between each keystroke. The independent variables included in all statistical models are listed in **Table 1** and include typicality (Typical, 1–5), visual complexity (VisCom, 1–5), number of syllables (4–5), log lemma frequency (LLemfreq), log 1st and 2nd constituent frequency (LFreq1c, LFreqc2), log bigram frequency (LBiFreq; for WOT average bigram frequency and for IKI individual bigram frequencies were entered in the model), log frequency of the initial (LIni3); and final trigram (LFin3), whole word and 1st and 2nd constituent length (LenWW, Len1c, Len2c), 1st and 2nd syllable length (LenSyl1, LenSyl2) and typing proficiency (TypingSpeed). For the latter variable, average writing time of the compounds was used as an approximation of typing proficiency. For IKI the type of linguistic transition, LingTrans, between two letters was still added to the analyses. This factor included four levels: no boundary n (N = 164), syllabic boundary s (N = 44), morphosyllabic boundary m (N = 26), and geminate g (N = 27). Finally, we reanalyzed the data for IKI (IKI\_2) to assess the time course of effects within words by including constituent as a factor (Const, 1–2). For these analyses, the morphosyllabic condition had to be excluded, as the boundary for this condition is exactly between

### TABLE 1 | Properties of the target compounds and the participants (Typing Speed).


<sup>a</sup>All values scaled to one million.

<sup>b</sup>Length in characters.

<sup>c</sup>Scaled to one thousand.

<sup>d</sup>Rating scale from 1 to 5.

presentation of the picture in the left window, the participant types the picture name in the right window. In this example the participant writes the word tennismaila "tennis racket." Going with the mouse cursor to "next" and pressing a mouse button will make the next picture appear.

the first and second constituent; we also excluded the geminate condition in order to obtain a purer comparison between the syllable boundary and no-boundary conditions. For all the models, we included participants and items as random effects; other variables did not improve the random effect structure. Variables with a high mutual correlation were decorrelated before entering into the statistical models (e.g., Len2c and LFreqc2 were highly correlated, so we used residualized Len2c, Len2c from which the influence of LFreqc2 was partialled out). The fixed effects of the dependent measures are listed in the Supplementary Material.

# Results

The data were analyzed using linear mixed effects models with participants and items as crossed random effects, while making use of the lme4 package (Bates et al., 2013) for R statistical software (R Core Team, 2013). Separate models were fitted for both dependent measures. The measures were log-transformed in order to normalize their distributions. Trials in which the target word was misspelled, initially mistyped or not the intended compound were excluded before analyses (16% of trials). Values that were 3 SDs smaller or larger than the grand mean were excluded (1.5% of the trials for WOT and 1% of the trials for IKI). No further data trimming was done before analyses. We report models with the effects that retained statistical significance in the stepwise backward elimination procedure. More precisely, we first included all the predictors and subsequently removed the least predictive predictor in each round until we ended up with a model with only significant predictors, |t|> 1.96. We also made sure by model comparison that each predictor significantly improved the explanatory power of the model. The model specifications are presented in detail in Supplementary Material.

# WOT

There was a significant effect for LLemFreq, |t|> 2. The more frequent the word, the more quickly participants started to type. In addition, the effect for Typical was significant, |t|> 2. The more clearly a picture corresponded to participants' own mental representation of a given object, the shorter the WOT. Significant effects were also found for TypingSpeed and Syl1Len, both |t|s > 2. Faster typists initiated typing earlier than slower ones and longer first syllables elicited longer WOTs than short first syllables. Other variables did not make significant contributions to the model. For instance, neither the effects of LFreq1c and LFreqc2 (|t|s < 1.3, when entered alone) nor any interaction came close to significance<sup>2</sup> .

# IKI

There was a clear effect for LingTrans, with different IKIs for all four types of transitions, all ts > 2. The time between keystrokes was smallest in the case of geminates; it was significantly longer when there was no boundary, still longer at the syllable boundary and longest when there was a morphosyllabic boundary (g = 155 ms; n = 217 ms; s = 274 ms; m = 377 ms). The two other variables that affected IKIs were LBiFreq, t > 2, and LFin3, t > 2, with words including more frequent bigrams and more frequent final trigrams generating shorter IKIs than words with less frequent ones. The best model though included interactions between LingTrans and LBiFreq, with interactions between the following levels: m X s, m X g, n X s, and n X g. The interactions reflected that the effect for LBiFreq was larger for IKIs at morphemic boundaries and no boundaries than at pure syllable boundaries or for geminates. Separate analyses revealed that the effect for LBiFreq was significant for all transitions apart from gemination.

# IKI2

In order to assess the time course of effects within words, we reanalyzed the IKI-data by including constituent (Const, 1 or 2) as a factor, but for the no boundary and syllable boundary conditions only. For this measure there were clear main effects for Ling-Trans, LBiFreq, LFin3, and Const, all ts > 2. For the first three variables the effects were the same as in the initial IKI-analysis. The effect of Const indicated that IKIs were shorter in the second constituent than in the first constituent. The best model though included interactions between LingTrans and LBiFreq and between Const and LBiFreq, both ts > 2. The first interaction indicated again that the effect of LBifreq was larger for IKIs at intrasyllabic positions than at syllable boundaries. The second interaction indicated that the effect of LBifreq was larger for IKIs during second constituent writing than during first constituent writing. We further explored the latter interaction, by separately analyzing the first and second constituent of the syllable boundary and the no boundary condition. These analyses showed that LBifreq did not affect first constituent IKIs at syllable boundaries, t < 1, but had a significant effect on second constituent IKIs at syllables boundaries, t > 2. For the no-boundary IKIs the LBifreq effect was significant for both constituents, both ts > 2, be it that it was larger for the second constituent.

# Discussion

The current study set out to investigate whether Finnish compounds are retrieved holistically or via constituents, while at the same time it investigated what linguistic planning takes place before and during motor execution when typewriting these compounds. To assess linguistic planning before writing, we used WOT as the dependent measure; for processes during writing, we opted for the IKI between the typing of two subsequent letters.

It was found that whole-word frequency rather than constituent frequency was a solid predictor for WOT, indicating that initial retrieval is holistic in nature. Moreover, it was found that picture typicality, typing speed and the length of the first syllable had an impact on WOT. The picture typicality effect indicates that less prototypical pictures require more processing resources to retrieve the correct semantic concept. The typing speed effect indicates that more skillful typists manage to activate their motor program more quickly than less skillful ones. Perhaps it also reflects that more skillful typists are faster in placing their right hand back to the keyboard keys after having clicked the mouse to start a new trial.

With respect to the first syllable length effect, it can be argued that if only the first phoneme would have been prepared before writing, the length of the 1st syllable should not have mattered. Given that longer first syllables led to longer WOTs, it has to be concluded that the first syllable is fully prepared before writing

<sup>2</sup> Initially, the compounds were selected in such a way that there were 2 factorial manipulations included. The first manipulation concerned whole-word frequency, with 10 high-frequency compounds (on average 9 per million) and 10 low-frequency compounds (on average 1 per million). The second manipulation concerned 1st constituent frequency with 10 compounds having a high-frequency 1st constituent (79 per million) and 10 having a low-frequency one (on average 6 per million). For both manipulations other factors (frequencies, lengths) were matched. Similar to the regression analyses, the ANOVAs showed that the effect of whole-word frequency was highly significant (hf: 1406 ms vs. lf: 1723 ms, both ps < 0.05), whereas the effect of 1st constituent frequency did not even approach significance (hf: 1588 ms vs. lf: 1600 ms, both ts < 1).

is initiated, including the retrieval and activation of the motor program for all first syllable graphemes (see Weingarten et al., 2004, for a similar argumentation).

The IKI-results indicated that linguistic planning is not fully ready before writing, as linguistic boundaries clearly caused a delay during writing. More specifically, IKIs were longer for letter sequences around a syllable and morphosyllabic boundary than for intrasyllabic sequences. The IKIs were shortest for geminates, whereas morphosyllabic boundaries generated the longest IKIs. It thus seems that linguistic planning cascades into the actual motor execution phase and linguistic units need to be retrieved or reactivated whilst writing. Interestingly, bigram and trigram frequency also affected IKIs, even more so in the second part of the compound word than in the first part. Higher bigram frequencies led to shorter IKIs for intrasyllabic, syllabic and morphosyllabic letter sequences, but the bigram effect did not appear at syllable boundaries in the first constituent and was smaller for intrasyllabic sequences in the first than in the second constituent. Moreover, whereas the frequency of the initial trigram, always appearing in the first constituent, did not affect IKIs; higher frequencies of the final trigram—always located in the second constituent—clearly led to shorter IKIs.

# Retrieval of the Orthographic Representation

The picture-word written production task requires object identification and retrieving the semantic concept, after which an orthographic representation from the orthographic long-term memory store (O-LTM) can be retrieved (see Purcell et al., 2011). This retrieval process can take different shapes in case noun-noun compound words are involved, as these compounds contain two words which have their own orthographic representations. Typically, the constituent words are more frequent than the compound word and are therefore likely candidates to be activated before the whole compound word. Several compound word studies in spoken language production suggest that constituents are involved at an early stage in word retrieval. For example, Bien et al. (2005) showed by a position-response association task that response latencies where predicted by constituent frequencies rather than whole-word frequency. Several picture-word interference studies showed priming of constituents (jas 'coat) by earlier presented compound words (e.g., jaszak, "coat pocket"), but not by orthographic controls (e.g., jasmijn, "jasmine"; Zwitserlood et al., 2000, 2002; Koester and Schiller, 2008). In addition, in reading comprehension constituent frequency effects are omnipresent in masked priming (e.g.,Duñabeitia et al., 2009), visual lexical decision (e.g., Fiorentino and Poeppel, 2007) and eye movement studies (e.g., Pollatsek et al., 2000; White et al., 2008). All these studies thus show that constituents are involved in initial access/retrieval of compounds.

However, two studies in speech production on compounds using the picture naming task did not find any constituent effects (Janssen et al., 2008, 2014). On the contrary, these studies found that production latencies were predicted by whole-word frequency in both Mandarin Chinese and English. Using an equivalent to this task in written word production, we find exactly the same results as Janssen and associates. Thus, we also conclude that initial retrieval of compounds in production is holistic in nature. However, it may well be the case that this retrieval procedure is not written in stone. Janssen et al. argue that for all studies where constituent effects are found the compounds were visually presented. In the position-response association task of Bien et al. (2005), participants were exposed to compounds several times in the training phase, during which they learned to link a specific compound with a specific position. Similarly, in the picture interference paradigm compounds are first visually presented, before they are being produced (Zwitserlood et al., 2000, 2002; Koester and Schiller, 2008). In these paradigms one does actually not know whether the constituent effects are solely on the production side, or whether the initial visual presentations or perhaps the earlier production of the compound has triggered decompositional access and retrieval. In that sense one may say that a basic picture naming paradigm in which the compound words are not explicitly presented beforehand is a purer task to assess how compounds are produced. It seems that under these context-free circumstances holistic retrieval is the most likely procedure, in both spoken and written word production. However, we do agree with Janssen et al.'s (2014) conclusion that their results together with the results of other compound studies where constituent effects are found suggest a dual route system. That is, we also would argue that both processing routes are at work during compound retrieval, whereby under context-free circumstances (picture naming) the whole-word route is the faster one to deliver. However, as soon as constituents receive some prior stimulation (picture interference, cueing paradigms) the decomposition route is boosted and will be involved in initial compound retrieval. We therefore predict that when using in written word production for instance a (compound) word-copying paradigm (presenting the compound words instead of pictures in the elicitation window), onset latencies will be predicted by constituent frequencies as well. We leave it to further research to test this hypothesis.

# Cascaded Processing During Written Word Production

As in previous studies, we also found that during motor execution intervals between keystrokes are neither equal nor random, but dictated by a number of linguistic properties within the compound. The effects of gemination mimic the results of Weingarten et al. (2004)reflecting that it is fairly easy to strike the same button twice on a keyboard, once the typist has sorted out that the word contains a double vowel or consonant at a certain position. However, the inflated IKIs at syllabic and morphosyllabic boundaries as well as the impact of bigrams and trigrams—most prominent in the second part of the compound—cannot be ascribed to the keyboard configuration.

Kandel et al. (2011) proposed that the model of Van Galen (1991) should be extended with a syllable level, as there is ample evidence that the syllable is a functional processing unit during written word production, at least in languages with clear syllable structure (see **Figure 2**). The syllable effect (longer intervals for letter sequences at syllable boundaries than intrasyllabic letter sequences) that we found in Finnish adds to this body of evidence. Kandel et al. (2011) describe how a bisyllabic word like VILAIN is produced in handwriting. After activating the

linguistic modules and the orthographic representation of the whole word, the syllable module is activated which informs the writing system about the syllabic structure of the word (VI + LAIN). The first syllable (VI) is then fed forward via the letter module to the motor module for production, while at the same time the next syllable (LAIN) is "activated on-line" (p. 1320). We presume that this implies that both syllables are fed forward to the letter level and that only the first syllable (VI) is then handed over to the motor modules. In a subsequent phase—while the first syllable is being produced—the next syllable (LAIN) is handed over to the motor modules. It also has to be assumed that handing over of the second syllable to the motor modules is not fully completed during the production of the first syllable, but that it spills over to some extent to the syllable boundary, hence the inflated inter-key intervals at this boundary.

The next question is at what level the bigram frequencies come into play. Kandel et al. (2011) suggest that letter co-occurrence information is stored at the letter level. This would mean that high frequency bigrams more quickly reach activation threshold at this level and are handed over to the motor modules than lowfrequency bigrams, probably by virtue of stronger activation links between the two letters in the bigram. For intrasyllabic bigrams this procedure seems very plausible, but one may ask—if syllables are handed over to the motor modules one at a time—how the frequency of intrasyllabic bigrams can modulate IKIs. We think that it is likely that these effects also partly reside in the motor modules; that is, it is likely that procedural (finger) muscle memory is involved here with more automatized behavior in case of frequently co-occurring letter sequences than more rarely cooccurring sequences. To put it simply, the fingers are more used to type sequences of letters that frequently co-occur and typing such sequences is more automatized than typing infrequently occurring sequences. Higher bigram frequencies are gluing linguistic units like syllables together—even though their motoric encoding is sequential—by more quickly handing over a syllable to the motor modules. However, it should be noted that this only happens in the second constituent and bigram frequency effects are still larger for intrasyllabic than intersyllabic bigrams. Thus, it has to be concluded that procedural motor memory does not completely wipe out linguistically motivated processing (in this case syllable-based processing).

One may also wonder why bigram and trigram effects are stronger toward the end of the word than in the beginning. As noted above, bigram frequency actually does not affect IKIs in the first but only in the second constituent at syllable boundaries. Moreover, also the effect for intrasyllabic bigrams is stronger in the first than in the second constituent and even trigram frequency only exerts an effect in the second constituent. We think that this reflects that the motor program needs some warming-up during the typing of a long compound word. That is, initially typing is more linguistically motivated (hence the lack of a bigram frequency effect at the first syllable boundary), but upon arrival to the second constituent, low-level automatisms start guiding the processing.

A subsequent question that needs to be asked is to what extent morphological encoding takes place during written compound word production. The longer IKIs at the morphosyllabic boundaries in comparison to the pure syllable boundaries suggest that there is at least some morphological influence during writing. This is confirmed by similar findings of Weingarten et al. (2004) in German and by Kandel et al. (2012) in French. However, it is unclear whether the boundary effect implies (late) activation of the first constituent at the constituent boundary, whether it indicates that the second constituent is retrieved at the boundary, or both. At least Kandel et al. (2012) suggest that their handwriting model should be still further extended with a morphemic level located between the word and the syllable level. However, if syllable effects are observed before morpheme effects, as observed in our study as well as by Kandel et al. (2012), one may wonder whether the morphemic level should be above the syllable level. In addition, in case the morpheme boundary does not coincide with the syllable boundary, as in the Kandel et al. (2012) study (e.g., pruneau, syllabified as pru.neau, with morphological structure prun/eau), the question is how syllable structure is going to be recovered after it is first violated by dividing the word in morphemes. In sum, one can say that morphological structure has an impact on on-line written word production (see also Sahel et al., 2008), but the current empirical evidence does not allow to make conclusions about how morphology should be incorporated in a model of written word production.

Finally, two additional points have to be made. First, it needs to be noted that the phonological level is not included in current models of written word production. Yet, it is undoubtedly the case that during retrieval phonological representations get activated as well as that phonological rehearsal will take place during the writing process. Second, even though we have argued for a cascaded processing architecture, it is likely that the processing system is to some extent interactive as well. For one thing, since morphemes can be subsyllabic, syllabic, and multisyllabic, it is likely that we need an interactive model to capture the reality of the processes going on during writing. We leave it to further studies to address these issues in more detail.

### Concluding Remarks

The current study showed that typewriting is an intricate interplay between central linguistic processing and peripheral motor processes. Compound words seem to be retrieved as whole orthographic units and the first syllable is fully prepared before writing commences. However, linguistic planning is not fully ready before writing, but cascades into the motor execution phase where additional planning is needed. In terms of the model by Kandel et al. (2011), one could say that graphemes beyond the first syllable are handed over to the motor system only during or after the production of the first syllable. In addition, we showed that letter co-occurrence also plays a role in written word

# References


production, suggesting the involvement of automatized routines of motor memory.

# Author Contributions

RB, JH, and PN were involved in conducting and analysing the experiment as well as in designing and writing the study; FT and SS were involved in designing and writing the study. SS also developed the program Scriptlog by which the study was conducted. All authors approved the final version of the current submission. We thank the two reviewers for their helpful suggestions.

# Acknowledgments

This study was financially supported by the Norwegian Research Council and the Reading Centre at the University of Stavanger.

# Supplementary Material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fnhum. 2015.00207/abstract


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Bertram, Tønnessen, Strömqvist, Hyönä and Niemi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Evidence for morphological composition in compound words using MEG

### Teon L. Brooks <sup>1</sup> \* and Daniela Cid de Garcia<sup>2</sup> \*

<sup>1</sup> Department of Psychology, New York University, New York, NY, USA, <sup>2</sup> Department of Anglo-Germanic Languages, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil

Psycholinguistic and electrophysiological studies of lexical processing show convergent evidence for morpheme-based lexical access for morphologically complex words that involves early decomposition into their constituent morphemes followed by some combinatorial operation. Considering that both semantically transparent (e.g., sailboat) and semantically opaque (e.g., bootleg) compounds undergo morphological decomposition during the earlier stages of lexical processing, subsequent combinatorial operations should account for the difference in the contribution of the constituent morphemes to the meaning of these different word types. In this study we use magnetoencephalography (MEG) to pinpoint the neural bases of this combinatorial stage in English compound word recognition. MEG data were acquired while participants performed a word naming task in which three word types, transparent compounds (e.g., roadside), opaque compounds (e.g., butterfly), and morphologically simple words (e.g., brothel) were contrasted in a partial-repetition priming paradigm where the word of interest was primed by one of its constituent morphemes. Analysis of onset latency revealed shorter latencies to name compound words than simplex words when primed, further supporting a stage of morphological decomposition in lexical access. An analysis of the associated MEG activity uncovered a region of interest implicated in morphological composition, the Left Anterior Temporal Lobe (LATL). Only transparent compounds showed increased activity in this area from 250 to 470 ms. Previous studies using sentences and phrases have highlighted the role of LATL in performing computations for basic combinatorial operations. Results are in tune with decomposition models for morpheme accessibility early in processing and suggest that semantics play a role in combining the meanings of morphemes when their composition is transparent to the overall word meaning.

Keywords: compounds, MEG, left anterior temporal lobe (LATL), word naming, morphology, semantic transparency, morphological decomposition, morphological composition

# 1. Introduction

Some words are simple and some words are not. This, at first, sounds like a very trivial tautology, but the controversy over whether multi-morphemic words are simply stored in whole word form (Butterworth, 1983; Giraudo and Grainger, 2001) or always constructed from their morphemic

### Edited by:

Alina Leminen, Aarhus University, Denmark

# Reviewed by:

Joanna Morris, Hampshire College, USA Guglielmo Lucchese, Freie Universität Berlin, Germany

### \*Correspondence:

Teon L. Brooks, Department of Psychology, New York University, 6 Washington Place, 2nd Floor, New York , NY, USA teon@nyu.edu; Daniela Cid de Garcia, Department of Anglo-Germanic Languages, Federal University of Rio de Janeiro, Av. Horcio Macedo, 2151, Sala D204, Cidade Universitria, 10003 CEP 21941-917 Rio de Janeiro, Brazil cid.daniela@gmail.com

> Received: 20 September 2014 Accepted: 02 April 2015 Published: 28 April 2015

### Citation:

Brooks TL and Cid de Garcia D (2015) Evidence for morphological composition in compound words using MEG. Front. Hum. Neurosci. 9:215. doi: 10.3389/fnhum.2015.00215 parts (Taft, 2004) has been entertaining, provocative, and contentious in the field of lexical processing for the last 40 years. A comprehensive model of how words are both stored and retrieved requires an understanding of how form and meaning are connected, and how this connection unfolds in time in natural speech.

The potential contrast between whole-word storage and morpheme storage was first discussed in the classic affix-stripping model (Taft and Forster, 1975), which proposed that lexical access involves access to the stem of morphologically complex words. This study demonstrated that pseudo-complex words with real stems (e.g., de-juvenate) took longer to reject in a lexical decision task (and were often selected incorrectly as words) than pseudo-complex words with real prefixes and nonexistent stems (e.g., de-pertoire). This was taken as evidence that the morphemes were accessed prior to lexical access and they contribute the retrieval of the lexical item in memory. With various priming paradigms, evidence has accumulated in favor of morpheme accessibility during lexical access (Marslen-Wilson et al., 1994; Rastle and Davis, 2003; Taft, 2004). This has given rise to processing models where morphological decomposition is an automatic- and necessary stage in processing for complex words (Rastle et al., 2004). Recent studies (Fiorentino et al., 2014; Semenza and Luzzatti, 2014) have looked at the stages following decomposition to see how morpheme meaning is integrated into the meaning of the complex word. Results from electrophysiology (Fiorentino et al., 2014) revealed a greater negativity for lexicalized compounds (e.g., teacup) and novel compounds (e.g., tombnote) compared to mono-morphemic words in a time window of 275–400 ms, positing a stage where morpheme meanings are combined in English compounds. These psychological models make clear predictions as to the stages and time-course of lexical access, but currently, there is a lack of evidence for the anchoring of these stages to particular areas of the brain. This study seeks to identify an area responsible for the composition of morpheme meanings. Research from the picture naming literature (Dohmes et al., 2004) suggests that there should be greater activation at this stage in processing for semantically transparent complex words since they exhibit greater conceptual activation, and lemma competition in addition to the effect of morphological overlap. Therefore, this area should be sensitive only to the composition within complex words whose morpheme meaning have a semantically transparent relationship to the overall meaning as compared to complex words whose morphemes do not share a semantic relationship, opaque.

One way to look at the lexical processing of complex words is to see if activating morphological structure can modulate the accessibility of a complex word. Some cross-modal priming studies (Marslen-Wilson et al., 1994) have shown that priming in lexical decision between words that shared a stem only occurred when the prime and target had related meanings (e.g., departure primed depart but department did not) while other studies (Zwitserlood, 1994) using partial-repetition priming found that priming did not depend on a semantic relationship between the prime and target. However, studies using masked priming, a subliminal priming paradigm where a prime word is preceded by a forward mask and followed by the target word (Forster and Davis, 1984), found that when manipulating semantic transparency, facilitation effects occurred for complex words regardless of whether the prime and target share the same morphological root (Longtin et al., 2003; Rastle et al., 2004; Fiorentino and Poeppel, 2007; McCormick et al., 2008). These effects did not appear for the morphologically simple words (e.g., brothel). Faster lexical decision times were found for complex words that can be segmented into existing morphemes, which means that masked prime/unmasked target pairs with no semantic relationship like corner-corn and bootleg-boot speeded recognition showed of the target words with magnitudes indistinguishable from pairs with a semantic relationship like cleaner-clean and teacup-tea.

Since it is generally agreed that morphological decomposition is performed for every complex word that can be exhaustively parsed into existing morphemes, research on visual word recognition should shift its focus from decomposition to the subsequent mechanisms engaged to activate the actual meaning of a complex target word. Meunier and Longtin (2007) suggested that word activation comes into play in stages, which include at least one early stage for morphological decomposition and a later stage for semantic integration of the morphological pieces. Fiorentino et al. (2014) presented evidence for a morphemebased route for word activation that includes decomposition into morphological constituents and combinatorial processes operating on these representations. Since previous studies have shown that early decomposition triggered by morphological structure happens automatically for transparent and opaque words, the difference between these two word types may manifest itself during a later stage of combinatorial operations.

Another way to look at lexical processing of complex words is to look at how form is mapped onto meaning. This is critical in processing morphologically complex words in order to disentangle how the brain perceives transparent ones from how it perceives opaque ones. This can be investigated by looking at how morpheme meanings are composed in the brain. There are models for a general binding mechanism in sentence building (Friederici et al., 2000) and in basic composition of noun phrases (Bemis and Pylkkänen, 2011) that implicate the left Anterior Temporal Lobe (LATL) in the composition of words into phrases. In a minimum composition paradigm, Bemis and Pylkkänen (2011) found that two composable items in an adjective-noun phrase (e.g., red boat) evoked more activation in the left anterior temporal lobe, LATL, at roughly 225 ms, than two non-composable items (e.g., xkq boat, a random letter string and word). This was taken as evidence that the most basic of combinatorial processing is supported by the LATL. Within complex words, there is a special subclass of words that have a parallel structure to noun phrases known as compound words. Compound words have the unique property of being composed of only free morphemes (stand-alone words). Compound words also vary along the dimension of semantic transparency, the degree to which the combination of morpheme meanings corresponds to the overall word meaning. This means we can vary the contribution of the morphemes to the composition of the meaning. These properties make compound words a great

candidate for investigating morphological composition within complex words since they can provide an analogous structure to work done at the phrase level. These parallels give rise to the LATL as a candidate region for composition within a word and this provides an interesting basis for studying effects of intralexical semantic composition as an analog to composition at the phrase level.

Thus, semantically transparent compound words (e.g., mailbox) should elicit greater activity in this region than simple words since their meanings are derived from the composition of their morphemic parts, whereas semantically opaque compounds (e.g., bootleg) should not elicit greater activity since there is no relationship between their parts and meanings. In sum, a model of complex word recognition would require at least these two stages of processing: parsing into basic units (decomposition), and the composition of these word forms into a complex meaning. To unpack these stages, we propose using two types of priming paradigms: partial-repetition priming (e.g., ROAD-roadside), similar to the paradigms used in masked priming studies, which will be used to investigate the decomposition effects in compounds, and a full-repetition priming (e.g., ROADSIDE-roadside), which will be used to investigate the composition effects of their morphemes. The primes of the repetition priming condition were used to evaluate the composition effect in the absence of a behavioral response. In this respect, the method of analysis analogous to that adopted by Zweig and Pylkkänen (2009), in which the authors directly compare complex (derived) words, thus aiming to find decomposition effects that are not dependent on priming. This study uses a word naming production task to investigate these stages involved in lexical processing since it provides comparable effects to lexical decision tasks (Neely, 1991) and does not require filler trials. This task was done while brain activity was recorded using MEG to investigate whether there is an area within the left temporal lobe that is responsible for morphological composition. This study contributes to the work of characterizing the neural bases of lexical processing of complex words by providing evidence for composition within compound words, while linking it to their neural correlates. Given the prior literature, we expect to find evidence of decomposition for compound words but not for simplex words. This would be a finding that fits in with the visual word recognition literature, specifically the masked priming literature, where there are facilitatory effects when priming morphologically complex words but not morphologically simple words. However, we do not expect to find this overall benefit of morphological complexity in composition. Since composition of meaning is semantically governed, we expect to find composition effects on brain activity only for transparent compounds.

# 2. Materials and Methods

# 2.1. Participants

Eighteen right-handed native speakers of English ranging from 18 to 30, with normal or corrected vision, all gave informed consent and participated in this experiment. The study was approved by the University Committee on Activities Involving Human Subjects (UCAIHS) of New York University. The MEG data from three participants were excluded due to the large number of trial rejections caused by a noise interference (>25%). Details for rejection are described in the procedure.

# 2.2. Material

All stimuli consisted of English bi-morphemic compounds (e.g., teacup) and morphologically simple (e.g., spinach) nouns, matched for length and surface frequency. We manipulated semantic transparency, including fully semantically transparent (e.g., teacup) words, in which both constituent morphemes have a semantic relationship to the meaning of the whole compound, and fully semantically opaque words (e.g., hogwash), in which neither of the constituent morphemes have a semantic relationship to the compound meaning.

311 English compounds were compiled from previous studies (Juhasz et al., 2003; Fiorentino and Poeppel, 2007; Fiorentino and Fund-Reznicek, 2009; Drieghe et al., 2010) and categorized in terms of semantic transparency by means of a semantic relatedness task conducted using the Amazon Mechanical Turk tool. In this task, 20 participants were asked to judge, on a 1–7 scale, how much each constituent of the compounds related to the whole word. On the scale, 1 corresponded to unrelated and 7 corresponded to very related. Each participant was randomly presented with one of the constituents of each compound. Compounds were classified as semantically opaque (henceforth opaque) if the sum of the scores of their constituents was within the interval 2–6, and as semantically transparent (henceforth transparent) if the sum were within the interval 10–14. For example, the opaque compound deadline received a summed rating of 3.76 with dead contributing a transparency rating of 1.44 and line contributing a rating of 2.32. Similarly, the compound dollhouse received a summed rating of 11.79 with doll contributing a transparency rating of 6.47 and house contributing a rating of 5.32. Sixty compounds were selected for each word type. This method of semantic transparency norming was consistent with the methods used in the mentioned prior studies. The morphologically simple words (henceforth simplex: e.g., spinach) were pooled from Rastle et al. (2004) and the English Lexicon Project selecting the words coded for having only one morpheme (Balota et al., 2007).The simplex words (e.g., brothel) were selected to have a non-morphological form relationship to their primes (e.g., broth). Also, these words were constrained and selected such that the simple word could not be broken into smaller parts without creating illegal morphemes.

# 2.3. Design

The three different word types were contrasted in two priming conditions: full repetition and partial (constituent) repetition (See **Table 1**). For the repetition priming condition, the same compound was used as prime and target (e.g., TEACUP-teacup). For the partial-repetition priming, we used the first constituent of the compound as the prime (e.g., TEA-teacup). For the simplex condition, the non-morphological related form was used as the constituent in the partial-repetition priming condition (e.g., SPIN-spinach). These two priming conditions were paired to control conditions in which the prime had no semantic

### TABLE 1 | Design matrix.


relationship to the target (e.g., DOORBELL-teacup; DOORteacup).

### 2.4. Procedure

All participants read all the items in all conditions (720 total), which were divided in three lists of 240 words and randomized within each list. The order of presentation of the lists was counterbalanced between subjects. The experimental task was word naming: subjects were presented with word pairs, and they were asked to read out loud the second word of each pair. Stimuli were presented in 30-point white Courier font on a gray background using PsychToolbox (Brainard, 1997). Each trial began with the presentation of a fixation cross, followed by the prime, then the target. Each of these visual presentations was presented for 300 ms followed by a 300 ms blank (see **Figure 1**). We recorded the onset latency to speech and the utterance from each subject for behavioral analysis.

Before the experiment, the head shape of each participant was digitized using the Polhemus Fastscan system, along with five head position indicator points, which are used to coregister the head position with respect to the MEG sensors during acquisition. Electromagnets attached to these points are localized after the participants are lying within the MEG sensor array, allowing for co-registration of head and sensor coordinate systems. The head shape is used during the analysis to co-register the head to participants MRIs. For half of the participants, MRIs were not provided; therefore, we scaled the common reference brain that is provided in FreeSurfer to fit the size of these participants' heads.

During the experiment, participants remained lying in a magnetically shielded room as their brain response was monitored by the MEG gradiometers. The experimental items were projected onto a screen so the participant could read and perform the task. The MEG data were collected using an axial whole-head gradiometer system with 157 channels and three reference channels (Kanazawa Institute of Technology, Nonoichi, Japan). The recording was conducted in direct current mode, that is, without a high-pass filter, and with a 300 Hz low-pass filter and a 60 Hz notch filter.

# 2.5. Analysis

We examined onset latency, the reaction time to naming the word, to evaluate the effects of morphological decomposition based on Fiorentino and Poeppel (2007). Since reaction time is sensitive to lexical properties of words (Fiorentino and Poeppel, 2007), compound words should be processed faster when primed than simplex words due to residual activation of previously activated morphemes. A non-decompositional account predicts no differences due to word structure, if the words are correctly matched for relevant whole word properties. Thus, onset latency can be used to disentangle whether or not there is a decomposition effect. The behavioral data were analyzed using traditional analysis of variance for the Word Type by Partial-Repetition priming interaction model. Partial-repetition priming in lexical decision tasks has been used to demonstrate the accessibility of morphemes within complex words (Rastle et al., 2004). Similar behavioral effects have also been found using word naming (see Neely, 1991 for a comparative review of lexical decision and word naming). Therefore, the evidence of decomposition effects can be observed in the reaction time to speak, onset latency. Prior research led to the prediction that there should be a facilitative effect of shorter onset latency due to priming for the compounds as compared to their simplex word counterparts since the segmentation into morphemes lead to faster access to the complex word.

After brain data acquisition, we applied a Continuously Adjusted Least-Squares Method (Adachi et al., 2001), a noise reduction procedure in the MEG160 software (Yokogawa Electric Corporation and Eagle Technology Corporation, Tokyo, Japan) that subtracts noise from the MEG gradiometers based on noise measurements at the reference channels positioned away from the head. The data were bandpass filtered between 1–40 Hz using an IIR filter. The recording of the whole experiment was segmented into epochs of interest, from −200 ms before to 600 ms after the visual display of the prime word. We rejected trials in which the maximal peak-to-peak amplitude exceeded the limit of 4000fT and we equalized the trials to have an equal number of trials per condition and per word type for proper comparison. The average percentage of all trials rejected across subjects was 1.9%, and per word type: 1.3% for opaque, 2.2% for simplex, 1.8% for transparent. Sensor channels were marked as bad and discarded for each subject if the channel's peak-to-peak rejection exceeded 10%.

A noise-covariance matrix was computed for each participant using an automated model selection procedure (Engemann and Gramfort, 2015) on a random selection of baseline epochs (120 epochs) from −200 ms to the onset of the presentation of the fixation cross. For participants with MRIs, cortical reconstructions were generated using FreeSurfer resulting in a source space of 5124 vertices (CorTechs Labs Inc., La Jolla, CA and MGH/HMS/MIT Athinoula A. Martinos Center for Biomedical Imaging, Charleston, MA). A boundary-element model (BEM) method was used to model activity at each vertex to calculate a forward solution. An inverse solution was generated using this forward model and noise-covariance matrix, and was computed with a fixed-orientation constraint requiring dipole sources to be normal to the cortical surface. The sensor data for each subject was then projected into their individual source space using a cortically-constrained minimum norm estimate (all analyses were conducted using MNE-Python: Gramfort et al., 2013, 2014) resulting in noise-normalized dynamic statistical parameter maps (dSPMs: Dale et al., 2000).

For this analysis, our design (**Table 2**) reduces to the simple comparison between compounds (e.g., TEACUP) and simplex words (e.g., SPINACH) of the same size that served as primes in the repetition condition (e.g., TEACUP-teacup) described above in the Design section. Since, for this analysis, we use neurophysiological data related to the silent reading of the words that served as primes, there is no behavioral data for these words. By these means we also avoid artifacts associated with voluntary movements that can compromise the analysis of the effects of interest to the study (Hansen et al., 2010).

We examined the neural activity localized in the entire left temporal lobe. This region was selected based on composition effects found with sentences (Friederici et al., 2000) or adjectivenoun phrases (Bemis and Pylkkänen, 2011). In order to verify if there was increased activity for compounds in this area, a t-test was performed on the residual activation of a compound word type (opaque, transparent) after removing the activation from the simplex control word from 100 to 600 ms after the stimulus onset. The p-value map of the brain was generated for the time series and spatiotemporal clusters were identified for contiguous space-time clusters that had a p-value of less than 0.05 and a duration of at least 10 ms. The t-values were summed for those points within the cluster that met these criteria. Then, a non-parametric permutation test was performed



first by shuffling the word type labels, then calculating clusters formed by the new labels. A distribution generated from 10,000 permutations was computed from calculating significant levels of the observed cluster. The corrected p-value was determined from the percentage of clusters that were larger than the original computed cluster (Maris and Oostenveld, 2007). These tests were computed using the statistical analysis package for MEG data, Eelbrain, (https://pythonhosted.org/eelbrain/).

# 3. Results

# 3.1. Morphological Decomposition

Behaviorally, we found a significant effect of partial-repetition priming [F(1, 17) = 25.91, p < 0.001], but most critically an interaction of word type by priming [F(2, 17) = 9.24, p < 0.001] (**Figure 2**). This effect shows that there is a greater facilitation in word naming for compound words than for morphologically simple words when primed. In the planned comparisons, reliable differences were found between opaque compounds and simplex words [F(1, 17) = 5.93, p < 0.03], and transparent compounds and simplex words [F(1, 17) = 14.46, p < 0.005] but not between transparent and opaque compounds [F(1, 17) = 2.84, p > 0.1]. These results show that even in word production, there is sensitivity to morphological structure above and beyond orthographic and phonological overlap, but this stage of processing is not sensitive to the meaning of the morphemes in relationship to the compound word, which is consistent with the prior literature on morphological decomposition (Rastle et al., 2004; McCormick et al., 2008).

# 3.2. Morphological Composition

Results reveal reliable effects of greater activation for transparent compounds when compared with their simplex controls within the temporal lobe. There were two significant clusters associated with this difference: the first cluster was localized to the anterior middle temporal gyrus from 250 to 470ms (P t = 4552.3, p < 0.05, **Figure 3**), and a second cluster of activity was localized to the posterior superior temporal gyrus from 430 to 600 ms ( P t = 5654, p < 0.05, **Figure 4**). However, there were no reliable clusters found for the difference of opaque compounds and simplex words within the temporal lobe.

# 4. Discussion

Analyses of the different word types in isolation revealed very consistent evidence that there is a difference in how simplex and complex words are processed in the brain. The behavioral results confirmed that there is a stage in lexical access that is sensitive to the morphological forms within complex words and demonstrated that these effects could also be observed in other testing modalities, namely, word naming. The onset latency interaction effect where compound words were faster to produce than morphologically simple words when primed by their constituent morpheme is largely consistent with the results within the masked priming literature on word recognition, and gives further evidence that there is a decomposition stage in lexical access where complex words are parsed into their morphemes (Rastle et al., 2004; Taft, 2004; Morris et al., 2007; McCormick et al., 2008; Fiorentino and Fund-Reznicek, 2009). The parsing operation occurs independent of the semantic relationship between constituent morphemes and their complex word. Since early activation of constituents via morphological decomposition happens irrespective of semantic transparency, what differentiates transparent and opaque compound must happen, thus, during a later stage of morphemic composition. The increased activity found for transparent compounds in anterior temporal lobe from 250 to 470ms provides evidence for a stage in lexical access where meanings of the morpheme play a part in accessing the overall meaning of the word. Bemis and Pylkkänen (2011) show combinatorial effects in the LATL for adjectival words at around 225 ms after the critical word is presented. The difference in timing could be explained by the different time points at which we time lock the onset of the stimulus. In Bemis and Pylkkänen (2011), the onset coincides with the onset of the noun boat in the phrase red boat, whereas in our study the critical stimulus is the entire compound sailboat.

The increased activation in the posterior temporal lobe for transparent compounds from 430 to 600 ms that follows the activity in the LATL is consistent with the fact that this region is involved in lexical retrieval (Hickok and Poeppel, 2007; Lau et al., 2008). Lau et al. (2008) proposed that the posterior region of the temporal lobe is the best candidate for the lexical storage of words. Since the LATL is responsible for composing the meaning of the constituent morphemes, the posterior temporal lobe

FIGURE 4 | Transparent vs. simplex difference in Posterior Superior Temporal Gyrus (pSTG).

would be responsible for retrieving information from its stored lexico-semantic representation. This region is also engaged in sound-to-meaning transformation (Binder et al., 2000), which would include the retrieval of phonological information. This study is in tune with decomposition models from visual word recognition literature and provides the neural basis for a stage in lexical access involved in the composition of meaning within compound words, thus helping to disentangle cognitive processes that are indistinct when reaction time is the only measure. Bridging results from psycholinguistic research with MEG recordings of brain activity, the emerging results suggest that the recognition of compounds involves distinct stages: a decomposition stage that is independent of semantics, and a composition stage that is governed by semantics. We showed that the course of activation varies in terms of word complexity and semantic transparency.

# Author Contributions

Authors TB and DC share first-authorship as they have both equally contributed to the paper.

# References


# Funding

This work is supported by the National Science Foundation under Grant No. BCS-0843969, and by the NYU Abu Dhabi Research Council under Grant No. G1001 from the NYUAD Institute, New York University Abu Dhabi. The work of TB was supported by the National Science Foundation Graduate Research Fellowship under DGE-1342536. The work of DC was supported by the Coordination for the Improvement of Higher Education Personnel and the Fulbright Commission under the Mutual Educational Exchange Act, sponsored by The United States of America Department of State, Bureau of Educational and Cultural Affairs.

# Acknowledgments

We would like to thank Alec Marantz for his support and guidance with this project. We would also like to thank Masha Westerlund and Phoebe Gaston for providing critical feedback for this paper. We would also like to thank Jeff Walker of the NYU MEG Lab for his help while running participants.


in visual word recognition. J. Mem. Lang. 58, 307–326. doi: 10.1016/j.jml.2007. 05.006


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Brooks and Cid de Garcia. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Insights from letter position dyslexia on morphological decomposition in reading

### Naama Friedmann<sup>1</sup> \*, Aviah Gvion1, 2, 3 and Roni Nisim<sup>1</sup>

*<sup>1</sup> Language and Brain Lab, Tel Aviv University, Tel Aviv, Israel, <sup>2</sup> Reuth Medical and Rehabilitation Center, Tel Aviv, Israel, <sup>3</sup> Communication Sciences and Disorders, Ono Academic College, Tel Aviv, Israel*

### Edited by:

*Harald Clahsen, Potsdam Research Institute for Multilingualism, Germany*

### Reviewed by:

*Davide Crepaldi, University of Milano-Bicocca, Italy Stavroula Stavrakaki, Aristotle Univeristy of Thessaloniki, Greece*

### \*Correspondence:

*Naama Friedmann, Language and Brain Lab, School of Education and Sagol School of Neuroscience, Tel Aviv University, Ramat Aviv, Tel Aviv 69978, Israel naamafr@post.tau.ac.il*

> Received: *16 October 2014* Accepted: *02 March 2015* Published: *03 July 2015*

### Citation:

*Friedmann N, Gvion A and Nisim R (2015) Insights from letter position dyslexia on morphological decomposition in reading. Front. Hum. Neurosci. 9:143. doi: 10.3389/fnhum.2015.00143* We explored morphological decomposition in reading, the locus in the reading process in which it takes place and its nature, comparing different types of morphemes. We assessed these questions through the analysis of letter position errors in readers with letter position dyslexia (LPD). LPD is a selective impairment to letter position encoding in the early stage of word reading, which results in letter migrations (such as reading "cloud" for "could"). We used the fact that migrations in LPD occur mainly in word-interior letters, whereas exterior letters rarely migrate. The rationale was that if morphological decomposition occurs prior to letter position encoding and strips off affixes, word-interior letters adjacent to an affix (e.g., signs-signs) would become exterior following affix-stripping and hence exhibit fewer migrations. We tested 11 Hebrew readers with developmental LPD and 1 with acquired LPD in 6 experiments of reading aloud, lexical decision, and comprehension, at the single word and sentence levels (compared with 25 age-matched control participants). The LPD participants read a total of 12,496 migratable words. We examined migrations next to inflectional, derivational, or bound function morphemes compared with migrations of exterior letters. The results were that root letters adjacent to inflectional and derivational morphemes were treated like middle letters, and migrated frequently, whereas root letters adjacent to bound function morphemes patterned with exterior letters, and almost never migrated. Given that LPD is a pre-lexical deficit, these results indicate that morphological decomposition takes place in an early, pre-lexical stage. The finding that morphologically complex nonwords showed the same patterns indicates that this decomposition is structurally, rather than lexically, driven. We suggest that letter position encoding takes place before morphological analysis, but in some cases, as with bound function morphemes, the complex word is re-analyzed as two separate words. In this reanalysis, letter positions in each constituent word are encoded separately, and hence the exterior letters of the root are treated as exterior and do not migrate.

Keywords: morphological decomposition, Hebrew, letter position, inflection, derivation, letter position dyslexia, acquired dyslexia, developmental dyslexia

# Introduction

Major questions in the study of morphological processing are whether and when morphological decomposition takes place during reading. Since the seminal work of Taft and Forster (1975), many researchers assume that words are represented in a decomposed form in the orthographic input lexicon. If this is so, then in order to identify a word in the lexicon, morphological decomposition is required. Debates remain as to whether this decomposition is obligatory or whether words can still be accessed as wholes: Taft and Forster (1975) supported an obligatory decomposition account (see also Taft, 2004; Taft and Ardasinski, 2006; Rastle and Davis, 2008), whereas other models advocated a dual-access view whereby morphologically complex words can also be stored as wholes in the lexicon and decomposition occurs only in certain conditions (e.g., Schreuder and Baayen, 1995; Baayen et al., 1997; Diependaele et al., 2009, 2013). Additional work has revolved around the question of the nature of the morphological decomposition: whether it is guided by purely structural, morphological-orthographic considerations (Longtin et al., 2003; Rastle et al., 2004; Longtin and Meunier, 2005; Rastle and Davis, 2008; Beyersmann et al., 2011; Crepaldi et al., 2014) or rather consults the lexicon or lexical-semantics (Giraudo and Grainger, 2000, 2001).

In the current study, we ask when and how this morphological decomposition takes place using letter position encoding. Specifically, we ask about the interaction between letter position encoding and morphological decomposition, and their relative order. Until now, studies that asked questions about morphological decomposition by using letter transpositions examined priming in normal readers. The basic idea of many of these studies was to compare the priming of primes created from existing words by transposition within the stem to primes created by transposition across morpheme boundaries. A difference in the priming effect of the two conditions would indicate that morphological decomposition occurs early. These studies yielded inconsistent results (Christianson et al., 2005; Duñabeitia et al., 2007; Grainger and Ziegler, 2011; Rueckl and Rimzhim, 2011; Masserang and Pollatsek, 2012; See Sanchez-Gutierrez and Rastle, 2013; Taft and Nillsen, 2013, and Amenta and Crepaldi, 2012, for review and discussion). A different way to look at morphological decomposition through transpositions was created by Beyersmann et al. (2011; see also Beyersmann et al., 2013). Their idea was to examine priming from a morphologically complex nonword created from a transposed stem and a suffix to the stem. Their findings, indicating priming in such stimuli, point to morphological decomposition. Finally, in a recent study, Taft and Nillsen (2013), who also used priming in normal reading, took advantage of the fact that primes in which the exterior letters transposed provide a smaller priming effect primes with middle transposition. They compared transpositions at the exterior letters of the stem (which would be exterior letters following decomposition) to transpositions in the middle of the stem: comparing, for example, disrpove, and disporve, respectively. Their results were that even when the prime was a nonword, when it could be decomposed to a lexical stem and existing affix (e.g., unprove), it primed a word with the same stem, indicating early morphological decomposition. No difference was found in the priming of exterior and middle transpositions, which the authors explained by saying that the reduced effect of initial letters is purely perceptual and hence this was not observed once the initial letters of the stem were not perceptually initial.

In the current study we looked at morphological decomposition through letter position from a novel perspective: that of the reading pattern of individuals with letter position dyslexia (LPD), a dyslexia that specifically affects letter position encoding. The rationale is the following: LPD affects an early, pre-lexical stage of orthographic-visual analysis (for these model components c.f., Ellis and Young, 1996; Coltheart et al., 2001; Jackson and Coltheart, 2001; Friedmann and Coltheart, in press). Therefore, whether or not LPD is affected by the morphological structure of the target word can inform us about morphological processing taking place in this early stage.

Previous studies have already examined the interaction of morphological decomposition and peripheral dyslexias dyslexias in the pre-lexical orthographic-visual analysis stage. Reznick and Friedmann (2009) tested the effect of morphology on the reading of 7 Hebrew readers who had word-based neglect dyslexia (neglexia) following stroke. Neglexia is a reading deficit in which letters on one side of the word are neglected, causing substitutions, omissions and additions of letters on the neglected side (Caramazza and Hillis, 1990; Ellis and Young, 1996; Haywood and Coltheart, 2001; Vallar et al., 2010). Readers with left neglexia may read stop, unclear, and cars, as "top," "clear" and "bars," respectively. Readers with right neglexia would read boot, liver, and corner as "book," "live/lived" and "corn." Reznick and Friedmann found that the reading of the neglexic patients was affected by the morphological structure of the target words: affixes were neglected significantly more than root letters. This pattern was especially evident in letter omission errors: whereas affixes on the neglected side were often omitted, root letters were never omitted. This effect was purely structural and was not affected by lexical properties of the root and the target word. The interpretation was that morphological decomposition affects reading already in the orthographic-visual analysis stage, and without feedback from the lexical stages: it requires three root letters, and does not stop shifting attention to the neglected side until three root letters are found.

A similar effect of morphology on peripheral dyslexia was found in the reading errors of 10 individuals with developmental attentional dyslexia (Friedmann et al., 2010b). The typical error in attentional dyslexia is the migration of letters between neighboring words. Friedmann et al. found that these errors occurred more often in affix morphemes than in the root.

Neglexia and attentional dyslexia both stem from a prelexical deficit at the orthographic-visual analyzer: neglexia affects attention shift to the neglected side of the word and attentional dyslexia affects binding of letters to words. Therefore, the findings of both studies serve as an additional evidence that morphological decomposition indeed occurs very early in the course of word reading, before lexical access.

The current study assessed a different function of the orthographic-visual analysis stage, which possibly functions at an earlier stage than letter-to-word binding<sup>1</sup> : that of letter position encoding. We asked whether the morphological structure of the target words affects letter position errors in LPD, to find out whether letter position encoding precedes or follows morphological decomposition. We further asked whether all types of morphemes behave similarly or whether they exhibit different patterns with respect to decomposition. LPD is characterized by letter position errors in reading (e.g., trail→ trial, smile→ slime, cloud→ could) that occur mainly in middle letters (Friedmann and Gvion, 2001, 2005; Friedmann and Rahamim, 2007, 2014; Friedmann et al., 2010a; Friedmann and Haddad-Hanna, 2012, 2014; Kohnen et al., 2012; Kezilas et al., 2014). This dyslexia results from a selective impairment in letter position encoding in the early, pre-lexical stages of visual analysis of the written word.

We used the fact that individuals with LPD make transpositions in middle letters but almost never in the first or final letters. The idea was that if the morphologically-complex word is decomposed to its morphemes prior to the stage at which letter position errors occur, then the exterior letter of the base morpheme that is adjacent to an affix and therefore appears as a middle letter in the complex word, may become an exterior letter when stripped of the affix. For example, in a word like signs, the "n" is a middle letter, but if the plural affix −s is stripped off the base before the stage in which letter position errors occur, then the "n" becomes exterior and hence would not migrate.

Namely, if both conditions are fulfilled: morphological decomposition occurs before letter position encoding, and this decomposition actually creates two separate morphemes, then letter position errors are not expected to occur in base letters on the edge of an affix (or are expected to occur in a rate similar to that of exterior letters). If, however, letter position encoding (and hence, letter position errors) occurs prior to the early morphological decomposition, then at the level in which letter position errors occur, the first letter of the second morpheme is still in middle position, and would have a similar fate to other middle letters. In this case, it will show the same transposition rate as middle letters. To examine this question and to compare various types of morphemes, we used Hebrew, a morphologically-rich language.

# Morphology in Hebrew

Hebrew is a Semitic language, read from right to left. It is an alphabetic script in which not all vowels are represented orthographically. Hebrew words are built from a 3-letter root and a derivational template and/or inflectional morphology. Verbs, nouns, adjectives, and prepositions can inflect for gender, number, and possessor/genitive. Verbs also inflect for tense and person. Derivational templates exist for verbs, nouns, and adjectives. The nominal template for nouns and adjectives is called "mishkal" and the verbal template for verbs is called "binyan" (Arad, 2005; Arad and Shlonsky, 2008). Inflectional and derivational morphemes may be vowels or consonants. The morphological structure of Hebrew words was consistently shown to affect word reading. For example, in a line of priming studies and oral reading in rapid serial visual presentation, Frost et al. (1997, 2000a,b), and Velan and Frost (2007, 2009, 2011) showed that Hebrew words prime visual recognition of other words that share their roots (more than other orthographically similar primes).

As shown in Appendix A **Table A1**, the morphological inflections and the derivational templates may appear before, in the middle, or at the end of the word. Many of them occur in more than one position in the word. One can think of the morphemes in Hebrew as a template consisting of consonants and vowels, with three empty slots for consonants, in which the root letters are inserted. All 22 Hebrew letters can function as root letters, 12 letters can also be part of inflectional or derivational affix. Some letters can serve as inflectional or derivational affixes only in the beginning of the word, but not in its end (e.g., ), whereas other letters can appear as affixes before, within, and after the root (e.g., ), or both before and after the root (e.g., , ). Some morphemes are single letters, whereas others are two letters. There is another type of morpheme in Hebrew, which we term "bound function morpheme." These are 7 function words that appear in English as separate words (the, that, and, in, to, as, from). In Hebrew they appear as a single letter ( , , , , , , , respectively) that is bound to the beginning of the word, and appears as part of the word (theword, andappears). Bound function morphemes always precede the word<sup>2</sup> . We compared in this study inflectional, derivational, and bound function morphemes, assessing whether they are stripped off the words early enough so as to make the adjacent root letters behave like exterior letters.

# Participants and Background Tests

# Background Description of the 12 Participants with LPD

The participants were 11 individuals with developmental LPD and one woman with acquired LPD following brain damage. Galia, the participant with acquired dyslexia, was a 54 years old woman. She was a teacher and a PhD student with 20 years of education. She had a sudden onset of seizures with herpes encephalitis 13 months before our testing. CT demonstrated a small hypodense area in the right temporal lobe. Her reading was impaired, showing clear and selective LPD. Her speech

<sup>1</sup>Whereas letter identification and letter position encoding have to occur in the first stage of orthographic-visual analysis, letter-to-word binding may occur slightly later, in the graphemic input buffer, which holds the products of letter identification and position stages, a stage that can hold more than a single written word at a time.

<sup>2</sup>Most of these bound function morphemes have a full-word counterpart that appears as a stand-alone function word. Talmy Givón (1971) made the famous claim "Today's morphology is yesterday's syntax"—according to which in many languages bound morphemes arise historically from free lexical morphemes. The same is true for Hebrew bound function words. Historically, most of these morphemes started out as the full independent form and then their phonologically reduced form emerged, written as a bound prefix. For example, "min," from, became "mi-," "kemo," like, became "ke-," and "el," to, became the attached "le-" (Hardy, 2014 and cf., Pat-El, 2012, for a discussion of the relation between the full independent relativizer "asher" and the bound clitic "she-"). Given that most full-forms still exist alongside the bound morphemes, this may contribute to the perception of such bound function morphemes by Hebrew speakers as separate words.

TABLE 1 | Background description of the participants with developmental LPD.


and naming abilities were normal. Her writing was impaired, with mild graphemic buffer dysgraphia. She participated only in Experiment 1, in which the participants read aloud 500 migratable words. The background details of the developmental LPD participants, who were all school students, 5 females and 6 males, are summarized in **Table 1**.

# Testing to Establish LPD and for Inclusion in the Study

Each of the participants with LPD was selected to participate in this study on the basis of migration errors within words in reading aloud and in silent reading, alongside intact word production. This screening testing included two tasks of reading aloud: the TILTAN screening test of oral reading of 136 single words of various types, and a test of oral reading of 232 migratable words. To establish that the migrations that the participants made in reading indeed resulted from a deficit in letter position encoding and not in the speech production stages, we also used tasks of reading without oral production: a test of migratable word comprehension, and tests of oral production without reading: picture naming and migratable word repetition. We only included participants who made migrations in reading aloud and in comprehension and who had no migrations in oral word production.

### Screening Tests

The TILTAN reading screening test (Friedmann and Gvion, 2003) includes 136 single Hebrew words of various types that were constructed so that they are sensitive to various types of dyslexia: Most importantly for our study, 65 of the words in the test are sensitive to detect LPD as these words are migratable words words for which a transposition of middle letters can create another existing word. All the words in the test are sensitive to left neglect dyslexia at the word level, as all the words in the list are such that when read with a neglect error on the left side (omission or substitution of letters), another existing word can be created (such as snow, which can be read as "know" or "now" following a left letter substitution or omission, respectively); 104 of the words are sensitive to right neglect, as neglect errors on their right side create other existing words. The test also includes words for identifying surface dyslexia<sup>3</sup> : potentiophones and words that are parallel to irregular words in English; abstract words, function words, and morphologically complex words, for identifying deep dyslexia (and phonological output buffer dyslexia); words with many orthographic neighbors for identifying visual dyslexia; and words for which migrations, substitutions, omissions, or additions of a vowel letter create other existing words for identifying vowel dyslexia (Khentov-Kraus and Friedmann, 2011).

For individuals who made significantly more migration errors than controls, without other dyslexias, who were therefore suspected to have LPD, we further administered an additional reading aloud test of 232 migratable words.

The 232 migratable words oral reading test includes 232 Hebrew words in which migration of middle letters creates another existing word (such as cloud-could, parties-pirates, casual-causal). The 232 migratable words had 4–7 letters (M = 4.9, SD = 0.9). In 87 of these words a middle migration that involves a vowel letter and a consonant letter creates another existing words, and in 163 words a middle migration that involves only consonant letters creates another word.

To establish that the impairment is at the early stage of orthographic-visual analysis rather than in the output stages, we also tested reading comprehension of migratable words, picture naming, and the repetition of 20 migratable words. The rationale was that if the deficit is at the orthographic-visual analysis stage, not only reading aloud but also comprehension of migratable words would be impaired and indicate transpositions of middle letters, but picture naming and repetition should not be affected. An output deficit should show the opposite pattern, with good comprehension of written migratable words when no reading aloud is required, and poor oral production in picture naming and repetition.

Reading comprehension of migratable words was tested using 50 triads of written words. Each triad included a target migratable word, a word that is semantically related to it and a word that is semantically associated to the transposition counterpart of the target word. The participant was requested to choose the word that was related to the target word. For example, the target word , dogs, in which a transposition creates the word , cables, appeared with the words animals and television.

Naming was tested using a picture naming task of 100 color object pictures (SHEMESH, Biran and Friedmann, 2004); repetition was tested using a task of repetition of 20 migratable words.

<sup>3</sup> In Hebrew, due to the under-specification of vowels in the orthography, and to the fact that there are 9 letters that have an ambiguous conversion to phonemes, 13 homophonic letters, and lexical stress that is not marked in the orthography, there are actually no regular words. Therefore, all words in the screening tests were irregular, but for the detection and identification of surface dyslexia, we used the two types of words that are most sensitive to surface dyslexia: potentiophones –words whose reading via grapheme-to-phoneme conversion creates another existing word, like now, which can be read via grapheme-to-phoneme conversion as "know" (Friedmann and Lukov, 2008), and words that are parallel to irregular words in English—words with silent letters or with a letter that can be converted via the sublexical route into two or more different phonemes, and in the target words the letter is converted to the less-frequent conversion phoneme.


TABLE 2 | Number of errors of the various types in the TILTAN oral reading screening test.

<sup>∗</sup>*Significantly more errors than age-matched control group (p* < *0.05).*

# Results in the Screening Tests of LPD Participants Included in the Study

We selected only participants who had significantly more letter migration errors on the three tasks of migratable word reading than age-matched skilled readers (TILTAN norms, Friedmann and Gvion, 2003), using Crawford and Garthwaite's (2002) t-test for the comparison between an individual and a control group, and who performed normally and migrations-free in picture naming and repetition.

**Table 2** summarizes the participants' reading performance– number of errors of each type—in the TILTAN reading screening test. **Table 3** summarizes their performance in oral reading of the 232 migratable word test and their performance in reading comprehension of the 50 migratable words.

In reading aloud, as can be seen in **Tables 2** and **3**, the prominent error of all the participants was letter migrations within words, and each of them made significantly more migration errors compared to age-matched controls, whereas other types of reading errors were relatively few<sup>4</sup> .

The comprehension of migratable words, which involved only silent reading, also indicated that the participants had LPD, as each of them made significantly more errors than the controls in this test.

TABLE 3 | Percentage of migrations in oral reading of 232 migratable words and errors in a task of comprehension of 50 written migratable words.


<sup>∗</sup>*Significantly more errors than age-matched control group (p* < *0.05).*

Unlike their impaired oral and silent reading, characterized by migration errors, the participants' naming and migratable word repetition was normal, and none of the participants made migration errors in naming or in repetition. **Table 4** summarizes their performance in the picture naming and repetition tasks.

This pattern of results shows that indeed the source of the migration errors of the 12 participants lies in the encoding of letter position in the orthographic visual analyzer.

<sup>4</sup>As the screening test reading (**Table 2**) indicates, six of the participants with developmental LPD also made surface-dyslexia-like errors, resulting from reading via grapheme-to-phoneme conversion instead of via the lexical route, in a higher rate than expected for their age. These errors do not necessarily mean that these participants have surface dyslexia on top of their LPD, but could rather stem from their insufficient exposure to reading because of the reading difficulties, which results in insufficient entries in the orthographic input lexicon, forcing them to read through the sublexical route. It might also be that given the pre-lexical deficit in the orthographic-visual analyzer, the representations in their orthographic input lexicon are abnormal.



## Control Group

The control group included 25 age-matched skilled readers without any reading impairments, as tested by the TILTAN reading screening test. They were 9 female and 16 male. Ten of them were age-matched to the 5–6th grade participants with LPD (mean age = 11.6, SD = 0.5); Ten were age-matched to the 7– 8th grade participants with LPD (five matched to the 7th grade participant and five to the 8th grade participant, mean age = 13.8, SD = 0.9); and five were 12th grades, age-matched to the older individual with LPD (mean age = 18.4, SD = 0.4). In all the data tables below, YO was compared to the 12th grade group, OR and BR to the 7–8th grade group, and the rest were compared to the 5–6th grade control group. These control participants were tested in all the reading tests that were administered to the LPD participants, described in the following sections.

# General Method

The experimental study of morphology in LPD included six experiments that tested reading, lexical decision, and comprehension of migratable words in two levels: single words, and sentences that include migratable words.

# Procedure

During the testing sessions, every response that differed from the target was transcribed by the experimenter, and words read correctly were scored with a plus sign. All the sessions were audio-recorded and two judges listened to the recordings after the sessions, and the transcription from the session was checked and corrected or completed using the recordings.

The words and sentences in the various experiments were presented to each participant over the desk, printed on a white page. In the oral reading tasks, the participant was requested to read aloud as accurately as possible; in the lexical decision and comprehension tasks the participant was requested to perform the task without reading the words aloud. No time limit was imposed during testing, and no response-contingent feedback was given by the experimenter, only general encouragement. The participants were told that whenever they needed a break they can stop the session or take a break. Each participant was tested individually in a quiet room in two to three sessions of 1–2 h. The Ethics Committees of Tel Aviv University and the Ministry of Education approved the experimental protocol.

# Data Analysis

The results were analyzed on the group level as well as for each individual participant. We compared the performance at the group level between two conditions using t-test for correlated samples (after we established that the data of the LPD participants on the inflectional, derivational, bound, and exterior conditions did not depart from normality, as the skewness and kurtosis of each of them did not significantly differ from 0).

At the individual level, performance in different structures was compared using Chi square test. To compare the performance of each experimental participant to her/his age-matched control group, we used Crawford and Howell's (1998; Crawford and Garthwaite, 2002) t-test. An alpha level of 0.05 was used.

# General Materials: Stimuli Structure

Across all 6 experiments, we examined three types of morphemes: Inflectional, derivational, and bound function morphemes (conditions 1–3 below). In all cases, we examined the rate of transpositions of root letters AB that were adjacent to the tested morpheme, compared with a control condition in which the root letters AB were exterior<sup>5</sup> (example 4, with and without an affix in the irrelevant side).

In 1–4 below, ABX represent the three consonant root letters. In all the target words, the two letters to be migrated, A and B, were always adjacent to each other, and the transposition of the letters AB created an existing word with the sequence BA in the relevant side.

Condition 1: Inflectional morphology



Condition 3: Bound function morpheme in the beginning: [bound function morpheme]**AB**X

Condition 4: Exterior letter migration, with no morpheme on the relevant side


<sup>5</sup>We selected this exterior control condition because we were interested in whether the migrations adjacent to a morpheme behave like exterior migrations. A middle letter migration control condition, which involves migration of middle letters within the root and not adjacent to morphemes, is impossible in Hebrew because Hebrew words are based on 3-letter roots. Interior letter migrations require at least 4 letters, but 4 letter words inevitably include affixes. Therefore, middle letter control items that involve migration of two letters of the root and do not include affixes are impossible (or are limited to loan words that do not have the Semitic morphological structure).

We followed several procedures and principles when creating the list of words of the various types: we used the same root for the various conditions, in most cases (72% of the roots) the same root was used in all 4 conditions or in 3 conditions, except when the root does not naturally appear with some of the morpheme types. That way, in many cases it was exactly the same root and the same two letters that migrated in the compared conditions. For example, the 3-letter root , bxr (and here x is the IPA transcription of the velar fricative consonant represented by the letter , not a variable), which has a transposition counterpart , xbr, appeared in the inflectional condition with an inflectional prefix (and affix) as (tbxri, you-will-choose), with the transposition counterpart (txbri, you-willconnect); in the derivational condition, with a derivational prefix, as (mbxr, selection), with the transposition counterpart (mxbr, connector or connects); in the bound function morpheme condition as (hbxorh, the-girl), with the transposition counterpart (hxborh, the-group or thebound); and in the exterior transposition condition as (bxrh, selected-fem), with the transposition counterpart (xbrh, girlfriend or company). In each of these four conditions the relevant transposition involves the first two root letters.

The derivational morphemes were morphemes of verbal and nominal templates, the inflectional morphemes were morphemes of person, gender, number, tense, and possessive pronoun suffixes.

The Hebrew bound function morphemes always appear before the root, and so they did in the stimuli. We used the 7 bound function morphemes, in a way that they always formed a syntactically licit combination with the word they were bound to (e.g., the determiner "the" and the preposition "in" were always added to a noun or an adjective but not to a verb).

In the bare root control condition, we used the 3-letter root itself, when it was an existing word. In the morphologically complex exterior letter migration condition, we used the root and an additional affix that appeared on the side opposite to the expected migration—if the expected exterior migration was on the beginning, in letters 1 and 2 of the root, the affix was added at the end of the word, and if the expected migration was at the end, the affix was added in the beginning of the word, before the root. The morphologically complex control condition also included vowel letters inside the words, but not between the migrating letters, which were always adjacent. The longer control stimuli were used so that exterior letter migration would be tested in words of the same length as the words in the morphological conditions.

In Hebrew, five letters have different forms in middle and final position, so in order to avoid the effect of letter form on position encoding (see Friedmann and Gvion, 2005), these letters did not appear in any of the morphological conditions when the migration of the second and third root letters was tested, either as the second or as the third root letter.

Morphologically complex words were classified to the various conditions according to the type of morphemes that were adjacent to the site of expected migration. Namely, if a word started with an inflectional prefix and ended with a derivational suffix, it was considered part of the inflectional condition if the relevant migration was adjacent to the prefix (root letters 1 and 2), and part of the derivational condition if the relevant migration was adjacent to the suffix (root letters 2 and 3). In some of the word lists there were few words that had a potential for both migrations of the first and second root letters and the second and the third root letters. In these cases, we included these items in the totals of both conditions, and analyzed the errors according to the errors each participant made. The words of the various conditions were presented in a semi-random order, making sure that no more than two words of the same condition appeared consecutively, and that words of the same root (and even words with the same root letters in a different order) never appeared consecutively.

To assess the effect of the morphological structure of the word on the rate of migrations, we compared the rate of migrations of the two root letters (AB) in each of the three types of morphemes to the exterior migration<sup>6</sup> , and between the various morpheme types. Error scoring referred only to transposition errors and ignored surface dyslexia-like errors, so that words that were read with surface dyslexia-like errors but without transposition errors were counted as correct response.

We used three types of tasks: oral reading, lexical decision, and written word comprehension. Because we had initially thought that some morphological analyses may occur only within a sentence context, we examined each task both on the single word level and in the sentence level, with a total of six experiments. (As you will see below, this worry was unwarranted, as morphological decomposition occurred even at the single word level). The group with LPD read a total of 8679 morphologically complex migratable words in the 6 experiments. Together with the initial lists of migratable words that each participant read in the screening stage, each participant read 1136 migratable words, so our results are based on a total of 12,496 migratable words that the LPD group read.

# Experiment 1: Oral Reading of Single Morphologically-Complex Words

# Method and Material

Each participant was presented with a list of 500 words and was requested to read them aloud as accurately as possible. The word list included:

**116** words with initial **bound function morphemes** (**Table 5**, condition 1);

**104** words with **inflectional morphology** adjacent to the migrating letters: 52 in the beginning, 55 in the end (three of the 104 words included "relevant affixes" on both sides: in these words, both a migration of the root letters adjacent to an inflectional prefix and a migration near an inflectional suffix created an existing word) (**Table 5**, conditions 2a and 2b);

**109** words with **derivational morphology** adjacent to the migrating letters: 56 in the beginning, 57 in the end (4 of the

<sup>6</sup>We could not compare these migrations to migrations in the middle of the root because Hebrew roots are generally 3-letter roots, so there is no way for a migration to involve two letters in the middle of the root, and hence every migration of middle letters is on the edge of a suffix.



words included both a derivational prefix and a derivational suffix, each of which was adjacent to migrating root letters) (**Table 5**, conditions 3a and 3b).

The control items were **98** monomorphemic and **115** morphologically complex words in which a migration of two adjacent exterior letters created another word. The monomorphemic words included 39 words in which a transposition of the first two letters creates an existing word, 26 words in which a transposition of the last two letters creates an existing word, and 30 in which both the first-second transposition and final-penultimate transpositions create existing words (42 of the morphologically complex words served both in the exterior migration condition and in one of the morpheme conditions, when one side of the word allowed for an exterior migration and the other for migration adjacent to a morpheme); the morphologically complex words were matched in length to the words in the experimental conditions. In the morphologically complex control words the affixes were always on the other side of the words than the expected transposition. They included 56 words with a suffix, in which a transposition of first two letters creates an existing word, and 59 words with a prefix, where a transposition of last two letters creates an existing word (see **Table 5**, conditions 4a and 4b). The words in the different conditions did not differ in frequency [F(3, 499) = 1.56, p = 0.20].

# Results

### Participants with Developmental LPD

The results, summarized in **Table 6**, indicated that the rate of transposition errors crucially depended upon whether the transposing letters were adjacent to an inflectional or derivational morpheme, or to a bound function morpheme: Whereas transpositions were abundant for all participants near inflectional and derivational morphemes, they were very scarce near bound function morphemes. Transpositions near bound function morphemes occurred in a low rate that was similar to the rate of exterior letter migrations. This pattern held at the group level and for each of the individual participants. At the group level, transposition errors occurred significantly more often in letters adjacent to inflectional [t(10) = 7.22, p < 0.001, d = 2.6] and derivational [t(10) = 8.09, p < 0.001, d = 2.9] morphemes

TABLE 6 | Percentage migrations in oral reading of single words according to the type of morpheme adjacent to the migration site.


<sup>∗</sup>*Significantly more transposition errors compared with age-matched control group (p* < *0.05).*

than in letters adjacent to bound function morphemes, with no difference between the inflectional and derivational conditions. Similarly, transposition errors occurred significantly more often in letters adjacent to inflectional [t(10) = 6.90, p < 0.0001, d = 2.7] and derivational [t(10) = 7.78, p < 0.0001, d = 3.0] morphemes than in the exterior letters (with no difference in the rate of exterior letter migrations between the two control conditions—the bare root condition of 3-letter words and the longer morphologically-complex control condition—t(100) = 1.52, p = 0.13). Importantly, the rate of transpositions edging a bound function morpheme did not differ from the rate of transpositions of exterior letters.

The same tendency was found for each of the individual participants. All individuals made significantly (p ≤ 0.01) fewer transpositions near bound function morphemes than near inflectional and derivational morphemes (except for MR's inflectional vs. bound comparison, which was in the same direction but not significant). Similarly, each participant made significantly fewer transpositions in exterior letters than near inflectional (p ≤ 0.001) (apart from MR) and derivational (p < 0.05) morphemes. The number of transpositions near inflectional and derivational morphemes did not differ for any individual participant, neither did the bound and exterior letter conditions (p > 0.05).

Another finding sheds light on the early morphological analysis that occurred in the reading of our LPD participants: In total, across all 11 developmental LPD participants in reading all 500 migratable words, there were 58 exterior letter migrations in which two consonant letters transposed (1% of the words they read). None of these involved a root letter transposing with a letter that belonged to the bound function morpheme (or, in fact, any non-root morpheme). Even if we take only words in which an exterior transposition creates an existing word, there were 34 words (a total of 408 target words for all LPD participants) that started with a bound function morpheme, in which a transposition of the letter of the function morpheme and the first letter of the root could create an existing word (e.g., OBDK, vebadak, and-checked, that could create BODK, bodek, checks). However, this error occurred only once—only one participant made one such exterior migration across a morpheme boundary. This supports the conclusion that letter position errors occurred later than the morphological decomposition of the function morpheme from the word to which it was bound.

Additional analyses that explored decompositions and letter position errors in words in which the same letter can function in two different morphological roles are reported in Appendix B).

### The Woman with Acquired LPD

Similarly to the participants with developmental LPD, Galia (see **Table 6**) also made transposition errors but mainly adjacent to inflectional (27.3% errors) and derivational (26.9% errors) morphemes. She made only few transpositions near bound function morphemes (3 errors) and in exterior letters (5 errors). Her transpositions near bound function morphemes were significantly fewer than near inflectional and derivational morphemes (χ <sup>2</sup> = 18.63, p < 0.0001; χ <sup>2</sup> = 17.6 p < 0.0001, respectively). Similarly, her transpositions in exterior letters were significantly fewer than her transpositions near inflectional or derivational morphemes (χ <sup>2</sup> = 59.55, p < 0.0001; χ <sup>2</sup> = 55.65, p < 0.0001, respectively), with no significant difference between the inflectional and derivational conditions (χ <sup>2</sup> = 0.005, p = 0.94). Importantly, she made similar rates of migrations in exterior letters and near bound function morphemes, χ <sup>2</sup> = 0.55, p = 0.46. The two control exterior-migration conditions (bare root and longer words) did not differ, χ <sup>2</sup> = 0.39, p = 0.53.

# Interim Summary: Transpositions and Morphological Structure in Reading Aloud of Single Words

Both the developmental and the acquired LPD participants made significantly more transposition errors near inflectional and derivational morphemes than near bound function morphemes, and their transpositions near bound function morphemes were as scarce as exterior transpositions. No differences were found between the rate of transpositions near inflectional and derivational morphemes. These results indicate that some form of very early morphological decomposition applies to bound function morphemes, at the same time or before letter position encoding takes place. As a result of this early analysis, the bound function morpheme is stripped off the base word, so that the letters at the edge of the word that are adjacent to the bound function morpheme are treated as exterior letters, and hence, very few transpositions occur in them.

# Experiment 2: Oral Reading of Migratable Words in Sentences

Experiment 1 indicated that when words are presented in isolation, there is an effect of early morphological decomposition on migrations in oral reading. Experiment 2 tested the effect of morphology on migrations in oral reading of migratable words in sentences.

# Materials and Methods

The target words were migratable words in which a transposition of root letters could occur adjacent to inflectional, derivational, or bound function morphemes. The test included 30 sentences: the **inflectional condition** included 8 sentences with a word that allowed for a lexical transposition next to an inflectional morpheme (example 5); the **derivational condition** included 7 sentences with a word that allowed for a lexical transposition next to a derivational morpheme (example 6); and the **bound function word condition** included 15 sentences with a word that allowed for a lexical transposition next to a bound function morpheme (example 7, one of the items in the bound function morpheme condition was later excluded from the analysis because many of the participants read the target word with an irrelevant vowel error). Examples (5)-(7) demonstrate sentences of the three conditions, followed by the result of the expected transposition in parentheses.

(5) **Inflectional condition:**

: She occasionally hosts (is-late)


The migratable words in the different conditions did not differ in frequency [F(2, 27) = 0.75, p = 0.50]. The sentences of the various conditions were presented in random order. We constructed the sentences in a way that both the target word and the word that results from the transposition would be syntactically, semantically, and pragmatically plausible in the sentence.

Error analysis focused solely on migrations in the relevant target words. We removed from the analyses 7 sentences in which


TABLE 7 | Migration errors in oral reading of migratable words in sentences (Percentage migrations out of the words presented in the relevant condition).

<sup>∗</sup>*Significantly more errors than age-matched control group, p* < *0.05.*

*a Indeed, in this task EL performed not differently from the matched controls, but he performed significantly worse than matched controls in most of the other tasks. In general, all participants performed below the control participants in all or most of the tasks reported here.*

the participant made an irrelevant (non-transposition) error on the target word.

### Results

The results of the reading aloud of migratable word within sentences are summarized in **Table 7**. Similarly to Experiment 1, in sentence context as well, the LPD participants made significantly fewer transpositions near bound function morphemes than near inflectional morphemes, t(10) = 7.11, p < 0.0001, d = 2.0, and significantly fewer transpositions near bound function morphemes than near derivational morphemes, t(10) = 2.71, p = 0.02, d = 1.1. There was no difference between the migration error rates in the inflectional and the derivational conditions. Each of the participants showed the same pattern, with no differences between the inflectional and derivational conditions (p > 0.05) but with more transpositions adjacent to inflectional and derivational morphemes than adjacent to bound function morphemes (p < 0.05). Due to the relatively small number of items, this difference reached significance at the individual level only for four LPD participants.

# Experiment 3: Lexical Decision of Single Words

After we established the clear effect of the morphological structure of the target word on the rate of transpositions on the edge of the root in oral reading, we moved to assess whether the same effect is present also in reading tasks that do not involve reading aloud. Experiments 3 and 4 tested migrations in a lexical decision task at the single word and sentence level respectively.

### Materials and Methods

The stimuli list for lexical decision included 105 items: 59 pseudowords and 46 non-migratable real words. The pseudowords included: 20 pseudowords derived from real words by transpositions of the root letters next to an inflectional morpheme (**Table 8**, examples 1a and 1b); 20 pseudowords derived from real words by transpositions of the root letters next to a derivational morpheme (**Table 8**, examples 2a and 2b); and 19 pseudowords derived from a transposition of exterior letters (**Table 8**, examples 3a and 3b). (This task did not include bound function morphemes because we realized it would be unnatural to request the participant to circle words, when a word with a bound function morpheme, parallel to, for example, "thatmorning" could be considered as two words). The side of the transposition in the pseudowords—left or right, was controlled there were half expected migrations on the right and half on the left in each condition (9 and 10 on the left and right in the exterior condition respectively). The items were presented in a random order on a paper and the participants were requested to circle only the real words, without reading aloud.

### Results

The results of the lexical decision task, summarized in **Table 9**, exposed the same pattern: there were significantly more transpositions next to inflectional and derivational morphemes than exterior transpositions [t(10) = 4.59, p = 0.001, d = 0.9; t(10) = 5.96, p = 0.0001, d = 1.4, for inflectional and derivational morphemes, respectively]. Inflectional and derivational morphemes did not differ significantly.

Each of the individual participants showed this pattern of more errors on pseudowords that involved transposition next to an inflectional / derivational morpheme than on pseudowords


TABLE 9 | Percentage errors in lexical decision of migratable nonwords at the word level.


<sup>∗</sup>*Significantly more errors than age-matched control group (p* < *0.05).*

derived by transpositions of exterior letters. This difference was significant or approached significance (p ≤ 0.08) for six of the participants. The inflectional and derivational conditions did not differ significantly, neither on the group level, nor for any individual participant.

Thus, like in the oral reading Experiments (1 and 2), individual and group level analyses indicate that lexical decision is also vulnerable to migrations when the pseudoword is derived by transposing letters next to inflectional and derivational morphemes, whereas exterior transpositions are rare. This suggests that letters on the edge of the root, adjacent to an inflectional/derivational morpheme, are considered middle letters by the position encoding procedure.

# Experiment 4: Lexical Decision of Migratable Nonwords in Sentences

To further test the effect of sentential context on migratability of letters near morphemes, we administered a lexical decision test using migratable words incorporated in sentences. Presenting the transposition errors within sentences allowed us to also include transpositions next to bound function morphemes, which we could not use in the single word lexical decision task.

### Materials and Methods

A total of 64 sentences were presented to each participant: 8 sentences with a pseudoword that was formed by transposing the root letters near an inflectional morpheme (example 8); 7 sentences with a pseudoword formed by transposing the root letters near a derivational morpheme (example 9), and 15 sentences with a pseudoword formed by transposing the root letters near a bound function morpheme (example 10). The 9 control sentences included a pseudoword formed from a real word by the substitution of a single letter (example 11), and could not form any existing word following transposition. Twenty five length-matched sentences written correctly were presented as fillers.

The participants were requested to read each sentence silently and to judge whether the words in the sentence are written correctly or not.

	- Po**dv**t (Po**vd**t) o**d**e**v**et (o**v**e**d**et, work-3rd-sg-fem-present) Danny's mother pseudoword (works) in the kindergarten

### Results

The lexical decision task in which the transposed pseudowords were incorporated into sentences yielded similar results to



*\*Significantly more errors than age-matched control group (p* < *0.05).*

Experiments 1–3 (see **Table 10**). As **Table 10** demonstrates, whereas the LPD participants were able to detect transpositions on the edge of a bound function morpheme quite well, at a level that was not significantly different from the letter substitution control condition, they were less likely to identify transpositions near an inflectional or a derivational morpheme. They detected significantly fewer errors next to an inflectional morpheme than next to a bound function morpheme t(10) = 6.11, p < 0.001, d = 1.6, and fewer errors next to a derivational morpheme than next to a bound function morpheme, t(10) = 2.49, p = 0.03, d = 0.9. The inflectional and derivational morpheme conditions did not differ significantly from each other. The inflectional and derivational conditions were both significantly poorer than the control letter substitution condition, t(10) = 9.38, p < 0.0001, d = 2.9; t(10) = 3.63, p = 0.005, d = 1.5, respectively.

At the individual participant level the pattern was similar, all but one participant performed more poorly on the inflectional and derivational conditions (combined) compared with the bound function morpheme condition, a difference that was significant for 3 of the participants. There were no differences between the inflectional and derivational conditions for any of the LPD participants.

# Experiment 5: Comprehension of Single Written Migratable Words

Another task we used to examine whether the effect of different morphemes on migrations occurred also in silent reading was a comprehension task. Again, we tested word comprehension in a single word task (Experiment 5) and in words incorporated in sentences (Experiment 6).

# Materials and Methods

We tested the comprehension of 60 migratable words using a word association task. Each migratable word was presented as part of a triad that included, in addition to the target migratable word, a pair of words, one was semantically related to the target word, the other was semantically related to a transposition error in the target word (examples 12–15). The participants were requested to choose the word that is semantically related to the target word, without reading the target word aloud. Again, the target migratable words in the test were of the four types: 15 words with potential of lexical transposition near an inflectional morpheme (12); 15 words with a potential of lexical transposition near a derivational morpheme (13); 15 words with a potential of lexical transposition error near a bound function morpheme (14), and 15 words with a potential of lexical transposition that involved exterior letters (15). In this task too, the inflectional, derivational, and exterior conditions included both words in which the transposition could occur on the left or adjacent to a relevant morpheme on the left of the word, and words in which the transposition was expected on the right. The target words of the various conditions did not differ in frequency [F(3, 56) = 1.47, p = 0.23].



(15) exterior migration: rotten (near) – too ripe / not far

# Results

The performance of each participant in each condition is presented in **Table 11**. In general, the performance of the LPD participants in this task was relatively good compared to the other tasks, possibly because this was the only task that explicitly presented two options, which may have caused more deliberate attempt to read the words letter-by letter to avoid transpositions. Still, in this task too, the participants performed poorer on triads that involved transposition near inflection and derivation compared to triads that involved transposition near bound function morphemes or transposition of exterior letters.

Each of the inflectional and derivational conditions separately yielded significantly poorer performance compared with exterior migration, t(10) = 2.44, p = 0.03, d = 0.9, t(10) = 3.03, p = 0.01, d = 1.4, respectively. Both inflectional and derivational conditions were each poorer than the bound function morpheme condition, a comparison that was significant for the derivational condition, t(10) = 2.31, p = 0.04, d = 1.2, and for the initial inflectional and derivational conditions combined, t(10) = 2.42, p = 0.03. As in the previous experiments, the inflectional and derivational conditions did not differ at the group level, or for



*\*Significantly more errors than age-matched control group (p* < *0.05).*

any of the individuals with LPD, and neither did the bound and the exterior conditions.

# Experiment 6: Comprehension of Migratable Words in Sentences

# Materials and Methods

The last experiment tested comprehension of migratable words of the various morphological structures in a more natural task in which the migratable words were incorporated into sentences. The sentences were created in a way that both the target word and the result of the transposition error are plausible in the given sentential context. The participants were requested to read each sentence silently and then to paraphrase it. We assessed whether the paraphrase reflected the target word or its transposition.

The test included 30 sentences, each with a migratable word. We compared the performance on 15 sentences with a word in which the transposition occurred adjacent to an inflectional (10 sentences) or derivational (5 sentences) morpheme, see examples (16) and (17), with 15 sentences with a word in which the transposition occurred adjacent to a bound function morpheme (18). The different conditions did not differ in frequency [F(2, 27) = 0.16, p = 0.80].



Sentences whose paraphrases indicated that the participant read the target word incorrectly but with an irrelevant (nonmigration) error type were excluded from the analysis (16 such sentences were removed in total).

### Results

The comprehension of the migratable words in sentences, summarized in **Table 12**, again indicated that transpositions occurred significantly more often adjacent to inflectional and derivational morphemes than adjacent to bound function morphemes, t(10) = 7.90, p < 0.001, d = 4.0. This pattern held also for each of the participants individually, and was significant for five of them.

# Letter Position Errors and Morphology in LPD: Interim Summary of Experiments 1–6

The pattern that the LPD participants demonstrated was consistent across the six tasks: they made very few migrations adjacent to bound function morphemes, at a rate that was similar to the low rate of exterior letter migrations, indicating they treated letters adjacent to bound function morphemes practically as exterior letters. They made significantly more migrations adjacent to inflectional and derivational morphemes. Their error rates in the various conditions in the six experiments are summarized in **Figure 1**.

One possible alternative explanation for the difference between letter position errors in words with bound function morphemes and words with inflectional/derivational morphemes



<sup>∗</sup>*Significantly more errors than age-matched control group (p* < *0.05).*

is that bound function morphemes appear only word-initially, whereas inflectional/ derivational affixes appear both word initially and word finally (and sometimes even word-internally). However, when we compared only initial affixes, the differences between bound function affixes and inflectional/derivational affixes survived in each of the 4 experiments that included a bound function morpheme: there were significantly more transposition errors near initial inflection and derivation morphemes than near bound function morphemes, in Experiment 1, t(10) = 10.86, p < 0.0001; Experiment 2, t(10) = 3.42, p = 0.006; Experiment 4, t(10) = 3.52, p = 0.005; and Experiment 6, t(10) = 4.62, p = 0.0009.

# Similar Findings from Normal Reading

Throughout Experiments 1–6, we had individuals with normal reading perform the same reading tasks as the LPD participants. They did not make many errors, but we were curious to see whether the few migration errors that occur in normal reading are affected by the morphological structure of the target word.

# Participants with Normal Reading

The participants we analyze in this section are 40 skilled readers, all Hebrew native speakers without any reading impairments according to the TILTAN reading screening test (Friedmann and Gvion, 2003). Twenty five of them served as age-matched controls in Experiments 1–6 and were described above in the section reporting the control participants (Section Control Group). They were tested in all 6 experiments, in the same conditions as the individuals with LPD, with stimuli presented "over the desk" for unlimited time.

Because this type of presentation yielded very few migrations in the control participants, we also added another group of 15 skilled readers, in more challenging reading conditions of limited exposure times of 300 and 100 ms. These 15 additional participants were 20–63 years old (M = 38.6 years, SD = 14.3),

Average percentage migrations.

with 12–21 years of education (M = 15.3 years, SD = 2.4). They were tested with the word-level reading aloud, lexical decision, and comprehension tasks described in Experiments 1, 3, and 5.

# Procedure for the Short Exposure Presentations

For the short exposure tests, the target words from Experiments 1, 3, and 5 were presented on a computer screen, for a limited time. The words for each of the three experiments (oral reading, lexical decision, comprehension) were presented in three separate blocks. Each participant saw the same 665 migratable words twice, a week apart, the words in each block were presented in a different order in the two sessions. Because most migratable words appeared in both orders in the word list (if SOFTIM appeared in the list, so did SOTFIM), and the words appeared in the list in a different order, there was no effect for remembering the words in the list. In the first session all words were presented for 300 ms (without masking). The second session, a week later, presented the same words, in a different order, for 100 ms.

In Experiment 1, the participants were requested to read each word aloud. In Experiment 3, they were requested to say, for each presented stimulus, whether it was an existing word. In Experiment 5, the participants were requested to explain each word in their own words.

# Results: Migrations and Morphology in Normal Reading

The results of the individuals with normal reading, summarized in **Table 13**, show that the error rate in all conditions was rather

TABLE 13 | Normal reading of migratable words according to the type morpheme adjacent to the transposition site.


*Average percentage errors (SD).*

small, but still an interaction of the rates of migrations with morphological structure could be detected for normal reading as well.

# Adults in 100 ms Exposure

Not surprisingly, the condition that yielded most migrations was the shortest exposure time. In reading aloud, significantly more migrations occurred near inflection or derivation than near bound function morphemes, t(14) = 2.54, p = 0.02, t(14) = 3.85, p = 0.002, respectively; Similarly, significantly more migrations occurred near inflectional or derivational morphemes than in exterior letters, t(14) = 3.71, p = 0.002, and t(14) = 3.88, p = 0.002, respectively. In the lexical decision and the comprehension tasks a similar pattern was evinced, although only the difference between the inflection and exterior letter conditions was significant, t(14) = 2.76, p = 0.02, t(14) = 2.81, p = 0.02, in lexical judgment and in the comprehension task, respectively.

# Adults in 300 ms Exposure

In the longer exposure condition the pattern was similar: migrations occurred more often adjacent to inflectional and derivational morphemes than adjacent to bound function morphemes and exterior letters. These differences reached significance only in the comparisons between derivation and bound function morphemes, t(14) = 3.83, p = 0.002, and between inflection and exterior letters conditions, t(14) = 3.6, p = 0.003.

In the lexical decision and comprehension tasks too, more migrations occurred in the letters near inflection and derivational morphemes than in letters near bound function morphemes and exterior letters, but most of these differences did not reach significance [the only significant difference was the one between the derivation and exterior conditions, t(14) = 2.43, p = 0.03].

# Children and Adolescents in Unlimited Presentation

The unlimited presentation yielded even fewer migrations, but the same pattern persisted, although only few of the comparisons were significant, due to the ceiling effect. In reading aloud of single words, significant differences were found between derivational and bound function morphemes in the 5–6 and 7–8th graders [t(9) = 6.15, p = 0.0002; t(9) = 2.90, p = 0.02, respectively]. There were also differences that approached significance between inflection and bound function morphemes in the 7–9th graders and the 12th graders [t(9) = 2.15, p = 0.057; t(9) = 2.37, p = 0.08, respectively]. Significant differences were also found between inflection and exterior conditions in the 5–6th graders, 7–8th graders and the 12th graders [t(9) = 3.03, p = 0.01; t(9) = 2.55, p = 0.03; t(9) = 3.76, p = 0.02] and between derivational and exterior letter conditions in the 5–6th graders [t(9) = 6.0, p = 0.0002, and in the 7–8th graders, t(9) = 3.26, p = 0.01]. No differences were found between the bound and the exterior letters conditions.

The pattern of migration errors in reading aloud of migratable words within sentences was similar to that manifested in single word reading. Letters adjacent to inflectional and derivational morphemes yielded more errors than letters adjacent to bound function morphemes. Given the relatively small number of migrations, only the comparisons between inflection and bound function morphemes in the 7–8th graders and derivation and bound in the 5–6 and 7–8th graders reached significance, t(9) = 2.5, p = 0.003; t(9) = 3.2, p = 0.001; t(9) = 2.33, p = 0.04, respectively.

The same tendency was found in lexical decision of single words, where only the difference between derivational and exterior letter conditions in the 5–6th graders reached significance, t(9) = 2.22, p = 0.05. In the Lexical decision of nonwords in sentences the same tendency emerged, without significant differences; In comprehension of single words significant differences were found only in the performance of the 5–6th graders between derivational and bound function morphemes and between derivational and exterior letters [both comparisons yielded t(9) = 2.24, p = 0.05]. Finally, in comprehension of words within sentences, the 7– 8th graders made significantly more errors in the inflectional and derivational condition compared to the bound condition, t(9) = 1, p = 0.02.

# Discussion

This study examined the nature of early morphological decomposition in reading via testing letter position errors that individuals with LPD make in words of various morphological structures. The study was based on the well-established finding that in LPD almost only middle letters migrate whereas exterior letters are less prone to errors. We used this fact to ask whether morphological decomposition occurs prior to letter position encoding: we reasoned that if words are decomposed to their roots and morphological affixes, then letters that used to be internal in the visually perceived complex word become exterior following decomposition (such as in the case of the English word signs, where the letter n is internal in the complex word, but is exterior in the base sign). Thus, if such word-internal base-exterior letters do not migrate, this can indicate that morphological decomposition affects letter position encoding, and hence, precedes it. We compared three types of morphological affixes: inflectional, derivational, and bound function morphemes.

# The Ordering of Morphological Analysis and Letter Position Encoding

The assessment of the effect of morphology on letter position errors in LPD indicated that morphological decomposition follows letter position encoding for inflectional and derivational morphology. This makes sense: it is hard to imagine how morphological analysis of a morphologically complex word can proceed before the order of the letters is encoded (after all, -ment is a suffix, but -nemt is not). We reached this conclusion on the basis of the finding that letter position errors occurred in the root letters adjacent to inflectional and derivational affixes even when morphological decomposition would make these letters exterior and hence less liable to migrations. Namely, letter position errors occurred prior to the analysis of the inflection and derivation in morphologically complex words.

The results also clearly indicated that letter position errors are sensitive to the morphological structure of the target word: whereas the participants made migration errors on the letters that were adjacent to inflectional and derivational morphemes, treating them as middle letters, they did not make almost any migration errors on root-exterior letters that were adjacent to a bound function word (namely, when encountered with a letter string composed of a bound function morpheme and a word, parallel to the-art in English, they almost never made exterior errors in the word base that would lead to reading it as "the-rat").

We suggest that these results can be explained if one distinguishes between morphological analysis and morphological decomposition. The results are consistent with the following model: the first stages of word reading involve letter identification and letter position encoding. Then, an early, prelexical, morphological analysis takes place, whereby the morphological structure of the word, including inflection, derivation, and bound function morphemes, is analyzed. This analysis is structural in nature, non-lexical, and it relies on knowledge of existing inflectional and derivational templates.

This analysis is enough for morphologically complex words that include inflectional and derivational morphology to access the next stages of reading: the orthographic input lexicon and the sublexical route. Letter position encoding occurs prior to this morphological analysis and hence letter errors affect letters of the root even if they are adjacent to inflectional or derivational morphemes. We assume that in inflectional and derivational morphemes, the system encodes letter position for the whole complex word, and then during the morphological analysis, when the three letters of the root are extracted, they receive their letter position within the root directly as part of the analysis: if the word is XKRh (she-researched), where the root is XKR and the h is the feminine singular affix, it is enough to encode the position of the letters to know that K is the second letter of the root. The same with derivational prefixes like m in the word mXKR (research). In this case, again, the morphological analyser that analyses the m as the derivational prefix and XKR as the root can already assign the letter position of the root letters and hence, again, K will be encoded as the second letter of the root. A letter position encoding error would therefore affect the position of all middle letters in the morphologically complex word, including the letters on the verge of the inflectional and derivational morphemes.

The story is different when this early morphological analysis detects that the letter string cannot be analyzed as including a root and inflectional and derivational morphemes, as is the case in words with bound function morphemes like the-art (which, in English and many other languages visually appear as two words). In this case, the string is decomposed into the two constituents (the function word the and the word art), and then letter position encoding should take place again on the two constituents (or at least—on the base constituent, because the position of the bound function morpheme is already encoded as first). This might be because once decomposed, the letter positions of the base word (which could be morphologically complex in itself) changed and should be re-coded (a letter that was second prior to decomposition now becomes the first letter in the base)<sup>7</sup> . When this happens, and letter position encoding is applied to the decomposed words, letter position errors again occur only in middle positions of the constituent words, and hence the exterior letters of the constituent words do not migrate<sup>8</sup> . We summarize and exemplify our proposed model in **Figure 2** (and see Appendix C for a transcribed example and a parallel example in English).

Of interest is also the finding that there were practically no transpositions across a function morpheme boundary: the letter of the function morpheme almost never transposed with the letters of the root, suggesting another corroboration for the conclusion that morphological decomposition of bound morphemes occurs prior to letter position encoding.

The results showing the morphological effects on letter position errors were consistent across the 6 experiments, on words presented in isolation and within sentences, and were evinced both in the reading of the participants with LPD and in the reading of the skilled readers, who made much fewer errors, but with the same patterns.

# Morphological Analysis is Prelexical

Our results also suggest some further insights as to the nature and locus in the reading process in which morphological analysis occurs: they suggest, like many previous studies, including studies on morphological analysis in peripheral dyslexias, that the morphological analysis does not rely on lexical considerations but rather on a structural analysis of the words.

This conclusion is supported by three findings in the current study: firstly, LPD is a deficit at the letter position encoding function in the early, pre-lexical stage of orthographic-visual analysis. The fact that morphological structure affects letter position errors, at least in the case of bound function morphemes,

<sup>7</sup>An important distinction, which might underlie the reason why words that appear with a bound function morpheme need to be decomposed and return for reencoding of letter positions, is the distinction between words and roots. Whereas derivational and inflectional morphemes in Hebrew (and in Semitic languages in general) appear with a root, bound function morphemes appear with a word. This word, in turn, may be morphologically complex in itself. Morphological analysis of morphologically complex words that are composed of a root and inflectional and derivational affixes is enough to allow access to the lexicon, because the identification of the derivational template and inflections provides information about the slots of the three consonants of the root and their order. Such analysis is impossible when a morphologically complex word appears bound to a bound function word. In this case, it seems that to apply morphological analysis and identify letter positions within the root, the word needs to be decomposed and stripped off the bound morpheme, and then fed to the process again, for reencoding of letter position, followed by the iteration of the morphological analysis stage.

<sup>8</sup> Interestingly, Taft and Nillsen (2013) did not find any difference between priming of prefixed words in which stem-initial letters transposed and those in which stem-interior letters transposed (as was the case in our study with inflectional and derivational prefixes). They explained this finding in that although morphological decomposition is prelexical, the initial letters are less prone to transpositions due to their perceptual salience. The results of our study indicate that there actually can exist cases in which the morphological structure is decomposed in a way that stem-initial letters that are not perceptually initial are resistant to transpositions, hence supporting the idea of early morphological decomposition, but suggesting that the effect of morphology on letter position encoding differs between different morpheme types.

suggests that morphological analysis occurs in this early stage of orthographic-visual analysis.

Secondly, there were words that started with a bound function morpheme but could structurally be analyzed as starting with a verbal derivational affix, although the root does not exist with this derivational affix (see Appendix B). Such words were analyzed as starting with a derivational affix, as indicated by the higher rate of migrations adjacent to their first letter. The fact that the analysis created a non-existing word indicates that the analysis was structurally, rather than lexically driven (in line with previous studies such as Longtin et al., 2003; Rastle et al., 2004; Longtin and Meunier, 2005; Rastle and Davis, 2008; Reznick and Friedmann, 2009; Beyersmann et al., 2011; Crepaldi et al., 2014), supporting the conclusion that morphological analysis takes place in an early, pre-lexical stage.

This conclusion of pre-lexical morphological analysis is also supported by the finding that nonwords showed exactly the same morphological effect as words: Experiments 3 and 4 showed that even in morphologically complex nonwords, in which both the whole nonword and its root did not exist, there were much fewer migration errors adjacent to a bound function morpheme than adjacent to inflectional and derivational morphemes<sup>9</sup> .

Such pre-lexical, structurally-based analysis takes place both when reading a whole sentences and when morphologically complex words are presented in isolation.

### A Note on Accounts for Developmental Dyslexia

Given that 11 of the participants had developmental LPD, the results also shed light on the source of developmental LPD, and, more specifically, shed light on what cannot be the source of developmental LPD.

Firstly, as in many other cases of developmental LPD (Friedmann and Rahamim, 2007), all of the participants had perfect repetition of migratable words, indicating good phonological production, and all of them had normal picture naming, indicating good lexical retrieval processes. This demonstrates that neither phonological impairment nor lexical impairment can account for their dyslexia (cf. Castles and Coltheart, 2004; Castles and Friedmann, 2014). Rather, they showed a selective deficit in letter position encoding.

Furthermore, some studies ascribe developmental dyslexia to impaired morphology (Shu et al., 2006). Here we actually saw the exact opposite: the preserved morphological ability of our participants with LPD modulated dyslexic errors and protected the letter position dyslexics from making errors in one of the conditions. Additionally, in their reading aloud and in their word repetition they did not make morphological errors: they did not substitute or omit morphological affixes. Thus, developmental dyslexia, or at least developmental LPD, does not originate in a morphological impairment.

Finally, and this applies to both developmental and acquired LPD, some accounts for letter migrations provide visually-based

<sup>9</sup>Whereas words like the-art in Hebrew appear orthographically as one word but can be decomposed structurally based on knowledge of morphology and without any contribution of lexical knowledge, word-word compounds that occur in some other languages may require a different treatment. In languages like German and Italian, compounds may be created from two or more words combined (Kirschfruchtfaft, tostapane, see a recent special issue on compounds, Semenza and Luzzatti, 2014). In these cases, the decomposition of compounds into their constituting words is probably lexically-based rather than purely structural.

explanations for the relative immunity of exterior letters to letter migration. The current results suggest that this cannot be the whole story, because stem-exterior letters may be immune to migrations even when they visually appear word-interiorly. We saw that first letters of the root, when appearing right after a bound function morpheme, very rarely migrate, and their migration rate is comparable to that of first letters of the root that are also visually exterior. This suggests that morphologicalorthographic processing also contribute to the relative immunity of exterior letters.

Therefore, the results of the current study show that developmental LPD does not stem from a phonological, lexical,

# References


morphological, or visual impairment. These results thus are also inconsistent with general claims that do not distinguish between different types of developmental dyslexia, which suggest that one of these factors is the source of developmental dyslexia in general.

# Acknowledgments

This research was supported by the Israel Science Foundation (grant no. 1066/14), US-Israel Binational Science Foundation (BSF grant 2011314), and by the Australian Research Council Centre of Excellence for Cognition and its Disorders (CE110001021).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Friedmann, Gvion and Nisim. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Appendix A

TABLE A1 | Examples for Hebrew words with the root , SPR, with inflectional, derivational, and bound function morphemes. The root meanings relate to stories, numbers, and hair-cutting.


*The root appears in purple, inflectional morphology in orange, derivational in blue, bound function morphemes in cherry red.*

# Appendix B

# What Ambiguous Letters Tell US About Morphological Decomposition?

We saw that bound function morphemes behave differently from other morphemes in their effect on letter position errors. As a next step we asked whether it is the letter itself that is identified as a "function letter" by the morphological mechanisms in the orthographic-visual analyzer, or whether a first-pass analysis of the whole word is done. For this aim, we made two additional analyses that took advantage of the fact that some Hebrew letters can represent both bound function morphemes and inflectional or derivational morphemes, and some can be both bound function morphemes and root letters.

# Analysis of Letters That Can Function as Bound Function Morphemes or as Inflectional/Derivational Morphemes

The letters (corresponding to the function words ha- ,me-,le-), when appearing as the first letter of the word, form the bound function morphemes "the," "from," and "for/to" respectively, but they can also function as inflectional and derivational morphemes when they appear as the first letter of the word, before the root. When they function as inflectional/derivational morpheme, they are part of the morphological structure of the word, which often includes other letters in the middle or end of the word, that form part of its morphological structure.

We examined the rate of transposition errors near the three ambiguous letters when they functioned as bound function morphemes, and compared them to words in which they functioned as inflectional or derivational morphemes. We also compared the rate of transpositions near these letters when they functioned as bound function morphemes compared to the rate of transpositions near non-ambiguous bound function morphemes ( ) (be-,ke-,ve-,she-), and the rate of transpositions near these letters when they functioned as inflectional and derivational morphemes compared to transpositions near non-ambiguous inflectional or derivational morphemes ( ) (a,i,t,n).

This analysis revealed that when the first letter functioned as an inflectional or a derivational morpheme, and was part of a derivational/inflectional structure, the LPD participants made three times more transpositions (13.2%) near it than when it functioned as a bound function morpheme (4%), a difference that was significant, χ <sup>2</sup> = 22.13, p < 0.0001. The letters that can function both as bound and as inflectional/derivational morphemes showed similar transposition rates to the nonambiguous bound function letters (3%) when they functioned as bound (4%) (χ <sup>2</sup> = 0.29, p = 0.59), and similar to the nonambiguous inflectional/derivational letters (12.2%) when they functioned as inflectional/derivational (13.2%), (χ <sup>2</sup> = 0.23, p = 0.63). These findings suggest that the difference is not only based on the identification of the first letter as one of a list of "function letters" but is rather based on a wider morphological analysis that looks at the structure of the whole word and its properties.

# Analysis of Letters That Can Function as Bound Function Morphemes or as Root Letters

All letters that are part of morphological affixes can also function as root letters. We analyzed the letters and (corresponding to the function words be-, in, and she-, that), the two letters for which we had enough instances in which they appeared both as a bound function morphemes and as first letters of the root, followed by at least 3 consonant letters.

We compared the rate of transposition errors near the two ambiguous letters when they functioned as bound function morphemes, and when they functioned as the first letter of the root. In this analysis we only included words in which a structural-morphological analysis can identify the role of the first letter: we therefore selected words with at least 4 letters in which the first letter was the ambiguous / , the 2nd and 3rd letters were consonants and their transposition created another existing word (17 words in which / functioned as a root letter and 21 words in which they functioned as bound function morphemes).

For example, (BXRTM, baxartem, you-pl-chose), starts with the ambiguous letter B, which functions in this word as a root letter. A structural analysis of this string would suggest that the first letter is a root letter because the last two letters form a suffix, so the three consonants of the root need to include the first letter. In contrast, in the word , BSPR, ba-sha'ar, inthe-gate, the letter B functions as a bound function morpheme. In this word, there are three consonant letters after the first letter, none of which could be affixes, indicating that the first letter is a prefix - a bound function morpheme.

The results, again, indicated that a full morphological analysis of the word is done, and not only identification of the first letter as one of a list of bound function morpheme letters. Twice as many migrations occurred in the (2nd and 3rd) letters that are adjacent to the ambiguous first letter when the first letter served as a root letter than when it served as a bound function morpheme. An average of 1.7 migrations occurred (for the 11 developmental LPD participants combined) when the first letter ( / ) was a root letter, very similar to the rate of errors in words in which the first letter was unambiguously a root letter (1.6), and only 0.7 migrations when it was a bound function morpheme.

This suggests that the morphological analysis takes into account the whole structure of the word, the existence of three consonant that could function as the root and the existence of other letters that can function as derivational or inflectional morphemes.

Finally, a further interesting finding relates to words that structurally could be analyzed as words (verbs) in a derivational template, but lexical knowledge actually indicates that this is a composition of a bound function morpheme and an existing word, whereas the derivational form is a non-word. (For example, the word with the bound determiner , the-prisoner, is cast in the derivational template for causative verbs, , but such verb does not exist in this template). Thus, had the analysis been guided by lexical considerations, such words would be analyzed as a bound function morpheme+word, and lead

to reduced migration rate. But in fact, they were structurally analyzed as one derivationally complex word, as signaled by the fact that migrations occurred after the first letter. There were 45% migration errors on this word, exactly like the average rate of migrations in real verbs in this template, whereas there were only 3.8% errors in the rest of the words in the list that started with the bound article and did not conform to an existing verbal template.

These analyses indicate that the morphological parsing takes into account possible morphological templates and affixes, and analyzes the whole word. If three consonant letters are followed by letter(s) that are recognized to be a suffix, the first letter is analyzed as a root letter. If, however, a consonant letter that can be a bound function morpheme is followed by a morphological structure that includes three root letters in a known template, it is analyzed as a function morpheme, and hence is subject to decomposition and stripping off from the following word. Notice that such analysis can occur early, structurally, and without any access to the lexicon, but rather be based solely on knowledge of the existing templates and affixes, their position within the word, and the demand for three root consonant letters.

# Appendix C

The differential information processing of derivationally complex word and a word with bound function word.

The Hebrew examples: a. , MSPT (sentence or trial), derived from the derivational template MXXX and the root SPT. b. , VSPTH (and-judged-fem), in which the bound function morpheme "and," V- is attached to the morphologically complex word she-judged SPTH, which is itself combined of the root SPT and the feminine singular past inflection -H.

A parallel example in English would be the following, however, notice that it cannot reflect the whole story of Semitic languages like Hebrew, because there is no notion of the letters of the root and their order, neither is there a single letter that functions as a bound function prefix:

# Evidence from neglect dyslexia for morphological decomposition at the early stages of orthographic-visual analysis

### Julia Reznick and Naama Friedmann\*

*Language and Brain Lab, Tel Aviv University, Tel Aviv, Israel*

This study examined whether and how the morphological structure of written words affects reading in word-based neglect dyslexia (neglexia), and what can be learned about morphological decomposition in reading from the effect of morphology on neglexia. The oral reading of 7 Hebrew-speaking participants with acquired neglexia at the word level—6 with left neglexia and 1 with right neglexia—was evaluated. The main finding was that the morphological role of the letters on the neglected side of the word affected neglect errors: When an affix appeared on the neglected side, it was neglected significantly more often than when the neglected side was part of the root; root letters on the neglected side were never omitted, whereas affixes were. Perceptual effects of length and final letter form were found for words with an affix on the neglected side, but not for words in which a root letter appeared in the neglected side. Semantic and lexical factors did not affect the participants' reading and error pattern, and neglect errors did not preserve the morpho-lexical characteristics of the target words. These findings indicate that an early morphological decomposition of words to their root and affixes occurs before access to the lexicon and to semantics, at the orthographic-visual analysis stage, and that the effects did not result from lexical feedback. The same effects of morphological structure on reading were manifested by the participants with left- and right-sided neglexia. Since neglexia is a deficit at the orthographic-visual analysis level, the effect of morphology on reading patterns in neglexia further supports that morphological decomposition occurs in the orthographic-visual analysis stage, prelexically, and that the search for the three letters of the root in Hebrew is a trigger for attention shift in neglexia.

Keywords: morphology, morphological decomposition, reading, neglect dyslexia, Hebrew

# 1. Introduction

One of the intriguing questions in the cognitive psychology and neuropsychology of reading relates to how we read words like "segmentation," "absolutely," "smiling," or "kangaroos." If such morphologically complex words are represented in the orthographic lexicon in a decomposed form, access to the lexicon should use morphologically decomposed codes. To allow for such access, a pre-lexical stage of morphological decomposition is required.

Word-based neglect dyslexia (neglexia), a reading deficit in which letters on one side of the word are neglected, provides an interesting opportunity to examine the process of

### Edited by:

*Minna Lehtonen, University of Helsinki, Finland*

### Reviewed by:

*Alessio Toraldo, University of Pavia, Italy Cristina Burani, Institute of Cognitive Sciences and Technologies, Consiglio Nazionale delle Ricerche, Italy*

### \*Correspondence:

*Naama Friedmann, Language and Brain Lab, School of Education and Sagol School of Neuroscience, Tel Aviv University, Ramat Aviv, Tel Aviv 69978, Israel naamafr@post.tau.ac.il*

> Received: *21 October 2014* Accepted: *27 August 2015* Published: *15 October 2015*

### Citation:

*Reznick J and Friedmann N (2015) Evidence from neglect dyslexia for morphological decomposition at the early stages of orthographic-visual analysis. Front. Hum. Neurosci. 9:497. doi: 10.3389/fnhum.2015.00497* morphological decomposition. Because neglexia occurs at the stage of the orthographic-visual analysis of words, an effect of the morphological structure of words would indicate that such early morphological decomposition occurs at the stage of orthographic-visual analysis, and would enable the examination of the characteristics of this early morphological decomposition.

# 1.1. Morphological Representation and Processing of Written Words

The first stage of the reading process is a stage of visualorthographic analysis, according to the model we assume here, the dual route model for word reading (Morton and Patterson, 1980; Newcombe and Marshall, 1981; Coltheart, 1984, 1985; Marshall, 1984; Coltheart et al., 1993, 2001; Ellis and Young, 1996; Jackson and Coltheart, 2001). This first stage is responsible for recognizing the abstract identity of the letters in the word, for encoding the relative position of letters in the word, and for binding the letters to the words they appear in. The output of the orthographic-visual analysis then enters the orthographic input lexicon, possibly through an orthographic input buffer<sup>1</sup> . The orthographic input lexicon contains the written form of words, and reading proceeds by a search for a word in this lexicon that matches the input information regarding the identity and position of the letters. The information from the orthographicvisual analyzer is also transferred to the other reading route the sublexical route, which is based on grapheme-to-phoneme conversion, and enables the reading of unfamiliar words and of non-words.

There are three main types of approaches to the way in which morphologically complex words are represented in the orthographic input lexicon, from which different approaches are derived for explaining morphological decomposition at the pre-lexical stage.

According to one approach, no morphological decomposition of morphologically complex words occurs pre-lexically (e.g., Manelis and Tharp, 1977; Lukatela et al., 1980, 1987; Butterworth, 1983; Giraudo and Grainger, 2000, 2001). Nonetheless, some of the researchers who hold this full-listing view suggest that morphology does act as an organizing factor of lexical representations in the lexicon (Lukatela et al., 1980, 1987), or alternatively, that morphological decomposition occurs at a postlexical stage (Giraudo and Grainger, 2000, 2001). There are also researchers who completely reject the relevance of morphology to the processing and representation of written words, and claim that the morphological effects that have been found in studies are no more than an expression of the ensemble of associations that exist between words (Seidenberg and McClelland, 1989).

According to the opposite approach, morphological decomposition of morphologically complex words is a necessary part of the process of accessing their lexical representations (e.g., Taft and Forster, 1975; Rastle et al., 2004; Taft and Kougious, 2004; Longtin and Meunier, 2005; Crepaldi et al., 2010, and see Amenta and Crepaldi, 2012, for a review). According to one of these models, words are stripped of their affixes pre-lexically and the stem is used as a lexical unit of access (Affix-Stripping Model, ASM, Taft and Forster, 1975; Taft, 1979, 1981). Another model that postulates obligatory morphological decomposition suggests that word access occurs through the activation of the morphemes that the word is composed of (the Interactive Activation Model, IAM, Taft, 1994).

An intermediate approach, the dual-access approach, postulates that the lexical units of access can be either morphemes and/or whole words (Baayen et al., 1997; Diependaele et al., 2009). Whereas some assume there to be a parallel activation of both the whole-word and the morpheme routes (e.g., Meta Model, Schreuder and Baayen, 1995), others determine the method of access (one route or both in parallel) according to the characteristics and morphological structure of the target word (Augmented Addressed Morphology Model, AAM, Laudanna and Burani, 1985; Burani and Caramazza, 1987; Caramazza et al., 1988; Chialant and Caramazza, 1995; Traficante and Burani, 2003). According to the AAM, both the whole word units and the morpheme units are used to access the lexicon, in which the words are stored in a morphologically decomposed form (at least the regularly inflected words). Thus, according to this approach, morphological decomposition is optional.

A further debate relates to whether early morphological decomposition relies solely on structural, morpho-orthographic pre-lexical analysis (identification of units that enable morphological decomposition) or whether it is based on lexical information (e.g., whether a certain combination of morphemes forms an existing word; see also Meunier and Longtin, 2007).

Whereas most studies of morphological decomposition asked these questions of whether decomposition is obligatory and what its nature is through the assessment of normal reading, mainly using priming tests, the current study approaches these questions from a novel perspective: that of reading in peripheral dyslexia. We examine whether morphological decomposition occurs in the process of lexical access and when it occurs, by studying the effect of the morphological structure of words on reading in neglect dyslexia (neglexia). Given that neglexia is a deficit at the pre-lexical stages of reading, if the morphological structure is found to affect reading in neglexia, this will provide evidence for morphological decomposition, and locate it before the lexicon. We will also assess whether this morphological decomposition is affected by lexical and semantic factors and what guides this early decomposition. This study was conducted in Hebrew, a morphologically rich language, and the following section surveys what is known about the effect of morphology on reading in Hebrew.

# 1.2. Representation and Processing of Morphologically Complex Words in Hebrew

Hebrew is a Semitic language with an alphabetic orthography, read from right to left. As a language with Semitic morphology, most Hebrew words are composed of a tri-consonantal root and affixes. Verbs, nouns, adjectives, and prepositions can include inflectional morphology, and inflect for gender, number, and possessor/genitive; verbs also inflect for tense and person. As for

<sup>1</sup>According to some approaches (cf., Sternberg and Friedmann, 2007, 2009) the output of the orthographic-visual analyzer is held in a short term graphemic memory component, the orthographic input buffer, until it is transferred to the orthographic input lexicon and the sublexical route.

derivational morphology, verbs, nouns, and adjectives are created from a root and a template: verbs are formed in a verbal template called "binyan" (Arad, 2005; Arad and Shlonsky, 2008), nouns and adjectives are inserted into a nominal template ("mishkal"). The inflectional and derivational morphemes may be vowels or consonants. They are not only linearly added to the beginning or end of the root, but may be interwoven, with the root and affixes appearing alternately. The vowels and consonants of one morpheme (word pattern) can appear between the letters of another morpheme (the root), so the letters of the root can be non-adjacent. Thus, affix letters can appear before the root, in the middle of the root, or after it, namely, in the beginning, middle, or end of the word, and often in several positions in the same word (see **Table 1** for examples).

All letters in Hebrew can be part of the root, 12 letters can also serve as part of inflectional or derivational affix, whereas 10 other letters cannot be part of any affix. Some letters can serve as affixes only in the beginning of the word (e.g., ל, א(, and other letters can appear as affixes before, within, and after the root (e.g., י, ת(, or both before and after the root (e.g., (ה 2 .

In languages with an alphabetic orthography and a linear morphology, the organization of the lexicon reflects, among other things, the orthographic similarity between the words. In Hebrew, the words are thought to be organized according to their morphological structure in the lexicon (Frost et al., 2005; Frost, 2012), and hence, words like מצלמה) mCLMh, maclema, camera)<sup>3</sup> and יצטלם) iC¸tLM, yictalem, will-be-photographed), which share a root (CLM), are thought to be represented adjacently in the lexicon, even though they are not very similar orthographically (see also the words תספורת and ספריך in the bottom of **Table 1**).

Findings from normal reading of Hebrew, mainly from studies by Avital Deutsch, Ram Frost, and their colleagues (e.g., Frost et al., 1997; Deutsch et al., 1998, 2000) indicate that the root morpheme mediates access to words in the lexicon, as words prime other words with the same root, regardless of semantic relation, and more so than orthographically similar words. Nouns prime nouns with the same root. For verbs, both the root, and the verbal template show priming effects, suggesting that the affix also has a mediating role in lexical access (Deutsch et al., 1998). Even a root that is not an existing word in itself mediates the identification of words that are derived from it (Frost et al., 1997). Morphologically complex non-words that are composed of an existing root and a verbal template also undergo decomposition (Deutsch et al., 1998). Additional findings indicate that the speed of decomposition is similar when the root's consonants are joined or dispersed (Feldman et al., 1995; Frost et al., 1997), providing evidence of the non-linear nature of word scanning in Hebrew.

Morphological decomposition in Hebrew is disrupted in the case of defective roots, which do not include three consonants. The addition of a random consonant to these verbs, which creates a pseudo-root, re-establishes morphological decomposition (Frost et al., 2000a), indicating that the decomposition mechanism in Hebrew does not require an existing root to decompose the verb to its constituents. This finding clarifies that morphological decomposition is guided by the word's structure and not by lexical factors such as whether the root exists in the lexicon.

In Hebrew, there are many words that are morphologically related but not semantically related. Bentin and Feldman (1990), Frost et al. (1997), and Frost et al. (2000b) used this fact to show that morphological effects can occur in the absence of semantic relations between the words in Hebrew. Frost et al. (1997) used a masked priming task and found that priming effects for morphologically related words were almost identical for semantically related and unrelated words. Bentin and Feldman (1990) used delayed repetition priming at long lags, and reached similar conclusions. They compared semantically related pairs (with and without morphological relation) and morphologically related pairs (with and without semantic relation), and showed that words that share the root but are unrelated semantically show significant repetition effects even at long lags, whereas semantic associations showed priming only at short lags. Frost et al. (2000b) used a cross-modal priming task and also found a strong morphological effect beyond the semantic and phonological relations between words. Morphological priming occurred in their task even when there was morphological (both are derived from the same root), but no semantic relation between the prime and the target. Frost et al. (2000b) concluded that morphological priming cannot be accounted for by semantic and phonological factors alone. The broader implications of their study are that the source of the priming effect reflects morphological processes that are not constrained by semantic factors. Furthermore, the results pertain to the lexical organization of words in Hebrew, and probably other Semitic languages: these results suggest that words are organized by a morphological dimension.

It is interesting to compare these conclusions from Hebrew to conclusions drawn from non-Semitic languages like English and Italian. Some studies (e.g., Marslen-Wilson et al., 1994) found evidence for morphological decomposition of semantically transparent forms, but not of semantically opaque ones. In other studies (e.g., Feldman and Soltano, 1999), morphological facilitation was insensitive to semantic transparency in early stages of reading, and semantics became relevant later. Yet other studies of English report, like Hebrew, a non-semantic morphological priming effect. For example, Kempley and Morton (1982) found this effect in long term priming of spoken words presented in noise. They found a strong facilitation from words inflectionally related to the test word (e.g., reflect/reflected). Importantly, there was no facilitation from semantically related words that were not morphologically related, in words with irregular inflection (e.g., lost/loses), suggesting that the facilitation was morphological rather than semantic.

<sup>2</sup>There are also seven bound morphemes in Hebrew, which are represented each by a single letter that is linearly affixed to the beginning of words, parallel to the, that, and, in, from, such as, and to in English. We do not test or discuss this type of morphology in the current paper (see Friedmann et al., 2015, in this research topic for findings regarding the morphological analysis of these prefixes in reading).

<sup>3</sup> In all the graphemic transcriptions throughout this article, root letters appear in capital letters and the rest of the letters are in lower case. The Hebrew words do not include this distinction in the orthography.


TABLE 1 | Examples for inflected and derived words in Hebrew for the root , SPR. The root appears in purple, inflectional morphology in orange, derivational in turquoise. The root meanings relate to stories, numbers, and hair-cutting.

*Some of the words have additional readings, we chose the main ones for simplicity.*

Hence, studies on normal reading of morphologically complex words in Hebrew indicate that this morphological decomposition is a non-semantic, structural process, which extracts the roots from nouns and verbs, and applies even for morphologically complex non-words. In this study, we will examine the stage at which morphological decomposition occurs by studying the effect of morphological structure on the reading of people with a pre-lexical deficit in visual-orthographic analysis—neglexia.

# 1.3. Neglexia

Neglect dyslexia is a type of dyslexia in which one side of the stimulus is neglected, usually the left side. The literature reports neglect dyslexia at the word level and at the text level (de Lacy Costello and Warrington, 1987; Patterson and Wilson, 1990; Haywood and Coltheart, 2001; Friedmann and Nachman-katz, 2004; Nachman-katz and Friedmann, 2007; Vallar et al., 2010; Friedmann et al., 2011). This study focuses on acquired neglect dyslexia at the word level, which we term neglexia. Neglexia is manifested in neglect errors in word reading, i.e., omissions, substitutions, and additions of letters, on one side of the target word. Neglexia belongs to the group of peripheral dyslexias, caused by a deficit at the early, pre-lexical stages of orthographicvisual analysis of written words (Caramazza and Hillis, 1990; Riddoch, 1990; Ellis and Young, 1996; Haywood and Coltheart, 2001).

# 1.3.1. The Effect of Morphology on Reading in Neglexia

Although many studies explored in depth many aspects of neglexia (see, for example, Ellis et al., 1987; Riddoch, 1990; Ellis et al., 1993; Haywood and Coltheart, 2001), only few studies evaluated the role of morphology in neglexia, and neglexia is often thought to be affected by spatial, rather than morphological, factors. For example, Caramazza and Hillis (1990) concluded that "the representation computed at the level of the grapheme description does not contain morphological structure" (p. 420). However, the performance of NG, the participant with rightneglect they describe in that article (summarized in their Table 11, p. 420) was actually affected by the morphological structure of the target words. She made significantly more errors on the right side in words that end with suffixes (222/383, 58%) than in words in which the same stems appeared on the right side (with no affixes) (122/383, 32%; χ <sup>2</sup> = 52.77, p < 0.0001).

Arduino et al. (2002) examined the effect of two morphological measures on oral reading in neglexia: lexical frequency of the words' morphological components and the morphological complexity of the target non-word. They found that some (but not all) the participants were affected by the frequency of the root and the suffix, reading words in which the morphological components were of high frequency better than words with the same frequency in which the morphological components had lower frequency. Similarly, some (but not all) the participants read morphologically complex non-words that included a real root and a real suffix better than morphologically simple non-words. These findings (and see also Vallar et al., 2010, for a review) indicate that the morphological structure of the target word affects the reading of some individuals with neglexia. Arduino et al. (2002, 2003) and Marelli et al. (2013) discuss the morphological effect in neglexia and suggest that they result from an interaction of lexical knowledge with the residual perceptual analysis of the neglected portion of the stimulus that is available to the neglexic reader.

In the current study we aim to further explore, using this effect of morphological structure on reading in neglexia, the stage at which morphological decomposition occurs, the mechanism by which neglect errors are affected by the morphological structure, and the nature of morphological decomposition at the early stage of reading. The general rationale was that given that neglexia is a very early deficit in the process of single word reading, then if the morphological structure of the target word affects reading in neglexia, which could not be ascribed to lexical feedback, this would indicate that morphological decomposition occurs at an early stage of the reading process. We will further explore the nature of the effect of morphology by examining whether perceptual effects such as word length and letter forms are sensitive to morphology, which would establish the early stage at which this effect occurs. We will then assess the extent to which lexical and semantic factors modulate the effect of morphology on neglect errors. We will do so by assessing the morphological effects on neglect errors in pseudo roots and pseudo affixes. Namely, we will test the rates of neglect errors of components that can, structurally, be roots/affixes in the target word, but are not real roots/affixes, and compare them to real roots and affixes. We will also examine whether the erroneous responses preserve the semantic or morpho-lexical features of the target word. If these lexical and semantic factors do not have an effect on neglect errors, this would further support the notion that morphological decomposition is active during the early stage of visual-orthographic analysis, and would rule out a mechanism according to which morphology affects neglect errors by way of feedback from later, lexical, stages.

# 2. Method

# 2.1. Participants

Seven individuals with neglexia at the word level following brain damage participated in this study (**Table 2**). All participants had acquired neglexia, as diagnosed using standard language tests (the Hebrew versions of the WAB, Kertesz, 1982; Hebrew version by Soroker, 1997; or the ILAT, Shechther, 1965) conducted when they were admitted to the rehabilitation centers. Six of them had left-sided neglexia, and one had right-sided neglexia. None of the participants had syntactic or morphological problems (according to the WAB and the ILAT). Five of the participants were native speakers of Hebrew (one of them was bilingual), and two participants (T. and K.) had been living in Israel and speaking and reading Hebrew for over 40 years at the time of their stroke. As shown in **Table 2**, some of the participants had a general


visuo-spatial neglect, as assessed by the Behavioural Inattention Test (BIT, Wilson et al., 1987), and some also had neglect at the text/sentence level.

# 2.2. Procedure and Material

The participants read aloud a list of single words that end or start with derivational or inflectional affixes (Tiltan Test for Neglexia, Friedmann and Gvion, 2003), with no time limit. If the participant gave several responses for the same target word, only the first response was included in the analysis. Importantly, the words in the list were selected so that a left and/or right sided neglect error on each of these words creates other existing words. The words were presented to the participants as a list, one above the other, in the middle of an A4 white page. Different participants read different numbers of words which were relevant for further analyses, ranging between 88 and 163 words. (these differences resulted from some of the patients not being available for more than one meeting, and the difference in their severity of impairment and degree of frustration). Across the list, the same root appeared only once (except for one root that appeared in three morphological templates), and the morphological inflections and derivations of the target words varied so that the same morphological template (derivational + inflectional) repeated four times at most, and most of the morphological templates appeared only once or twice in the list. The protocol has been approved by the Tel Aviv University Ethics committee (Department of Psychology), and the participants signed written informed consent forms, which were read and explained to them.

# 2.3. Data Analysis

# 2.3.1. Potential for Lexical Errors

Neglect dyslexia causes letter omissions, letter substitutions, and letter additions in the neglected side. Because it is often the case that individuals with acquired peripheral dyslexias provide mainly lexical responses, the word list was created so that an omission, substitution, or addition of letters on the left or on the right of each of the target words would create existing words.

As will be reported in the Results, most of our participants' neglect error responses (91%) were indeed existing words. Therefore, each of the analyses was made out of the set of words that could be created by a neglect error of the relevant type. For example, for the participants with left neglect, the word ŠoReK has lexical potentials for omission, substitution, and addition (שורק → שור, שורש, שורקת ;šorek → šor/šoreš/šoreket)—namely, each of these error types could create an existing word; the word tarnegolim had lexical potential for omission and substitution ,(tarnegolot/tarnegol → tarnegolim; תרנגולות ,תרנגול → תרנגולים) but not for addition—namely, no existing word could result from an addition of a letter to the left of this target word; the word nafsik only has the lexical potential for substitution (נפסיק→ נפסיד ; nafsik → nafsid). Thus, each analysis was made out of the words that had the relevant lexical potential: omissions were calculated only out of the total number of words that allowed for an omission that would create an existing word, and the same for substitutions and additions. Therefore, in the analysis of the total number of words with a lexical potential for omission, words like shorek and tarnegolim were included, but not the word nafsik.

The potential word sets also took into account the neglect point of each participant (e.g., for participants who tended to only neglect the final letter in 4–5 letter words, the potential sets were created accordingly, for words that differ in the final letter only). Potential words that produced infrequently used words were not included (see Section 3.7.4 for the relative frequency of the target word and the lexical error responses).

# 2.3.2. Real Morphological Components vs. Potentially-morphological Components

A component that can be used as a morpheme can be a real morpheme, namely, function as part of the affix in the target word (like –er in dancer in English), or can be potentially morphological, namely, include the letters and be placed in a position in the word that could function as an affix in some words, but not be part of the affix in the target word (like –er in corner). To determine whether a component that can be used as a morpheme has a real morphological role or a potentially morphological role in the specific target word, a list of the relevant words was presented to 10 linguists and psycholinguists who are native speakers of Hebrew. Only words for which the agreement rate with respect to the status of the affix was higher than 70% were included in the analysis comparing real and potential morphological role.

# 2.4. Statistical Analyses

A comparison between conditions for each participant individually was performed using chi-squared (χ 2 ) tests or Fisher tests, according to the number of items compared. In all of the tables in the paper, the chi-square values are reported using the χ 2 and p-values, and the Fisher's exact probability test is presented with a p-value. A comparison of the error types at the group level was performed using t-test, reported with a t-value. The logistic regression coefficients (B-values) are reported, and the binominal tests are presented using z statistics. All tests were conducted with α = 0.05. A non-significant difference was defined as a trend when 0.05 < p ≤ 0.1.

# 3. Results

The same analyses were done for the 6 participants with left neglexia and for the participant with right neglexia. We will first present the analyses and findings from the participants with leftsided neglexia in Sections 3.1–3.7, and then in Section 3.8, the findings from the participant with right-sided neglexia will be presented.

# 3.1. Reading Accuracy and Error Types

The participants with left-sided neglexia had between 15% and 57% left-sided neglect errors when reading the word lists, with a group mean of 26% errors (**Table 3**). Almost all the errors the participants made were neglect errors, namely, errors of omission, substitution, or addition of letters on the left of the word, and none of the participants had more than two nonneglect errors– errors that were not confined to the left of the word. Such non-neglect errors amounted to only 1.1% of the


TABLE 3 | Left-sided neglect errors: number and rate of left-neglect errors compared with other non-left errors out of all words presented, and the rate of lexical responses out of the neglect responses of each participant.

total number of words the participants read, supporting the participants' diagnosis of left neglexia. The eight non-neglect errors were excluded from further analyses.

Most of the neglect error responses of the participants with left-sided neglexia (91%) were existing words. The neglect errors yielded significantly more lexical than non-lexical (non-word) responses both at the individual level (χ <sup>2</sup> ≥ 37.29, p ≤ 0.001) and at the group level (z = −11.39, p < 0.0001). Only one participant (Z.), who had the highest rate of neglect errors (57% of the words he read), produced more than two non-lexical responses. As a result, we calculated the rate of each type of error out of the target words with a lexical potential of the relevant type. For example, left sided letter omissions were calculated out of the number of words the participant read for which a left letter omission could create an existing word (see Methods Section).

The neglect errors the participants made included letter omissions (e.g., שורק → שור ;ŠoRQ→ŠOR; šorek→šor), letter substitutions (e.g., שורק→שורש ;ŠoRQ→ŠoRŠ; šorek→šoreš), and letter additions (e.g., שורק→שורקת ;ŠoRQ→ŠoRQt; šorek→šoreket). Although the participants made a larger number of substitution errors (see **Table 4**), this is a result of the number of words in the list that allowed for lexical substitution errors compared with lexical omissions or additions. When the errors of the various types are calculated as rates out of the number of words in which such an error would create an existing word, the rate of omissions, substitutions, and additions becomes similar (**Table 4**). There were similar rates of the various neglect error types at the group level [t(5) ≤ 1.04, p ≥ 0.53]. Similarly, at the individual level, except for T. and C., the analysis of the rates of the three types of neglect errors yielded no significant differences between the different error types (p ≥ 0.08). T. had significantly more substitutions than omissions (p = 0.008) and made only one omission error. C. had significantly more omissions than substitutions (χ <sup>2</sup> = 4.48, p = 0.03). **Table 4** presents the distribution of neglect errors of the three types out of the lexical potential for each type.

# 3.2. The Effect of Morphology on Reading: Root vs. Affix

The first analysis of the role of morphology on reading in neglexia assessed the rate of neglect errors as a function of the morphological status of the left side of the word. Throughout the article, we will use the term "affix" to refer to non-root letters that are part of the nominal or verbal derivational pattern morpheme, or part of an inflectional morpheme. These could occur as an infix, suffix, prefix, or a combination thereof. For the analysis of left-sided neglexia we will use the term "affix" for non-root morphemes that appear in the left side of the word.

We compared the rate of neglect errors (letter omission, substitution, and addition) in words that end (left side) in a root letter (including real and potential roots, see 3.7.2) with words that end in an affix (real or potential, Methods section). As shown in **Table 5**, all the participants neglected more letters belonging



TABLE 5 | Neglect of a root letter in words ending with a root letter and neglect of an affix letter in words ending with an affix.

*In this table and in all of the following tables, the boldface in the comparison column marks a significant difference.*

to affixes than root letters. This difference was significant at the group level and for four of the individual participants.

To rule out a confound of length effect that may have modulated the morphological effect (words ending with a root letter had 3–5 letters, M = 4.1 letters, whereas the words ending with an affix had 4–8 letters, M = 5.2 letters), we compared neglect errors only in 4- and 5-letter words ending with a root or with an affix. In this analysis too, there were significantly more neglect errors in words ending with an affix: for 4-letter words, there were 13% errors in words ending in a root letter and 29% errors in words ending in an affix. For 5-letter words, the rates were 12 and 24%, respectively. In 4- and 5- letter words analyzed together, the left letter was neglected significantly more often when it belonged to an affix (27%) than when it belonged to the root (13%), t(5) = 2.09, p = 0.04. Thus, the morphological role effect in left-sided neglexia is a real effect and cannot be explained by the length effect.

In conclusion, the reading of participants with neglexia was found to be affected by the morphological role of the left side of the target word: significantly more neglect errors occurred when the left side of the word was part of an affix than when it was part of the root.

# 3.3. Does the Morphological Effect Result from Morphological Decomposition of the Target Word?

A question that arises from these findings is whether letters that are part of the affix are just recognized as letters that can, in general, have a morphological role in some words, or whether, for each word, a morphological analysis of the target word is made that identifies the root and template/inflection, and then the letter is treated as an affix letter when it can be part of the affix in the specific target word, at least according to a structural analysis of the word.

A way to determine between these possibilities comes from the fact that in Hebrew all the letters that can serve as part of an affix can also be part of the root. We used this property of Hebrew to compare between two possible explanations: one according to which there is no decomposition but only a list of affix letters, and another explanation according to which the target word undergoes morphological decomposition. We did so by comparing the neglect of the same letters in two roles. Specifically, we compared letters that can take an affix role in some words, when they function as an affix and when they function as the third letter of the root. To do this, we compared neglect error rates in words ending with the letters m (ם (and n (ן( when they function as an affix (e.g., in the word ספרתם, SPRtm, safartem, count-past-2nd-mas-pl, where the m serves as part of the inflection) and when they function as a root letter (e.g., in the word אחלום, aXLoM, axlom, dream-future-1st-sg, where the m serves as the third root letter). Lexical knowledge is not required to identify the letter in the two words as part of the affix or as part of the root: the structure of the words and its derivational templates and inflections indicates whether it is (structurally) a root or an affix letter.

As shown in **Table 6**, this comparison indicated that the participants with neglexia neglected the exact same letters in exactly the same linear position significantly more often when, taking into account the structure of the whole word, these letters functioned structurally as affixes in the target words than when they were part of the root. All the participants showed this pattern, which was significant for B. and Z.

Thus, this comparison indicates that neglect is influenced by the morphological role of the letter in the target word: a root letter or an affix letter, and not by a list of letters that could function as an affix and are thus deleted regardless of their role in the target word. It suggests that an analysis of the structure of the whole word is done, probably on the basis of information about templates and affixes in Hebrew and the search for three consonant letters to serve as a root. This, in turn, indicates that an early morphological analysis of the whole word occurs prior to the stage at which letters are neglected.

# 3.4. The Effect of Morphology on Different Types of Neglect Errors: No Omissions of Root Letters

An analysis of the different types of neglect errors in words ending with a root letter and in words ending with an affix, summarized in **Table 7**, showed that the morphological status affected different neglect errors differently. In target words ending with a root letter, there were significantly fewer omissions than substitutions and additions. For words ending with an affix,


TABLE 6 | Neglect errors (omissions and substitutions) in the left letters m and n when they appear as part of the affix and as part of the root.

no significant difference was found between the rates of the different types of neglect errors.

Furthermore, the morphological role affected omissions and substitutions, but not additions: omissions and substitutions occurred more often in words ending with an affix than in words ending with a root letter. For addition errors, no significant difference was found between the two types of words.

The most striking difference between root and affix letters was thus found in the rate of omissions. Why are omissions so sensitive to the morphological status of the letters in the neglected side? In Hebrew, most words are constructed from 3-letter roots and affixes, the root carries most of the meaning of the word, and is probably the unit stored in the orthographic input lexicon. We believe that the sensitivity to morphology results from this fact. The results suggest that orthographic-visual analysis is directed by a search for three letters of the root, and the orthographicvisual analyzer refuses, as it were, to stop before it identifies three root letters. This creates the situation in which root letters on the neglected side are almost never omitted. In the reading of all the words ending with a root letter with a potential for omission, across all participants, only a single omission of a root letter was made. It seems that the visual analyzer does not stop shifting attention to the left until three consonant letters that could form the root have been identified.

This pattern also has a direct effect on whether or not the neglect response keeps the length (number of letters) of the target word. In a general analysis across all word types, none of the participants preserved word length, only 33% of the responses preserved the length of the target word. There were more neglect errors that did not preserve word length than neglect errors that preserved word length (a Binomial analysis that pulled all the responses of the participants together , z = −4.61, p < 0.0001). This is related to the finding that, as shown in **Table 4**, letter omissions and additions, which changed the length of the word, also occurred, and not only substitutions that preserved word length. Once the preservation of word length is analyzed (see the bottom of **Table 7**), with a separate analysis of words ending with a root letter and with an affix, one can see that there were almost no responses that shortened the word length when the target word ended with a root letter, whereas for words ending with an affix, no significant difference was found between the rates of neglect errors shortening, elongating, or keeping the original word length.

# 3.5. Interim Summary: The Effect of Morphology on Reading in Neglexia

The morphological role of the neglected side of the word has a crucial effect on reading in neglexia: letters on the left side of the word are neglected more often when they function as an affix


TABLE 7 | The rate of different types of neglect errors in words ending with a root letter vs. words ending with an affix.

*The analysis summarized in this table includes only words with the relevant lexical potential for each type of error, only lexical neglect errors, and excluding errors that occurred after the first or second letter.*

in the target word than when they function as root letters. This effect is a result of the morphological analysis of the target word and identification of the role of each letter in the target word, as the same letters can sometimes be treated as affixes, and be neglected, or as root letters, and be retained, according to the morphological structure of the target word. The morphological structure is analyzed as a whole, based on knowledge of the morphological structure of Hebrew words, and hence, of possible structures in which the root letters are inserted: the derivational and inflectional templates. The morphological role of the letter mainly affects omission and substitution errors. Thisindicates that the orthographic-visual analyzer is actively searching for the three root letters. Until these root letters have been detected, attention shifting continues, and these letters are not omitted. When the three root letters are identified, there is no longer difference between words ending with an affix and words ending with a root, and letter additions occur in both word types to a similar extent.

# 3.6. Perceptual Effects in Reading in Neglexia are Modulated by Morphological Structure

The finding that the morphological structure of the word affects reading in neglexia, which is a pre-lexical impairment, already points to a pre-lexical morphological decomposition. To further examine the locus of morphological decomposition, we examined the effect of perceptual factors, length effect, and final letter-form effect, on the reading of participants with neglexia.

The rationale was that if these perceptual effects differentially affect words that end in a root letter and words that end in an affix, morphological decomposition occurs very early, at the stage in which these perceptual effects apply. We evaluated the existence of these effects for words of all morphological types together, and then moved to assess whether these perceptual effects affect roots and affixes to the same degree.

# 3.6.1. Length Effect is Modulated by Morphological Status

To evaluate the effect of the number of letters in the word on reading, we compared the error rates in words of different lengths: 3 letters, 4 letters, 5 letters, and 6–8 letters. In this analysis, all types of neglect errors were included in the



*The numbers in superscript indicate the lengths that were found to be significantly different. For example, for participant B., a significant difference in the error rates was found between 3 letter words and words with 6–8 letters.*

calculation of number of errors, including non-lexical responses. As shown in **Table 8**, four participants showed significantly more errors in longer words, an effect that was significant at the group level too, as indicated by pairwise comparisons as well as a significant linear contrast [F(1, 5) = 15.25, p = 0.01], showing a linear increase in the error rates with the increase in word length.

Importantly, when the calculation of length effect was done separately for words ending with a root letter and words ending with an affix, a different picture emerged. For words ending with an affix, there were more neglect errors in 6–8 letter words than in 5-letter words, whereas for words ending in a root letter, there was no difference in error rates between words of different lengths. In order to assess the effects of word length and word category on subjects' error rates, logistic regression with two-way interaction (Word Category X Length) was calculated. This interaction was significant (WALD = 6.31, df = 2, p = 0.04), meaning that the word length affected subjects' error rates differentially according to word category. Namely, once the word ended with a letter that was part of the root, the error rate did not increase when the word became longer. Further analysis revealed that this interaction was due to the difference in error rates between 6 and 8 letter words and 5 letter words for words ending with an affix (WALD = 5.14, df = 1, p = 0.02).

Relatedly, the presence of a prefix (on the right-hand side of the word) in words ending with a root letter did not raise the neglect error rate in comparison with words without a prefix (שקל—שקל , **m**ŠQL—ŠQL, miškal vs. šekel), both at the individual level (p ≥ 0.13) and at the group level (t(5) = 1.3, p = 0.12). This finding indicates that the prefix letter is identified as such and is not counted as a root letter.

In summary, words ending with a root letter did not show a length effect, whereas words ending with an affix did show a length effect for 5-letter and 6–8 letter words.

# 3.6.2. Final Letter Form Effect is Modulated by Morphological Status

Hebrew has five letters that change their form according to their position in the word. When they appear in the final (leftmost) position in the word, they bear a different form than when they appear in any other position. These letters have the form פצכמנ in the beginning or middle of the word, and ףץךםן in final position (Friedmann and Gvion, 2005). To assess the effect of the letterform (final-non final) on reading, we compared words ending with a final-form letter with words ending with a letter that does not change its form at the end of the word (from here on "non-final letters").

All of the participants except B. had more neglect errors in words ending with a non-final letter than in words ending with a final letter. This difference was significant for H., Z., and C. (p ≤ 0.03). At the group level, there were more neglect errors in words ending with a non-final letter than in words ending with a final letter (t(5) = 2.06, p = 0.04)<sup>4</sup> .

<sup>4</sup> In Hebrew, six letters protrude beyond the writing line—5 protrude downwards (ן, ך, ק, ץ, ף(, and one upwards (ל(. This visual salience did not seem to have an effect on neglect errors. Whereas all the participants made fewer neglect errors in words ending with a protruding letter, at the individual and group level, this was

Similarly to the length effect, the effect of final letter forms on neglect errors was modulated by morphology. Whereas when all the target words are analyzed together, significantly more neglect errors were made in words ending with a non-final letter than in words ending with a final letter, the analysis by morphological status showed that the final letter effect was found in words ending with an affix but not in words ending with a root letter. For words ending with a root letter, no significant difference was found between words ending with final and non-final letters, both at the individual level (p ≥ 0.35) and at the group level (t(5) = 0.97, p = 0.18). In contrast, for words ending with an affix, the group (without B who showed a reverse trend) made significantly more neglect errors in words ending with a non-final letter than in words ending with a final letter, t(4) = 2.28, p = 0.04. This effect applied for each of the individual participants, except B., but was significant only for C. (p = 0.05).

# 3.6.3. Interim Summary: Morphological Structure Affects the Manifestation of Perceptual Effects

Whereas in the calculation of all test words, length and final letter effects were found, these perceptual factors did not affect the reading of words ending with a root letter, only words ending with an affix. Different patterns were also found with respect to neglect errors of different types (omission, substitution, and addition) for the words ending in a root letter vs. words ending in an affix, indicating the greater resilience of words ending with a root letter in comparison to words ending with an affix. The finding that these perceptual effects show differential behavior for words ending in root and affix letters indicates that morphological decomposition occurs very early, at the orthographic-visual perception stage in which the perceptual effects apply.

# 3.7. Does Morphological Decomposition Occur before Access to the Lexicon and to Meaning?

If morphological decomposition is indeed implemented in an early, pre-lexical stage, before the access to the lexicon and to meaning, and without feedback from the lexical stages, we would not expect semantic and lexical variables to affect the reading of the participants with neglexia. We thus examined whether various lexical and semantic factors affect their reading and the manifestation of the morphological effects on their neglect errors. Absence of such effects would support pre-lexical morphological decomposition.

# 3.7.1. Words for Which a Structural Non-lexical Morphological Decomposition Creates a Lexically Incorrect Analysis

One way to examine whether the morphological decomposition occurs at a stage at which lexical factors already play, or whether it is guided by purely structural characteristics of the target word, is by examining the reading of words that "trick" or mislead a pre-lexical structural analysis. We used words ending with an affix letter that an early structural morphological decomposition, ignorant of lexical knowledge, would analyze as a root letter. For this analysis we used words that have a defective root of only two letters and a consonantal affix, which could be taken by structural non-lexical analysis to be the third consonant. The rationale was the following: to know that in this specific word there are only two root letters and the final letter is an affix letter, one needs to access the lexicon. Otherwise, a preliminary structural morphological decomposition would take the final consonant to be the third root consonant. Thus, such defective roots offer a way to find out whether the morphological analysis and its effect on neglect errors take into account lexical considerations. If these words behave like words ending with a root, and include fewer omissions than words ending with an affix, this will indicate that the morphological analysis in this stage is structural, and is not guided by lexical considerations. Namely, that the morphological analysis that affects neglect errors is pre-lexical.

For example, the word מילון) MiLon, milon, dictionary) is derived from the word מילה) MiLh, mila, word) plus the derivational affix ון-) -on). However, this knowledge, and the relation between word and dictionary, only exist in lexical and semantic stages. Structurally, because the base only has two consonant letters, this word could be analyzed as a word with a 3-consonant root, if the affixal -n is taken to be the third root consonant. To allow for a comparison between words with defective and 3-letter roots, we used words with similar frequencies (M = 4.3, SD = 1.04, for the defective root words, and M = 4.2, SD = 1.16, for the other words we tested, which included three letter roots).

The results were that the participants with neglexia treated these words as if they ended in a root letter, namely, they did not use the information in the lexicon about this word, which would have caused them to treat it as ending with an affix. Each of the participants made fewer neglect errors in these "unclear" words than in words with three root letters clearly ending with an affix, and this difference was significant for B. and C. (p ≤ 0.04). Furthermore, these "tricky" words behaved like the words that end with a root letter: all the participants showed similar neglect error rates for the "tricky" words and for words ending with a root letter, p ≥ 0.25 (and B. even showed marginally significantly fewer errors in the tricky words compared with the root-ending words). And so did all of them as a group, t(5) = 1.04, p = 0.17.

Therefore, we can conclude that morphological decomposition at this stage is structural rather than lexicalsemantic, and treats words with only two root letters and a final consonant affix letter like three-consonant root words, and considers the left letter to be a root, rather than an affix letter, and hence does not neglect it. These results also indicate that the morphological effect is a result of morphological analysis of the whole target word rather than a different, simple, treatment of letters that belong to a list of "morphological letters." These results thus indicate that the morphological analysis is structural and can occur without information from the lexical level.

# 3.7.2. Does the Lexicality of the Root Affect Decomposition?

Another way of examining whether morphological decomposition occurs before the lexicon and whether it is

the result of most of the protruding letters being final-form letters, rather than their visual salience. When controlling for the final-form variable and the morphological letter variable, the visually-salient protruding letters are no longer more resilient to neglect errors that the other letters.

influenced by the lexicon and semantics is by examining whether the decomposition occurs only when a productive root (i.e., a root that acts as a root in additional semantically-related words) is identified or whether it occurs in every case in which the word structure enables the identification of three consonant letters that can serve as root letters. To examine this, we compared the neglect error rate in words in which the left letter is part of a real productive root with the error rate in words in which the left letter is part of a consonant sequence that is structurally the root but is not a real productive root.

We defined a sequence of consonants a productive root if the target word was a 3-consonantal verb, or if there was a 3 consonantal verb or an action noun derived from the same root and semantically related to the target word. E.g., the word שתיל (ŠTiL, štil, seedling) includes a real productive root, because its root, STL, serves in the verb שתל) ŠTL, šatal, planted), which is semantically related to it.

No significant difference was found between the neglect error rates in words ending with a productive root letter and in words ending with a potential root letter, at the individual level (p ≥ 0.24) and at the group level (t(5) = 0.24, p = 0.41). Thus, words in which three consonants can structurally serve as a root, even if they are not real productive roots, are morphologically decomposed just like words with a meaningful productive root.

# 3.7.3. Does It Matter if the Affix Letter Really Functions as an Affix in the Target Word?

A similar comparison was conducted for affixes. We analyzed words ending with an affix letter, comparing words ending with a real affix and words ending with a potential affix. A word was defined as ending with a real affix if it included a real 3-letter root or stem that was joined to the affix, and the root/stem was semantically related to the affixed word (e.g., dancer in English). A word was defined as ending with a potential affix if it included three letters with the potential to act as a root that were joined to letters with the potential to be an affix, but the root/stem was not semantically related to the affixed word (e.g., corner in English).

In this comparison too, no significant difference was found between words ending with a real affix (96/278) and words ending with a potential affix (4/19), at the individual level (p ≥ 0.22) and at the group level (t(5) = 1.71, p = 0.07).

These comparisons, at the root and at the affix levels, provide evidence that there is no lexical-semantic effect on the morphological analysis that affects neglect errors, and that this preliminary morphological decomposition does not take the existence of a real root or the semantic relationship between the decomposed word and the target word into account.

### 3.7.4. No Clear Frequency Effect

Another way to evaluate lexical effects on reading was by assessing whether word frequency, which is clearly a lexical factor, affected reading accuracy and neglect errors. We evaluated the relative frequency of the target and response words, as well as the correlation between the target word frequency and the success in reading it.

To examine the relative frequency of the target words and the erroneous responses the participants provided, we presented 30 skilled readers, native speakers of Hebrew, with pairs of words that included the target word and the erroneous response word. The judges were asked to mark the more frequently used word of the two or to mark both of them if they felt that the words had similar frequency. To include only targetresponse pairs for which there was a clear frequency difference, the target word was defined as more frequent if the ratio [number of judges who chose the target as more frequent/(2<sup>∗</sup> number of judges who chose the response as more frequent + number of judges who judged the words as similar)] was at least 1.5. The response word was defined as more frequent in the same way, namely if [response/(2<sup>∗</sup> target + similar)] was at least 1.5.

To examine the relation between frequency and the participants' performance, the frequencies of the target words were collected through the judgments of 30 native Hebrew speakers. In this judgment, the judges rated the frequency of the word on a 7-point scale from "very rare" to "very frequent."

In the analysis of the relative frequency of the target and response, the participants' performance was characterized by mixed trends. Two of the participants, H. and Z., had a significantly higher percentage of erroneous responses that were more frequent than the target words (p ≤ 0.04), three participants showed no significant difference between the two types of responses, and one participant, T., had a significantly higher percentage of erroneous responses that were less frequent than the target words (p = 0.02).

To examine the effect of frequency on accuracy, we ran logistic regression with error rates as dependent and word frequency as independent variables. K's error rate was found be dependent on word frequency (B = −0.49, p = 0.03). B's error rate was marginally depended on word frequency (B = −0.39, p = 0.06). The other four participants did not show dependence between error rate and word frequency (−0.20 ≤ B ≤ 0.06, p ≥ 0.33).

# 3.7.5. No Semantic Effects

Another analysis we used to examine whether lexical-semantic factors affect neglect errors focused on the semantic relation between the response and the target word.

# **3.7.5.1. Semantically related and unrelated responses**

We compared neglect errors that result in words semantically related to the target word (e.g., ילדים → ילד, ILDim → ILD, boys → boy) and neglect errors that result in words with no semantic relation to the target word (e.g., ריבה → ריב, RIBH → RIB, jam → quarrel). The analyses were performed on words ending with an affix letter (real or potentially morphological affix).

No significant difference was found between neglect errors that created words semantically related to the target words and neglect errors which were not semantically related to the target words, at the individual level and at the group level [t(5) = 1.7, p = 0.07]. Namely, there was no effect of the semantics of the target word on the erroneous response produced.

# **3.7.5.2. No preservation of morpho-lexical features**

We also examined whether the neglect errors preserved morpholexical features of the target word, such as the lexical category and gender. Preservation of these features can provide evidence that higher processing occurs prior to morphological decomposition, because to know the lexical category and gender of a written word, the reader has to access the syntactic lexicon (Friedmann and Biran, 2003; Biran and Friedmann, 2012). Preservation of morphosyntactic properties of the target word would thus provide evidence that such access to lexical stages has occurred prior to the morphological decomposition, and hence, would indicate that the morphological decomposition is post-lexical.

The analysis in this section only included words for which neglect errors of any type had both the potential for creating a word that preserves the relevant feature and a word that does not preserve this feature (e.g., one of the words in the analysis of lexical category preservation was the noun משק, MŠQ, which could be read with a neglect error as another noun, משקל, mŠQL or as a verb, משקר mŠQR). We then compared the rate of errors that preserved the relevant feature and errors that did not<sup>5</sup> .

No significant difference was found between neglect errors that preserved the lexical category (noun, verb, adjective) and neglect errors that did not preserve the lexical category, at the individual level (χ <sup>2</sup> ≤ 2.89, ≥ 0.13) and at the group level (z = 0.58, p = 0.72).

As for the gender feature, in Hebrew there are two grammatical genders, masculine and feminine, both for animate and for inanimate nouns. Adjectives and verbs also inflect for one of the two genders. We tested whether neglect responses preserved the gender or the gender inflection of nouns, adjectives, and verbs. The results indicated that there was no tendency to preserve the gender of the target word in the response, and in fact four of the participants even had a smaller percentage of neglect errors that preserved the gender feature than neglect errors that did not preserve this feature, and for C. this difference was significant (χ <sup>2</sup> = 5.33, p = 0.02). For K. no difference was found between the two types of neglect errors. Thus, these findings indicate that there is no tendency to preserve lexical categories or gender inflection in neglect errors.

# **3.7.5.3. Derivational vs. inflectional errors**

Some studies of Hebrew normal reading suggested that some types of morphemes are decomposed but others are not (Deutsch et al., 1998; Frost et al., 2000b, for example, demonstrated differences between verbal and nominal templates). We examined this issue by comparing neglect errors that reflect inflection processes and neglect errors that reflect derivation processes.

In an analysis of the errors that took into account for each target word the lexical potential for derivational and inflectional errors, no significant difference was found between derivational omissions and inflectional omissions either at the individual level (p ≥ 0.06) or at the group level [t(5) = −0.36, p = 0.63]. In the analysis of substitution errors, also no significant difference was found between derivational substitutions and inflectional substitutions both at the group level [t(5) = 0.45, p = 0.33] and at the individual level, at which none of the participants showed a significant difference between the two types of substitutions (p ≥ 0.45), except for B. (p = 0.04). Similarly, in the analysis of addition errors, no significant difference was found between derivational additions and inflectional additions at the group level [t(5) = −0.13, p = 0.55], and at the individual level, at which none of the participants showed a significant difference between the two types of additions (p ≥ 0.36), except for C. (p = 0.04). Thus, the distinction between derivational and inflectional morphology did not have an effect on the participants' performance, and it seems that both types of morphemes are decomposed at the pre-lexical morphological decomposition stage.

# 3.7.6. Interim Summary: Morphological Decomposition is Structural and Prelexical

The findings in this section indicate that lexical and semantic factors do not affect the neglect pattern of the participants with neglexia. These results indicate that neglect errors occur before written words undergo lexical and semantic processing, and without feedback from these stages.

Indeed, we know that the lexicon affects reading in neglexia in general—a word like artichoke is likely to be read correctly, because no other word exists that results from an omission or substitution of the left letter of the word, and hence, access to the lexicon with the partial information about the letters would activate a single word—artichoke, and the word would be read correctly, unlike the word rice, for example, which could be read as nice, ice, price etc.

However, such lexical considerations could not be the source of the pattern of morphological structure effect that we see here: the words that end with a root letter and the words that end with an affix letter showed different error patterns even though both were selected such that neglect errors would create in each of them existing words. Furthermore, we saw the morphological effects even in pseudo-roots and in defective 2-letter roots that were treated by the structural morphological analysis as 3 letter roots, namely, where there was no lexical support from the constituents.

Therefore, we suggest that the morphological effect results from an earlier stage, of a non-lexical non-semantic

<sup>5</sup> In determining the sets of possible lexical neglect errors for each word for this analysis, we had to give homographs a special treatment. Homographic words can have different potentials for a neglect error that results in an existing word. For example, the word אהבה) AHVh, ahava), means both the abstract noun love, and the verb love-past-3rd-fem-sg. Thus, a neglect error that changes אהבה to the verb אהבו) AHVo, ahavu, love-past-3rd-pl) can be analyzed in two different ways, depending on the meaning of the target homograph. If we consider ahava as a noun, the substitution is derivational, whereas if take it to be a verb, the substitution is inflectional. To determine which of the meanings to use in these cases, we collected the judgments of 50 native Hebrew speakers on the relative frequency of the meanings of each homograph. In cases in which there was an agreement of over 95% between judges on which meaning was more frequent, we used the meaning they agreed on. In cases the agreement rate was below 95%, we only used potential words that were common to all of the meanings. Homographic target words that were ambiguous between preserving and non-preserving feature were not included in the morpho-lexical feature preservation analysis.

preliminary morphological decomposition, that is guided by the morphological structure of the target word and affects the attention shift itself. A relevant metaphor would be a city in which all streets have 5-letter flower names. When one sees a street sign in this city, which is partly covered by a traffic light pole, and hence only sees four letters, he will move his head to see the fifth letter. This is parallel to the shift of attention to access the third letter of the root. If this sign is too far and hence looks blurry, then the lexicon can be helpful if only some of the letters are more easily identified: if the reader, after moving his head sees "?aisy" the lexicon would help and activate the word "daisy."

# 3.8. The Effect of Morphology on Reading in Right-sided Neglexia

The reading of R., the participant with right-sided neglexia, was also significantly affected by the morphological status of the neglected side: R. made significantly more neglect errors in words in which the beginning (the right side) was an affix<sup>6</sup> (15/24, 63%) than in words that began with a root letter (7/22, 32%; χ <sup>2</sup> = 4.33, p = 0.04).

Similarly to the participants with left-sided neglexia, R. made significantly fewer omissions in words beginning with a root letter (5/21, 24%) than in words beginning with an affix (12/22, 55%; χ <sup>2</sup> = 4.25, p = 0.04). Moreover, and also similarly to the participants with left-sided neglexia, whereas for words beginning with an affix, significantly more omissions were made than substitution errors (p = 0.001), for words beginning with a root letter, no significant difference was found between the rates of various types of neglect errors (p ≥ 0.21).

Similarly to the findings on left-sided neglexia, R.'s reading was not affected by lexical and semantic factors, suggesting that morphological decomposition occurs prior to access to the lexicon and to meaning also in right-sided neglexia.

## 3.8.1. Real vs. Potential Root

No significant difference was found in the rate of neglect errors between words beginning with a real root letter (6/17) and words beginning with a potential root letter (1/5; p = 0.48).

# 3.8.2. Frequency

No significant correlation was found between the target words' frequency and R.'s success in reading them (B = −0.24, p = 0.27). There was no tendency to produce an error that is more frequent than the target word. In fact, R. made significantly more errors that were less frequent than the target word (38%) than errors than were more frequent than the target (7%), p = 0.005.

# 3.8.3. Semantically Related vs. Semantically Unrelated

No significant difference was found between affix neglect errors that created a response semantically related to the target word (9/29) and affix neglect errors that were semantically unrelated to the target word (5/23; χ <sup>2</sup> = 0.56, p = 0.45).

# 3.8.4. Derivational vs. Inflectional Errors

No significant difference was found between the rate of derivational neglect errors (7/23) and inflectional neglect errors (2/9; p = 0.64).

# 3.8.5. Preservation of Morpho-lexical Features (Lexical Category and Tense)

There was no significant difference between neglect errors that preserved the lexical category of the target word and neglect errors that did not preserve this feature (χ <sup>2</sup> = 2.89, p = 0.13). Additionally, for right-sided neglexia, we examined the preservation of a morphological feature that appears in the right side of the word—the tense inflection. R. made significantly more neglect errors that changed the tense inflection (8/10) than neglect errors that preserved the tense inflection of the target word (2/10; p = 0.01).

In summary, the performance of the participant with rightsided neglexia was consistent with the findings from leftsided neglexia in relation to the effect of the morphological structure of the word on reading performance and to the characteristics of this effect: words beginning with an affix letter were more susceptible to neglect errors than words beginning with a root letter, and the morphological effect on reading was not affected by lexical or semantic factors, a finding that also locates the morphological effect on reading in right-sided neglexia as occurring during visual-orthographic analysis, and pre-lexically.

# 4. Discussion

This study explored morphological decomposition in reading, its nature and where in the process of word reading it occurs. These questions were explored through the analysis of neglect errors in the reading of seven Hebrew-readers with neglexia and the effect of the morphological structure of the target words on their reading. The main findings of this study are:


Frontiers in Human Neuroscience | www.frontiersin.org October 2015 | Volume 9 | Article 497 |

<sup>6</sup>Among the target words there was only one word beginning with a potential affix, and thus we could not compare words beginning with a real affix and a potential affix. The word was removed from the calculations, thus, the category of words beginning with an affix only includes words beginning with a real affix.

<sup>7</sup>Recall that we created the stimuli so that a neglect error in each word can create another existing word. This applied both to words ending in affix letters and to words ending with root letters. Therefore, the larger rate of neglect errors in affixes cannot be ascribed to lexical completion or support from the lexical stage, as both kinds of words would receive equal support from the lexical level. As explained in Section 2.3.1, the analysis of the omission errors, for example, included only words in which the omission of the final letter, be it an affix or a root letter, creates another existing word.


Taken together, these findings indicate that a preliminary structural morphological decomposition occurs at the orthographic-visual analysis stage and is not affected by lexical factors. We will now discuss the location and nature of the morphological decomposition at the early stages of visualorthographic analysis and the nature of the effect morphology has on reading in neglexia in light of these findings.

# 4.1. The Stage at Which Early Morphological Decomposition Takes Place

The results indicate that morphological decomposition occurs prelexically. The first clue for the pre-lexical application of the preliminary morphological decomposition comes from the main finding of this study: that the morphological structure of the target word had a clear effect on reading in neglexia: affixes were neglected significantly more often than root letters in the neglected side. Given that neglexia is a deficit at the pre-lexical visual-orthographic analysis stage (Caramazza and Hillis, 1990; Riddoch, 1990; Ellis and Young, 1996; Haywood and Coltheart, 2001; Vallar et al., 2010), the effect of morphology on reading in neglexia indicates that initial morphological analysis takes place at the orthographic-visual analysis stage.

Another clue for the stage at which the initial morphological decomposition is performed comes from the differential effect that perceptual factors (length and letter form) have on the neglect of affixes and root letters. These perceptual factors affected words ending with an affix but not words ending with a root letter. This finding also supports the idea that morphological decomposition occurs early, at the orthographic-visual analysis stage, at which perceptual factors are relevant.

Our findings also provide evidence that this prelexical decomposition is not affected by lexical and semantic factors from later stages, and that the effect on attention shift to the neglected side is not lexical. Most importantly, no difference was found between real roots and structurally-possible roots, and no difference was found between affixes that served as real affixes in the target word and potential affixes (like –er in corner); words with defective 2-letter roots ending with an affix consonant letter did not differ from words with three letter roots. These findings indicate that the decomposition is not guided by the lexicon.

In addition, there was no effect of the semantics of the target word on the erroneous response produced and no preference for errors that are semantically related to the target word. No difference was found between neglect errors that involved a derivational change and neglect errors that involved an inflectional change. Furthermore, neglect errors also did not preserve the morpho-lexical features of the words, such as lexical category, gender, and tense inflection. These findings indicate that lexical and semantic information and information on morpho-lexical features of the word are not yet accessible during this early stage of morphological analysis, and thus, that this decomposition occurs at a pre-lexical stage, without lexical feedback. This early morphological decomposition may take place in the orthographic-visual analyzer itself or in an orthographic input buffer that is holding all the information coming from the orthographic-visual analyzer until it is transferred to the lexical and sublexical routes.

These findings join studies from Hebrew concerning the active role of morphology in the lexical access of written words and the organization of the mental lexicon in this language, in the reading of skilled readers without dyslexia, and the centrality of the root in these processes and representations (Frost and Bentin, 1992; Katz and Frost, 1992; Frost et al., 1997, 2005; Deutsch et al., 1998, 2000).

Is early morphological decomposition part of visual analysis only in languages like Hebrew, where morphology plays a dominant role? Findings from normal reading in other languages also indicate that a preliminary morphological decomposition occurs before lexical access (Rastle et al., 2000, 2004; Longtin et al., 2003; Longtin and Meunier, 2005; Meunier and Longtin, 2007; Rastle and Davis, 2008; Beyersmann et al., 2011, 2013; Crepaldi et al., 2014) and that morphological structure even affects the reading of pseudowords, which are clearly not stored in the lexicon (Burani et al., 2006; Traficante et al., 2011).

Work on English by Rastle and Coltheart (2000); Rastle and Davis (2008); and McCormick et al. (2008) emphasize that morphological decomposition is a pre-lexical phenomenon that already operates at a very early stage of processing of complex words, and is based on orthographic analysis alone, regardless of lexical, semantic, or syntactic characteristics of the target word and its constituents. According to Meunier and Longtin (2007), the analysis that occurs at a preliminary stage of the processing of morphologically complex words is morpho-orthographic, and at the next stages of the reading process, information from higher processing stages is taken into consideration.

The fact that we studied the effect of morphology on reading using morphologically complex words that are constructed from a root, a derivational template, and an inflection allowed us to examine the effect of morphological decomposition that is independent of lexical contributions. To decompose a morphologically complex word in Hebrew, no access to a list of existing roots is needed. Decomposition can rely exclusively on the known structure of derivational templates, inflections, and the placeholders of the roots. This is probably what enabled us to see the very early effect of morphology on neglexia. In contrast, the decomposition of compounds, for example, is crucially dependent upon access to the words composing the compound, because structural knowledge cannot suffice, for example, to know where to segment a cowboy in English (or Bauerngartenmischung, Wiesenblumensamen, or Sauerkirschsaft in German). This explains the different findings of studies such as Mozer and Behrmann (1990), Behrmann et al. (1990), and Marelli et al. (2013), who studied compounds. Marelli et al., for example, found that the compound reading of their participants with neglexia was affected by two lexical variables: the type of compound (existing/non-existing) and the location of the head of the compound (right/left). They explained their findings in terms of the effect of the lexical information on the visual processing. Thus, whereas morphological decomposition occurs pre-lexically and is guided by the orthographic structure of the word (also according to Marelli, see for example Amenta et al., 2015), the analysis of compounds requires later stages and access to the lexicon, as compounds cannot be segmented solely based on a structural-orthographic analysis of the target word.

Furthermore, Arduino et al. (2002, 2003) found that the lexicality of the target affects the reading of some of their patients with neglect dyslexia. Such lexical contribution seems to be in effect after the stage into which we tapped in the current study: namely, when the information about the letters on the left side of the word is degraded, and this information is transmitted to the orthographic input lexicon, the lexicon can retrieve words that fit the partial information<sup>8</sup> . This happens, we believe, later than the effect that we described in the current study, where the attention shift has been affected by the morphological structure of the word, even before the lexicon was accessed.

# 4.2. The Nature of the Early Morphological Decomposition

What is the nature and mechanism of this prelexical morphological decomposition? One can consider two options: one is that the preliminary structural decomposition is based on identification of derivational templates and inflectional morphemes that are stored in the prelexical morphological analyzer; the other option is that the prelexical morphological analyzer holds a list of existing roots and the decomposition is based on the identification of an existing root in the target word.

Our findings indicate that the morphological analysis is based on a structural analysis of the morphological structure of the word, and does not rely on a list of existing roots (in line with Rastle and Coltheart, 2000; Rastle and Davis, 2008 and many others). Rather, the results suggest that it relies on information about morphological templates and affixes and their positions within the word. This conclusion is based on the finding that the early morphological decomposition is not sensitive to whether or not the root that can be structurally extracted from the target word is an existing root or not.

Further support for the structural nature of root extraction is that in words with defective roots composed of only two consonant root letters that end with an affix letter, the early analyzer mistook the final affix letter to be a third root letter. These findings indicate that lexical considerations and the existence of the root are not the basis for the early morphological decomposition. Theoretical considerations also disfavor an analysis at the orthographic-visual analyzer stage that is based on a list of existing roots, because such an assumption is not parsimonious and actually turns the visual analyzer into a lexicon.

The results indicate that the decomposition is guided by structural principles involving a search for three letters that can function as root letters structurally and not necessarily for an existing root. This finding corresponds with previous evidence concerning the structural quality of the process (Bentin and Feldman, 1990; Frost et al., 2000a,b; see also Rastle et al., 2004, for English, and Davis and Rastle, 2010 for a discussion).

Another indication for the way the structural decomposition is done comes from our finding that the presence of a prefix in words ending with a root letter did not raise the rate of neglect errors in left-sided neglexia in comparison with words without a prefix (e.g., mŠKL vs. ŠKL). Namely, the prefix letter is identified as an affix and is not counted as a root letter, and the search for three root letters continues. This mechanism led to similar neglect error rates in words of different lengths (3, 4, and 5 letters) ending with a root letter. As long as the first letters are identified as possible affixes, the morphological analyzer keeps shifting attention to the left until it identifies a three-letter root. For this procedure to occur, the morphological analyzer should have information about the possible affixes, and where in the word they can appear in their affix role.

Theoretically, this identification of "affix letters" can act in two different ways: letters that sometimes function as part of an affix may be identified as "morphological letters" and be neglected regardless of their role in the target word. Namely, the orthographic-visual analyzer may hold a list of letters that can be part of affixes, and these letters would be neglected even if they are part of the root in the target word. Alternatively, morphological decomposition may take place, according to some structural guidelines (looking for three letters of the root, taking into consideration which letters can play a morphological role of affixes), and then the letter would be judged according to its structural role in the target word and neglected only when it may structurally belong to an affix in the target word. Hebrew provides an excellent opportunity to determine between these two possibilities, as each letter that can be part of an affix can also be part of the root. For example, the final letter m can function as an affix in certain words (as part of the plural affix, or as the 3rd person plural possessive), and can thus be defined as a "morphological letter." But this letter can also serve as part of the 3-consonant root. The findings of this study showed unequivocally that neglect errors took into account the morphological role of the letter in the target word. Namely, the

<sup>8</sup>For this reason all the target words in the current study were selected so that a neglect error would create another existing word.

letter was omitted only when it was part of an affix in the target word (structurally, though not necessarily lexically), whereas when it was (structurally, though not necessarily lexically) part of the root, it was not omitted. These findings indicate that the effect of morphology is not due to the orthographic-visual analyzer keeping a list of possibly-morphological letters, which are treated differently than root letters. Rather, these results indicate that the effect of morphology on neglexia is based on a morphological decomposition of the entire word, according to knowledge of inflectional and derivational templates and affixes and of the structure of Hebrew morphologically complex word. This analysis takes into account all the letters in the word and the complete morphological structure, and the structural role of each letter in the target word. Thus, an early, structural, morphological analysis already occurs before the neglect errors are made, leading to the neglect of letters in the neglected side only when they are analyzed structurally as an affix in the target word.

The picture that emerges from these findings and considerations is that during visual-orthographic analysis, the analyzer searches for three consonant letters that can function as the root letters. This search algorithm is based on the recognition of letters that have the potential to function as affixes, and where in the word they function as affixes (see also Crepaldi et al., 2010 for evidence from normal reading that the position of the affix in the word is taken into account, and discussion of this issue in Amenta and Crepaldi, 2012). If the affix letter appears in the relevant position within the target word, the morphological analyzer assumes it is part of an affix, and continues the search for three root letters. This is also the mechanism that protects root letters on the neglected side from omissions in neglexia.

# 4.3. Neglexia and the Root

Reading errors in neglexia result from a deficit in attention allocation to one of the sides of the word. It is known that the spatial and visual framework can affect reading in neglexia. The current study showed that the morphological structure of the target word also affects reading in neglexia, as it modulates the allocation of attention to letters on the neglected side of the target word.

The morphological structure of the Hebrew language and orthography dictates the structure of the orthographic input lexicon, which is organized according to roots (Frost et al., 1997, 2005; Deutsch et al., 1998; Frost, 2012). This lexical organization, in turn, dictates the role of the orthographic visual analyzer—to extract the root that will enable access to the entry in the orthographic lexicon. Because of the important role of the root in lexical access, Hebrew readers, including Hebrew readers with neglexia, search for the letters of the potential root, and this search is a trigger for continued attention shift in neglexia.

The results suggest that morphological decomposition occurs pre-lexically, analyzing and identifying the template, affixes, and the possible root letters according to the structure of the target word. The analyzer identifies root letters and keeps them from

omission. An attentional spotlight runs across the word, from right to left, in search for three root letters, and the attention shift in our neglexic participants was guided by this quest.

This quest for the three root letters also explains the finding that length affected words ending with an affix but not words ending with a root letter. When words ended with a consonant that was part of the root, the length of the word did not matter, and neglect errors did not occur more frequently in longer words. This is in contrast to words ending with an affix, for which a significant length effect was found. This indicates that as long as the quest for the three-letter root is not completed, attention shift to the left does not end, regardless of the word length. If the word includes an affix at the end of the word (i.e., on the left), after three root letters, the spotlight will stop after the three root letters have been identified, and the final affix letters will be neglected. By contrast, if an affix or even several affix letters appear in the word before all the root letters have been identified, and the word ends in a root letter, the spotlight will continue searching and reach the left end to recruit the 3 root letters, no matter how long the word is.

In this view, the effect of morphology on neglexia occurs very early, with the morphological structure directly affecting attention shift. The spotlight does not cease to shift attention to the left until the three root letters are identified. Once three root letters have been found, the spotlight is not "motivated" to search any further, and, given the attentional limitations affecting the left side, it stops, with a result of a neglect error.

This is in line with findings from the effect of the syntactic structure of sentences on reading in text-based neglect dyslexia. In a study of reading of sentences with different degrees of obligatoriness of the left component in the sentence, Friedmann et al. (2011) demonstrated that the syntactic structure of the sentence determined whether or not the readers keep shifting their attention toward the left side of the sentence, so that syntax served as a trigger for attention shift to the left of the sentence. A similar effect on neglect errors was also found in two-word compounds in Hebrew, where the right word included a morpho-phonological indication for the existence of another word on the left. This morpho-phonological indication increased the attention shift to the left word and reduced omissions of the left word (Friedmann and Gvion, 2014).

Quite similarly, at the word level, the current study shows that morphology serves as a trigger for attention shifting, and the visual analyzer continues to shift attention to the left side of the word until it identifies the three root letters.

# Acknowledgments

We are grateful to Aviah Gvion, Dror Dotan, Daniel Reznik, Dana Rusou, Inbar Trinczer, and Shira Freedman for their comments on this paper. This study was supported by the Israeli Science Foundation (1066/14) and by the ARC Centre of Excellence in Cognition and its Disorders (CCD), Macquarie University.

# References


Kertesz, A. (1982). Western Aphasia Battery. Orlando: Grune & Stratton.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Reznick and Friedmann. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# ADVANTAGES OF PUBLISHING IN FRONTIERS

FAST PUBLICATION Average 90 days from submission to publication

COLLABORATIVE PEER-REVIEW

Designed to be rigorous – yet also collaborative, fair and constructive

RESEARCH NETWORK Our network increases readership for your article

# OPEN ACCESS

Articles are free to read, for greatest visibility

## TRANSPARENT

Editors and reviewers acknowledged by name on published articles

GLOBAL SPREAD Six million monthly page views worldwide

### COPYRIGHT TO AUTHORS

No limit to article distribution and re-use

IMPACT METRICS Advanced metrics track your article's impact

SUPPORT By our Swiss-based editorial team

EPFL Innovation Park · Building I · 1015 Lausanne · Switzerland T +41 21 510 17 00 · info@frontiersin.org · frontiersin.org