The role of the frontal cortex in memory: an investigation of the Von Restorff effect

Elhalal, Anat; Davelaar, Eddy J.; Usher, Marius

doi:10.3389/fnhum.2014.00410

ORIGINAL RESEARCH article

Front. Hum. Neurosci., 27 June 2014
Sec. Cognitive Neuroscience
Volume 8 - 2014 | https://doi.org/10.3389/fnhum.2014.00410

The role of the frontal cortex in memory: an investigation of the Von Restorff effect

Anat Elhalal¹^*

Eddy J. Davelaar¹

Marius Usher^2,3

¹Department of Psychological Sciences, Birkbeck College, University of London, London, UK
²School of Psychological Science and Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
³Department of Experimental Psychology, University of Oxford and Wahadham College, University of Oxford, Oxford, UK

Evidence from neuropsychology and neuroimaging indicate that the pre-frontal cortex (PFC) plays an important role in human memory. Although frontal patients are able to form new memories, these memories appear qualitatively different from those of controls by lacking distinctiveness. Neuroimaging studies of memory indicate activation in the PFC under deep encoding conditions, and under conditions of semantic elaboration. Based on these results, we hypothesize that the PFC enhances memory by extracting differences and commonalities in the studied material. To test this hypothesis, we carried out an experimental investigation to test the relationship between the PFC-dependent factors and semantic factors associated with common and specific features of words. These experiments were performed using Free-Recall of word lists with healthy adults, exploiting the correlation between PFC function and fluid intelligence. As predicted, a correlation was found between fluid intelligence and the Von-Restorff effect (better memory for semantic isolates, e.g., isolate “cat” within category members of “fruit”). Moreover, memory for the semantic isolate was found to depend on the isolate's serial position. The isolate item tends to be recalled first, in comparison to non-isolates, suggesting that the process interacts with short term memory. These results are captured within a computational model of free recall, which includes a PFC mechanism that is sensitive to both commonality and distinctiveness, sustaining a trade-off between the two.

Introduction

Free recall of word lists is a central experimental paradigm that has driven research on the nature of memory encoding and retrieval processes and their neural substrate (Craik and Lockhart, 1972; Tulving et al., 1994; for a review, see Davelaar and Raaijmakers, 2012). Early memory research has shown that a major factor in enhancing memory performance is the depth of encoding: one remembers more words when attending to semantic relations between the words than when attending to their sound (Craik and Lockhart, 1972). This research also showed the importance of semantic relations between list words, both in the encoding and retrieval processes. For example, many studies have demonstrated enhanced memorability for semantically related words (e.g., Glanzer and Schwartz, 1971; Greene and Crowder, 1984; Davelaar et al., 2006), and others have shown that semantic clustering takes place spontaneously in the free-recall of categorized lists (Bousfield, 1953).

A series of neuropsychological and imaging studies indicate that semantic memory enhancement involves the prefrontal cortex (Moscovitch, 1994; Tulving et al., 1994; Gershberg and Shimamura, 1995; Baldo et al., 2002; Kishiyama et al., 2009; Løvstad et al., 2012). A standardized neuropsychological test which is utilized to measure the frontal lobes' contribution to memory is the California Verbal Learning Test (CVLT-II, Delis et al., 2000). CVLT involves several memory tests including a free recall test of a categorized list. In two different studies using CVLT (Baldo et al., 2002; Alexander et al., 2003) frontal lobe patients showed semantic clustering well below healthy control participants. Converging evidence for frontal lobe involvement in semantic clustering comes from ageing studies. For example, older people produce less clustered output on the CVLT test (norm tables in Delis et al., 2000). Ageing has many shared deficits with frontal lobe impairments, and it was suggested that frontal cortex decline in elderly people is responsible for these impairments (for a detailed list: Haarmann et al., 2005).

One important strategy for utilizing the relation between words to enhance memory performance (which is central to our investigation here) is to look for distinctiveness. For example, Hunt and Lamb (2001) reported that isolate-words presented within a list of related words (all of which belong to a single category that is different from the isolate) have higher probability of recall in a free recall test. For example: an isolate word, “Hour” presented within the animal list: “Bear, Pig, Elephant, Deer, Cat, Mouse, Cow, Tiger, Horse, Lion, Rat.” This manipulation belongs to a family of experiments, usually called the “Von Restorff” (VR) paradigm, after Von Restorff who initially designed it (Von Restorff, 1933). It has also been proposed that the mechanism that mediates the effects of novelty and distinctiveness is related to frontal cortex functions (Fabiani et al., 1998; Daffner et al., 2000; Ranganath and Rainer, 2003; for a detailed overview, see Kishiyama et al., 2009).

More recently, a number of neuropsychological investigations have provided evidence to support the frontal mediation of novelty-based encoding in memory. In particular, studies by Knight and colleagues (Kishiyama et al., 2009; Løvstad et al., 2012) have shown that the prefrontal cortex modulates the Von Restorff effect. Kishiyama et al. (2009) tested 16 patients with damage to the unilateral PFC (9 Left, 7 Right) on a Von Restorff paradigm for recognition memory of object images. Whereas the age-matched control group exhibited a novelty advantage in recollection, the patients did not show any novelty advantage. In a follow up study, Løvstad et al., (2012) observed that patients with orbitofrontal or lateral prefrontal lesions exhibited a reduction in the novelty-induced P3 response over frontal electrodes. The lateral PFC patients showed sustained slow wave activity compared to the orbitofrontal patients, suggesting that the two areas are instrumental in novelty processing and are partially differentiated in their contributions. This conclusion is consistent with a number of theories that attribute several memory processes to the PFC: encoding of similarities along with distinctiveness in the material in a dynamic way (Shimamura et al., 1995; Fletcher et al., 2000; Shimamura, 2000; Frith et al., 2004).

The central aim of our study is to examine the factors that contribute to the memory enhancements related to semantic structure and semantic isolates (VR-effect), and their frontal mediation. Research on the VR-effect concentrated mainly on the improved memory of items that are “physically” distinctive from other items in the experiment (different in color, sound, font, etc). However, there is evidence that somewhat different processes are involved in the processing of semantic and physical distinctiveness (Fabiani and Donchin, 1995). Surprisingly few experiments have investigated the nature of semantic Von-Restorff effect, and specifically the effect of the serial position of an isolate word in free recall memory tests on its enhanced memorability. One would expect the isolate advantage in free recall to depend on the serial position of the isolate in the list. In the extreme case, when the isolate is the first item in the list, it is still not distinctive as the participant does not know what comes next. In later serial positions, a few related words have already been presented and thus the isolate word is more distinctive.

We start with an experimental investigation of the correlations between a number of semantic effects in memory recall (in particular VR and semantic clustering) and fluid-intelligence—a measure associated with frontal function. The results indicate the presence of individual differences (fluid-intelligence), previously associated with the lobes (Duncan et al., 1995, 1996, 2000) that mediate both VR and clustering effects. As both temporal clustering (CVLT) and novelty based encoding were found to be deficient in patients with frontal lesions, we will take it as our working hypothesis that they are both mediated by a frontal mechanism. This hypothesis is obviously tentative at this stage (see General Discussion). Based on this, we developed a neurocomputational model that makes specific parametric predictions for VR-effects and semantic clustering in free recall. Finally, two experiments were carried out that confirmed these predictions.

Experiment 1: A Correlational Study of the Relationship Between Fluid Intelligence and Semantic Factors in Free Recall

Experiment 1 was designed to explore the role of the frontal cortex in semantically related memory functions. In order to investigate this with healthy young adults, a correlational approach was adopted. A correlation between frontal cortex activity and performance on fluid intelligence tests was previously demonstrated in tasks such as the Cattell Culture Fair Test (Duncan et al., 1995, 1996, 2000). Therefore, participants' performance in this test was measured and translated into IQ scores. The participants carried out a free recall test of different types of lists to measure memory components which are believed to be related to the frontal cortex. Types of lists included semantic isolates (Von Restorff), lists of categorized words, lists from the DRM paradigm, and lists designed to measure proactive interference (PI).

Investigations of frontal patients show that semantic clustering is mediated by the frontal cortex, as frontal lobe patients frequently show semantic clustering well below healthy control participants (Gershberg and Shimamura, 1995; Baldo et al., 2002). In the DRM paradigm (Deese, 1959; Roediger and McDermott, 1995), participants are presented with lists of words, all strong associates of a non-presented target word¹. False memory for the target word is measured. Melo et al. (1999) showed initial evidence of frontal patients producing more false memories than controls on the DRM paradigm, in agreement with the hypothesis that relates utilizing common features between words in memory to PFC functions. It was previously found that frontal patients show enhanced sensitivity to proactive interference (for example, Shimamura et al., 1995). However, other scholars have claimed that it is the release from proactive interference that is the main deficit in frontal patients (Moscovitch, 1982).

Methods

Participants

The study was carried out as part of an undergraduate practical class, in which students participate in an experiment in order to produce their own empirical data and subsequently analyze it. Eighty-one students at Birkbeck College took part in the experiment, ages ranging 21–50, with a mean of 32. Since the focus of the experiment was semantic effects, non-native English speakers were excluded from the sample. Thirty-two participants did not meet the criteria of living in English speaking countries from the age of 12 or younger. Two more participants were excluded, since their measured IQ score was 60. This score was assumed to not represent the true IQ of university students, indicating they did not engage with the test truly. Eventually, data was analyzed from the remaining 47 participants.

Design

This experiment was of a correlational design. Measured variables were IQ and 5 different types of semantic memory scores.

Materials

The Cattell test was used to measure IQ.

Thirty lists of 12–15 words each were used to measure specific memory functions. All lists involved semantically related words. Categories were chosen to have no overlap to minimize between-lists effects. All categorized lists (except those of the original DRM task) were created using the Van Overschelde et al. (2004) norms.

DRM lists. Six lists were taken from the DRM paradigm (Roediger and McDermott, 1995) from which 15 words were presented; these were all strong associates of a word that was not presented. For example, the words: hot, snow, warm, winter, ice, wet, frigid, chilly, heat, weather, freeze, air, shiver, arctic, frost were presented, which are all associated with the word cold which was not presented. The number of times the non-presented words were falsely recalled was counted. The following DRM lists were used: feelings, colors, food, furniture, weather, medicine (from Roediger and McDermott, 1995).

Von Restorff lists. Six categories were chosen from the Van Overschelde et al. (2004) norms. Lists of 11 words were created from each category, with the 12th word from an unrelated category inserted in different positions in the list, as the 1st, 6th, and 10th word. For example: Beer, Vodka, Rum, Whiskey, Tequila, Gin, Liquor, Scotch, Martini, House, Bourbon, Daiquiri.

Categorized words lists—blocked. Six lists of 12 words were created, with words taken from three different categories (from the Van Overschelde et al. (2004) norms, four words from each category). First the words from the first category were presented, then from the second, then third. For example: Eagle, Robin, Hawk, Crow, Priest, Pope, Bishop, Nun, Barbie, Ball, Puzzle, Lego.

Categorized words lists—cyclical. Six lists of 12 words were created, with words taken from three different categories (from the Van Overschelde et al. (2004)norms, four words from each category). First, a single word from the first category was presented, then a single word from the second, then from the third, and so on. For example: Rose, Ballet, Window, Tulip, Tango, Door, Lily, Salsa, Wall, Iris, Waltz, Floor.

Proactive interference lists. Six lists of 12 words were constructed. Every two consecutive lists were with words from the same category.

The memory test was created using the Eprime environment for psychological testing (pack 1.1).

Procedure

The experiment was carried out in a classroom. Participants were instructed to keep silence at all times to avoid interfering with other participants. Participants first performed the Cattell test, followed by the memory test. Participants were told they would be presented with lists of words which they should read subvocally, and thereafter try to remember as many of them as possible in any order.

The memory test comprised of 30 lists of words. Each word was presented for 1.5 s. After each list a recall box appeared and participants typed in as many words as they remembered. Instructions were to press “Enter” after every word, which cleared the recall box for a new word. Pressing “Enter” on an empty box was the agreed sign for no more recalled words. That led to a screen asking them to “press any key to move to the next list.” Participants were given 60 s to recall words, after which the program moved to the same screen automatically. Participants were first briefed with this procedure by a class presentation which was followed by individual practice prior to the experiment. The whole experiment took under 1 h, with the Cattell test lasting about 25 mins (including instructions). Participants then moved to a computer room and proceeded with the memory test for 30 min.

Results

IQ scores were calculated based on Cattell test results. Mean IQ for the sample was 107, with a standard deviation of 13. This is above the population mean, as one would expect from a sample of university students.

Memory for isolate words (von restorff effects)

IQ was correlated with the total number of words recalled from VR lists: r₍₄₅₎ = 0.335, p < 0.05. In addition, IQ was correlated with different measures of the isolation effect. Correlation between IQ and the number of isolates recalled was r₍₄₅₎ = 0.407, p < 0.005. To show that the impact of IQ is specific for the isolates, the number of isolates recalled was divided by the total number of list-words recalled. The resulting measure was correlated with IQ as well: r₍₄₅₎ = 0.336, p < 0.05. Isolates appeared in three serial positions in VR lists: 1st, 6th, and 10th. Looking for the number of isolates produced from each serial position separately, only the correlation with the 6th position (divided by the total number of words recalled) reached significance: r₍₄₅₎ = 0.403, p < 0.005. For the 1st and 10th positions, correlations were lower and non-significant [for the 1st: r₍₄₅₎ = 0.198, p = 0.181, NS; for the 10th: r₍₄₅₎ = −0.32, p = 0.828, NS]. A further measure of the VR-effect was calculated: the number of times a word is recalled from each of the three serial positions when it is an isolate, compared to when it is not an isolate (this could be found from the other VR lists in which isolates appeared in different serial positions). Significant correlation was found again, for the 6th serial position only: r₍₄₅₎ = 0.303, p < 0.05 [for the 1st: r₍₄₅₎ = 0.197, p = 0.185, NS, for the 10th: r₍₄₅₎ = −0.068, p = 0.651, NS]. Probabilities of recall and first-recall are presented in Figure 1.

FIGURE 1

Figure 1. Probabilities of recall (A) and first recall (B) for the Von Restorff lists by the isolate serial position in the list: green for 1, blue for 6 and red for 10.

The results show higher probability of recall for isolate items at late serial positions, and higher probabilities of first recall for all serial positions tested. Serial position curves exhibit high noise. This could be attributed to the experimental design which was optimized for correlation analysis, and hence there was no counterbalancing or randomization in lists presentation. In addition, only two lists were presented per participant per isolate serial position which may have contaminated the data with word specific effects. The probabilities of recall and first recall split by the IQ median score are presented in Figure 2. It can be seen that higher-IQ individuals tend to show more advantage for the isolates in comparison to lower-IQ individuals, as verified by the correlation analysis.

FIGURE 2

Figure 2. Probabilities of recall (left panels) and first recall (right panels), for the Von Restorff lists by the isolate serial position in the list, split by IQ median score (green for higher-IQ individuals, blue for lower-IQ individuals).

Memory for categorized lists: blocked and non-blocked

IQ was correlated with the total number of words recalled from categorized lists: r₍₄₅₎ = 0.335, p < 0.05. In addition, IQ was correlated with different measures of categorization. Categorization factors were calculated in accordance with the CVLT test method (Stricker et al., 2002). Essentially, the number of pairs of same category words that were recalled consecutively was counted and corrected by the number of such pairs expected to occur by chance. High correlations with IQ were found both for the blocked lists [r₍₄₅₎ = 0.464, p < 0.05] and the non-blocked lists [r₍₄₅₎ = 0.367, p < 0.05].

Probabilities of recall for the blocked and non-blocked lists of categorized words split by the median IQ score are presented in Figure 3. This shows an overall advantage for higher-IQ individuals in both conditions (also see Supplement, Figure S1 for a first recall probability, showing that High-IQ participants have an increases tendency to start recall with the first item of the last category—a novelty effect).

FIGURE 3

Figure 3. Probabilities of recall for blocked categorized lists (left) and non-blocked categorized lists (right), split by IQ median score (green for higher-IQ individuals, blue for lower-IQ individuals).

Counter to predictions, no correlations were found with any measures of the proactive interference lists, nor the DRM lists. This could be due to the homogeneity of the experimental sample, which may have produced less differentiation than between healthy subjects and patients.

Finally, we report in the Supplement (Figure S2) temporal order effects (lag-CRP; Howard and Kahana, 2002) for randomized and categorized lists (see Discussion).

Discussion

Correlations were found between fluid-IQ and semantic clustering, as well as IQ and the Von Restorff effect for mid-list items. The correlation between IQ and semantic clustering in healthy adults supports similar results found in frontal patients and neuroimaging (Gershberg and Shimamura, 1995; Savage et al., 2001; Baldo et al., 2002). It was expected that a relationship would be established between the PFC and sensitivity to novelty, as well as distinctiveness. This is based on results from other paradigms (Ranganath and Rainer, 2003; Kishiyama et al., 2009; Løvstad et al., 2012), and supported by the newly found correlation between the Von Restorff effect and frontal functions (for mid-list isolates). In agreement with patients and neuroimaging data, correlations were also found between semantic clustering and IQ. Counter to predictions, however, no correlations were found with proactive interference or release from proactive interference.

In order to better understand the mechanism by which the PFC utilizes the relationship between words in a list to enhance both semantic clustering and the memory of semantic isolates, we developed (in the next section) a computational model. This was based on the activation-buffer model that was previously used to account for data in immediate-free-recall (IFR) and in the continuous distractor task, and the dissociations between them (Davelaar et al., 2005, 2006).

Like in SAM (Raaijmakers and Shiffrin, 1980), this model includes a short-term activation buffer, so that the last words in the list would still be active in such a buffer at the time of recall. The frontal mechanism interacts with the buffer function to enhance encoding of semantic relationships between words. The neuropsychological literature described above and the experimental results presented here support the assumption that there are additional memory processes to those implementing the buffer which are mediated by the PFC. While the buffer only maintains some information about previously presented items in an active state, these additional mechanisms play a role in enhancing the encoding of semantic relationships (both similarities and differences) as illustrated by semantic clustering and the VR-effect. We label this model the “categorization-activation-novelty model” (CAN) to reflect the functional role of its components.

The present model does not address one important aspect of memory encoding and retrieval: the changing context. This has been demonstrated to be critical in accounting for order effects in free recall paradigms at different temporal scales (Howard and Kahana, 2002; Polyn and Kahana, 2008; Polyn et al., 2009); see also Figure S2 (in supplement) for data from Experiment 1, showing an interaction between lag-recency and semantic relation). This important component will need to be integrated into the present model in future studies (see General Discussion).

Simulation Study: The Can-Model for the Role of the PFC in Memory

We assume that the frontal mechanism has two main components: a fast learning categorization layer and a novelty detection mechanism (see also Grossberg, 1978a,b). The CAN model is comprised of four components: the frontal mechanism, a short term activation buffer, a layer of semantic features and a context mechanism.

Activation Buffer and the Semantic Layer

The activation buffer is a lexical layer (each unit corresponds to a word), with recurrent excitation and global lateral inhibition (Haarmann and Usher, 2001; Davelaar et al., 2005). This leads to 3–4 items being co-active during the list presentation, allowing the model to account for serial position functions (both recency and primacy) in IFR (Davelaar et al., 2005). Since we focused here on semantic relations between the words in the list, we included for each word a semantic representation in a semantic layer whose units correspond to the semantic features of the word. Each lexical unit is thus connected with the semantic units that correspond to its “semantic representation” (see Figure 4). Words from the same category were assumed to have shared semantic representations. Finally, we assumed the existence of a list-context representation (Anderson and Bower, 1972). During list presentation, this representation becomes connected with the activated units in the semantic and the lexical layers.

FIGURE 4

Figure 4. Schematic diagram of the encoding stage of the model. The top layer is the lexical layer, the middle is the semantic features layer. Context is represented by a single unit (list context) in the bottom-left, and the categorization layer (component of the PFC) is illustrated in the bottom-right. During list presentation, lexical units receive input. Activation spreads to the semantic features layer and further to the categorization layer. Episodic encoding by Hebbian learning takes place between the context and the lexical, semantic and categorization layers. In addition, fast Hebbian learning takes place between the categorization layer and the semantic layer.

Categorization Layer

The categorization layer is comprised of multiple units with strong competition and initial random connectivity to the semantic features. During the encoding phase, when a member of a certain category gets input, its semantic representation in the semantic layer becomes active. The activation sent to the categorization layer causes a categorization unit (or a few units in some cases) to win the competition in this layer. Hebbian learning then takes place between the semantic features layer and the categorization unit, hence the categorization unit is dynamically learning a new category online. The more category members presented in the list, the stronger the connections become between the category's shared features and the categorization unit. However, if an item from a different category is presented, its features activate a different categorization unit (due to the initial random connections). The new categorization unit then learns the second category. At retrieval, activation sent from the categorization layer to the semantic features layer enhances recall of category members consecutively, leading to semantic clustering.

Novelty Detection and Encoding

A second frontal component detects novelty in the input by comparing the predicted (a leaky integrated value of its history; see Detailed Description) activation values in the semantic features layer to the current ones. When a large difference is detected, a “surprise” signal is produced by the novelty mechanism which boosts both the effect of the sensory input (this can be viewed as enhanced attention) and the learning rate between the categorization layer and the semantic features layer. This mechanism enables simulation of the Von-Restorff effect through the stronger encoding of an isolate. It also enables simulating the dependency of the Von-Restorff effect on the isolate's serial position as found in the experiment above. When the isolate is in the first serial position, there is still no prediction, and therefore no advantage in encoding. When a few category members have already been presented before the isolate, the frontal mechanism detects the isolate's novelty and enhances activation and learning. Therefore, the isolate can be more easily encoded and later recalled. It is predicted that the closer the isolate is to the end of the list, the higher the probability that its enhanced activation would enable it to survive the competition in the lexical layer until the end of the simulation. Therefore, isolates nearer the end of the list have a better encoding advantage.

The combination of the categorization and novelty detection mechanisms enables the model to simulate differences in semantic clustering between frontal patients and healthy controls, as well as the PFC sensitivity to novelty, the dependency of the Von-Restorff effect on the serial position of the isolate and the correlation found between PFC function and the magnitude of Von-Restorff effect. The model components are illustrated in Figure 4.

Retrieval

At the retrieval stage, activation is sent from the context to the lexical, semantic and categorization layers. Every item that crosses the retrieval threshold is retrieved and subsequently inhibited. The PFC has an additional role in monitoring retrieval: the categorization layer is reset following the occurrence of a certain number of unsuccessful retrieval attempts.

Detailed Description

Encoding

Encoding a list of words is simulated by activating lexical units one at a time by an input. Activation spreads from the lexical units to the semantic features layer through a predefined connectivity matrix. Activation then spreads from the semantic features layer to the categorization layer through random initial connections. Connections between categorization units (which are active above a learning threshold) and active semantic features are strengthened. In addition, the activation in the semantic features layer is constrained by an adaptation component. When novel input is presented, the difference between the semantic features' leaky integrated value (or activation history, which is actually the adaptation component value) and the current activation is likely to cross a threshold. As a consequence, the input to the lexical layer (as well as the learning rate between the categorization layer and the semantic features) is increased (the “surprise” effect).

The model is applied to simulate free recall of categorized lists, of normal subjects and frontal patients. At the encoding stage, 12 lexical units get activated (or 16 in a CVLT simulation) for T time steps each. The following events take place at each time-step:

– The adaptation of the semantic features layer is updated

– If the difference between the semantic layer activation to the adaptation value (absolute value) crosses a threshold, the input to the lexical items as well as the learning between the semantic features and the categorization layers are raised for the current item

– The lexical layer activation is updated

– The semantic features layer activation is updated

– The categorization layer activation is updated

– The connections between the three layers: lexical, semantic features and categorization to the context, are updated

– If the activation of one (or more) categorization unit(s) crosses the learning threshold, the connections between this unit(s) and the semantic features are strengthened

The lexical layer (activation buffer). The lexical layer is comprised of 50 units, out of which 12 serve as list items. During a simulation of list presentation, lexical units are activated sequentially by clamping each unit for a “sensory” input for T time steps. Units are connected to themselves via self excitation to maintain activity after input offset. They are also negatively connected to all other units in the layer which creates global inhibition. The balance between the magnitude of the self excitation and global inhibition creates a limited capacity buffer. Units that are active above threshold in the lexical layer are considered to be the short term activation buffer. When a new item is activated, its activation causes the activation of one of the previous item to diminish through the global inhibition. The new item would eventually replace the weakest item in the buffer. Buffer dynamics are described by the following equation:

\begin{array}{l} x_{i} (t) = λ x_{i} (t - 1) + (1 - λ) [α F (x_{i} (t)) - β_{x} Σ F (x_{j} (t - 1)) \\ + I_{i} (t) + ξ Σ W_{ij}^{xy} F (y_{j} (t - 1)) + N (0, 0.2)] \end{array}

Where F(x) = x/(1 + x) for x > 0, and 0 otherwise (O'Reilly and Munakata, 2000). x_i(t): activation of lexical item i at time t. All activations are set to zero at the beginning of a simulation. λ: a time constant, controlling the amount of change in each time step; α: the self-excitation parameter; β_x: the global inhibition parameter; I_i(t): the sensory input to unit i at time t. The input is clamped to each list unit for T time steps. Each lexical unit gets activation from the semantic layer: ξΣW^xy_ijF(y_j(t − 1)). The activations of the semantic layer are denoted by y_j. W^xy is the connectivity matrix, (the superscript xy stands for the connections from the semantic features, y, to the lexical items, x, all connections are symmetric) and ξ is the parameter multiplying that activation. The activation in the layer is not allowed to get negative. The activation of the lexical units during a presentation of a list of 12 words is shown in Figure 5.

FIGURE 5

Figure 5. The dynamics of the lexical layer during a 12 items list simulation. The activation of each item is shown as a different color graph. It can be seen that the first three activated items reach higher activation values, because they face less competition from previously presented items.

The semantic features layer. The semantic features layer is comprised of 45 units which are reciprocally connected with the lexical units through the connectivity matrix W^yx. Competition in the semantic features layer is mediated by global inhibition, similar to the lexical layer. Semantic features receive activation from the lexical layer (through W^yx), as well as input from the frontal mechanism (categorization layer and the “surprise” signal). The semantic features activation is described by the following equation:

\begin{array}{l} y_{i} (t) = λ y_{i} (t - 1) + (1 - λ) [- β_{y} Σ F (y_{j} (t - 1)) \\ + χ Σ W_{ij}^{yx}^{'} F (x_{j} (t - 1)) - κ a_{i} (t)] \end{array}

Where y_i(t) is the activation of the semantic feature i at time t, λ and F(y) are the same as described above, β_y is the global inhibition parameter and χΣW^yx_ijF(x_j(t - 1)) is the activation the semantic features receive from the lexical layer.

The term a_i represents the adaptation in the semantic features layer.

Its value is updated in every time step, according to the semantic features activation:

a_{i} (t) = λ_{a} a_{i} (t - 1) + (1 - λ_{a}) F (y_{i} (t - 1))

In addition, the activation in the layer is not allowed to get negative.

The connectivity matrix between the lexical and the semantic layers. Each lexical unit is connected via the connectivity matrix W^xy with a number of semantic units. These connections are symmetric: W^xy_ij = W^yx_ji. Since the lexical layer corresponds to a localistic representation of items, the pattern of W^xy corresponds to a distributed representation of each item in terms of semantic features. The connectivity matrix between the lexical items layer and the semantic features layer is as follows: W^xy is defined in the beginning of each simulation. To simulate a list of unrelated items, each element of the matrix is set to one with a probability of 16%, which means that on average each lexical unit is represented by 6–7 semantic features. To simulate a list of words which belong to the same semantic category, a set of features (usually 5) is set to one for all of these words, in addition to randomly selected distinctive features. See Figure S3 (in the supplement) for a connectivity matrix of a lexicon in which words belong to 4 different categories which are represented by sets of 5 features each.

The following two model components are the categorization mechanism and the novelty detection mechanism, which we associate with the PFC.

The categorization mechanism. The categorization layer consists of 10 units. Their activation is initialized to zero at the beginning of the simulation. The dynamics of the categorization units' activation is described by the following equation:

\begin{array}{l} z_{i} (t) = λ z_{i} (t - 1) + (1 - λ) [- β_{Z} Σ F (z_{i} (t - 1)) \\ + ϕ Σ W_{ij}^{zy} F (z_{j} (t - 1))] \end{array}

Where z_i(t) is the activation of categorization unit i at time t, λ is a time constant, controlling the amount of change in each time step, β_Z is the global inhibition parameter. ϕΣW^zy_ijF(z_j(t − 1)) is the activation sent from the semantic features layer, W^zy is the connectivity matrix between the categorization and the semantic features layer.

The categorization layer is reciprocally connected with the semantic features layer. The connections are initialized to random values, distributed between 0–1: W^yz(t = 1) ϵ U(0,1). Hebbian learning takes place between categorization units which cross a learning threshold and the semantic features:

W^{yz} = W^{yz} + δ F {(z)}^{T} F (y)

The connections between each categorization unit and all features are normalized after each update.

The novelty (distinctiveness) mechanism: enhancing attention and encoding of novel stimuli. At each time step, the novelty mechanism is monitoring the difference between the current activation in the semantic features layer and the predicted one, which is mathematically identical to the definition of the adaptation term: a(t). If this difference crosses a threshold, the attention to stimuli (or studied material) is enhanced by increasing the input parameter (to the lexical items layer) to I_s. In addition, the learning between the categorization layer and the semantic features layer is increased to δ_s.

When simulating a frontal patient, no learning takes place between the categorization layer and the semantic features layer, and stimulus novelty is not detected.

The context “layer.” Context is represented by a single unit in this model, and can be seen as a “list context.” Episodic Hebbian learning takes place between the context and the lexical items layer, the semantic features layer and the categorization layer:

\begin{array}{l} C x_{i} = C x_{i} + F (x_{i} (t)) \\ C y_{i} = C y_{i} + F (y_{i} (t)) \\ C z_{i} = C z_{i} + F (z_{i} (t)) \end{array}

Where Cx, Cy, and Cz are the (vectors of) magnitudes of the connections of the context to the different layers, respectively.

Retrieval

At retrieval, the categorization layer activates the semantic features layer. In addition, the context activates the semantic features, the lexical items and the categorization layers. When the activation of a lexical item crosses a retrieval threshold, it is considered to be recalled. Its activation is reset after recall. If more than 2 unsuccessful retrievals have been made (repetitions and intrusions counted together), the activation of the categorization layer is reset.

The following steps take place for TR time steps:

– The adaptation of the semantic features layer is updated

– The lexical layer activation is updated

– The semantic features layer activation is updated

– The categorization layer activation is updated

– If the activation of a lexical unit crosses the retrieval threshold, it is considered to be recalled (and following reset)

– If more than two unsuccessful retrievals occurred, the categorization layer is reset

The lexical layer (activation buffer). The dynamics of the lexical layer in retrieval is similar to encoding, aside from the activation received from the context. The term μ_xCx_i(t − 1) is added to the equation.

The semantic features layer. Two additional terms are added to the dynamics of the semantic features layer in retrieval: the activation received from the context (μ_yCy_i(t − 1)) and that received from the categorization layer (ϖΣW^yz_ijF(y_j)). When a lexical item's activation exceeds the retrieval threshold, it is considered to be recalled. A recalled item can be a successful one if a list item is recalled for the first time, or an unsuccessful one when an item is repeated, or when an item which was not activated during encoding becomes active during retrieval (intrusion). When an item is recalled its activation is inhibited.

The categorization layer (and connectivity matrix to the semantic features layer). The dynamics of the categorization layer in retrieval is similar to encoding, aside from the activation received from the context. The term μ_zCz_i(t − 1) is added to the equation. Furthermore, the value of the parameter multiplying the activation received from the semantic features layer (ϕ) is reduced. It is more important for the PFC to be affected by successful/unsuccessful retrievals, than by the current activation of the semantic features layer.

In addition to the described dynamics, the unsuccessful retrievals are monitored. This is modeled as a symbolic algorithm rather than a neural network since this process is outside the scope of the current modeling attempt. When the number of unsuccessful retrievals (repetitions or intrusions) exceeds 2, the activation of the categorization unit(s) that were active during the recalls are inhibited for the rest of the simulation. In addition, the activity of the whole categorization layer is reset. A different categorization unit will become active due to the activation received from the context.

Simulation Results

The model was applied to simulate semantic clustering in free recall as well as the experimental Von-Restorff results presented earlier. In particular, the difference between frontal patients and normal subjects is explored. In order to simulate a frontal patient, the frontal mechanisms are disabled, hence no learning takes place between the categorization layer and the semantic features layer and novelty is not monitored.

Modeling clustering of a categorized list

As part of the CVLT task (Delis et al., 2000) participants are asked to memorize a list of 16 words from four different categories which are presented in a random order. The CAN model is used to simulate the differences in semantic clustering between frontal patients and normal subjects. The CVLT includes a few repetitions of encoding and retrieval of the list. Only the first trial is simulated here.

In the simulation, the lexicon for this task is constructed of four categories, with four items in each category. These 16 items are presented in a random order with no two consecutive items from the same category, similar to the CVLT paradigm. Presentation order changes with every run of the simulation (see Figure S3 in Supplement). In the Supplement we show a detailed illustration of the semantic representations of list items and the operation of the categorization units during encoding and retrieval.

Both the model without the frontal mechanisms and the one which included them were able to recall the CVLT–like lists (see Figure 6), the latter remembering consistently more words, except in the recency positions which are mediated by the lexical buffer. In the model without the frontal mechanisms, recency is due to the items which were still active in the lexical layer at the time of recall. Adding the frontal mechanisms created preference for category members of the most active items, lowering the recall of item number 14.

FIGURE 6

Figure 6. Probability of recall of a CVLT—like list, red: model without the frontal mechanisms, blue: model which includes the frontal mechanisms. The frontal mechanisms enhance recall of a categorized list.

Semantic clustering index was calculated in a similar way to the CVLT test (Delis et al., 2000; Stricker et al., 2002). The model without the frontal mechanisms output was less categorized than the model with the frontal mechanisms: 2.8 (±1.8) in comparison with 5.4 (±2.2) (p < 0.001). For example, in a single trial in which the presented list was: a1, d1, c1, d2, c2, d3, a2, b1, a3, b2, c3, a4, b3, d4, b4, c4 (letters represent categories, numbers represent category exemplars), The model with the frontal mechanism produced the following output: b4, c4, d4, c2, c4, c3, d3, d2, d4, c2, d3, a2, a1 (The first recalled items are the recency items, therefore are not categorized). Alternatively, the model without the frontal mechanisms produced: c4, b4, d4, c3, a4, a1, c4, d3, d1.

Modeling the VR-effect

The experimental results indicate that while isolate items have advantage in late serial positions, they are less well remembered when they appear in the beginning of the list. The CAN model is applied to simulating these data. The lexicon for this task is constructed of two categories. All studied items except one belong to a single category, while the isolate item belongs to a different category. See the supplement for detailed illustrations of the encoding and retrieval phases.

During the presentation of a category, a categorization unit becomes associated with it. When an isolate is activated, usually a different categorization unit (different from the one that learned the category of the other list items) becomes active and becomes associated with the isolate. The items activated after the isolate belong to the first category which was already learned by the former categorization unit. That categorization unit becomes reactivated due to the learning that already took place before the isolate presentation. The activation in the categorization layer can be seen in the top panel of Figure 7. For example, in the middle panel the isolate appears in serial position 7 (at around t = 180). Categorization unit number 5 gets associated with the main category, while categorization unit number 2 (and 8 to some extent) get associated with the isolate.

FIGURE 7

Figure 7. The categorization layer in a Von Restorff simulation. In both panels, the isolate appears in serial position 1 (left) 7 (middle) and 9 (right). An isolate at position 7 gets activated during time steps 150–175, and an isolate in serial position 9 gets activated during time steps 200–225. Warmer colors represent higher activations. Top: The activation in the categorization layer, in a simulation which includes the frontal mechanisms. Bottom: The connectivity matrix between the semantic features and the categorization layer. It can be seen that a categorization unit gets associated with semantic features 1–5 (which represent the first category) and a different categorization unit gets associated with semantic features 6–10 (which represent the isolate category).

The learning which is taking place between the categorization layer and the semantic layer is demonstrated by the changes in the connectivity matrix between them (see bottom panel of Figure 7). It can be seen that different categorization units become associated with the main category items and the isolate item, except when the isolate appears in serial position 1.

The model was used to simulate the Von-Restorff paradigm with isolates in the 1st, 7th, and 9th serial positions. Serial position curves are presented at Figure 8 for the model without (red) and with (blue) the frontal mechanisms. The frontal mechanisms make it possible to remember the isolates. The advantage of the isolate is larger, closer to the end of the list as found in experiment 1. Without the frontal mechanisms the isolate is poorly remembered at all serial positions. These results are a good qualitative fit to the experimental results, especially the relative difference between the two simulations. However, the model results show some advantage of the isolate in serial position number 7 (Von Restorff effect) which is not present in the data.

FIGURE 8

Figure 8. Probabilities of recall for the model without (red) and with (blue) the frontal mechanisms, simulating Von-Restorff lists with isolate in the 1st (left) 7th (middle), and 9th (right) position. It can be seen that the isolate is better recalled in the model with the PFC mechanism and this advantage depends on its serial position as predicted.

Modeling the von restorff effect—additional baseline

The CAN model replicates the well known result that words in semantically related lists have higher probability of recall, than words in unrelated lists (for example, Glanzer and Schwartz, 1971; Greene and Crowder, 1984, and more recently, Davelaar et al., 2006). If words from a semantically related list are better remembered, a prediction could be made for disadvantage for the isolate item. We have modeled the higher probability of recall for an isolate, or the Von Restorff effect, as a “surprise,” or novelty/distinctiveness signal which enhances memorability. This advantage however, is predicted to depend on the serial position within the list as at the start of the list, the isolate is still not distinctive. It can thus be expected that in earlier serial positions (especially if first in the list) the isolate is less likely to be retrieved, but in later serial positions (especially close to the end of the list, i.e., recency items) this tendency is inverted and gives the isolate a better probability of recall.

In previous experiments with semantic isolates, memory for an isolate word was compared to memory of a word of the same category as the rest of the list, in the same serial position (e.g., Hunt and Lamb, 2001).

To simulate a same-category list, semantic features 1–5 were selected for all items. Results are presented in Figure 9.

FIGURE 9

Figure 9. Probabilities of recall and first recall for the model with the frontal mechanisms for Von-Restorff simulations, with isolates at the 1st (left) 7th (middle) and 9th (right) serial positions. Blue: Von-Restorff list, Red: same-category list baseline.

First, the simulation shows that isolate items (star on blue lines) are better remembered than their within list neighbors and than the same word in a same-category list, but only at middle and recency position (this can be seen in both total recall, Figure 9 left, and in first recall probability, Figure 9 right panel). This effect is stronger at recency positions due to the buffer. When the isolate is at the first position on the other hand, it is remember less frequently than its neighbor from position 2, or than the same word in a single-category list.

To summarize, the model predicts that at the beginning of a list, the probability to recall the critical word when presented within an isolate-list (Von Restorff condition) will be lower than the probability to recall the same word within a same-category list (since it is still not distinctive or surprising). Toward the end of the list this is expected to be inverted: the probability to recall the critical word within an isolate-list (Von Restorff condition) is higher than the probability to recall the same word in a same-category list. An experiment was constructed to test this hypothesis and simulation result.

Experiment 2: Memory for Semantic Isolates in IFR

The Von Restorff effect has been studied in the semantic domain by Hunt and Lamb (2001) and by Fabiani and Donchin (1995). However, little attention was given to the serial position of the isolate word within the list. The current experiment was constructed to further investigate the relationship between novelty and serial position as found in the previous experiment and model. An additional goal is to verify the model's prediction for comparison of the novelty effect in different serial positions to two possible baselines: a list of unrelated words, and list of related (same-category) words.

Therefore, two control conditions were included in the current semantic-isolates free recall experiment: a semantically related list and a semantically unrelated list. In order to compare the effect of the different types of lists as accurately as possible without confounds (due to the specific word chosen) the same critical word was embedded in three types of word lists: (1) a different category list (isolate condition), (2) a same category list, and (3) a list of unrelated words. For example, to create an isolate list, a critical word “Hour” was presented within the “animal” list: “Bear, Pig, Elephant, Deer, Cat, Mouse, Cow, Tiger, Horse, Lion, Rat.” To create a “same” category list, the word “Hour” was presented within the “time units” list: “Year, Decade, Second, Day, Century, Week, Millisecond, Minute, Month, Nanosecond, Millennium.” To create an unrelated list, “Hour” was presented within a list of randomly selected words, such as: “Prospect, Velvet, Account, Advance, Madam, Payment, Hunter, Pursuit, Circle, Clothing, Safety.”

To avoid presenting a critical word twice to one person, this comparison was performed between participants for each critical word. However, each participant contributed to all conditions while counterbalancing conditions with critical words. In addition, the contribution of frontal mechanisms to this effect was investigated. In order to test this with healthy young adults, a correlational approach was adopted. As previously, all participants completed a fluid intelligence test (Cattell and Cattell, 1963), and correlations were analyzed between the Von Restorff variables and IQ. The following experiment was replicated twice, with small methodological changes.