In paintings, the viewer’s eye is easily caught by human figures, especially faces. Although gaze behavior during picture viewing is affected by physically salient visual features, also cognitive factors, such as the given task, are important (Buswell, 1935; Yarbus, 1967; DeAngelus and Pelz, 2009). Moreover, the viewer’s internal cognitive plans or strategies may differently guide the gaze. In art schools and classes for art history, future artists and experts on art are trained to pay attention, beyond the figurative elements, to other aspects of the paintings, e.g., the historical context, different painting styles and the composition of objects, forms, and color. Thus, artists and experts on art are expected to view paintings differently from laypersons.
Differences in gaze behavior can be studied by analyzing fixations and saccades. Fixations are the periods when the eyes are relatively stable and visual information is gathered, while saccades are fast ballistic eye-movements which bring the fovea from one fixation point to another. The idea of expert cognitive strategies has prompted several studies on comparison of the eye-movements of experts vs. laypersons in different areas of expertise. For example, experienced radiologists were found to apply a “global” analysis of mammography images in detecting breast cancer; the expertise was considered to arise as a shift from detailed scanning to a holistic, gestalt-like perception (Kundel et al., 2008). Expert chess players, on the other hand, fixated beside the chess pieces and at the center of the board, whereas novices fixated more often directly at the piece they needed to recognize (Bilalic et al., 2011). The particular eye-movement behavior of the chess experts was accompanied by bilateral brain activation, in contrast to only left-hemisphere activation in the novices, and the authors suggested the right hemisphere activation to be linked to holistic processing of the stimuli.
Expertise is reflected in holistic processing also in subjects viewing art. Nodine et al. (1993) showed that untrained viewers fixate more on central and foreground figures, whereas art-trained viewers spend more time looking at background features, consistently with the idea that untrained viewers focus more on individual objects and art-trained viewers more on the relationships among the pictorial elements. Accordingly, Kapoula and Lestocart (2006) suggested that experts scan a larger surface of a painting than do laypersons. Vogt (1999) and Vogt and Magnussen (2007) provided further evidence for different viewing strategies of art-trained and untrained subjects by showing that untrained viewers spend more time on areas with recognizable objects and human features than do artists. However, differences in gaze patterns are less obvious between the groups. Illes (2008) and Kristjanson and Antes (1989) reported great individual variability in durations of fixations of both experts and laypersons viewing paintings. The artists made longer fixations while viewing familiar paintings whereas the non-artists’ fixations were longer while viewing unfamiliar paintings (Kristjanson and Antes, 1989).
Expertise affects not only the viewing strategies, but also the viewers’ art preferences. Representational art depicts elements that are easily recognized by most people, whereas with increasing level of abstraction the recognizable elements disappear. Non-professional art viewers prefer representational over abstract paintings. They also give higher scores on an affective scale to representational rather than abstract paintings (Uusitalo et al., 2009). Art education and frequency of visits to art galleries were linked to a tendency for positive ratings of abstract art (Furnham and Walker, 2001; Uusitalo et al., 2009). When subjects viewed post-impressionist paintings and their manipulated “abstract” versions whose content could not be identified, non-experts and industrial design students preferred the original paintings over the abstract ones, while the ratings of senior art school students did not differ significantly between the original and abstract versions (Hekkert and van Wieringen, 1996). Similarly, Illes (2008) found a clear preference for figurative paintings and dispreference for non-figurative ones in laypersons, but not in artists or experts.
An interesting, but largely overlooked, question is the relationship between the cognitive and bodily measures of experiencing art. De Jong (1972) compared the esthetic likings and skin conductances between three groups: students of art history, students of art, and non-experts. While the non-experts’ likings differed from those of the experts-in-training, it was not possible to differentiate between “beautiful” and “ugly” paintings by means of skin conductance. Self-reported evaluations of valence and skin conductance responses evoked by viewing emotional pictures did not correlate with each other and were associated with activation of different brain areas (Anders et al., 2004).
Taken together, experts’ viewing strategies and esthetic appreciations seem to differ from those of laypersons. However, many of the earlier studies suffer either from a small number of subjects (Yarbus, 1967; Nodine et al., 1993; Zangemeister et al., 1995), a small number of paintings (Zangemeister et al., 1995; Smith et al., 2006), or a lack of professional categorization of paintings into abstract and representational groups (Zangemeister et al., 1995; Uusitalo et al., 2009). One of the motivations for the present study was to replicate and expand earlier results on art expertise by investigating a larger number of subjects and paintings. By increasing the number of paintings, we were further able to group the paintings into subcategories along the continuum from representational to abstract. In our study, two of the authors, both experts on art history and esthetics, selected and categorized the paintings. Attention was also paid to specifying the group of experts. As discussed by Vogt and Magnussen (2007), the expertise that painters acquire by training to produce figurative art may be supported by special perceptual information-processing strategies. As these strategies are not necessarily typical for all artists or experts on art, the studied groups should be carefully defined. In the present study, the expertise was specified as the subjects’ knowledge on art history.
We investigated whether the expertise acquired by professional studies in art history affects esthetic judgments and gaze patterns of subjects viewing digitized images of paintings. Specifically, we were interested to analyze whether the continuum from representational to abstract paintings (five categories) would be reflected in these measures. As non-experts tend to dislike abstract paintings (see above), we hypothesized that the increasing level of abstraction would gradually decrease the esthetic judgments of laypersons, while those of experts would not change. Further, as laypersons spend more time looking at the figurative elements (Vogt, 1999; Vogt and Magnussen, 2007), we examined whether the increasing level of abstraction and disintegration of the figurative elements would affect differently the fixation parameters of the experts compared with those of the laypersons. To have a broader view, we also studied the effect of expertise on emotional reactions to the paintings by collecting self-reported evaluations of positive/negative feelings evoked by the painting and measuring electrodermal reactivity.
Materials and Methods
Half of the subjects (n = 20) were experts on art history who had been studying art history as a major subject or esthetics as a major and art history as a minor subject in the University of Helsinki. Thus viewing and evaluating paintings had formed an important part of their training. The laypersons’ group (n = 20) consisted of university students or graduates with no visual art studies or hobbies. In addition to the educational background, the groups were matched by gender and age. Both groups included 17 females and 3 males. The mean age was 30.2 years (range 24–49) in the expert group and 29.8 years (21–43) in the control group. All subjects had normal or corrected to normal vision (4/20 in expert group and 5/20 laypersons wore eyeglasses, and 5/20 and 3/20, respectively, had contact lenses). All subjects signed an informed consent form before the experiment. The study had prior approval by the Ethics Committee of the Hospital District of Helsinki and Uusimaa.
Stimuli and Experimental Paradigm
The stimuli consisted of 35 fine art paintings by renowned artists representing different styles in the Western tradition of painting from the sixteenth century up to the 1980s (Table 1). The paintings were downloaded from the ARTstor digital library (http://www.artstor.org/index.shtml) and selected to represent different subject categories and a continuum from representational to abstract art. The representational–abstract continuum had five categories: (I) representational paintings, (II) less representational paintings where the subject matter can be well-recognized despite less details than in category I due to style, technique used etc., (III) paintings in which the subject matter is difficult to understand, at least at first sight, (IV) almost abstract paintings where the style approaches full abstraction, with only a few identifiable details or with details that are difficult to recognize, and (V) abstract paintings. Each category had seven paintings. All paintings, except those in the abstract category, depicted human beings, landscapes, or urban scenes.
Digitized copies of the paintings were projected (Mitsubishi Electric HC6000) at resolution of 1024 by 768 pixels to a screen (112 cm width, 100 cm height) about 2.5 m in front of the subject. Due to the different formats of the artworks and some free moving of the subject, the paintings, fully covering the screen in either direction, were seen in visual angles of 19–25° horizontally and 15–23° vertically.
The Presentation software (Neurobehavioral Systems, Inc.,) was used for controlling stimulus presentation. The software ran on a stimulus PC which was connected to the eye-tracking PC to provide correct timing.
The experiment consisted of two viewing sequences (Parts 1 and 2) of all paintings; thus each painting was displayed twice. Subsequent to each painting, the subject had to answer questions on a printed questionnaire (see below). We used a presentation time of 10 s in Part 1 and a presentation time of 30 s in Part 2 since it is known that 10 s is sufficient to obtain an overview of a picture while 30 s is the average observation duration for an esthetic judgment when unlimited time is given (Locher et al., 2007). In the Metropolitan Museum of Art, visitors typically view paintings for less than 30 s, with a median of 17 s (Smith and Smith, 2001). In Part 1, following the 10 s of image presentation, the subjects had 30 s to answer the questions. After that, a sound indicated the end of the answering period, and the subjects had to switch their gaze back to the screen. In Part 2, subsequent to the 30-s presentation, a 25-s period was given for answering. To avoid confusion, paintings were numbered sequentially (1–35), and the respective number was printed on the questionnaire and shown on the screen during the answering period. The full experiment, including preparation time, lasted on average 1.5 h.
The paintings were shown in a fixed pre-randomized order to one half of the subjects, and in the reverse order to the other half. Both Part 1 and Part 2 began with the presentation of the same five “rehearsal” paintings – one from each category – always shown first. The rehearsal paintings were presented to acquaint the subjects with the duration of picture presentation and the time for answering the questions, as well as to give an overview of the different image categories for facilitating the subsequent ratings.
Each 30-s presentation period in Part 2, except for the five “rehearsal” paintings, was accompanied by auditory information (presented via two loudspeakers) about the painting. The rehearsal paintings were used as control pictures to separate the effect of information. For half of the paintings, the information given in Part 2 consisted of neutral facts. For example, for Gauguin’s Pool, Martinique, the subjects heard that “Gauguin made the painting while staying on this Caribbean island, and the painting is from the painter’s early impressionist period; hence the ambiance has been conveyed with small, distinguishable brush strokes and with pure colors.” For the other half of the paintings, some “tabloid-type,” emotion-evoking details were added. For example, for Macke’s Separation, the story went: “Macke was influenced by the avant-garde movement and expressionism, and he was combining these styles in his work. He was called to join the army in the beginning of the World War I and died just few weeks later at the age of 27 years.”
In Part 1, subjects had to first indicate if they had seen the picture before. During Parts 1 and 2, subsequent to each picture presentation, participants had to answer a questionnaire asking for esthetic evaluation (“In your opinion, is this painting a good work of art?” Scale: 1 – not at all good, 5 – very good) and the emotions evoked (“In your estimate, what is the quality of emotion evoked by this painting?” Scale: −2 very negative, +2 very positive).
Gaze patterns were measured using a semi-portable, video-based iViewX HED4 eye-tracking device (SensoMotoric Intruments, Teltow, Germany). The sampling rate of the eye tracker was 50 Hz, the spatial tracking resolution was <0.1°, and the gaze-position accuracy better than 1°. The system was attached to a cap – thus allowing small head and body movements while the subject was sitting on a sofa – and connected to the eye-tracking PC, and from there via serial connection to the stimulus PC. Before Part 1, a 9-point gaze calibration was performed. When needed (in 30% of the subjects), the calibration was repeated after the rehearsal pictures or before Part 2.
Changes in electrodermal activity (EDA) were measured between two electrodes attached to the index and ring fingers of the subject’s non-dominant hand. A small amount of conductive paste was put between the electrodes and the skin. Low (0.5 V) DC voltage was applied between the two electrodes, and the conductance of the body in between them was measured. Sensors were connected to a ME6000 (Mega Electronics Ltd., Finland) data logger, which sampled the EDA at 1000 Hz. Offline, the data were transferred to MegaWin analysis software (Mega Electronics Ltd., Finland), which was used for handling and exporting the data.
First, the HED4 eye tracker calculated the gaze position in the scene video coordinates. In an offline analysis with Matlab software (Natick, MA, USA), we determined the position of the projected painting in each frame of the video. We used the scale invariant feature transform (SIFT; Lowe, 1999) to extract salient key points from images and matched them to find corresponding points between images automatically. Even though SIFT features are designed to be robust to changes in lighting, straightforward matching of SIFT features between the original images and the video failed because projection and subsequent video imaging changed the images too much. To solve this problem, we picked a reference frame in the video for each painting, matched the reference frame to the painting image manually, and then matched the video frames to the reference video frame by using the SIFT feature. Finally, we mapped the eye-tracking data from each video frame to the painting image via the reference video frame. The accuracy error of the transformed data points was less than 30 pixels in the scale of the original image, and the error rate was inspected manually in several data sets. This transformation was necessary because we allowed the subjects to view the images without restricting their head movements.
We then imported the raw eye-tracking data to OGAMA software (Voßkühler et al., 2008) for event detection and for preparation of statistical analysis. Detection of fixations was based on the dispersion-threshold-identification (I-DT) algorithm (Salvucci and Goldberg, 2000), with a dispersion radius of 1° and a minimum fixation length of 80 ms. Gaps between the fixations were classified as saccades.
The average fixation duration, average fixation count, and total length of the scanpath (sum of all saccades) were then computed for each subject and painting. Furthermore, region-of-interest (ROI) analysis was conducted to determine the total fixation duration on each predefined ROI. The ROIs, drawn manually, included heads and faces on paintings depicting human characters. ROI analysis was performed for Categories I–IV of the representational–abstract continuum; Category V was excluded because, by definition, no human figures were depicted there.
Furthermore, scanpaths were analyzed as spatial and temporal sequences with the ScanMatch method (Cristino et al., 2010). For the spatial alignment of the sequences, an 8 × 8 substitution matrix was created, dividing the screen in 64 sectors, each with a size of 128 × 96 pixels and a gap penalty of 0. The small gap penalty value was chosen as it “benefits the global alignment of the sequences” (Cristino et al., 2010, p. 695). In addition, a temporal binning was applied with a bin size of 100 ms. Thus, in the sequence a fixation of 100 ms was counted only once while a fixation of 300 ms was counted three times.
Whenever appropriate, the statistical analyses were carried out using repeated measures ANOVA (SPSS 14.0.1). Greenhouse–Geisser correction for F and P values was used if the sphericity assumption was violated.
As expected, experts were familiar with a larger number (7.5 ± 4.6; mean ± SD) of the 35 paintings than laypersons (1.1 ± 2.1). In other words, the paintings got more “Yes, I have seen it before” answers from experts than laypersons (Mann–Whitney Test, U = 260, z = −4.4, P < 0.001).
In general, the esthetic ratings were higher in Part 2 than in Part 1 (Wilcoxon signed ranks test, z = −2.02, P = 0.043 for both groups). The same was true for the “rehearsal” pictures without audio (experts: z = −2.8, P = 0.003; laypersons: z = −2.9, P = 0.002).
As shown in Figure 1A, the level of abstraction affected the esthetic judgments differently for both groups of participants. The grades of the laypersons were highest in the representational Category I and lowest in the abstract Category V (Table 2) [Friedman’s ANOVA, Part 1: χ2(4) = 41.0, P < 0.001, Category I vs. V: P < 0.001 sig = 0.0125; Part 2: χ2(4) = 44.0, P < 0.001; Category I vs. V: P < 0.001 sig = 0.0125]. In Part 1, the judgments of the experts were not affected at all by the abstraction level [Friedman’s ANOVA, χ2(4) = 6.8, P = 0.15] and Part 2 showed a slight effect to the opposite direction [Friedman’s ANOVA, χ2(4) = 9.7, P = 0.046]. Accordingly, the abstraction level only affected the emotional evaluations of laypersons [Friedman’s ANOVA, Part 1: χ2(4) = 17.1, P = 0.002; Part 2: χ2(4) = 21.3, P < 0.001]: representational paintings evoked the most positive emotions (Part 2, Category I: 0.58 ± 0.14; mean ± SE), but they became less positive and even negative with growing abstraction (Part 2, Category V: −0.15 ± 0.11), whereas experts’ grades did not change [Part 1: χ2(4) = 2.4, P = 0.7; Part 2: χ2(4) = 5.9, P < 0.2; Figure 1B].
Figure 1. Mean esthetic judgments (A) and emotional ratings (B) of the paintings in Part 2 decrease from representational (I) toward the abstract (V) category in laypersons but not in experts. The error bars represent SE.
Figure 2 shows how for both experts and laypersons, the number of fixations (main effect of Part, F1,38 = 93.0; P < 0.001) and the length of the scanpath (F1,38 = 37.8; P < 0.001) decreased from Part 1 to Part 2 (first 10 s). The mean duration of fixations increased from Part 1 to Part 2 (F1,38 = 56.8; P < 0.001).
Figure 2. Gaze patterns relative to the abstraction level of the paintings for Part 1 (left) and for the 10 first seconds of Part 2 (right). Average number of fixations (top row), mean duration of fixations (mid row), and total length of the scanpath (bottom row) for experts and laypersons. The error bars represent SE.
Generally, the gaze patterns were affected by the level of abstraction in both laypersons and experts (Figures 2 and 3; Table 3). In both groups, the mean duration of fixations decreased (main effect of Category for both Part 1 and 2, P < 0.001) and the length of scanpath increased (main effect of Category for both Part 1 and 2, P < 0.001) from representational toward the more abstract categories with no group differences. Also the number of fixations increased in both groups (main effect of Category for both Part 1 and 2, P < 0.001), this increase was stronger for laypersons in Part 2 and is evidenced in the contrast interaction of Group by Category for Category I vs. V (F1,38 = 8.7; P = 0.005).
Figure 3. Gaze patterns relative to the abstraction level of the paintings for Part 2 (all 30 s). Average number of fixations (top row), mean duration of fixations (mid row), and total length of the scanpath (bottom row) for experts and laypersons. The error bars represent SE.
Table 3. Results of statistical analyses for number and duration of fixations and length of scanpaths for Part 1 and Part 2.
The fixations were longest for paintings depicting human beings, with no differences between experts and laypersons (main effect of Category F3,114 = 47.1; P < 0.001; P < 0.001 in comparison with landscapes, urban sceneries as well as abstract paintings).
From these paintings depicting human beings, ROIs including the heads and faces were selected for further analysis. In Part 1, the (total) fixation duration for faces was 12% longer in laypersons than experts (Figure 4; F1,38 = 4.4, P = 0.042;), it was generally longest in the representational Category I, which also separated laypersons from experts (main effect of Category F2.4,90 = 30.8, P < 0.001, Group by Category interaction F2.4,90 = 30.8, P = 0.021, Category II: F1,38 = 6.6, P = 0.014), whereas toward the more abstract categories the group differences disappeared. In Part 2, the fixation durations did not differ between the groups.
Figure 4. Mean total fixation times (as percentage of viewing time) on head ROIs for Part 1 (A) and Part 2 (B). * denotes statistically significant difference for groups in Category II.
To compare the scanning strategies between the groups, we calculated the average distances of the fixations from the center of the paintings. In Part 1, the distance was larger for experts than laypersons (6.8° ± 0.15°; mean ± SE vs. 6.3° ± 0.12° respectively; t = 3.0; P = 0.005; see Figure 5); in Part 2, the distances did not differ between the groups.
Figure 5. Examples of fixations and scanpaths for laypersons and experts on Boudin’s (left) and Gris’ (right) paintings in Part 1 (top row) and Part 2 (bottom row). In Part 1, on Boudin’s painting the layperson #1 concentrates more on the center of the painting, whereas the expert #1 views the picture more widely. This difference is not seen between the layperson #2 and expert #2 for Gris’ painting, thereby illustrating the large variability between subjects and/or paintings. In Part 2 the differences between laypersons and experts tend to disappear.
Moreover, to examine the similarity of the scanpaths in the two groups, we compared for each picture all scanpaths pairwise, separately for each group, using the ScanMatch algorithm (Cristino et al., 2010). Mean similarity indices per picture showed that the scanpaths were more similar in Part 2 than in Part 1 (main effect of Part, F1,56 = 55.1, P < 0.001). In Part 1, the scanpaths were more similar in the layperson group than the expert group (Part by Group interaction, F1,56 = 10.5, P = 0.002, t-test between the groups in Part 1: t = −2.9, P = 0.006). When Part 2 was divided into three consecutive sequences of 10 s, a main effect for Sequence (F2,16 = 19.6; P < 0.001) indicated a higher similarity in each group during the first 10 s than the rest of the viewing time. No differences were observed regarding the similarities of scanpaths between the categories.
Electrodermal reactivity (the difference between maximum and minimum EDA values) was not affected by the level of abstraction in either group, neither in Part 1 nor in Part 2.
Electrodermal reactivity was larger in both groups during Part 2 – when either neutral or tabloid-type information was given – than during Part 1 (Figure 6; main effect of Part, F1,32 = 46.8, P < 0.001), and the change from Part 1 to Part 2 was larger (25.9 vs. 13.0 nS) for laypersons in comparison with the experts (interaction of Part by Group, F1,32 = 5.2, P < 0.029). Furthermore, the tabloid-type information tended to have stronger effect on laypersons than experts (three-way interaction of Part by Group by Type, F1,32 = 5.4, P = 0.026).
Figure 6. Effect of the type of information given in Part II on the magnitude of the electrodermal reactivity (nS, nanoSiemens; N, neutral information; T, tabloid-type information).
We examined whether and how expertise in art history would affect the self-reported esthetic and emotional ratings, eye-movements, and EDA during viewing of paintings. We were interested in how the continuum from representational to abstract paintings is reflected in these measures. As expected, the abstraction level affected the ratings of laypersons and experts differently. Esthetic judgments and emotional valence decreased with increasing abstraction level for laypersons, but not for experts. Contrary to the cognitive ratings, however, the abstraction level affected the number and duration of fixations as well as the length of the scanpath in both groups. Nevertheless, in Part 1, the fixation duration on the face areas, the distance of the fixations from the center of the picture, and the similarity of scanpaths differed between the groups and thereby indicated different viewing strategies.
The abstraction level affected both the number and duration of fixations. For the most representational category of paintings, the number of fixations was smallest and the fixation durations were respectively longest, whereas paintings of the most abstract category elicited more fixations with shorter duration. This finding is compatible with the idea that, in representational paintings, the eyes fixate longer on the figurative details than in abstract paintings where the figurative elements are lacking and the subject keeps searching for them. The paintings with human figures evoked the longest fixations in both groups. The ROI analysis of Part 1 revealed that for the most representational paintings depicting humans, laypersons had longer fixations than experts on the face and head areas, whereas for the more abstract categories the group differences disappeared. These results support the notion that while human figures are strongly salient in attracting the gaze, their effect can be inhibited by expert viewing strategies (Vogt, 1999; Vogt and Magnussen, 2007). However, the group differences were seen only in Part 1. In Part 2, the fixation durations were similar for both groups also in the representational paintings, most likely because the longer viewing time allowed subjects to concentrate on the details.
Several factors tend to keep the gaze focused on the center of the screen. First, between the displays, while subjects were answering the questionnaires, an image number was displayed in the center of the screen, which may have focused the gaze toward the center at the beginning of the display of the next picture. Second, subjects have a general tendency for fixating the middle of the screen irrespective of the distribution of the image features (see Tatler, 2007). Third, in art, main figurative elements often appear in a central position (Locher et al., 2007; Tyler, 2007). However, the larger distance of fixations from the image center observed for experts indicates that expertise can inhibit the center-viewing tendency. This interpretation is in line with earlier suggestions that eye movements of experts cover wider areas of paintings than those of laypersons (Kapoula and Lestocart, 2006), or that experts generally use more global than local viewing strategies than non-experts (Zangemeister et al., 1995). While laypersons concentrate on the details of the picture, experts also examine the spatial construction while evaluating the esthetics of the painting (Kapoula et al., 2008). However, in Part 2, the fixation distances from the center were similar in both groups. This result can be a combined effect of the longer viewing time, allowing for concentration on the details and the given information that guided the gaze similarly in both groups (Richardson et al., 2007). Nevertheless, we want to emphasize that for such kind of investigations the number of subjects and stimuli play an important role, since the viewing behavior varies considerably according to the artwork and the viewer, as illustrated in Figure 5.
The scanpaths in Part 1 were more similar within the layperson group than within the expert group. Despite the strong effect of low-level visual saliency (Koch and Ullman, 1985; Itti and Koch, 2001) in guiding saccades, semantically meaningful features attract fixations (Yarbus, 1967; Nyström and Holmqvist, 2008). We argue that, during the first viewing, both the social (human figures) and non-social saliency guided the gaze of the laypersons in a similar way, whereas the experts were using their individual training- and expertise-related strategies in scanning the pictures, resulting in a top-down inhibition of social (figurative) cues. The disappearance of the difference between similarity indices in Part 2 can be explained by the audio information that guided the viewing process similarly in both groups (Richardson et al., 2007). Interestingly, during the first 10 s of Part 2, the scanpaths were more similar within both groups than during the remaining periods. This finding agrees with earlier suggestions that early viewing is guided more by low-level processes before a stronger involvement of individual strategies comes into play (Tatler et al., 2005). Thus, with longer viewing time, the consistency of fixation locations between observers decreases. It is possible that the combined effect of the audio information and (social and non-social) saliency factors were more powerful in the beginning of the viewing period, after which the scanpaths became more individual.
The two parts in the present study are not directly comparable because of the longer duration and the additional provided information in Part 2. Some differences between the parts are discussed below.
Esthetic judgments of the paintings were higher in Part 2 than Part 1. At least repetition, longer viewing time, and information given about the paintings have to be considered as possible contributing factors to this difference. However, repetition of images of real-world scenes has been shown to lower rather than increase preference ratings (Biederman and Vessel, 2006). Regarding the effects of viewing time, the results diverge. Locher et al. (2007) found that a longer viewing time (100 ms vs. unlimited time, mean 32.5 s) raises the pleasingness of art stimuli, whereas Smith et al. (2006) found no such effect (viewing times varying between 1, 5, 30, and 60 s). Moreover, Smith et al. (2006), by either showing or omitting painting captions, noted that the information about a painting did not affect the ratings of viewers (mixed group of art-trained and lay viewers). The role of information in raising the ratings in the present experiment is improbable since the ratings were also higher for the “rehearsal” pictures in Part 2 that were not accompanied by auditory information. Thus, we argue that judgments in Part 2 raised ratings mainly due to the longer viewing time.
The eye-movement parameters also differed between the two parts of the experiment: in both expert and layperson groups, the duration of fixations increased from Part 1 to Part 2 (first 10 s in Part 2), while the number of fixations decreased. Even though the longer viewing time was controlled for, as both parts were analyzed for the 10 first seconds, the subjects knew that the viewing time was longer in Part 2 than Part 1, and they could thus take more time to examine the details as reflected in the longer fixations.
Finally, the electrodermal reactivity was stronger in both groups in Part 2 than in Part 1. The information given during Part 2 affected the EDA values of the laypersons more than those of the experts suggesting that the laypersons were more susceptible to the information given, which is understandable as they knew less about the paintings and painters in advance.
The digitized pictures projected on the screen obviously did not have all qualities (e.g., texture, size) of the original paintings. The reproduction type (original or digital painting) does not, however, affect the target of fixation (Kapoula et al., 2008). However, original paintings viewed in the gallery are rated more pleasant and interesting than their slide or computer reproductions, both by art experts and laypersons (Locher et al., 2001). In our study, the experts gave on average higher esthetic and emotional ratings than did the laypersons. This difference cannot be caused by the format of the painting, but reflects the low ratings that the laypersons gave to the more abstract paintings.
In conclusion, we found that expertise in art history strongly influences the cognitive but hardly any of the psychophysiological measures of subjects experiencing art. Esthetic judgments and emotional valence ratings given by the laypersons depended on the level of abstraction, being more positive for the representational than abstract categories, whereas those of the experts did not show this tendency. Although the gaze patterns of both groups were similarly affected by the level of abstraction, the expertise was reflected in the viewing strategies, e.g., where the subjects were looking. This result agrees with the global viewing strategies previously detected in various expert groups.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
This study was supported by the Academy of Finland (National Centers of Excellence Program 2006–2011), the aivoAALTO project of the Aalto University, Aalto MIDE programme (project UI-ART), ERC Advanced Grant #232946 to Riitta Hari, and European Commission grant (FP7-PEOPLE-2009-IEF, EyeLevel 254638) to Sebastian Pannasch. We thank Cathy Nangini for valuable suggestions on earlier versions of the manuscript and Ville Renvall for support in the experiment preparation. Image credits for Figure 5: Eugène Boudin (French, 1824–1898), Villefranche, c. 1891. Oil on panel, 16 1/16′′ × 12 7/8′′ (40.8 cm × 32.7 cm). (c) Sterling and Francine Clark Art Institute, Williamstown, MA, USA, 1955.547. Juan Gris (José Victoriano González Pérez, Spanish, 1887–1927) Still Life before an Open Window, Place Ravignan, Work Type Paintings, 1915. Oil on canvas, 45 5/8′′ × 35′′ (115.9 cm × 88.9 cm). Philadelphia Museum of Art, Philadelphia, PA, USA, The Louise and Walter Arensberg Collection, 1950, 1950-134-95.
Bilalic, M., Kiesel, A., Pohl, C., Erb, M., and Grodd, W. (2011). It takes two – skilled recognition of objects engages lateral areas in both hemispheres. PLoS ONE 6, e16202. doi: 10.1371/journal.pone.0016202
Illes, A. (2008). “Behind the beholder’s eye – searching for ‘expertness’ in gazing patterns,” in Proceedings of the 20th Biennial Congress of the International Association of Empirical Aesthetics, ed. K. S. Bordens (Chicago, IL: Indiana University-Purdue University Fort Wayne), 35–37.
Kapoula, Z., Yang, Q., Vernet, M., and Bucci, M. P. (2008). “2D-3D space perception in F. Bacon’s and Piero della Francesca’s paintings: eye movement studies,” in Proceedings of the 20th Biennial Congress of the International Association of Empirical Aesthetics, ed. K. S. Bordens (Chicago, IL: Indiana University-Purdue University Fort Wayne), 75–78.
Kundel, H. L., Nodine, C. F., Krupinski, E. A., and Mello-Thomas, C. (2008). Using gaze-tracking data and mixture distribution analysis to support a holistic model for the detection of cancers on mammograms. Acad. Radiol. 15, 881–886.
Locher, P. J., Smith, J. K., and Smith, L. F. (2001). The influence of presentation format and viewer training in the visual arts on the perception of pictorial and aesthetic qualities of paintings. Perception 30, 449–465.
Salvucci, D. D., and Goldberg, J. H. (2000). “Identifying fixations and saccades in eye-tracking protocols,” in Proceedings of the Eye Tracking Research and Applications Symposium, ed. A. T. Duchowski (Palm Beach Gardens, NY: ACM Press), 71–78.
Uusitalo, L., Simola, J., and Kuisma, J. (2009). “Perception of abstract and representative visual art,” in Proceedings of AIMAC, 10th Conference of the International Association of Arts and Cultural Management, Dallas, TX.
Voßkühler, A., Nordmeier, V., Kuchinke, L., and Jacobs, A. M. (2008). OGAMA – open gaze and mouse analyzer: open source software designed to analyze eye and mouse movements in slideshow study designs. Behav. Res. Methods 40, 1150–1162.