Edited by: Eric D. Ragan, Texas A&M University, USA
Reviewed by: Guillaume Moreau, Ecole Centrale de Nantes, France; Eric Hodgson, Miami University, USA
Specialty section: This article was submitted to Virtual Environments, a section of the journal Frontiers in ICT
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
One potential application for virtual environments (VEs) is the training of spatial knowledge. A critical question is what features the VE should have in order to facilitate this training. Previous research has shown that people rely on environmental features, such as sockets and wall decorations, when learning object locations. The aim of this study is to explore the effect of varied environmental feature fidelity of VEs, the use of self-avatars, and the level of immersion on object location learning and recall. Following a between-subjects experimental design, participants were asked to learn the location of three identical objects by navigating one of the three environments: a physical laboratory or low and high detail VE replicas of this laboratory. Participants who experienced the VEs could use either a head-mounted display (HMD) or a desktop computer. Half of the participants learning in the HMD and desktop systems were assigned a virtual body. Participants were then asked to place physical versions of the three objects in the physical laboratory in the same configuration. We tracked participant movement, measured object placement, and administered a questionnaire related to aspects of the experience. HMD learning resulted in statistically significant higher performance than desktop learning. Results indicate that, when learning in low detail VEs, there is no difference in performance between participants using HMD and desktop systems. Overall, providing the participant with a virtual body had a negative impact on performance. Preliminary inspection of navigation data indicates that spatial learning strategies are different in systems with varying levels of immersion.
If virtual environments (VEs) are going to be used for spatial training, it is critical to understand how people explore and perceive surrounding space and the objects contained in it. However, the question of how humans learn and recall locations within an environment remains unanswered, since contrasting results have been reported. Previous findings suggest that the human brain could combine mechanisms based on geometric properties of the environment as well as self-motion. Moreover, it is not clear whether these strategies are the same when encountering real and virtual environments.
Previous research has looked at the effect of landmark configuration on search behavior. Spetch et al. (
Waller et al. (
Hartley et al. (
There is also evidence for spatial updating of egocentric representations (Wang and Simons,
If humans generate a cognitive map based on experience, awareness of the syntheticness of a computer-generated environment and the way it is explored may have an impact on spatial encoding and recall. Considering object location learning and recall as the crucial and most elementary form of training, this study explores the impact of environmental features, self-avatar, and immersion on spatial memory by asking participants to learn and recall the location of three identical objects.
When training in a VE, it is important to have an understanding of the technological variables that can be sacrificed without degrading learning effectiveness transfer to the real world (Waller et al.,
Although geometric fidelity of a space can be reproduced using basic 3D objects, such as planes, spheres, or cubes, high feature fidelity is not always achievable or may result in the development of computationally expensive systems. Previous studies have assessed the impact of rendering style on distance perception accuracy in virtual replicas of concurrently occupied VEs (Interrante et al.,
Based on previous results, in our study, we directly compare performance resulting from learning object locations in concurrently occupied virtual and real environments. We explore learning and recall of multiple external object locations as subjective measures of spatial perception. Our research focuses on understanding which cues are necessary for the design of virtual spaces that will ensure the optimal transfer of spatial knowledge to the real world. It is the purpose of this study to explore object location learning and recall in VEs with varying levels of feature fidelity.
Slater and Usoh (
The term
In this study, we aimed to explore the effect of level of immersion, the presence or absence of a virtual body, and the role of environmental features on object location memory. We compared placement accuracy when object locations were learnt in the real world and object locations were learnt in two distinct virtual replicas of the environment: a high detail 3D scan, where color, environmental, and geometric features are available, and a low-detail non-photorealistic replica of the shape of the room, where only geometric features were accessible. Participants learnt the position of three identical objects in one of the three environments as shown in Figure
Participants observed the VEs and learnt objects positions in different systems following a 2 × 2 × 2 design, with fidelity (high detail, low detail) as a within-subjects factor and avatar (body, no body) and level of immersion (HMD, desktop learning system) as between-subjects factors. Real world learning in the real environment with physical objects was treated as an additional learning system. Table
Learning system |
||||||
---|---|---|---|---|---|---|
Desktop body | Desktop no body | HMD body | HMD no body | Real world | ||
VE detail | Low detail VE | Desktop body | Desktop no body | HMD body | HMD no body | |
Low detail VE | Low detail VE | Low detail VE | Low detail VE | – | ||
High detail VE | Desktop body | Desktop no body | HMD body | HMD no body | ||
High detail VE | High detail VE | High detail VE | High detail VE | – | ||
Real environment | – | – | – | – | Real world |
Participants in the learning system conditions with a virtual body were assigned a single point tracking avatar model based on head tracking. In other words, a fixed mannequin was placed underneath the participant’s head position, with no other reference points or animated movements. Participants learning in the real world and in the HMD learning system conditions were able to explore the space by physically walking around the room. Participants learning in the desktop system condition were able to navigate the room by using keyboard and mouse control, to change position and view, respectively. All participants completed the learning stage in one of the three learning systems and then placed the physical objects in the real world (see “
We hypothesized that providing optic flow information, natural locomotion, and access to idiothetic cues in an HMD would promote higher similarity with real world learning in terms of placement accuracy and navigation. Previous results have indicated that training in a virtual environment of relatively low fidelity allows people to develop useful representations of large-scale navigable space (Waller et al.,
The experiment was conducted in a lab at University College London. The laboratory consisted of a 6 m long × 4 m wide × 3 m high open space. The high detail VE was comprised of a high fidelity 3D laser scan point cloud of the room with textures derived from photographs, rendered with a GPU-based point cloud renderer. 3D scanning was performed with a Faro Focus 3D S120 laser scanner. The low detail VE was modeled using diffuse shaded planes to reproduce the geometric shape of the laboratory. Figure
Head tracking and object positional data were logged with a NaturalPoint OptiTrack motion capture system using twelve Flex 3 cameras and retroreflective markers, at a sampling rate of 60 Hz. The measured mean tracking error was 3 mm. A 27″ Dell U2713HM monitor and an Oculus Rift Development Kit 2 (DK2) were used as displays for the desktop and HMD learning conditions, respectively. High fidelity single point tracking virtual avatars, based on head tracking, were used in the corresponding desktop body and HMD body conditions. A female and male avatar model were obtained from the Rocketbox® Library (Havok,
A total of 20 participants (9 females, 11 males; average age 26 years, SD = 5.3) were recruited from the student and staff population at University College London. All participants signed a consent form and the study was approved by the University College London Research Ethics Committee (project ID: 6708/002). Participants were paid £10 for participation. They were assigned to the different experimental conditions based on individual results for a standard spatial ability test to avoid any possible bias between groups (Bodner and Guay,
The experimental task consisted of two phases, before and during the lab session. Table
Before lab session | Online consent | |
Background questionnaire | ||
Spatial ability test | ||
During lab session | Consent | |
Trial 1 | Learning stage (high detail VE, low detail VE, or real world) | |
Recall stage (real world) | ||
Questionnaire | ||
Trial 2 | Learning stage (high detail VE, low detail VE, or real world) | |
Recall stage (real world) | ||
Questionnaire | ||
Debrief interview |
During the lab session, participants were asked to sign a paper copy of the consent form and asked to read an information sheet with written instructions describing the experimental task. Participants were asked to switch off their mobile phones and were introduced into the lab. No practice trials were done, and participants were not given feedback on their performance throughout the experiment.
The experimental task consisted of two trials, each with a learning and a recall stage. The learning stage involved viewing the three virtual objects in the real room or one of the low and high detail VEs in one of the three learning system conditions: real world, desktop, or HMD. In the recall stage, participants were asked to place the three physical objects as they remembered them from the learning stage into the real room. No further information was given, and participants were asked to try their best if they were in doubt as to where the object’s original position was. There was no time limit for the learning and recall stages, and participants were able to freely navigate the environment. Participants could navigate through all objects of the environment, but not through the environment boundaries. An experimenter was present at all times during the experimental task to manage cables and provide guidance on the different experimental stages.
Participants learning in the HMD and desktop learning systems (16 participants) performed the two trials, each corresponding to one of the two versions of the VE in the learning stage: high detail and low detail. Participants experienced the two VEs in different orders, ensuring that the two possible combinations were tested equally. Participants learning in the real world (4 participants) performed the same trial twice, always learning in the real room. In each trial, and for each participant, all three objects were randomly arranged on a conceptual 5 × 5 grid, avoiding straight line configurations. Subjects could not see the grid in the environment and were asked to ignore retroreflective markers on the stools, which were used to track and identify the stools for data collection.
After each trial, participants were asked to complete a short online questionnaire measuring examination, confidence, difficulty, movement, application, and observation (see Table
Variable | Question | Likert scale range |
---|---|---|
Examination | The learning environment allowed me to closely examine the objects | 1: poorly–5: very well |
Confidence | I am confident that I performed the task well | 1: unconfident–5: confident |
Difficulty | The placement task was… | 1: easy–5: difficult |
Movement | I could move around the learning environment as I wanted | 1: disagree–5: agree |
Application | I could directly apply what I learned in the learning environment when placing the objects in the real room | 1: disagree–5: agree |
Observation | The learning environment allowed me to naturally observe and learn the object positions | 1: disagree–5: agree |
Tracked object placement data were used to calculate the Euclidean distance, referred to as placement error, between object positions as placed by participants in the recall stage and original object positions. Figure
A three-way mixed ANOVA with fidelity (high detail, low detail) as a within-subjects factor and avatar (body, no body) and level of immersion (HMD, desktop learning system) as between-subjects factors was run. There were no outliers in the data, as assessed by inspection of a box plot. There was homogeneity of variances for both high detail placement errors (
Statistical significance of simple main effects was accepted at a Bonferroni-adjusted alpha level of.025. There was a statistically significant simple main effect of avatar for the low detail environment,
A Kruskal–Wallis H test showed that there was an overall statistically significant difference in placement error between the different learning systems, χ2(2) = 56.452,
A one-way between-subjects ANOVA was performed on questionnaire responses for desktop body, desktop no body, HMD body, and HMD no body learning system conditions, for high and low detail VEs. Results show a large number of mixed significant interactions with no overarching trend due to the limited number of repetitions.
Tracking results, shown in Figure
A one-way between-subjects ANOVA was conducted to compare the effect of learning system on the percentage of time spent navigating inside the conceptual 5 × 5 object grid in desktop, HMD, and real-world learning system conditions. There was a significant effect of learning system on percentage time spent navigating inside the conceptual 5 × 5 object grid at the
To further illustrate differences in navigation strategies, we created cluster heat maps of the time spent in each region of the room for each of the system conditions: Desktop (left), HMD (middle), and real world (right), shown in Figure
This study analyses object location memory transfer from VR to the real world. It extends previous work on spatial perception in VEs (Ellis and Menges,
Our results illustrate that HMD learning resulted in statistically significant higher performance followed by desktop learning. Our analysis suggests that availability of environmental features in VEs can enhance object location memory under certain setups. The overall negative effect of the self-avatar indicates that single point tracked virtual bodies may not be sufficient to increase performance in this experimental task. Specifically, the use of self-avatar in HMD body learning impaired placement accuracy. Single point tracking caused the virtual self-avatar to appear in front of the participant’s real body if they leaned forward, partially occluding some of the available environmental features. The degradation in performance might have been because the virtual body occluded features in the environment that the participant could have attended to. This might then have forced a change to a different strategy for learning one or more object placements. Moreover, the lack of motion fidelity provided by single point virtual bodies might interfere with presence in VEs.
The results on navigation strategies seem promising. Similar to participants learning in the real world, participants learning in the HMD system mainly navigated areas within the boundaries of the conceptual 5 × 5 object grid, whereas participants learning in the desktop system primarily explored areas outside the boundaries of the conceptual 5 × 5 object grid. This may suggest that, when learning object locations in less immersive systems, users navigate toward the environment boundaries to obtain more global views of the scene. In addition, the range of areas of the room accessed by participants learning in the desktop system was wider than the range of areas of the room participants learning in the real world and HMD system in the X and Y axis. Although differences in navigation in systems with varying levels of immersion have been reported (Ruddle et al.,
One of the limitations of the work presented here is the relatively low number of participants. A larger population sample is needed to further validate our results as well as to explore the effect of more complex self-avatars with higher motion fidelity on spatial memory. It would also allow us to analyze navigation trajectories in more detail, exploring the regions visited by participants in relation to the object locations and features of the environment. Other experimental tasks comparing object location memory in systems with varying levels of immersion are required to confirm whether our results are generalizable.
In this paper, we present a study on object location memory. The experimental task involves several judgments, including distance estimation, and it is not clear exactly what strategies participants use to learn object locations (Hartley et al.,
We believe that the main outcomes of this study could be generalized to other spatial learning scenarios and assist experts in the design of training simulations related to spatial memory, where trainees are required to remember component or tool locations as part of the task. Overall, our results denote that HMD training resembles real world training more than desktop learning, related to higher object location memory accuracy. However, desktop training applications can be suitable and offer acceptable results when precise location learning accuracy is not required. Regarding self-avatars, our results suggest that a low fidelity avatar representation can degrade object location memory. In our experimental task, this observation is particularly important when the training transfer takes place from a low fidelity VE, where only basic geometric cues are available, to the real world equivalent.
MM-L wrote the code, ran the study, analyzed the data, and wrote the paper. AS supervised MM-L through each stage of the work including planning and running the study, analyzing the data, and writing the paper.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.