THE VIRTUAL TOURIST : COGNITIVE STRATEGIES AND DIFFERENCES IN NAVIGATION AND MAP USE WHILE EXPLORING AN MAGINARY CITY

This paper, submitted for the Workshop/Theme session on Virtual & Augmented Reality: Technology, Design & Human Factors, organized by ISPRS Working Group IV/9, explores the research field opened by experiments in virtual environments from multidisciplinary approach. At the recently established Cognitive Cartography Lab, Eötvös University, Budapest we designed an experiment to study and better understand the role of visuospatial displays in spatial cognition, in particular the cognitive conditions of navigation in an imaginary city with a map. Below we present some preliminary results based on our experiments recording the spatial behaviour of 62 subjects, including their verbal reactions and eye tracking data collected during the sessions. We measured the wayfinding behaviour of participants after an active or passive learning phase. The analysis of the accumulated data suggested no significant differences in the efficiency of spatial problem solving between the groups of subjects. For further investigation we found that although salient visual cues grasped the attention of the participants they could not benefit from this knowledge of landmarks in the actual navigational tasks. Despite the lack of group differences, the low number of getting lost in such complex, large-scale virtual environment suggests that participants could solve the navigational tasks rather efficiently, most probably due to using different cognitive strategies. The project was part of an educational development plan and was supported by the Student Talent Grant of Eötvös University. It was designed by a multidisciplinary research group including university students and offered them the opportunity to collaborate, cross disciplinary borders and develop their profile when contributing to front-line scientific research. Figure 1. Street views of the imaginary town ‘Szegvár’: the navigational environment constructed for the virtual tourist * Corresponding author, ORCID ID: 0000-0002-6447-1499 The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-4, 2018 ISPRS TC IV Mid-term Symposium “3D Spatial Information Science – The Engine of Change”, 1–5 October 2018, Delft, The Netherlands This contribution has been peer-reviewed. https://doi.org/10.5194/isprs-archives-XLII-4-631-2018 | © Authors 2018. CC BY 4.0 License. 631


INTRODUCTION
Although both in natural and virtual environments human navigation is based on multiple factors external cues, as landmarks play a major role.In highly complex spaces verbal instructions and/or visuospatial displays, first of all maps assist the human cognitive system as external cognitive tools.Spatial behaviour, wayfinding and navigation is also influenced by learning modes, such as: active learning -meaning active exploration and decision making -and passive learning -being a passive observer of the environment.
Previous psychological studies suggest that active learning leads to better spatial performance since it facilitates the cognitive processes to develop a spatial representation of the environment, while the passive learning mode tends to lead poorer spatial knowledge acquisition.Since the extension of the field by the theories on cartographic communication cartography expanded into cognitive realms and the rather ambiguous concept of 'cognitive map' had been firmly established in the discipline as a mental representation, in most cases compared to real, cartographic maps as material or virtual representations.The study of cartographic visualization as a cognitive process grew out as an emerging field between the traditional academic fields psychology, cognitive science, geography and cartography.In the last two decades or so new experimental technologies (e.g.neuroimaging, eye tracking), virtual and augmented reality (VR and AR), and the ever developing informatics (computational analytics, network technology, visualization) brought revolutionary insights regarding the understanding of brain processes, at the same time the need for a more holistic and realistic study of cognitive processes, first of all spatial cognition resulted in multi-and transdisciplinary approaches.
The project introduced below represents this new approach to solve some pressing problems in different disciplines, notably psychology, cognitive science on one side and geography and geo-visualization and geoinformatics on the other.Traditional or virtual maps have been long used in psychological experiments as stimuli, and cartographers have been experimenting with maps and graphic representations of spatial knowledge (Fabrikant andLobben 2009, Çöltekin et al. 2017).
In our opinion the better understanding of spatial cognition processes and the design of more effective and cognitively relevant graphic, visuospatial displays assisting human spatial problem solving are not separate issues, but aspects of the same processes.To solve these pressing, both theoretical and practical problems, we need an integrated approach.

HUMAN SPATIAL KNOWLEDGE AND NAVIGATIONAL STRATEGIES
Human spatial behaviour and cognitive strategies during navigation, like how to get ice-cream or how to find our favourite bookstore in hometown, depend on the complexity or the geometrical structure of the town (Lynch, 1960), the goal of (Dogu and Erkip, 2000) or the type of spatial information one has acquired (Montello, 1988).Therefore, humans might use different strategies depending if they are in a familiar or in a novel, large-scale environment (Ruddle et al. 1999;Montello, 1988, Herman andSiegel, 1978).In this way, how we are navigating to our pet's veterinarian may differ from how we are trying to find the best souvenir for our friends during a weekend trip in a new place.Furthermore, in many cases, we cannot rely on our internal cognitive representations about spaces, but we have been using external supporting aids since millennia (Török and Török, 2018).Starting with the first maps to modern days GPS -cognitive tools help us in orienting, wayfinding and navigating in space.
Siegel and White (1975) proposed that spatial knowledge acquisition follows a hierarchical structure.First, when we encounter with a novel environment, we acquire landmark knowledge (Thorndyke and Hayes-Roth, 1982).It contains distinctive elements of the environment or scenes stored in memory (Montello, 1988) Mallot, 2000;Zimmer, 2004), like a water tower or a Ferris wheel.
The next level in the hierarchy is route knowledge.It consists landmarks, places and sequential turns or directions attached to them and route connections between them (Gale et al. 1990;Golledge 2003).Route knowledge enables us to follow an already known path from one point to another destination, containing sequential elements of directions, like a turn to right then a turn to the left at the given landmark (Chrastil and Warren, 2011;Thorndyke and Hayes-Roth,1982).
Finally, if we gain an integrated knowledge about the environment by time, by experience or by other external aids such as maps, we can acquire the highest or most advanced level of spatial representation, survey knowledge (Hart and Moore 1973;Siegel and White, 1975;Thorndyke and HayesRoth,1982).This internal representation of a large-scale, geographic environment called 'cognitive map' (Tolman, 1948).Survey knowledge helps more in effective navigation and route planning (Golledge et al., 2000;Siegel and White, 1975;Münzer et al. 2006) in contrast to route knowledge, because it provides an external reference frame which leads to an overview of the spatial layout, and this enables us to take shortcuts or to navigate in unfamiliar routes.
The high importance of reference frame in human navigation is underlined by recent neuropsychological research, showing their dependence of viewpoint (Török et al. 2014).Furthermore, this integrated spatial representation provides metric information, such as the relationships between different locations and landmarks (Péruch and Wilson, 2004).The type of spatial knowledge which can be acquired in a novel environment depends on the goals of learning, or on the effort taken into consideration or on the structure of the environment (Allen, 1999;Chrastil and Warren, 2011).
Furthermore, as mentioned above, humans tend to use different graphic representations to facilitate their navigation spatial knowledge can be refined by using overview methods, such as maps.(Golledge et al. 2000, Loyld, 1989;Thorndyke and HayesRoth 1982,).
However, not every type of navigation assistance contributes to the same aspects of spatial knowledge.We can make a distinction between static spatial representations and dynamic ones, the former category can refer to the classic paper-based, north-oriented maps with allocentric (or geographic) reference frame.while the latter one here refers to novel navigational systems such as GPS-based and mobile apps.These latter systems can track and visualize the spatial location of the user and offer a forward-up (egocentric) orientation, which means that they rotate themselves according to our headings.(Münzer et al. 2006;Willis et al. 2009).
In earlier studies comparing the performance of individuals in wayfinding tasks -what is a goal-directed spatial behaviour where we need to execute a purposeful movement to a reach a specific destination (Allen, 1999) -it was found that the classic, map-based navigation generally leads to the better acquisition of survey knowledge.Individuals using a map were better in route distance estimation, in pointing performance or had a better, less fragmented cognitive map about the environment (Willis et al. 2009;Darken, 1998).In contrast, other groups who relied on dynamic representations such as mobile appstended to be worse in tasks where survey knowledge would have been required, and their performance suggested the acquisition of route knowledge only.
These results can be caused by the difference in the required cognitive processes: a static map seems to support spatial learning in a more active way than a dynamic system does.First, it offers a stable reference frame -an allocentric one (Klatzky,1998) -and therefore it represents the spatial relationships between different spatial elements.Secondly, if we use maps for wayfinding purposes, we need to actively process the spatial information in order to extract the wanted route from the map (Münzer et al. 2006).Survey knowledge is supported by maps since they provide a well-structured, holistic model of the spatial relations and therefore the spatial information acquisition can be more coherent (Zimmer 2004, Thorndyke andHayes-Roth, 1982).Thus individuals doing map-based navigation do not need much experience to build up the memory representation of the spatial layout (Lloyd, 1986).
Münzer and his colleagues (2006) propose that map-based navigation also fulfils the presuppositions of spatial learning, namely it provides a survey model of the environment.On the other hand, with a conventional orientation route information extracted from the map requires mental rotation to fit to a personal viewpoint of the environment.
In most of the experimental studies the researchers used the term active navigation to refer to a physically active, selfgoverned navigation, accompanied with self-motion and with all of its sensory-motoric consequences (Gaunet et al. 2001;Larish and Andersen 1995;Hermann and Siegel 1978).While the passive term would refer to the lack of self-motion, thus it means every case when the person is just an observer of a video, a slide-show, or somebody else's navigation.This issue is also investigated in virtual environment (VE) where even more ambiguous results were concluded (Wilson and Péruch 2004;Wilson et al. 1997).A virtual environment is a convenient tool to investigate spatial learning because spatial factors can be controlled, the layout complexity can be varied and the routes and actions can be continuously measured during a session (Péruch and Gaunet, 1970;Carassa et al. 2002).Moreover, if it is visual spatial characteristics are similar to real-world navigation (Richardson et al. 1999) One possible explanation of the ambiguous results can be the problem of directed navigation in non-immersive VR, the lack of proprioceptive information or by the lack of the active decision making during exploration.
In their study Carassa et al. (2002) found that active decision making leads to better spatial representation.Their participants had to navigate inside of a virtual building, but while one of the group had to passively follow an avatar the other group could make route decisions.They found that active decision making positively influenced the participants' mental representation about the building.However, contradictory to this results, Wilson and Péruch (2002) found that passive observers were more accurate at pointing to the targets when they were sitting next to an active navigator and watched their movements.Furthermore, there are studies which did not find any difference between the learning methods (Wilson, 1999;Wilson et al 1997).

THE 'VIRTUAL TOURIST' EXPERIMENT
In this study, we investigated the differences of active versus passive spatial learning in a large-scale virtual environment, a town by comparing the navigational performance of a free exploration and two guided exploration groups.While group one could freely and actively interact with the environment, other two groups had to follow pre-determined routes.Furthermore, all participants could use a static, north-oriented map during navigation.
Hence maps positively influence the acquisition of a survey knowledge, and the active exploration also might help the development of an inner spatial representation, our research hypothesis was that this benefit would appear not only in better wayfinding performance as per time and accuracy, but also in more frequent map use.More precisely, we supposed that if the active exploration group interacted with the VE in a more inclusive way they should use the map more often than the other two groups.

The VR gamification experiment design
In our research paradigm the participants sitting in a chair in front of a 21" LCD screen navigated in an imaginary city.
Adopting the approach of gamification, participants were first guided in the streets of the unknown town to explore its attractions.Next in the storyline they were given some free time, and finally, while standing in front of the cathedral at the main square (salient landmark), they were instructed to find the locations of different specialty shops in the town by remembering the verbal descriptions and, whenever it was needed, using the map.
Clues in the spatial narrative, the verbal route description included relevant directional (both in geocentric and egocentric reference frame), landmark and/or beacon information.The statistical analysis in progress includes general, linear mixed effects modelling of data for the proportion and frequency of map use (and spot) throughout the different target finding tasks; task order and gender are used as independent variables.

Constructing the map of an imaginary city
On the upper half of the vertically placed monitor research subjects could see the 3D reconstruction of the fantasy town 'Szegvár'.Below the interactive environment, in the lower part of the screen a static city map was presented.In the period when topographic maps were secret it could be used to train general map reading without restrictions.We took advantage of this old map of an imaginary Hungarian town, realistic but still unknown for all.To construct the map for the VR experiment we used only a smaller, central part (~1×1 km) of the original map sheet.
The buildings were visualized as contiguous blocks, because separated buildings were not relevant to the experiment.Vegetation cover was shown outside of the town only in cases of groups of trees.The road network was marked by white colour, except the main roads, which were indicated by yellow lines.The map represented all the roads of the town and marked the shops and the landmark buildings by conventional signs.The map did not represent but the names of the four major streets.

Construction the 3D town model
The digital map was used to build up a coherent, 3D environment which was programmed in the virtual reality game software Unity 4 (4.6.7f1),finally presented on an LCD screen with a resolution of 1080×1920 pixels.
Considering the complexity of the environment we carefully the physical extension of the 3D model.With constant walking speed (which was set and the same for everyone), the town could have been walked around on the major roads in approximately five minutes.The town did not involve any global landmarks which could have been visible from anywhere, and the target locations were unobservable from the starting point.
The town had four major roads (Figure 2.) actually encircling the inner town.The other major cityscape elements of the town were its four squares: one was the town centre, and the other three took place in the North, in the South and in the East part of the town.A unique building, a salient landmark (Cathedral, Library, Fine Art Museum, and a Cartography Museum) was placed on each square.Otherwise, the houses of the town were similar to each other in style, as we used only five different house forefronts to increase the need of use of labels and street signs.The facades of the souvenir shops in the streets were selected from a sample of seven different building fronts, and the town had altogether forty-one (41) shops.The facades of the shops were different from those of the houses, moreover large labels identified them showing their name.Similar boards were visible in the unique buildings on the squares, too.Finally, the streets names also appeared on building s at the corners on white boards with black calligraphy, indicating the name of the streets.The make it more familiar for Hungarian research subjects the streets in the town, which are not represented on topographic maps, were named after famous Hungarian scientists and artists.The blocks in the town consisted of three different building types: houses, shops, and unique buildings on the squares.

Apparatus and stimuli
The participants sat in front of the screen in a distance approximately 55 centimetres.For the navigation in the virtual environment they could use a keyboard placed in front of them.Before entering the VR environment, we calibrated the remote eye tracking device (EyeTribe), which was placed under the screen to track and record eye movements during the entire navigation experiment.The low-cost instrument (with the default sampling rate of 30 Hz) and EyeTribe software works with an average accuracy of less than 0.5˚ of visual angle.How our previous experiments with GazeTracker proved (Török and Bérces,3013) it is optimal for recording and studying fixations using a normal laptop.

Guided navigation with verbal instructions
Sessions started at main square, at the centre of the town and ended up here.The free exploration group explored the town freely in five minutes, then they had to return to the town centre.The time was measured by the experimenter.If they had trouble in navigating back to the starting point, they got a hint about the right direction, but only after two minutes of ramble.Participants in the other, so-called guided exploration group were told which direction they should follow.They heard the verbal directions (e.g.: "turn right at the end of Rákóczi street") in smaller information chunks to keep memory load low.The given route followed a circle in the town, including a section of the main roads and the four squares.This guided tour lasted five minutes, but the participants could have spent as much time they wished to consult the map to identify their location.Time spent with map task never increased more than just a few minutes.
Participants in the third group (guided exploration with route representation) could saw an extra line, drawn in purple on the map, which marked the route which they had to follow.They also received verbal instructions on what route they should follow during exploration.
Table 1.The three groups of participants

Search task and route planning
In the navigation task we selected five different shops as target points.These objects were approximately at the same distance from each other (at 1.20-minute walk on optimal path) and from the town centre (0.50 minute).
Then we created a 5×5 matrix which contained possible routes from one shop to the other and described it verbally.Each participant heard four different route descriptions from the matrix.In sum we used nine different combinations of the routes, therefore each route sequence was used at least two times.The route plans offered the optimal way to the shops, considering the distance and the complexity (number of turns) of the routes.
Participants had to navigate to four different shops with the help of the verbal instructions, the route descriptions.A route description consisted of 4-6 different chunks of information selected from seven possible categories.These were the following:  allocentric direction (North/South/East/West),  egocentric direction (left/right/ahead),  street name  main street name which appeared on the map  square name  intersection  order in numbers (e.g.: "at the third street...")

Preliminary analysis of data
To investigate the difference in the frequency of the fixation between the map and the town we analysed our data with a general linear model using the v3.4.3(RCore Team, 2018) package.Our expectation was to find a difference between the proportion of the fixations on the map and on the town.The amount of fixations should have been higher when a person got a route description which contained a landmark to be seen on the map.We also expected that the Condition factor would have an effect on the Relative Fixation too -because a better cognitive map should lead to fewer occasions of map use in navigation.
Actually, we found a significant a main effect of Location (p < .001),meaning that the participants looked more frequently to the town than to the map.The Location*MapCue interaction was also significant (p = .027),which indicates that the participants looked more at the map when they had a route description with a salient visual landmark on the map.However, there was no difference between the three exploration groups in the amount of fixation, indicated by the nonsignificant main effect of Condition (p = .720).The main effect of Route and the other interactions were not significant (all ps > .409)

CONCLUSIONS
In our study, we investigated active and passive spatial knowledge construction navigational tasks in a large-scale virtual environment.With the interactive 3D representation of the town our virtual tourists were provided a cartographic representation of the environment, a north-oriented, static city map to foster survey knowledge acquisition.Since the virtual environment (VE) does not allow sufficient proprioceptive information, our main focus was on the cognitive features of active and passive spatial learning.
We measured their performance of the three groups by the proportion of time of fixations on the map, what they spent with wayfinding and the successful target's location finding.Curiously, we found no significant difference in the performance between the three groups (free exploration, guided exploration, guided with line exploration groups).According to our results, people could navigate to the targets, and they spent similar amount of time with wayfinding (path options).Moreover, there was also no difference between the groups in the proportion of fixations at the map, neither during the exploration-training phase, nor in the actual navigational task phase.We found difference in the proportion of fixations at the map only in cases when the participants heard a route description with a visual cue referring to a street sign or a square.
However, it must be emphasized, attention influences spatial learning abilities (Chrastil and Warren, 2012).On the other hand, there is no consistent evidence explaining how directing attention to different parts of an environment would influence spatial knowledge acquisition.Wilson and Péruch (2002) instructed half of their participants to pay attention to the spatial layout, and the other half had to focus on objects.They found an advantage of the first group in case of map drawing but in pointing and judging distance there was no difference between the two groups.
In our experiment the lack of more fixation on the map during the learning phase seems to indicate that participants did not tend to differ in the attention paid to the map.This is in line with our result of no group difference in the wayfinding performance.However, it is remarkable that the guided with route line group, where they had a salient line marking the route, did not pay more attention to the map.This may indicate that they failed to locate their position on the map.
Despite the high potential to solve spatial problems with visuospatial displays, if we do not know where we are on the map we do not use them for navigational purposes.

Figure 2 .
Figure 2. The maps used in the experimentIt was based on an old, instructional topographical map of an imaginary region but was re-designed in the style Google maps.The military educational map, made and used in the Socialist era(1959), was actually a compilation of maps of different regions in Hungary to make an ecologically valid representation of different types of terrain by the actual set of conventional signs.In the period when topographic maps were secret it could be used to train general map reading without restrictions.We took advantage of this old map of an imaginary Hungarian town, realistic but still unknown for all.To construct the map for the VR experiment we used only a smaller, central part (~1×1 km) of the original map sheet.

Figure 3 .
Figure 3.The layout of the VR experiment on screen

Figure 4 .
Figure 4.The relative frequency of fixations on them map (left) vs town (right)