ARE CITY FEATURES INFLUENCING THE BEHAVIOR OF PHOTOGRAPHERS? AN ANALYSIS OF GEO-REFERENCED PHOTOS SHOOTING ORIENTATION

Every day millions of social media users upload information as texts, pictures or likes. These online posts are nowadays mainly uploaded via a smartphone, that adds automatically complementary pieces of information such as the device’s location and orientation. This additional material is valuable for public services, and can be used to reinforce knowledge provided by typical methods. This study aims to inquire this additional material to observe the influence of city features on public behavior. A semi-automated workflow is introduced to combine two large datasets: the flickR geo-referenced photos (associated with their shooting orientation) and the OpenStreetMap streets’ network. The study is conducted in the city of Lausanne, Switzerland. This workflow promotes a novel approach to download, filter, compute and visualize large cluttered datasets. The investigations showed a significant difference between South/North photos’ orientation with a South dominance. Furthermore, the photographs’ orientation appears to be related to the street network, or city elements (such as remarkable buildings, fountains) only at a local scale; no connection was established at a larger scale. These results can be useful in urban planning for the diagnosis of a public place practice by its users (i.e., residents, tourists, etc.). An improved diagnosis promotes a better knowledge of a public space’s remarkable elements (by their attractiveness or unsightliness), easing the decision on conservation or transformation of these elements. Other applications are also outlined, notably in the touristic sector or the landscape preservation.


INTRODUCTION
Nowadays data holds a prominent place in all sectors of our everyday life. Data is indeed used in numerous contexts such as finding the shortest path for road navigation, suggesting new movies to watch, classifying user profiles for targeted advertisement, or even creating art. This trend is based on a drastic increase in computational and storage capacity (Hilbert & López, 2011). Furthermore, the transition to the web 2.0 has transformed the relation between users and the Internet (O'Reilly, 2007), where users shifted from a consumer to an active contributor role, notably with activities such as blogging, or social networking. A large additional volume of data is therefore produced by numerous users, allowing to discover new patterns (imperceptible with less data) that can be used for the development of new applications.
The possibility of collecting a considerable, cheap, experience-based volume of data has quickly appeared essential for experts in various fields. Non-experts users, through the production of these data, are increasingly holding a role of partners alongside experts, even seen as sensors emphasized by the development of public participatory geographic information systems (PPGIS) and volunteered geographic information (VGI) (Goodchild, 2007). The data delivered by these projects is actively or passively produced by the participants (Zhang, 2019). On the one hand, active participation (where users execute a requested task) provides accurate, up-to-date contributions to a specific issue. However, renewing and consolidating the motivation of users is an important challenge in these projects, notably because participants have diversified motivations (Lotfian et al., 2020). In addition, the participation being active, participants may mindfully select the information they want to provide. On the other hand, passive participation frequently involves data analytics, i.e. resolving issues using information not published for this specific purpose. In this case, participants are not always aware that their data is used for other purposes. However, this method has the benefits of being indirect, supporting the collection of latent information (considered not worth sharing by the users, or avoided to be shared consciously).
In the context of urban planning, this latent information is crucial for municipalities, allowing a better understanding of a place's perception (or usage) by its inhabitants and visitors. Urban experts can indeed employ these data to challenge their insights or consider the sensitivity of the public. Nevertheless, this latent (or tacit) information is acknowledged as arduous to record in traditional participatory mediums (transect walks, on-site workshops, public hearings, online surveys, etc.). Social media supports concurrently the elaboration of stories, atmospheres, and dialogues convened by passive, non-verbal information (Loukis & Charalabidis, 2015). These social networks could thus provide a medium to recreate this geo-tacit knowledge.
Photos consistently hold a central position in social media. Several platforms are even designed around it, such as Snapchat 1 , Instagram 2 or FlickR 3 . Images stored on these social networks provide a valuable source of information for collecting geo-tacit knowledge. The author of a photo is indeed influenced by several features such as human subjects (friends, strangers), attractive aspects (buildings, squares, fountains), repulsive elements (public garbage, rubbish) or city characteristics (water areas, streets orientations, topography). Therefore, a city's structural elements could affect the way photographs are taken, and then posted on social media.
In this exploratory study we aim to address the relation between a city's structural elements and photo specificities from social networks to generate relevant geo-tacit knowledge for urban professionals. This article is structured as follows -the next section presents how social media are and can be used in (urban) governance. Following this section, the motivation to select FlickR photos for this study will be presented alongside the description of a semi-automatic method to scrap, organize, and visualize information from several sources (crowdsourced information, social networks). Then, results for the municipality of Lausanne, Switzerland will be illustrated, discussed and criticized. Finally, future implications of this method for cities will be established in addition to suggestions to improve the process.

RELATED WORK
Every day millions of social media users willingly share information online through texts, pictures, or likes. This information, considered worth sharing by these users to their network, is also valuable for other entities such as private companies or public services. This gold mine of data provides an up-to-date massive amount of various materials that can be explored and exploited as inputs for third-parties actors. Public services have been using these methods for several years, notably in emergency and natural disaster management (Kongthon et al., 2014;Wang & Ye, 2018), planning and e-governance (Evans-Cowley, 2010;Magro, 2012), tourism (Sinclair et al., 2018;Zeng & Gerritsen, 2014) or epidemic outbreaks (Jia & Liu, 2020;Shahid et al., 2020), etc.
Nowadays social media applications are mainly used on smartphones; FlickR is no exception. FlickR is a social network that allows users to upload photos and share them with a broad public. In 2017, half of the photos uploaded on the platform were taken via a smartphone (Perez, 2017). The photographs taken with smartphones are valuable for researchers since their images automatically contain data about the location and the camera's specifications, including the shooting orientation. Furthermore, the sensors have been demonstrated as precise enough to be studied (Foltête et al., 2020). However, only few studies are investigating shooting orientation (Cao & O'Halloran, 2015;Ingensand et al., 2018;Tenerelli et al., 2017).
In their study, Ingensand & al. (2018) highlighted that the pictures retrieved from the Canton of Vaud, Switzerland are mainly located near road segments, i.e. places easily accessible. Roads, or linear features, also designed as paths, are one of the five structural elements of a city (Lynch, 1960). In addition, in his book Lynch emphasizes the significant position that these paths hold in people's mental image of the city. The paths can be analyzed to better understand how cities are designed notably regarding their spatial order (Boeing, 2019). This spatial order can be assessed and visualized by the orientation of streets.
In this study, we are aiming to analyze the relation between the street and the photos' shooting orientations. To analyze this connection, the method described by Boeing (2019) will be applied to the orientations of geo-referenced photographs. Since the street plays a significant role in people's image of the city, our main hypothesis is that there is a significant relation between the roads and photos shooting orientation. The workflow implemented to process these two types of data will then be reiterated to explore the influence of other city structural elements on the photos shooting orientations.

FlickR Selection
Several intentions led to selecting the FlickR platform as the principal data source for this study. First, FlickR is essentially based on photographs, and despite the decline of the user base for the last few years (notably due to the introduction of Instagram), the community is still actively posting millions of photos. Second, the platform is more oriented toward photography than selfies or personal photos. Third, the application's programmable interface (API) provided by the social network is well documented, and well established in the scientific community. Last, only a handful of social media still share the exchangeable image file format (EXIF, i.e. information on the date, equipment, position, and more) of the posted pictures; FlickR is one of them.

Study Area
This study has been carried out in Switzerland, in the city of Lausanne (140 000 inhabitants). The city is located on the north shore of Lake Geneva ( Figure 1). One specificity of this municipality is that its administrative limits are divided into two areas, but most of the city is located within the main part. The stretch of water between the north and the south coast in that area is about twelve kilometers. The elevation profile of the city is descending toward the south from 929 m to 372 m. Across the lake, the western edge of the Alps is observable with peaks up to 2300m. illustrates the view of Lake Geneva and the Alps from the southern tip of the city.

Data Collection
One specific challenge is the combination of diverse data sources. To search, download and assemble this heterogeneous information, python 4 scripts were implemented. The full workflow is depicted on Figure 2. FlickR. As a first step, the relevant FlickR photos and attached information (localization, shooting orientation, etc.) were identified. This step supports a reduction in data usage. For that purpose, a bounding box around the selected city was reconstructed, using two specific OpenStreetMap 5 (OSM) libraries: Nominatim 6 and Overpass 7 . The first one provides an identificator (osm_id) from a textual location, and the latter uses this input to define a bounding box in which data specified by keywords will be downloaded. Once the bounding box around the city limits identity, the photos inside this polygon were downloaded via FlickR's API ("flickr.photos.search"). Once this step accomplished, the photographs were subjected to three alternatives: (1) the EXIF information can be gathered directly from the "flickr.photos.getExif" API; (2) if the EXIF could not be recovered directly, the original photo was consequently downloaded to investigate if a non-deleted EXIF information exists; (3) if none of these methods provided results, the photograph was dropped. This method supported the identification of close to 64 000 photos, and the recovery of 49 000 photographs with location information using the EXIF information.
OpenStreetMap's buildings and roads. An analogous process was employed for OSM buildings. The ID of the city was recovered from Nominatim. Then, Overpass was used to download the data specific to this ID. A total of 10 900 buildings were downloaded. A different method was employed for the road data. Roads are stored in OSM as segments. Therefore, the download and the simplification of the road network was challenging. An external library, osmx 8 was employed. This library allows the download of a road dataset from a place name, and automatically reduces the nodes of the networks in addition to the circular elements, not considered in this study due to the impossibility of estimating an orientation (Boeing, 2017). However, only the main part of the city was retrieved with this method. Therefore, the road network was downloaded via a bounding box around the city's limits, to be filtered afterwards.

Data Filtering and Transformation
FlickR. The number of photos collected in the previous step being sizable and incomplete, a method was implemented to reduce and filter the relevant photos. The presence of geolocalization and photo orientation within the metadata was required, vastly reducing the amount of data to process. Following this selection, only the photographs located within the accurate city spatial extent and outside of building boundaries were kept, which corresponds to 8% of the global dataset (around 3750 photos).
OpenStreetMap's roads. The roads' overall orientation was first computed. The library osmx, used to retrieve road segments, also provides a module to calculate these orientations. This module has already been employed to explore urban forms in another study (Boeing, 2019). From the OSM raw network data, segments tagged as "footway", "service" or "steps" were excluded because of their inadequate complexification of the street network. Thereafter, the road segments located within the city's spatial extent were kept. A total of 19 000 road segments were collected for the study area.

Data Visualization
Following these steps, two cleaned datasets were ready to be used for visualization. To visualize these data, we used the concept of spatial order (in terms of street section orientations) described on a 360°dial divided into 60 sections (Boeing, 2019). A python script was implemented in order to ease this visualization.

The detailed comparison between OSM's street orientations and
FlickR's shooting orientations is depicted in Figure 3. The results illustrate two distinct axes (West-North-West / East-South-East and North-East-North / South-West-South) on the OSM roads graph. This distribution was expected due to the significant elevation differences observed on an NEN-SWS axis within the city. Lausanne's municipality is structured into levels where most of the streets have an orientation perpendicular or parallel (connecting the levels) to the slope. Regarding the FlickR photos orientations, a dominance of the South direction can be observed (1750 are facing North and 2150 South). However, no visual relation with the road network can be observed. Figure 3 (bottom) highlights the use of normalized density to statistically compare the two distributions of orientation angles. For each integer angle, the number of occurrences was calculated for the entire dataset, then this sum was divided by the maximal occurrence. The resulting distributions are non-parametric, non-linear and multimodal. Therefore, to measure the association between the two distributions, a maximal information coefficient (MIC) has been computed (Reshef et al., 2011). This method promotes the identification of two-variable relationships in big and noisy data. The application of the MIC confirmed the visual observation that the two distributions are statistically independent (mic = 0.0821). A MIC close to zero indicates no relationship, and a MIC close to one shows clear connection.

Temporal Evolution
Figure 4: Visualization of photographs' shooting orientation depicted on a polar graph. Each graph represents one of the four seasons.
The distribution of FlickR's shooting orientations according to the four seasons is presented in Figure 4. Data suggests that these four polar distributions are different. The winter and fall seasons show a dominance of pictures taken toward the south, and the summer seasons have a normal distribution. Therefore, photographers have different behaviors according to the season. Furthermore, on the filtered photographs that contain information on shooting orientations, winter photos represent only half of the photos gathered by seasons (around 1 000 images for spring, 1000 for summer, and 1000 for fall; but only 500 images for winter).

Scaling Down
The last preliminary analysis conducted on this dataset is focussing on scaling down the area to district scale (or lower). For this example, we selected a historical square in the center of Lausanne (Palud square). We extracted a polygon that encapsulates the area to filter and retrieve only the photographs within the square. A total of 36 photos were identified with this method. After separating these photos in another data set, the same visualization has been applied. The results are depicted in Figure 5. This square, in addition to being remarkable in the municipality's history, is interesting in its layout. The square is stretched on an axis North-West / South-East, with the city hall located on an orientation South-West (see Figure 5). A fountain oriented South-East, and located in front of a building that contains an animated clock is the principal attraction of the square. The square is also connected by four pedestrian-friendly streets oriented North, East, South, and North-West. The comparison between the polar representations is fairly connected to the layout of the square. The major axes are present in the shooting orientations, with a significant volume of the orientations facing the city hall (7 photos) or the fountain (10 photos).

Figure 5:
Visualization of the photographs taken on Palud square, Lausanne represented on a polar graph. The polar graph represents the volume of photos taken every 6°. The values range from 0 to 3 photographs. The map of the square is present in the background. Four photos illustrating specific orientations are also depicted.

City Features and Photo Shooting Orientations
At a large scale, the photo orientation appears not to be related to the city's spatial order. Therefore, the behavior of photographers is not affected by the orientation of the streets. However, the inclination of the photos towards the southern direction (seen notably for winter and fall) matches the topographic structure of the city. The photographs are shot in the direction of the slope and the Alps on the other side of Lake Geneva. From a cognitive perspective, a combination of these two elements tends to form clear open geographical spaces that have been acknowledged to contribute to a landscape's aesthetic (Coeterier, 1996). Such features are, therefore, more likely to be photographed, ultimately affecting the behavior of the photographers.
This view taking pattern is developed differently according to the period of the year. Each season imprints indeed distinct distributions. Along a year, several heterogeneous populations occupy the city environment. Summer portrays the touristic season, with its longer warmer days, terrasse and a population occupying exterior spaces. Winter is characterized by fewer tourists (or winter sports enthusiasts), colder days driving the population to stay in interior places. These seasons affect the behavior of the population, and presumably their approach in taking photographs.
City features appear to have a powerful impact on photo shooting behavior at a lower scale. The example of Palud square in Lausanne illustrates photograph orientation towards structural elements of the square, such as the connecting streets or the fountain. This relationship is valuable because meaningful viewpoints or elements can be identified with this method. Once collected this information can be used in urban planning to define, for instance, the features of a location that are the most attractive or at the opposite the most irrelevant. Strategies to redesign a district with features to enhance or conserve can be built on these data. Furthermore, the position elements concealing the landscape such as construction cranes, scaffoldings, urban furnitures or even new buildings could be determined through this information.

Limits of Crowdsourced Data
The major challenge of this study concerns the data, their collection and transformation. Retrieved from OSM, the street network was extensive, with no less than 32 000 segments to process and analyze. The data stored in OSM is at once complex, cluttered and incomplete. Several tags were identified and deleted (footway, service, steps) due to redundancy in the orientation or irrelevance, such as parking driveways. The libraries used to transform this dataset were not straightforward, and issues on node connections still exist in the network. Moreover, we noted the absence of several buildings in the OSM data. This lack implies that a few photos were not dropped during the decluttering of the photographs taken inside buildings.
Concerning FlickR photos, we observed that the API "flickr.photos.search" returns duplicate photographs after 4 000. Therefore, with one request, no more than 4 000 photos can be downloaded. A method based on the date of the photo has therefore been implemented to bypass this restriction. A finding concerning FlickR photos is that a vast majority of the photographs containing an orientation are taken from Apple's equipment. This information was absent from other mobile phone constructors. Last, after a manual review of some of the filtered FlickR photos, we still observe photographs clearly taken within buildings (see Figure 6). However, the GPS coordinates of these photographs are located outside buildings. This defect can be explained by the inconsistency of the GPS which could have kept the last known outside location of the device. Three of these photographs out of the nine appear to be irrelevant, and therefore inadequately filtered.

Future Work
The semi-automated method implemented in this study has shown promising results. Several challenges still need to be addressed in addition to evaluate the consistency of the results. However, from these results many areas for improvement can be investigated: (1) Recovering more data, the proportion of photos with the required characteristics (presence of location information and outside of buildings) to be analyzed is low (notably for the small scale analysis). This small number of photos increases the weight of outliers. Therefore, collecting more data can improve the robustness of the method.
(2) Studying another city, the present analysis is based only on one study, to consolidate the results demonstrated here before, the same analysis can be conducted on other studies in Switzerland or elsewhere. Cities with different structural features can be considered with this semi-automated method to identify which features are affecting the behavior of the photographers the most.
(3) Improving the filtering, numerous photographs have unrelated subjects (selfies, macro, inside buildings, etc.), these irrelevant data lead to a bias in the photos' orientation distribution. Therefore, a filtering method based on tags generation via image processing tools can be implemented. This kind of method is already in use on crowdsourced data (Lotfian et al., 2019).
(4) Scaling down, the shooting orientations of the photographs from the Palud square appear to record accurately the structural elements of the area. Other places within the city could also be similarly studied. However, this focus on a smaller scale implies a basic pre-knowledge of the location (position, layout, significance of the place). To overcome this preliminary investigation, a regular grid can overlay the city, and a photographs' orientation distribution can be calculated for each grid.

CONCLUSIONS
In this exploratory study, we were able to combine VGI data with geo-referenced photographs from social media to investigate how city features could affect the behavior of enthusiast photographers. Despite inconsistency in the data, the implemented semi-automated method shows encouraging results in the analysis and the visualization of these various data sets. Distinct patterns can be observed at a local scale, showing an influence of the local layout on the behavior of social media photographers. At a larger scale, the photos' orientation predominantly faces the natural features surrounding the city, however no relation was demonstrated with the road network.
Several applications could benefit from the results. In urban planning, the orientation of the photos at a small scale can assist the urban experts in the diagnosis of a public place practice by its users (inhabitants, tourists). An improved diagnosis will promote a better knowledge of a public space's remarkable elements (by their attractiveness or unsightliness), easing the decision on conservation or transformation of these elements. From the FlickR data, a photographer profile can be outlined based on the photo taken. This categorization can assist experts in identifying which type of profile is using a specific public space, and when these profiles are using it. Applications for the touristic sector can also be highlighted, such as the suggestion of scenic walks passing by the most appealing (most photographed) features of the city. A ranking of tourist attractions can be blueprinted via considering the redundancy in the photos of a city's attractions (a specific ranking can be created for locals or foreign tourists, or even according to the weather condition). Another application can be depicted for landscape preservation. Natural features appear to have a significant role in the photographs. This method can be therefore employed to assist state entities in the identification of places that should be preserved, or according to the activity in the public's photo shooting to investigate elements showing an unusual activity.