AN OVERVIEW OF GEOINFORMATICS STATE-OF-THE-ART TECHNIQUES FOR LANDSLIDE MONITORING AND MAPPING

Natural hazards such as landslides, whether they are driven by meteorologic or seismic processes, are constantly shaping Earth’s surface. In large percentage of the slope failures, they are also causing huge human and economic losses. As the problem is complex in its nature, proper mitigation and prevention strategies are not straightforward to implement. One important step in the correct direction is the integration of different fields; as such, in this work, we are providing a general overview of approaches and techniques which are adopted and integrated for landslide monitoring and mapping, as both activities are important in the risk prevention strategies. Detailed landslide inventory is important for providing the correct information of the phenomena suitable for further modelling, analysing and implementing suitable mitigation measures. On the other hand, timely monitoring of active landslides could provide priceless insights which can be sufficient for reducing damages. Therefore, in this work popular methods are discussed that use remotely-sensed datasets with a particular focus on the implementation of machine learning into landslide detection, susceptibility modelling and its implementation in early-warning systems. Moreover, it is reviewed how Citizen Science is adopted by scholars for providing valuable landslide-specific information, as well as couple of well-known platforms for Volunteered Geographic Information which have the potential to contribute and be used also in the landslide studies. In addition to proving an overview of the most popular techniques, this paper aims to highlight the importance of implementing interdisciplinary approaches.


INTRODUCTION
Earthquake-and rainfall-triggered landslides are global natural hazards which are directly affecting lives and environment, as well as various economical aspects. The importance for mapping and monitoring, whether through ground-,air-or spaceborne techniques, landslide-prone areas and already known ones is highlighted in numerous studies, and integrated in many risk mitigation strategies. However, in the recent years several tendencies emerged from separate fields that tend to unite according to the research problem.
On one hand, Earth Observation (EO) free and open-source datasets and techniques naturally blended into Geographical Information Systems (GIS) and have found new aspects of implementation (e.g. disaster mapping, land cover changes, etc.).
On the other hand, Volunteered Geographic Information (VGI) emerged and proved as invaluable data source also in the disaster response domain. There are many practical examples of using OpenStreetMap data and volunteering collaborative mapping for risk management, relief and recovery strategies. The fusion between these different data gathering and processing methods provided information from diverse aspects and contribution even to the landslide studies.
In addition, the use in the recent years of Artificial Intelligence (AI) and especially Machine Learning (ML) approaches, has increased in the field of Earth Observations. ML has been used for image classification applications, cloud detection and removal, enhancing the spatial resolution of satellite imagery, and many more. Naturally, scholars and decision-makers adopted machine learning techniques in their workflows and strategies for processing remotely sensed data in geohazard studies. Such implementations are already applied for landslide detection and mapping, landslide susceptibility and hazard mapping.
Even though, crowd-sourced data collection campaigns are often used in the disaster domain, both for risk mapping and disaster response, there are very few landslide-specific platforms and applications currently operational and well-known. In the paper, we will present the very few desktop and mobile-based applications and catalogues for collecting landslide-related geospatial -information. Lastly, it will be presented and discussed some current applications of AI on VGI datasets for recognition and detection of landslides. Moreover, it will be discussed how UAV datasets could be also obtained in a citizen science manner through VGI collaborative platforms.
As all those topics can be currently considered as hot, the number of scientific publications related to landslides-focused EO, AI and citizen science applications has increased tremendously in the last few years. This paper aims to provide a general overview of the state-of-the-art open methods and techniques for monitoring and mapping landslides.
The paper will not only discuss the geoinformatics state-of-theart landslide mapping and monitoring techniques but will also highlight the importance of combination and contribution between the different domains ( Figure 1) because hardly any risk related problem is a single-aspect one. Moreover, bringing expertise from fields is a must in the hazard domain. Interdisciplinary approaches are not only bringing deeper understand of the problem but will be improving methodologies that are, and will be, yielding more accurate and time/cost efficient results for better mitigating the landslide hazard. In-depth reviews for landslide investigations using remote sensing have already been presented (Mondini et al., 2021;Scaioni et al., 2014), thus in the next Section 2 will briefly highlight some of more traditional remote sensing approaches for mapping and monitoring landslides. While more attention will be paid for machine learning applications (Section 3) and in Section 4 will be presented VGI approaches contributing to the field. Finally, in Section 5 we will conclude with a general discussion of the presented.

LANDSLIDE MAPPING AND MONITORING : LANDSLIDE PREVENTION
Landslide mapping and monitoring are parts of the general goal of landslide prevention, that involves the landslide detection (Galli et al., 2008), the modelling of landslide susceptibility (Guzzetti et al., 2006) and the final hazard assessment.
The detection of a landslide is strategic to plan immediate disaster response; moreover, historical landslide inventories are also needed to prepare landslide susceptibility and hazard maps. Susceptibility is the spatial probability of occurrence of landslides. Susceptibility maps, combined with a modelling of landslide probability in time, allow the assessment of landslide hazard, that brings to forecasting or early warning raising.

Input data in landslide prevention.
In landslide detection, multitemporal satellite imagery is the most rapid and typically the cheapest data source; moreover, aerial photographs, LiDAR and inSar can be integrated in the process. Optical or multispectral images are used to estimate the location and the size of a landslide: LiDAR and inSar can be used to estimate the deformation field. In landslide detection also Digital Terrain Models can be integrated.
Beside the above data, susceptibility modelling requires data about the local scenario. In a rather recent paper, Reichenbach et al. (2018) provide a complete discussion about the scientific history of susceptibility modelling. 565 papers of peer review international journals from 1983 to 2016 are analysed in order to classify thematic variables and statistical approaches used in modelling. For the input data, the outcome is quite dramatic: 596 different thematic variables are used in the different papers but 445 of them are used just in one paper; in each paper susceptibility is modelled by using from 2 to 22 variables. The authors classify the thematic variables in 23 classes that are then grouped into 5 thematic clusters: morphological (37.2%), land cover (18.3%), geological (16.1%), hydrological (12.8%),and 'other', like for example precipitation.
Estimates of landslide hazard and early warning systems can be implemented at several level of complexity: always, they require the evaluation and modelling of all the triggering phenomena; in the case of the monitoring of active slow landslides also the maps of deformation in time are needed (Krkac et al., 2016). Guzzetti et al. (2020) provide an extensive review of several systems that have been implemented and are presently used at the local, national and global scale.

Landslide detection and mapping
An updated and exhaustive landslide inventory is an important dataset used when modelling landslide risk and susceptibility levels (Guzzetti et al., 2012). In-situ campaigns are still in use for updating landslide databases. However, mapping landslides using satellite-based datasets is reducing the time and costs for such tasks, especially with the use of freely available images from NASA's Landsat and ESA's Sentinel missions. Due to their medium spatial resolution and relatively high revisit time, scholars are mainly implementing change detection approaches relying on presence of vegetation to map landslide extents after the failure using both multispectral and radar imagery (e.g., (Plank et al., 2016;Scheip and Wegmann, 2021). However, using only optical dataset may have its limitations especially when it is present a constant cloud cover, which obstructs timely detection.
To overcome it, it is more common to use SAR imagery, which is almost not affected by the weathering conditions. The use of SAR allows to be exploited its signal propertiesamplitude (single and multi-polarisation) and phase, through change in the coherence or bitemporal interferometry (Mondini et al., 2021). Ground-based and aerial photogrammetry is also a widely used approach for mapping and further analysing landslides through geological means (Scaioni et al., 2018(Scaioni et al., , 2014.

Landslide monitoring
Ground-based landslide monitoring is a key component for the proper implementation of early-warning systems. However, the needed sensors cannot be installed everywhere, and these represent typical cases where remote sensing landslide monitoring make an important contribution. Depending on the availability of data to be used, the most popular techniques for monitoring landslides (slow and rapid) are: Digital Image Correlation (DIC) from optical images which provides displacement into 2D space (Mazzanti et al., 2020); 3D reconstruction and comparison from photogrammetric datasets (Scaioni et al., 2014); Differential SAR Interferometry and Persistent Scatter (PSInSAR) interferometry methods (Ferretti et al., 2001). The last two methods are yielding more than satisfactory results at sub-pixels scale for slow movements, however some limitations are found due to the presence of vegetation.

STATE OF THE ART OF MACHINE LEARNING FOR LANDSLIDE PREVENTION
The present section is structured in two parts: an introduction to machine learning is followed by the discussion on its application in landslide prevention and protection.

Machine learning
By Machine Learning (ML, Bishop, 2006, Hastie et al., 2011, Murphy, 2012, James et al, 2013, models are built to represent relationships between input data, or observations, and target variables, or unknowns, when these relationships cannot be described by simple parametric models or are at least partly unknown. The simplest application is a binary classification, for example for spam recognition in emails; multi class classification, ranking problems and predictions provide more complex applications: for example, a ML algorithm can be implemented to detect potential medical diseases from tomographies and other medical data of patients. From an historical point of view, the driving applications for machine learning were computer vision, speech recognition, natural language processing, medical analysis, as much as other specific scientific applications, like in Physics (Baldi et al., 2014). The application of machine learning to geophysics or geological problems started about three decades ago (Van Der Baan and Jutten, 2000, Bergen et al., 2019, Reichstein et al., 2019, for studies relevant to solid Earth as well as oceans and atmosphere. Classification problems constitute the typical applications in such disciplines: for example, Dowla et al (1990) used artificial neural networks to discriminate signals between natural earthquakes and underground nuclear explosions. The application of machine learning to landslide prevention (Ma et al., 2020) started later but in the last two decades has experienced a disruptive growth: the analysis of landslide susceptibility provides a typical application. In this section, the main machine learning algorithms are shortly summarized, then the major applications in landslide prevention are presented.
In general, we can write that ML algorithms map an input dataset y into a target, or label, vector x. The two main classes of ML algorithms are supervised and unsupervised (Jordan et al., 2015). Supervised algorithms require a training dataset of known couples of y and x. Training data is used to set the algorithm. If correctly trained, the algorithm can be successfully applied to new, unexplored, data. Unsupervised algorithms adopt learning approaches that are not based on the availability of training data: they are needed when no or few training data are available, for example in exploratory research. Many different ML algorithms have been proposed, both supervised and unsupervised: the choice of a proper algorithm depends on the specific application, the availability of training data, the size of the data set, the characteristics of the target vector (continuous / discrete).
Images are often processed with ML algorithms; this is particularly true in geophysical or geological applications, where the analysis always implies the use of digital maps and images. Image analysis can be either pixel oriented or object oriented. In the first approach, pixels are individually investigated on the basis of their spectral signature. Object based algorithms firstly group pixels into objects, sometimes named also superpixels: the grouping is based on the analysis of spectral, spatial and contextual characteristics. Objects are then investigated, accordingly to the specific application. Traditionally, classification was obtained by pixel-oriented approaches, like in Danneels et al. (2007). In time, object-oriented algorithms have proved to be more effective: this is particularly true with the new generation of high resolution images (Hussain et al., 2013, Martha et al., 2012. Stumpf and Kerle (2011) Pham et al. (2020) investigate the use of CNN. Specifically, an algorithm, named Moth Flame Optimization is explored to train CNN. The approach is tested in a mountain area of the Lai Chau province, Vietnam. Also, in this study several statistical indexes are considered, with excellent results. Finally, Fang et al. (2021) implement four ELM, whose components are CNN and RNN, combined with SVM and LR; in the meantime, they analyse the correlation between different geomorphological parameters and landslide susceptibility. The ELM results are really encouraging with respect to the results provided by individual algorithms.

ML for landslide forecasting and early warning.
Landslide early warning systems by ML are a relatively recent object of investigation: landslide forecasting requires the combined analysis of susceptibility and triggering effects, typically rainfalls, groundwater levels and earthquakes. Krkac et al. (2016) predict the motion of Kostanjek landslide (northern part of Zagreb city, Croatia) by applying RF to the analysis of GNSS monitoring stations, rainfalls, and groundwater levels; the prediction of the landslide velocity and its variation in time is really accurate.

Paradhan et al. (2019) analyse landslides triggered by rainfalls in
Busan, Korea, a city of about 780 km 2 in a mountain area; ANN is trained and applied to estimate rainfalls thresholds for early warning. A similar research is done by Sang et al. (2019), who compare genetic algorithm back-propagation neural network and genetic algorithm support vector machine. Shiluo and Niu (2018) model the dynamic of a landslide in Zigui County, China, by considering six triggering effects, all related to hydrological and meteorological aspects. A long short-term memory neural network provides the best results; although, the accuracies are quite low and the outputs present ill conditioning. Finally, Kuradusenge et al. (2020) examine the case study of Ngororero district, Rwanda. They apply LR and RF to several thematic variables, included rainfalls, to predict landslides: LR provides the best results with an error (incorrect prediction) of less than 4%.

Final considerations on ML for landslide prevention
During the last two decades, supervised ML algorithms have been widely studied and tested in landslide detection and susceptibility modelling; more recently, research on forecasting and early warning started. In detection and susceptibility, the significant class imbalance between non landslide and landslide classes represents a problem in training and could bias the classification (Yordanov and Brovelli, 2020a,b). In susceptibility modelling, the choice of the model represents the methodological problem: indeed, the input thematic variables and their use greatly vary accordingly to the local scenario and the investigated landslide. Therefore, only local models can be tuned but no global guidelines can be given. In detection and susceptibility modelling, both the deep learning and the nested ensemble algorithms seem to be the pushing study subjects.
Landslide early warning systems are complex tasks that have been implemented and tested at various spatial scales in the world. Research on the machine learning application is now a work in progress, particularly for the analysis of triggering phenomena.

Citizen science
As ML became trending in the landslide domain, similar is the case with the citizen science (CitSci) contribution to the geospatial field and the attempts of scholars to "harvest" it also for the landslide-specific studies. Widely adopted term in the geospatial studies related to crowdsourced knowledge but with added geospatial reference (Schade et al., 2013) is the Volunteered Geographic Information (VGI) from Goodchild (2007), who stresses the benefits of motivating individuals to volunteer especially in emergency situations. In the current paper it would not be discussed in-detail the definitions of crowdsource, citizen science (Brovelli et al., 2020), nor the discussions about the data uncertainties (Fonte et al., 2017) and integrity (Juhász et al., 2020). However, it should be highlighted that usually VGI participation is more focused on some common aim and all individuals are collaborating on this specific outcome (Lee et al., 2020). An example for a notable VGI project is the OpenStreetMap which was initiated with the aim to generate and distribute free geographic data (OpenStreetMap contributors, 2017).
Scholars highlighted the disaster domain that could benefit greatly from crowdsourced contribution (Glantz and Ramírez, 2018;Goodchild, 2007) in different phases of the disaster management (Lee et al., 2020) starting from a level where the citizen are mainly contributing with data gathering to a more experienced level where they are part in the definition of the problem, information gathering and its analysis (Haklay, 2013;Kocaman et al., 2018). Interestingly, the inclusion of citizens at different levels appears to coincide with the implementation of CitSci in the time, moreover the initial usage of crowdsourced information in the disaster domain is correlated with the mass usage of mobile technologies and social networks. Where scholars were mainly exploiting the citizens as sensors, with application for earthquakes (Earle, 2010;Shan et al., 2012), floods (Kouadio and Douvinet, 2015). On the other hand, the technology is used for early warning systems (Marchezini et al., 2018(Marchezini et al., , 2017, crisis mapping (Norheim-Hagtun and Meier, 2010), building resilience (Cieslik et al., 2019), landslide susceptibility mapping (Rohan et al., 2021).
However, in particular for the landslide domain, the incorporation of CitSci is more restricted to gathering landslide data for populating inventories. Such an example of a project on a global level is the Cooperative Open Online Landslide Repository (Juang et al., 2019) as a part of the Global Landslide Catalogue (Kirschbaum et al., 2010). On national level examples are the Great Britain National Landslide Database (Pennington et al., 2015) and Italian IdroGeo platform incorporated in the national landslide inventory (Iadanza et al., 2021). Reviewing CitSci applications for landslide data collection can be noted the two main types of crowdsourcing data collectionpassive and active.

Passive crowdsourcing
Passive crowdsourcing is referring to the approach where scholars are "mining" for a specific information which has been already published/made available, or in other words an individual unknowingly contributed to a specific field. Examples of such data sources are news portals, social medias (e.g., Twitter and Flickr), search engines (some are allowing to assign alert for a specific keywords). The approach is utilized in studies at local level, but also at national and global (e.g., GBNLD and GLD). As some of the advantages could be considered that the data is The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLVI-4/W2-2021 FOSS4G 2021 -Academic Track, 27 September-2 October 2021, Buenos Aires, Argentina streamed almost constantly (Ghermandi and Sinclair, 2019) and it is at no cost. On the other side the data can be biased, wrong or too general which will require addition research time to be useful. For example, a news report or microblog could reveal an approximate location of a landslide, but a verification from another source or field investigation could be necessary to determine its exact location.

Active crowdsourcing
On the contrary of the passive CitSci, active crowdsourcing is when the individual is participating for a specified outcome or when it is in the spatial domain it coincides with the definition of VGI. In the active participation it is important to properly design the surveys according to proficiency level of the individuals. Possible target group is involving professionals or participants which have less to no knowledge of the topic, or both. In the former case, a VGI project would be easier for a design where less guidance and information are needed to the participants, however this may lead to a smaller group of participants, thus insufficient amount of data. In the latter case when nonprofessionals are involved additional efforts should be made to provide them with sufficient and straightforward informational materials prior the data collection. For example, when the aim is landslide mapping -clear and brief description of landslides, types of material and movement, parts, triggering factors, etc.
Depending on the type of data provision and tools to be used, active crowdsourcing could be further differentiated into two approachesdesktop and mobile. They differ mainly in the mean which is used for the data report. In the case of landslide data gathering, usually the fields to be filled are requiring the same type of information and are following standard geological survey questionaries. Mainly it is requested information for the location of landslide, movement type, displaced material, date of the event, damages, mitigation measures, a photo of the landslide. The mandatory information and any other fields required are usually based on the survey designer's experience and choice.

Desktop-based.
This approach applies when users are required to report landslides of their knowledge usually through a dedicated web-portal by compiling a form. Such a report can be done without any limitation for the current location of the user. In using a dedicated webpage for reporting there are several benefits: the user is not bounded or needed to be at the location of the landslide, as long as the provided information has a certain level of accuracy; the project designer has a wider possibility to integrate teaching materials (purely text-based or even interactive) useful for non-professionals. On the con side, not always the geographical location of the event is enough precise and further efforts are needed for determining the exact one. This approach is adopted when the aim is to populate the inventory of a large regions, as implemented for GLD, GBNLD and IdroGeo.

Mobile (field) based.
This technique mainly relies on the relatively recent advancement of the mobile phone technologies and the integration of various sensors inside them. Scholars are adopting the use of mobile applications mainly due to the built-in GPS, compass, camera and Internet connection, which allows users to provide data with relatively high accuracy directly from the location of a landslide. Often, an option for offline data collection is present due to the nature of the hazard and its presence in location where mobile signal can be sparse. The designed survey applications usually tend to keep the user interface simple and interactive, which allows straightforward data compilation and prevents a decrease of interest of the task. It is common that the provided information is then visualized/accessible (upon validation) to a webGIS platform or online database. In the case of GBNLD, except the online report a landslide form, is developed parallel mobile application for active mobile crowdsourcing (myHAZ -https://vct.myhaz.app/) incorporating tools for data reporting, managing and exploration. Many other VGI applications for landslide reporting were developed in the recent years (Choi et al., 2018;Hennig et al., 2020;Jacobs et al., 2019;Kocaman and Gokceoglu, 2019;Žabota and Kobal, 2020). However, during the literature review phase for this manuscript, was noted a trend related to applications that were presented couple of years ago which upon a verification appear to be discontinued and not currently maintained (at the time of writing). It appears as a discouraging trend that is related to the lifetime of projects supporting such efforts. In fact, it was pointed out by Irwin (2018) as a limitation of the CitSci.

Data validation
Naturally, the provided data should be further verified before its inclusion to an inventory. Most of current available applications and forms have a validation step which aims to verify if the reporting is correct, if it is a duplicate or if it is providing additional information to an already existing one. Mainly, it is carried from experts and, when are present uncertainties, additional sources, for redundancies, are suggested. (Kirschbaum et al., 2010).
More recently, (Can et al., 2019) trained a CNN and implemented the model in a webGIS (Can et al., 2020) for landslide image classification as part of validation process from crowdsourced landslide images. They highlight that due to limitation of the training datasets the automation process needs to be further manually validated, however it shows promising interdisciplinary incorporation.

Potential platforms for landslide VGI
In the following, we would like to briefly highlight the potential of OpenStreetMap (OSM) and OpenAerialMap (OAM -a platform for contributed volunteered UAV images) for landslide mapping.

4.5.1
OpenStreetMap. The data richness of OSM relies on its contributors and assigned tags which are describing the semantic of the features of the map. According to the OSM Wiki page (https://wiki.openstreetmap.org/wiki/Tag:natural%3Dlandslide) there are already present tags related to landslides: natural=landslidefor general annotation, while the types of landslides can be assigned as landslide=*, where the value is the specific type. However, it seems that tagging landslides is not of a particular interest since the general tag has been used only around 4,000 times. A restriction for using OSM could be that when users are contributing based on satellite imagery the datasets are not always up-to-date, therefore a presence of a landslide cannot be spotted due to the age of the imagery. However, incorporating UAV imagery may provide more recent high-resolution data.

4.5.2
OpenAerialMap. is a platform where users can explore and contribute UAV creative commons imagery (https://openaerialmap.org/ made by Humanitarian OpenStreetMap Team). It allows citizens to share high resolution orthophotos without restriction of the topic. When it is the case of landslide imagery then can be further mapped in OSM and used from professionals. OpenDroneMap (https://www.opendronemap.org/ -a FOSS tool for processing aerial images) has an integrated option to directly upload the output orthophoto directly into OAM.

CONCLUSION
In the current work we have presented a general overview of the best practices for landslide mapping and monitoring from the perspective of Earth Observation, Machine Learning and Citizen Science. Individually each of the discipline has its own advantages and disadvantages when implemented in landslide domain, but it is considered that integrating expertise from different fields is bringing firstly a better understanding of the problem, of the needed actions to be taken and of the tools for the actions to be carried out. For example, EO (whether ground-, airor satellite-borne) can provide huge amount of data, while ML is bringing the needed means for its accurate processing, while engineering geology can provide the right problem definition and outcome interpretation. In addition, crowdsourced information could be always beneficial for the correct interpretation and bring in additional value that would polish the landslide defining algorithms.