COMPARING MODEL PERFORMANCE METRICS FOR LANDSLIDE SUSCEPTIBILITY MAPPING
Keywords: Landslide, Susceptibility map, Validation, Logistic Regression, Random Forest
Abstract. Landslides are one of the most diffused hazard events in the world, they can occur in different locations under different triggering factors. As such, they are also one of the most studied hazards, while the mechanism of an event is known to the scholars, more difficulties are found in forecasting the location and time of the following event. However, scholars are putting great effort into modelling the phenomena through various tools, as such susceptibility mapping is one of the initial and key steps in the hazard assessment. While effort is put on producing such maps, less is put on the evaluation of those outcomes. The current work aims to analyse the behaviour of two validation metrics – Receiver Operating Characteristics (ROC) and Precision Recall Curve (PRC). The former is widely used in susceptibility modelling, while the latter not so much utilized. However, scholars are highlighting a drawback of the ROC – it is not able to discriminate imbalanced datasets and is providing unreliable outcomes, and as an alternative is proposed the PRC which does not exhibit such flaws. In order to test the performance of both metrics, they were applied to three susceptibility models produced using Statistical Index, Logistic Regression and Random Forest for the area of Val Tartano, Northern Italy. As a result, it was determined that when the metrics are applied to balanced datasets they exhibit similar behaviour; on the contrary when imbalanced classes are introduced PRC is depicting the model performance in a more precise manner.