LANDSLIDE SUSCEPTIBILITY MAPPING IN THE MUNICIPALITY OF OUDKA , NORTHERN MOROCCO : A COMPARISON BETWEEN LOGISTIC REGRESSION AND ARTIFICIAL NEURAL NETWORKS MODELS

The Rif is among the areas of Morocco most susceptible to landslides, because of the existence of relatively young reliefs marked by a very important dynamics compared to other regions. These landslides are one of the most serious problems on many levels: social, economic and environmental. The increase in the frequency and impact of landslides over the past decade has demonstrated the need for an in-depth study of these phenomena, allowing the identification of areas susceptible to landslides. The main objective of this study is to identify the optimal method for the mapping of the area susceptible to landslides in municipality of Oudka. This area has been marked by the largest landslide in the region, caused by heavy rainfall in 2013. Two Statistical Methods i) Regression Logistics (LR) ii) Artificial Neural Networks (ANN), were used to create a landslide susceptibility map. The realization of this susceptibility map required, first, the mapping of old landslides by the aerial photography, the data of the geological map and by the data obtained using field surveys using GPS. A total of 105 landslides were mapped from these various sources. 50% of this database was used for model building and 50% for validation. Eight independent landslide factors are exploited to detect the most sensitive areas: altitude, slope, aspect, distance of faults, distance streams, distance from roads, lithology and vegetation index ( NDVI). The results of the landslide susceptibility analysis were verified using success and prediction rates. The success rate (AUC = 0.918) and the prediction rate (AUC = 0.901) of the LR model is higher than that of the ANN model (success rate (AUC = 0.886) and prediction rate (AUC = 0.877)) . These results indicate that the Regression Logistic (LR) model is the best model for determining landslide susceptibility in the study area.


INTRODUCTION
Landslides are considered to be the most common geological disaster, causing loss of human life and damage to the economy (Bui et al., 2012;Shahabi et al., 2014).This phenomenon occurs when natural or man-made slopes become unstable due to geological, hydrological and geomorphological conditions, heavy rainfall, seismic movements, volcanic eruptions, and human activities leading to destabilization of slopes (Soeters and van Westen 1996;Tasoglu, Citiroglu, and Mekik 2016), (Bai et al. 2014).In Morocco, landslides are a recurrent problem throughout most of the Rif and to a lesser extent the Middle Atlas.This region exhibits mountainous topographical features, and is frequently subjected to heavy precipitation.This dynamic associated with the formation of the Rif chain (Alpine tectonics) is accompanied by instabilities mainly related to tectonic movements.The construction of major infrastructures (roads, highways, etc.) is a triggering factor and favors landslides.The latter causes many economic losses affecting populations, infrastructure and other goods.
To remedy this, it is necessary to predict areas that are susceptible to landslides.For the analysis of the susceptibility to landslides in the municipality of oudka, two models were applied and verified logistic regression and artificial neural networks for the study area of oudka, Taounate.The Geographic Information System (GIS) software such as R Studio and ArcGIS Software have been used for spatial data management and models building.Several susceptibility studies have been applied using the different models : frequency ratio (Lee and Pradhan 2007;Umar et al. 2014;Hong et al. 2016b), weights of evidence ( Pourghasemi et al. 2012a( Pourghasemi et al. , 2012b ) ), the logistic regression model ( Xu et al 2012b;Devkota et al. 2013;Park et al. 2013), support vector machine (SVM) (Yilmaz 2010 b;Peng et al. 2014;Tien Bui et al. 2015;Hong et al. 2016b), fuzzy logic (Akgun et al. 2012;Sharma et al. 2013;Zhu et al. 2014;Shahabi et al. 2015), decision tree (Nefeslioglu et al. 2010;Tien Bui et al. 2012;Hong et al. 2015) and Artificial Neural Network (ANN) ( Zare et al. 2013;Nourani et al. 2014;Nefeslioglu et al. 2008;Poudyal et al. 2010;Tien Bui et al. 2015;Dou et al. 2015).
The Logistic Regression (BLR) and Artificial Neural Network (ANN) methods are considered to be the two most commonly used methods for assessing the probability of landslide occurrence at the mean and regional scales (Dou et al., 2018).In our study area, none landslide susceptibility studies were applied using Logistic Regression and Artificial Neural Network models.In this context, this study aims at the realization of the landslide susceptibility maps using logistic regression and Artificial neural network models and based on 8 factors : topographic, hydrologic, land used, geology and human activity.The Validation, analysis and verification of these maps was performed using the Relative operative Characteristic curve (ROC) including success and prediction rates.

STUDY AREA
The study area is located in the north of Morocco, in the northwest of the province of Taounate, it is one of the areas most exposed to landslides in Morocco.The commune of Oudka is situated between the longitudes 4 ° 42 '11.40 "W and 4 ° 56'53.70" W and the latitudes 34 ° 42'35.05 "N and 34 ° 42'52.43"N, it covers a surface of 89 Km² (Fig. 1).This area is a continuation of the chain of the Rif is characterized by mountainous terrain with no plains except near the wadi Aoulai along the western boundary of the town.Jbel Oudka is considered the most important mountain of the province of Taounate, its altitude reaches 1600 m.This mountain is characterized by a very important vegetal cover such as the Oudka forest.In the commune of Oudka, the olive tree occupies most of the arboreal surface area (92%).It is followed by the cultivation of the most important of them is Afrat N'joum which is located in the north of the Oudka and which has an area of 13000 m².The average annual temperature in the region is between 15 ° and 16 ° C. The average maximum temperature of the hottest month is around 34.2 ° C and the average minimum of the coldest month is 0.5 ° C .The average extreme thermal amplitude is in the whole pre-rifaine area between 30 ° and 32 °, which corresponds to a semi-continental climate.

Image data
Image Landsat OLI8 was downloaded from USGS web page and pre-processed by layer stacking of bands 2,3,4,5,6 and 7. Landsat imagery that was collected along the same satellite path have been mosaicing into a single image.However, atmospheric correction was not necessary for images taken on the same calendar date (Song, Lee, and Seto 2001).

Landslide inventory map
In our study area, old landslide mapping data were obtained from: field surveys using GPS, geological map data, and Landsat oli 8 image processing data.
Several studies have shown that the best calculation model is one in which the ratio of landslides to non-landslide points is equal to 1 (Bai et al. 2010).A total of 105 field polygons and 43 randomly sampled polygons of the stable surface mapped from different sources were transformed into 8911 cells with a resolution of 30 m for landslide areas and 9005 cells for stable areas (without landslide) .The 8911 cells of the landslide grid and the 9005 cells of the stable zone (without landslide) were randomly divided using the software R into two subsets: half of the cells of the grid were used for the realization of the landslide susceptibility model, while the other half was used for the validation of the model.

Conditioning factor data set preparation
In this study, we divided the conditioning factors into five datasets, including topographic, hydrologic, land used, geology and human activity datasets.The landslide conditioning factors from these datasets were extracted from different sources and stored in the spatial database using a spatial analysis tool (ArcGIS software) with a pixel size of 30 m .We applied, the frequency ratio (FR) that is represented by the formula : Here Di is the area of a landslide of the i-th category, Ai is the area of the i-th category for a given parameter, and N is the category number of the parameter.
The different geological components of the Oudka commune with their frequency ratio values can be found in Tab. 2.
Tab.2 : Frequency ratio values of the different geological components of the Oudka

METHODOLOGY
Figure 3 presents a diagram the methodology used.
For the application and verification of landslide sensitivity models, the study area was randomly divided into two parts, the first for model establishment and the second for validation.Areas of occurrence of landslides were detected in the study area by aerial photography, geological map data, and data obtained using GPS field surveys.
The Topographic, hydrologic, land use, geology and human activity databases were constructed for the analysis.From these databases, 8 factors were extracted.Using the landslides detected and the factors calculated or extracted, two methods of landslide analysis were applied: logistic regression and artificial neural networks.For the application of these, the R Studio software has been used.Finally, the results of the analysis were verified using the Relative Functional Characteristic (ROC) curve, including success and prediction rates (van Westen, Rengers and Soeters, 2003).The probability that predicts the possibility of landslide occurrence, for the study area, was calculated using a spatial database : (p is the probability of landslide occurrence)

Artificial neural network model
An ANN is a computational mechanism that can acquire, represent, and compute a map of information from one multivariate space to another using a data set representing that mapping.The purpose of an ANN is to build a model of the data-generating process so that the network can generalize and predict outputs from inputs that it has not previously seen (Lee, Ryu, and Kim 2007).The multi-layer perceptron (MLP) neural network, which has been described by (Rumelhart and Mcclelland 1986), is one of the most widely used ANNs.The MLP consists of three layers (input, hidden, and output layers) and can identify relationships that are non-linear in nature (Pijanowski et al. 2002).The MLP networks are trained by error-correction learning, which means that the desired response of the system must be known; a back-propagation (BP) algorithm must also be known.The S-shaped sigmoid function is a particular case of logistic regression used as the transfer function.The collective effect on each of the hidden nodes is summarized by performing the scalar product of all the values of the input nodes and their corresponding interconnect weights.Once the net effect on a hidden node is determined, activation on that node is calculated using a transfer function (sigmoidal function) to obtain a result between 0 and 1.
The BP algorithm randomly selects the initial weights.Then, the difference between the expected and calculated output values across all observations is summarized using the meansquare error.After all observations are presented to the network, the weights are modified according to a generalized delta rule (Rumelhart and Mcclelland 1986).This process of feeding forward signals and back-propagating errors is repeated iteratively until the error stabilizes at a low level (Pijanowski et al. 2002).

Results of landslide susceptibility models
The database containing a dependent variable (landslide) and the eight independent variables (Altitude, Slope, Aspect, distance to faults, distance to streams, distance to the road, NDVI, Geology) were randomly divided into two parts the first to create models and the second for the validation using the R software.
In our case, the variance inflation factor (VIF) was used.The resulting values of VIF, as shown in the following table (Tab. 3), are all less than 4, indicating that there is no colinearity problem to explore. in cases where landslide factors had a VIF value greater than 4, these factors will not be applied to the logistic regression model.

Tab.3 : Multicollinearity diagnosis indexes for independent variables
To evaluate the effectiveness of the training datasets, the Hosmer and Lemeshow test was used and gave in the Logistic Regression (LR) model being statistically significant and predictive.
The relationship between conditioning factors and landslides based on logistic regression (LR) is illustrated in Tab. 4.

Tab.4 : Coefficients of the LR model
From Table 4, the Distance to Fault, Distance to Streams and NDVI are negatively related to landslide risk, ie when the values of these factors increases the risk of landslides decreases.the landslide susceptibility map has been realized (Fig. 4), based of the weights indicated in table 4. the area of risk very high it is the area where there is a high probability (P > 0.8 ) to have a landslide.
Fig. 4 : Landslide susceptibility map produced from LR model.
Compared to statistical methods, the neural networks make it possible to define classes taking into account their distribution in the corresponding domain of each data source (Zhou 1999).
After integrating the training data (dependent and independent) in the Multilayer Perceptron Neural Network (MLP) model using the R software, a network architecture was constructed consisting of eight neurons in the input layer, four and two neurons for hidden layers and a neuron for the output layer (Fig. 5).After completion of the learning phase (training and testing phases) and reaching the network objective, the data from the study area were introduced into the network to estimate landslide susceptibility.
After obtaining susceptibility values, the final map of landslide sensitivity was produced (Fig. 6).The area of risk very high it is the area where there is a high probability (P > 0.8 ) to have a landslide.
Fig. 6 : Landslide susceptibility map produced from Artificial neural network model.

VALIDATION AND
The landslide susceptibility maps resulting from the use of the different statistical models (Logistic regression, and artificial neural network) were divided into five classes.The accuracy of these landslide susceptibility maps was evaluated by calculating the relative operating characteristic (ROC) and the percentage of landslide points observed in various susceptibility categories (Nandi and Shakoor 2010).
The area under the ROC curve (AUC) represents the quality of the probabilistic model (its ability to predict the occurrence or not of an event) (Yesilnacar and Topal 2005).
The ideal model shows a curve that has the largest AUC, AUC ranges between 0.5 and 1.If the value of AUC is close to 0.5 indicates inaccuracy (Fawcett 2006).
An ROC curve of 1 indicates a perfect prediction.In this study, all landslide sensitivity models were validated using success rate and a prediction rate method.
The success rate results were obtained by comparing landslide susceptibility maps with landslides in the training data set, while the prediction rate results for the susceptibility models were evaluated using the validation dataset independent of that used in the landslide model construction process and using the R software.
The ROC curves of this study are illustrated in Fig. 7.These results indicate that the LR model is the best model for determining landslide sensitivity in the study area.the landslide susceptibility maps were verified by landslides covering 4463 pixels of the municipality Oudka.These landslide were not used in the construction of the models.the landslide susceptibility maps of the three models were divided into five categories (Fig. 8) : Very low (0 <LSI ≤ 0.2), Low (0.2 <LSI ≤ 0.4), medium (0.4 <LSI ≤ 0.6), high (0.6 <LSI ≤ 0.8) and very high (LSI > 0,8).The superposition between the verification landslides (4463 pixels) and the landslide susceptibility maps resulting the LR and ANN models, allowed us to determine the Percentages of test landslide points falling into different susceptibility categories (Fig. 8): in the very low susceptibility class, we found just 1% of the observed landslides for the LR methods, and 3% for the ANN method.However, In the high and very high susceptibility classes, we found 85% and 68% of the landslide observed in these classes for the LR and ANN methods, respectively.By comparing the results of the LR and ANN analysis, we determined that the LR method was better than the ANN method.
The LR method is the best approach for the assessment of landslide susceptibility for the commune of Oudka.

DISCUSSION AND CONCLUSIONS
The commune of the oudka is an extension of the Rif chain and is characterized by a mountainous terrain, containing a very important vegetation cover such as the Oudka forest.the commune of the oudka is frequently subject to landslides.
Recently, many landslides have occurred in this area, the largest is the landslide of Tissoufa, in northern Oudka, caused by the heavy rains of 2013 which caused the total demolition of five construction and cutting of the road RP 5302.Field surveys, aerial photography interpretation and data analysis allowed for the identification of factors causing landslides.
In this study, the LR and ANN models were used to analyze landslide susceptibility and create landslide susceptibility maps useful to local authorities when choosing appropriate locations for implementing land use plans and environmental protection (Ozdemir and Altural, 2013).Based on the results of both models, we identified LR as the best model with the highest predictive power for the study area.
In any landslide susceptibility analysis, a level of susceptibility is assumed that an active landslide will occur.If only areas of high to very high susceptibility were at risk, LR would provide the best results.If landslides were found in areas with at least moderate susceptibility , the LR model would also yield better results.Lower percentages of landslides were observed in areas of low susceptibility.
Fig and other crops.The municipality of Oudka is known at the national level by the importance of rainfall.Between 1977 and 2018, the Jbel Oudka station recorded an average annual rainfall of 1455 mm.The territory of the commune Oudka is part of the producing area of the Ouergha watershed, it is crossed by several affluents of the oued Aoulai such as oued Elil oued Elmaleh and oued Assenou.Several lacs are identified in this territory and especially in the Oudka forest.

Fig
Fig.3 : Flow diagram showing the methodology

Fig. 5 :
Fig.5 : Neural network architecture The final weights edit and biases are shown in the following table (Tab.5):

Fig. 7 .
Fig. 7. ROC curve evaluation of the LR and ANN models : (a) success rate curves and (b) prediction rate curves.The AUC values obtained from the susceptibility maps show that the LR model gave the highest success rate (AUC = 0.918) and the best prediction rate (AUC = 0.901) compared to the ANN model which has gave a low success rate AUC = 0.886 and prediction rate AUC = 0.877.

Fig. 8 :
Fig. 8 : Percentages of test landslide points falling into different susceptibility categories using LR and ANN.