LANDSLIDES EXTRACTION FROM DIVERSE REMOTE SENSING DATA SOURCES USING SEMANTIC REASONING SCHEME

: Using high resolution satellite imagery to detect, analyse and extract landslides automatically is an increasing strong support for rapid response after disaster. This requires the formulation of procedures and knowledge that encapsulate the content of disaster area in the images. Object-oriented approach has been proved useful in solving this issue by partitioning land-cover parcels into objects and classifies them on the basis of expert rules. Since the landslides information present in the images is often complex, the extraction procedure based on the object-oriented approach should consider primarily the semantic aspects of the data. In this paper, we propose a scheme for recognizing landslides by using an object-oriented analysis technique and a semantic reasoning model on high spatial resolution optical imagery. Three case regions with different data sources are presented to evaluate its practicality. The procedure is designed as follows: first, the Gray Level Co-occurrence Matrix (GLCM) is used to extract texture features after the image explanation. Spectral features, shape features and thematic features are derived for semiautomatic landslide recognition. A semantic reasoning model is used afterwards to refine the classification results, by representing expert knowledge as first-order logic (FOL) rules. The experimental results are essentially consistent with the experts’ field interpretation, which demonstrate the feasibility and accuracy of the proposed approach. The results also show that the scheme has a good generality on diverse data sources.


INTRODUCTION
As major natural hazards, landslides not only pose severe threats to human life, property, natural environment, constructed facilities and infrastructure (Nadim et al., 2006;Lacasse et al., 2009;Klose et al., 2014), but also have relevant indirect impact on society and various induced economic effects (Scaioni et al., 2014).Routine quantification of landslide occurrence is essential to assessment, mitigation and management of landslide risk (Metternicht et al., 2005).Since landslides usually leave visible marks on the territory, visual image interpretation of optical images is beneficial to the process of landslide identification, which is important for landslide mapping, landslide inventories, and landslide hazard assessment (Guzzetti et al., 2012;Cheng et al., 2013).Conventional methods for delineating landslides on imagery in terms of field survey and manual interpretation are tedious, error-prone, and inherently subjective (Malamud et al., 2004;Albrecht et al., 2010).
The object-oriented approach (Aksoy et al., 2012) is introduced to realize automated landslide extraction and analysis of images by grouping pixels into objects that present extremely relevant characteristics (Benz et al., 2004).Using this approach to detect landslides refers to segment images based on a series of criteria that related to spectral, shape, color and contextual information et al (Blaschke, 2010).However, Landslides usually have spectral characteristics similar to roads, river sand, or barren rock, mainly using spectral and size criteria to create landslide objects may not be consistent with the real landslide areas (Martha et al., 2011).To detect landslides unambiguously, it is critical to establish comprehensive expert system rules by combining assessment of spectral, spatial, morphological, and contextual parameters (Navulur, 2006).This process in terms of obtaining useful spatial and thematic information on the objects by using human knowledge and experience is defined as the extraction of the image semantic (Forestier et al., 2012).
The lack of consistence between the content (i.e.objects) extracted from images and the interpretation is called semantic gap (Smeulders et al., 2000).A number of studies with different methodologies based on imagery and object-oriented approach have been accomplished in order to attempt reducing the semantic gap (Forestier et al., 2012;Belgiu et al., 2014;Rejichi et al, 2015;Luo et al., 2016).Among all these researches (see also Oliva-Santos et al., 2014;Hudelot et al., 2003;Liu et al., 2007;Arvor et al., 2013), the most frequently used method is applying ontologies to formalize the image interpretation knowledge for developing automated image classification procedures, which are usually defined as formal, explicit specification of a shared conceptualization (Gruber, 1993).Most of these ontologies are developed by using Web Ontology Language (OWL) specifications (Motik et al., 2008).The OWL language is based on the Description Logics (DL) and through which the statements can be automatically tested by a reasoner (Tsarkov et al., 2006).Nevertheless, OWL decidability is achieved at the price of losing expressiveness, thus OWL reasoners are unable to cope with more expressive ontologies (Álvez et al., 2012).
Additionally, understanding the causes and triggering mechanisms is important for landslide risk assessment and mitigation (Lacasse et al., 2009).The landslide causes are the reasons that a landslide occurred in a location at a specific time, which can be considered as the factors that made the slope unstable and vulnerable to failure, and a trigger is a single event that finally initiates a landslide.Landslide causes have two primary categories: natural occurrence and human activities.Water, seismic activity and volcanic activity are the three major triggering mechanisms in the natural occurrence category.
Since understanding the landslide causes and triggering mechanisms can help determine the landslide affected areas and develop landslide prediction models.It is reasonable to assume that modelling the landslide causes and realizing semantic reasoning based on them can provide assistance to the landslide extraction from imagery.On the other side, First-order logic (FOL) (Lifschitz et al., 2008) is a well-known and expressive formalism whereas DL can be taken as decidable subsets of it.Considering the semantic formalization of landslide causes can be complex and expressive, using FOL provers in a hybrid tool may save effort in developing algorithms and reasoners for various DLs (Tsarkov et al., 2003), and more importantly, may provide a use means of dealing with the complex DLs that represent the knowledge of landslide causes.
In this paper, we propose a framework for landslide extraction using semantic reasoning on the basis of object-oriented approach and FOL.The framework has two potential benefits.One hand, we intend to create a transferable and automated method for landslide extraction that increases the classification accuracy and the generality among diverse data sources.On the other hand, a model considering the landslide causes and triggering mechanisms in FOL can provide a computer-readable description of this knowledge, which may be useful for other relevant research in the future.
The rest of this paper is organized as follows: after describes the study areas and relevant data in Section 2, the paper deals with the object-oriented landslides extraction approach and gives a short analysis of the preliminary classification results in Section 3. Section 4 first models three types of landslides triggered by different physical causes in the form of FOL, and then put forward a method to realize semantic reasoning by using these models in Prover9 (McCune, 2007).This study is summarized in Section 5.

STUDY AREAS AND USED DATA SETS
Three regions are used as the study areas in this paper, including Dujiangyan affected by Wenchuan earthquake, Baoying affected by Ya'an earthquake, and Neiliu railway landslide area (Figure 1).The information of the images with regard to these regions is shown in Table 1.
Data set A: The Wenchuan earthquake on May 12, 2008 was the largest seismic event in China in the last half century, caused massive fatalities and injured.The earthquake triggered around 60,000 landslides (Gorum et al., 2011) and considerable quantity of consequent secondary hazards.In this paper, we choose Hongkou County, Dujiangyan City as study area, which is about 20 kilometres from the earthquake's epicentre.The IKONOS image cover about 53 km 2 of Hongkou County acquired on 28 June 2008 is used in the experiment.
Data set B: The second study area is Baoxing County, Ya'an City, in Sichuan Province, China.This area was severely damaged by the 2013 Ya'an earthquake.The data used in this research are UAV images that cover the whole area of Baoxing County.The images (size: 5616*3744 pixels) were acquired from a height of 500 m using a ZC-5 UAV (TopRS) with a Canon EOS 5D Mark II camera (35.5132 mm focal length) onboard.The UAV were launched on April 21, 2013, one day after the occurrence of the Ya'an earthquake.We chose a study region (2410*1122 pixels) from the mosaics images after preprocessing.
Data set C: On August 2, 2013, a landslide event occurred along the Neijiang-Liupanshui railway and the railway has broken off because of this event.SPOT6 images of this area were acquired on October 12, 2013.Based on these images, a 15m DEM is created by stereo-image matching with manual editing.

Sensor
Acquisition time

Resolution DEM data
Wenchuan Earthquake  based on the selected features and segmentation with a supervised classification method.

Pre-processing
In order to eliminate or correct image distortion caused by radiometric errors, the radiometric correction is used.After that the ortho-image is obtained using SRTM DEM data or the DEM generated by stereo images of study areas.Pansharpening is also adopted to improve the spatial resolution of multispectral images and retain spectral information and high resolution.In this research, we use the Gram-Schmidt spectral fusion method to fuse and generate a four-band pansharpened high resolution multi-spectral image, with a higher resolution panchromatic image to reduce uncertainty and minimize redundancy (Sun 2013).We integrate the multi-spectral image (resolution: 4m, 6m) and panchromatic image (resolution: 1m, 1.5m) from IKONOS images and SPOT 6 images based on the pixels through Gram-Schmidt spectral sharpening algorithm using ENVI 5.1 software.After the pre-processing, the orthimages (IKONOS, UAV, and SPOT6) of three study areas are obtained.

Object-oriented approach
There are two major steps of an object-oriented approach, including image segmentation and image object classification.
In this paper, we use the multi-resolution segmentation algorithm, a bottom-up region-merging technique, to obtain the ideal objects.The features are used as criteria to classify these objects are Normalizes Difference Vegetation Index (NDVI), Normalizes Difference Water Index (NDWI) (McFeeters, 1996), contrast, bright-ness, density, the DEM and the Gray Level Cooccurrence Matrix (GLCM), et al.

Segmentation:
Image segmentation, the first step in object-oriented approach, is crucial to the detection extent and classification accuracy (Rau et al., 2014).It is the basis of object-oriented analysis and has a direct impact on the subsequent analysis.There have been amounts of segmentation algorithms for processing remote sensing images.A comprehensive review of these algorithms can be found in Dey et al., (2010).In this paper, we use the Multi-Resolution Image Segmentation (MRIS; Benz et al., 2004), a region-growing program that merges objects upwards from the pixel level based on a user-specified balance of shape and spectral measures (Parker, 2013).The segmentation algorithm is implemented in the Definiens eCognition software (eCognition, 2014a).
Additionally, scale parameter and composition homogeneity are selected to distinguish the different objects' heterogeneity.
Composition of homogeneity includes color (spectral of bands) and shape (compactness and smoothness).Through a trial-anderror approach, nearly desirable objects are combined into a whole according spectral and shape characteristics.
Before segmentation, the edge detection is used to add the segmentation to retrain the shape of road and other obvious boundary objects.For example, in Data set A, we choose different scale parameters (30, 50, and 100).Multi-spectral bands (blue, green, red and near-infrared (NIR)) are equally weighted with a value of one, and an edge detection filter is assigned a weight of five.The shape criteria are weighted with 0.2 and the compactness are weighted with 0.6.The parameters of the three data sets are shown in Table 2.  et al., 2011;Rau et al., 2014), we determine to use the following object features for extraction: a. Due to the slide of rocks, the regions after landslides are no vegetation covering surface.Using NDVI can discriminate vegetation and non-vegetation.Because the study areas include water, NDWI is used to remove the influence of the water.b.Brightness is the weighted average of the image intensity for each object (eCognition, 2014b).The landslide has a higher intensity than other objects.Hence, we can filter out bare soil or vegetation from landslide areas by using brightness.c.For removing shadow from the background, contrast is also taken into account.In addition, the features such as homogeneity, mean, density of GLCM and the DEM are used to assist the extraction of landslides as well.

Classification:
After an image is segmented, its objects could be detected and classified using rule set.Firstly, we utilize the selected features and build rule set to extract vegetation and water.Then, appropriate samples and feature are chosen to classify the landslide and other land types using a supervised classification -the nearest neighbor method.
Through observing images, we classify the regions to seven types, including vegetation, water, building, road, shadow, landslids and bareland.In Data set B, there are not obvious roads.So we use other land insteading of road and building.Furthermore, we extract the NDVI, vegetation regions from the images and classify landslides as non-vegetation regions.Then we calculate the NDWI to distinguish the water.After that, based on the samples from three regions and the features we chose, we obtaion the classification results through Nearest Neighbor Method (Figure 2).Because our aim is to extract landslides, other type lands are not considered in the accuracy analysis.Figure 3 show the initial landslide extraction results.The results show that some barelands, buildings and roads are classified as the landslides because of the similar spectral characteristics.600 sample points in Data set A, 250 sample points in Data set B and 150 sample points in Data set C of landslides are chosen as test sample points.The same number of non-landslide sample points in each data set is selected.Overall accuracy, Kappa coefficient and user's / product's accuracy are shown in Table 3.
As discussed above, barelands, buildings and roads have similar spectral characteristics, which means separating them from each other only rely on optical methods is a difficult task.Hence, additional information that can be used to distinguish between these objects is vital.In the following section, we introduce a means of using FOL to represent the knowledge of landslide causes mechanisms as additional information to the initial extracted landslides.

Modelling Landslides in FOL
Among all the physical causes of landslides, strong earthquakes are the prime triggering factors (Keefer 1984).The major type of landslides in two of our three study areas (Hongkou county and Baoxing county) is in this case.Another primary cause of landslides is slope saturation by water, which may occur as intense rainfall, snowmelt, changes in ground-water levels, and surface-water level changes along coastlines, earth dams, and in the banks of lakes, reservoirs, canals, and rivers (USGS, 2008).
In our case, the Neiliu railway landslide is basically trigged by strong rainfall.
Apart from triggering serious, coseismic landslides, strong earthquakes also lead to increased post-seismic slope instability for a long period of time that is very susceptible to future landslides under heavy rainfall conditions.For example, a strong rainfall event occurred four months after the Wenchuan earthquake and induced 969 new landslides and enlarged 169 existing landslides (Tang et al., 2011).
Based on the study areas and above analysis, we attempt to model three kinds of landslides that are strongly related to these two physical causes, earthquake and intense rainfall: a. Landslide triggered by earthquake; b.Landslide triggered by intense rainfall; c.Landslide triggered by intense rainfall after an earthquake.Huang (2015) suggests four main factors that increase the susceptibility of a certain area to earthquake-induced landslides are distance from seismic fault, slope profile types, slope angle and elevation.Based on these assumptions, the semantics of landslides triggered by earthquake can be defined through a translation into following first-order formula: ? o, ?l , ?d Object ?oLocation ?o,?l DEM ?d ?f distance_to_fault ?l,?f certainDistance has_elevation ?l,?d certainValue slope_angle ?l ,?d certainDegree ?p slope_profile ?l,?p certainTy It states that an object obtained from an image locate in a certain location which near the seismic fault, has a certain elevation value or a certain angle, or with a certain type of slope profile can be an earthquake induced landslide.Hong et al., (2007) reported that elevation, vegetation and the type of soil are preferentially susceptible to rainfall-triggered landslides based on empirical assumptions.Their analysis can be deduced as following expression: )) ?o, ?l , ?d Object ?oLocation ?o,?l DEM ?d ?r has _ elevation ?l,? d certainValue receive _ more _ ra infall ?d,?r ?s soil _ type ?l,? s certainType ?v vegetation ?l,? v certainType seismic_

landslide ?o ra infall_landslide ?o 
It expresses that an object at a high elevation area is easier receive greater amounts of rainfall, and if it has certain soil type or certain land cover (e.g.bare land), it can be a rainfall triggered landslide, even there was no earthquake happened before the rainfall events.Tang et al., (2009) deem that an abundance of loose debris and numerous extension cracks were induced on hill slopes after the Wenchuan earthquake, these debris and cracks led to rainfall triggered landslides during subsequent heavy rains.Other factors that have influence on the distribution of post-seismic landslides are the proximity of active faults and major rivers; the lithology, especially in Silurian slates and phyllites; and basically occurred at elevation between 900 m and 1500 m (Tang et al., 2011).The semantics under these factors can be deduced as following expression: The value of elevation, slop angle, and slope profile can be derived from DEM while the distance from seismic fault is calculated from a vector map, and other information like the slope profile or soil type can be retrieved from a database.After input the assumptions and the goal, and start the search in Pover9, a proof is found and indicates that the object is not a seismic landslide.
We have tested the first two landslide models, i.e. landslide triggered by earthquake and landslide triggered by intense rainfall, with the methods we proposed based on the three data sources.In other words, we translate the samples from the preliminary classification results obtained in Section 3.2 into FOL language and input them in Prover9.The running results are then used to improve the initial landslide extraction.The whole scheme proposed in this work is showed in Figure 4. Figure 5 and Table 4 show the results and accuracy analysis after using semantic reasoning to improve the initial extracted landslide objects.
For the model of landslide triggered by intense rainfall after a strong earthquake, since we do not have images taken after the September rainstorm event in Wenchuan, it has not been verified in this research.However, as the Prover9 can only process one input at a time, the process of large sample volume can be really time-consuming.Hence, we still need to find a way to improve the efficiency and the automaticity.The methods in the work of Tsarkov (2004) or Álvez (2012) may provide potential solutions.

Figure 1 .
Figure 1.The locations and images of the study areas3.IMAGE PROCESSING AND INITIAL LANDSLIDE EXTRACTIONIn this section, object-oriented approach is used to realize initial landslide extraction.Firstly, the pre-processing of three regions' images is conducted, including radiometric correction, orthorectification and pansharpening before classification and landslide extraction.Then, segmentation and feature selection are implemented as the first and second step of the objectoriented approach.The initial landslide extraction is conducted

Figure 2 .
Figure 2. Classification results based on object-oriented

Figure 4 .
Figure 4.The proposed framework of the landslide extraction based on object-oriented approach and semantic reasoning

Table 1 .
Study areas and data information

Table 3 .
The accuracy analysis of initial landslide extraction

Table 4 .
The accuracy analysis of the landslide extraction after using semantic reasoning Extraction of landslides from images based on object-oriented approach has been proved to be an effective way in disaster management, risk assessment and mitigation.Despite great development has been achieved in the remote sensing technology, the process of landslides extraction still highly rely on manual interpretation and the knowledge of domain experts.We present a relative new perspective on landslides extraction in terms of using FOL language to express the knowledge of landslide triggering factors.The intention is to create a transferable and less dependency on manual intervention mean of landslides extraction, which can be applied to heterogeneous data sources.Based on the data from three study areas and literature reviews, we have modelled three types of landslides in FOL according to different scenarios.To evaluate the method, we use eCogintion to prepare preliminary landslide objects from the images.The initial classification results show that the overall accuracy of Data set A, Data set B and Data set C are 82.33%,76.80% and 81.67% respectively.Then we test the feasibility of our model by choosing object samples as inputs within the FOL theorem prover Prover9.With the help of semantic reasoning, the final overall accuracy of Data set A, Data set B and Data set C reached into 89.58%,84.00% and 88.33% respectively.However, the methods presented in this paper are still at its beginning stage, because the occurrence of landslides is a complex function of various natural and human factors, but also because FOL theorem provers cannot be used directly.These two main challenges point out the directions of our future work.Aksoy, B., and Ercanoglu, M., 2012.Landslide identification and classification by object-based image analysis and fuzzy logic: an example from the Azdavay region (Kastamonu, Turkey).Computers & Geosciences, 38(1), pp.87-98.