GEOSTATISTICAL ANALYSIS FOR LANDSLIDE PREDICTION IN TRANSILVANIA , ROMANIA

Management of spatial data by means of Geographic Information System (GIS) plays an essential role based on the latest achievements in Geomatics domain. Geomatics offers the possibility to share, compare, and exchange data between researcher and users in unambiguous and accessible ways for map production and user-friendly technologies for results communication. Although sometimes the geodesist’s contribution to certain projects for landslide monitoring meant to develop early-warningsystems or risk maps is not adequately appreciated and he is only seen as supplier of measured geometric data, the geodesist has a significant contribution through his abilities regarding the modelling of dynamic systems, like strategic constructions (dams, tall buildings etc.) or landslides and data processing and interpretation. This study focuses on using geomorphological characteristics to detect the changes and the effects of landsliding using the ArcGIS 10.1 extension, Geostatistical Analyst. One of the main uses of geostatistics is to predict values of a sampled variable over the whole area of interest, which is referred to as spatial prediction or spatial interpolation. The extension allows creating a surface from data measurements occurring over an area where collecting information for every possible location would be impossible and gives us the possibility to fully understand the qualitative and quantitative aspects of the data. By providing us with the opportunity to predict and model spatial phenomena based on statistics and incorporating powerful exploration tools, ArcGIS Geostatistical Analyst effectively bridges the gap between geostatistics and geographic information system (GIS) analysis.


INTRODUCTION
Geostatistics represents a branch of statistics specializing in analysis and interpretation of data with geographic reference.In other words, geostatistics consists of statistical techniques adjusted to spatial data.This discipline has gradually developed and has been applied over time in many areas of the world.Originally it was created to estimate the probability of distributing certain existing resources and their volumes, such as ore classes for mining.Over the years, geostatistics has been applied in various areas such as: oil geology, hydrology, meteorology, geography, forestry, oceanography, and geochemistry.Geostatistics is closely linked to the interpolation process, which actually represents the art of designing the results of the available data, giving a hint about the appropriate hypothesis.Geostatistics implies more than data interpolation, being used to predict values based on sampled variables in an area of interest, method known as spatial prediction.Accurate results can be obtained by using ArcGIS Geostatistical Analyst extension that provides a set of tools for exploring spatial data and generating surfaces by using statistical methods.This extension allows users to create surfaces based on measured data of a surface, where the collection of information for each individual location would be impossible.The present paper proposes a comparative analysis of the main methods of prediction of the glimes in the Transylvanian area, Romania using Ka -the erosion factor.

THE PRINCIPLES OF GEOSTATISTICAL ANALYSIS
Geostatistics analysis actually represents an interpolation of the surface, part of the spatial analysis, and relies on spatial selfcorrelation.Geostatistical techniques focus on the spatial structure of a variable that represents the mathematical relation between it, as a measured value at a given point and the same type of variable measured at other points at certain distances from the first point.The first law of geography, formulated by Waldo Tobler, assumes that everything is related to any other thing, but the close ones are much more similar as characteristics than the other ones.Geostatistics analysis uses sample points taken from different locations of a surface and creates (interpolates) a continuous surface.The sample points represent the measurement of a phenomenon e.g.radiation leakage from a nuclear power plant, an oil spill, depth to the water table or altitudes.This type of analysis creates a surface using the values in the measured locations to predict values for each location in that area.The above mentioned extension, Geostatistical Analyst, provides two groups of interpolation techniques: deterministic and geostatistical.The deterministic ones are based on mathematical functions for interpolation, and the geostatistics are based on both mathematical and statistical methods and can be used to create surfaces and assess the uncertainty of predictions.
To create surfaces, both methods are based on the similarity of the vicinity of the sample points.In addition to providing various interpolation techniques, Geostatistical Analyst also provides support tools.These tools allow users to explore and gain better understanding of the data in order to create the best surface based on the available information.

The determinist interpolation technique
As mentioned earlier, this technique uses mathematical functions.The most used surface for this technique is the digital model of the terrain based on elevation, but also other measurements can be used (measurements at ground level, basement, atmospheric) to generate continuous surfaces.A major challenge to which a large proportion of GIS modeling is to be faced, is to generate the most accurate surface of the sample data and also to characterize the error and variability of the projected area.New generated surfaces are further used for modeling and analysis in 3D views.By understanding the quality of the data, users can improve the usefulness and purpose of the GIS modeling.This is the Geostatistical Analyst's role.This technique can be divided into two groups: global and local.
Global techniques calculate predictions for the entire dataset, and the local ones calculate predictions from the points measured in the vicinities, which are smaller surfaces than the ones taken into consideration in the study.The Geostatistical Analyst provides global polynomial as the global interpolator and the Inverse Distance Weighted (local polynomial) and the radial base as local interpolators.
A deterministic interpolation can either force the resulting surface to pass through the given values or not.An interpolation technique that predicts an identical value to that measured in the sample location is defined as the exact interpolator.An inaccurately interpolator predicts a value that is different from the measured value.The latter can be used to avoid the sharpening of the bed or gutters in the resulting surface.IDW and radial base are precise interpolators, while global and local polynomials are inaccurate.
As deterministic methods that can be applied are: Inverse Distance Weighted (IDW), the Global Polynomial, the Local Polynomial and the functions of the radial base.

The geostatistical interpolation technique
Unlike the deterministic interpolation technique, the geostatistical assumes that all values in the study are the result of a random process.A random process does not mean that all events are independent just as throwing a coin.Geostatistics is based on random process with dependency.For example, throwing three coins and seeing if they are heads or tails.The rule of determining how the fourth coin is to be put is: if the second and third are the head, then the fourth should be the same as the first; contrary the fourth will have to be different from the first.
In the spatial or temporal context, such dependency is called self-correlation (Figure 1).

CASE STUDY
This case study implies the analysis of the erodibility factor influence on a region in Transylvania, Romania and the estimation of the areas susceptible to landsliding.In order to carry out the proposed study, the previously discussed interpolation techniques were applied, and after obtaining the corresponding surfaces, they were compared and chosen the one that gave the best prediction.The erodibility factor K, represents a quantitative description of the erodibility inherent to a particular type of soil.It also constitutes a measure of the soil particles' susceptibility to the detachment and their transport by precipitation and landslides.For a specific type of soil, the erodibility factor is the erosion rate per unit of erosion index from a land considered standard (standard sample).This factor reflects the fact that different types of soil are eroding in different percentages when other factors affecting erosion (infiltration rate, permeability, total water capacity, dispersion, rain smears, abrasion) are the same.Texture is the main factor that affects the erodibility factor, but also contributes to structure and organic matter.The erodibility of the soil varies between 0.02 and 0.  2) Data Analysis is represented by using statistical methods such as frequency distribution graphics.This exploration is accomplished through the Exploratory Spatial data Analysis (ESDA) that allows data examination in different ways.Before creating the surface, ESDA provides tools for a better understanding of the investigated phenomenon to make the best decisions about the data.Each tool allows the data to be viewed so that it can be explored and manipulated, facilitating its better understanding.Each view is interlinked to the others so that selecting items in a view will appear selected into the other windows.These types of views are: Histogram (Figure 4), Voronoi Map (Figure 6), Normal QQ Plot (Figure 5), Trend Analysis, Semivariogram/Covariance Cloud, CrossCOVariance Cloud.3) Defining the model implies choosing the surface's interpolation method in order to estimate the values where measurements have not been performed.This step is achieved by means of Geostatistic Wizard which is based on the above mentioned techniques: deterministic methods that do not allow an assessment of the prediction error and do not require data evaluation and statistical methods which evaluate the prediction error and assume that the data comes from a stationary statistical process or that they have a normal distribution.The surfaces resulting from this process may be:  Through this comparison conclusions regarding how accurate a surface is by comparison to others created can be drawn.The two comparative geostatistical layers can be created using two different models or using the same model, but with different parameters.In the first case, it is compared which method best fits the dataset, and in the second case, the effects of using different input parameters of a model for creating a surface are examined.Generally, the best model is the one whose average standard is close to 0, the smallest average quadratic error of prediction, the average standard error close to the mean quadratic error of prediction and the standard quadratic error close to 1. Thus, surfaces will be compared and, by elimination, will remain the model that best lends itself to the factor used and will contain the least errors in the interpolation process.

CONCLUSIONS
Soil erosion modeling is quite complicated because landslides vary spatial and temporal depending on certain factors and interaction between them.It is necessary to determine both the estimation and the prediction for the unknown locations.In this study, the erodibility factor K was analysed.The assessment of erodibility represents an important step for understanding both the soil quality and its susceptibility to erosion and to predict soil erosion.This study proves that merging geostatistics with the calculation of GIS maps represents an important and useful tool for the study of spatial changes in environmental sciences.
Figure 2. Workflow of geostatistical analysis for landslide prediction in Transilvania, Romania 1) Data representation consists in converting information to spatial information by means of Geographic Information Systems.A sample of points surveyed in different locations of Transilvania area, Romania (Figure 3) are to be used.Attributive data of the sample points consists in erodibility factors, points' heights and terrains stratifications.

Figure
Figure 4. Histogram a) Prediction maps -estimate values at locations where measurements have not been carried out.b) Maps with standard errors -show the distribution of prediction errors for a certain surface.The error tends to be higher in areas where there is little sample data or not found at all.c) Quantile maps -show the 100% probability values of real values being lower than the quantile map values.Probability maps -show the chance that the actual value at a particular location to be greater than a threshold value.The probability of exceeding the threshold is determined by the expected values, error distribution and specified threshold value.4) Diagnosis represents statistical test such as Crossvalidation for assessing the estimation quality.Before realizing the final surface, it should be know how well the model predicts values in unknown locations.Cross validation and Validation facilitate the process of obtaining a conclusion regarding the model that offers best predictions.Cross validation uses all the data to estimate the self-correlation model.It then deletes the data from each location and predicts the values associated with the data, and then they are compared to the measured values.Firstly, Validation deletes part of the data, and then uses the remaining data to develop the trend or autocorrect patterns that will be used for the prediction.In both methods, the graphs and summary statistics used for the diagnosis are the same: prediction, predictive error (expectedmeasured), standard error (error/standard kriging error), normal Qqplot (standard error and normal standard distribution).

Figure 9 .
Figure 9. Comparative study between Geostatistical Analyst techniques Until now, deterministic interpolation techniques were based on surrounding measured values or on specific mathematical formula that determines the smoothness of the resultant surface.A second family of interpolation methods consists of geostatistical methods that are based on statistical models that include autocorrelation (statistical relations between measured points).These techniques do not only have the capability to produce predictive surfaces, but can provide measurements of uncertainty or accuracy of predictions.As added scientific value, geostatistics is synonymous with kriging, which represents the statistical version of interpolation.Kriging meets two requirements: the quantification of the spatial structure and making predictions.The quantification of the structure, known as Variography, is the place where a spatial dependency model can fit the data from the study.To make a prediction for an unknown value of a specific location, kriging will use the right model in variography, spatial configuration of the data, and the measured values of sample points around the prediction location.Thus, in this category are included the following techniques: Ordinary kriging, Simple Kriging, Universal Kriging, Indicator Kriging, Probability Kriging, Disjunctive Kriging, Cokriging.