Using Multivariate Adaptive Regression Spline and Artificial Neural Network to Simulate Urbanization in Mumbai , India

Land use change (LUC) models used for modelling urban growth are different in structure and performance. Local models divide the data into separate subsets and fit distinct models on each of the subsets. Non-parametric models are data driven and usually do not have a fixed model structure or model structure is unknown before the modelling process. On the other hand, global models perform modelling using all the available data. In addition, parametric models have a fixed structure before the modelling process and they are model driven. Since few studies have compared local non-parametric models with global parametric models, this study compares a local non-parametric model called multivariate adaptive regression spline (MARS), and a global parametric model called artificial neural network (ANN) to simulate urbanization in Mumbai, India. Both models determine the relationship between a dependent variable and multiple independent variables. We used receiver operating characteristic (ROC) to compare the power of the both models for simulating urbanization. Landsat images of 1991 (TM) and 2010 (ETM) were used for modelling the urbanization process. The drivers considered for urbanization in this area were distance to urban areas, urban density, distance to roads, distance to water, distance to forest, distance to railway, distance to central business district, number of agricultural cells in a 7 by 7 neighbourhoods, and slope in 1991. The results showed that the area under the ROC curve for MARS and ANN was 94.77% and 95.36%, respectively. Thus, ANN performed slightly better than MARS to simulate urban areas in Mumbai, India. * mdelavar@ut.ac.ir


INTRODUCTION
Unprecedented urban growth, one of the most common land use change (LUC) forms, especially in developing countries has caused the control of metropolitan areas to become out of the hands of urban policy makers and planners (Tayyebi et al., 2011;Pijanowski et al., 2010).For example, urbanization has led to the expansion of urban areas from previously open areas that were originally natural areas and agricultural lands (Pijanowski et al., 2014).As a result, disturbance of agriculture and forest areas affected food security of human populations and reduced the richness of biodiversity, respectively (Tayyebi and Pijanowski, 2014).At a global scale, extensive conversion from vegetation to agriculture mainly occurred on shrub land with lack of forage production during the 1970s and into the mid-1980s (Armesto et al., 2009).However, after the mid-1980s, agricultural growth occurred with some degree of intensification in those areas more suitable for agriculture.Some recovery of forests from shrub lands and abandoned agricultural land has occurred recently (Kolmogoro, 1956).
In the recent years, due to the effects LUC can have on the environment and human life, LUC modeling is considered as a significant issue.Many factors involved in LUCs including demographic (e.g., population growth), economic (e.g., gross domestic product), bio-physical parameters (e.g., elevation and soil), institutional issues (e.g., policies) and cultural affairs.LUC causes changes in climate (Watson et al., 2000), economy (Long et al., 2007), food security (Godfray et al., 2010), water cycle (Tayyebi et al., 2015), which are threats to human life and well-being.Therefore, it is essential to examine this phenomenon.To study LUC, information about LUC drivers is needed.These drivers operate at temporal and spatial scales and occur in a non-linear manner (Veldkamp and Lambin, 2001).So, it requires precise and advanced techniques to model LUC (Tayyebi et al., 2014).Given the complexities of LUC, using data mining techniques to understand the hidden patterns in land use data can help to understand this process better.In a general classification, data mining models can be divided into two main groups: 1) global parametric models, and 2) local non-parametric models.Global parametric models use all the available data for LUC modeling (Theobald and Hobbs, 1998), also, model structure is fixed before modeling.Artificial neural network (ANN) widely used by LUC modelers is a global parametric model.In contrast, local non-parametric models divide data into subsets and then apply the modeling procedure to each subset.Also, model structure is not fixed before modeling.Multivariate adaptive regression spline (MARS) is a local non-parametric model introduced in the literature as method provides the best fitness for a given dataset.Although global parametric models are used more than local nonparametric models, there are few studies that compared these two models together.So, in this study, we compared MARS with ANN.Relative Operating Characteristics (ROC) was used to compare the accuracy of the two models.In this study, Landsat images of 1991 (TM) and 2010 (ETM + ) were used for modelling urbanization.These images were obtained from the United States Geological Survey (USGS) portal.

MARS
MARS (Friedman, 1991) is a non-parametric model that divides data into various partitions and formulates the relationship between independent and dependent spatial drivers (Tayyebi and Pijanowski, 2014).This relationship was established using piecewise polynomial functions called basis functions (Friedman, 1991).In contrast to other non-linear models where fit only one set of coefficients to the data, MARS fits separate piecewise polynomial functions to each region and creates a separate set of coefficients (Tayyebi et al., 2014).Furthermore, complex non-linear interactions between spatial drivers of LUC can also be specified.The general model of MARS is in the form of Eq. ( 1) (Friedman, 1991) where M= the number of sub-regions m = the number of spatial drivers of LUC e = the error term α= the basis function coefficients X= the independent variables Y= the land use classes B is the basis functions which can be represented as (Friedman, 1991): where N= the interaction order of the mth basis function S i, m = ±1 X v (i, m) = the vth variable where 1 ≤ v (i, m) ≤ k k= the total number of spatial drivers t i, m = a knot location of the spatial drivers q = the power of the basis function N can be specified by a user given to prior knowledge about the application.When q = 1, simple linear splines are selected.The subscript '+' is according to the following phrase (Friedman, 1991): The objective of MARS is to minimize the sum of the square errors to regulate the basis function coefficients.MARS may over-fit data in the training phase by adding dispensable basis functions to the model.To avoid this over-fitting, the basis functions with the least contributions are eliminated by using generalized cross validation (GCV) (Friedman and Silverman, 1989) as such: where n = number of total observations y = response variable f = estimated function by MARS The aim of MARS is to minimize the GCV and the best model is the one with the lowest GCV.C(M) is as follow (Friedman, 1991): where d = the cost for each basis function M = total number of basis functions

Artificial Neural Network
Artificial neural network is one of the most common global parametric models applied to model LUC (Pijanowski et al., 2002(Pijanowski et al., , 2010(Pijanowski et al., and 2014)).A multi-layer perceptron (MLP) is designed to identify an unknown relation between spatial drivers of the LUC and land use classes.The designed MLP consists of three layers (an input and an output layer with one hidden layer).Neural network used in the model has 10 input nodes, 21 nodes in the hidden layer and 1 node for output layer (Kolmogoro, 1956).ANN used delta method for adjusting error between nodes.

Relative Operating Characteristic
ROC eliminates the limitation of defining a unique threshold for the problem using different thresholds.After defining thresholds, these values are applied to the suitability map (map shows the suitability of change for each cell which varies from low to high).The values greater than these thresholds in the suitability map (a map showing the membership of each cell to either change or no-change) are set to 1 (meaning LUC in the desired pixel) and the other values are set to zero (meaning non-LUC in the desired pixel).The result is compared with the real LUC map.Then, the 2×2 contingency tables are calculated based on the Table 2 for each threshold.In this table , (1) (3) (4) (5)

X Y
The Area Under ROC Curve (AURC) is obtained as:  1. List and the date of spatial predictors in study area 86%, respectively.60% of the entire data was used for training run and rest of the data was used to simulate the urban pattern at 2010.Factors considered as drivers of change between 1990 and 2010 are listed in Table 1.All of the land use drivers which were considered as inputs to the MARS model were prepared in GIS environment.

MARS
Table 3 presents inconstant effect of each variable in urban gain.For example, the CBD effect is negative for distances less than 19,988.42m and positive for the other intervals.The likelihood of urban gain increased sharply for the distances between 0 and 19,988.42m.Also, the likelihood of urban gain declined slowly (smaller coefficient) for the distances between 19,988.42m and 35000 m (Figure 2).Similarly, Figure 2 shows the effect of other LUC drivers in urban gain.In addition, the coefficient of each LUC drivers is given in Table 3. MARS found one knot (around 19,988.42 m) or two sub-regions (Figure 2) for Distance to central business district (Table1).Stratified random sampling used to extract 60% of data for training and 40% for testing.

ANN
We obtained 0.04241 for mean squared error in 500 cycles.Figure 3 shows training error for ANN.

ROC
Performance accuracy of the MARS and ANN models were evaluated using ROC.The area under the ROC curve was 94.77% and 95.36% for MARS and ANN models, respectively (Figure 4).

CONCLUSION
This study presented and compared two models, MARS and ANN, to simulate urbanization in Mumbai city of India.The considered drivers for urbanization in this area were distance to urban areas, urban density, distance to roads, distance to bodies of water, distance to forest, distance to railway, distance to central business district, number of agricultural cells in a 7 by 7 neighbourhood and slope in 1991.The results showed that the area under the ROC curve for MARS and ANN were 94.77% and 95.36%, respectively.Thus, ANN performed slightly better than MARS to simulate urban gain.
True negative (TN) indicates cells which are forecasted as nonchange and are actually non-change in the reference map.False positive (FP) indicates cells which are forecasted as non-change but are actually change in the reference map.False negative (FN) indicates cells which are forecasted as change but are actually non-change in the reference map.Finally, True positive Figure 2. Basic functions of MARS for significant drivers

Figure 3 .
Figure 3. Training run for ANN The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XL-1/W5, 2015 International Conference on Sensors & Models in Remote Sensing & Photogrammetry, 23-25 Nov 2015, Kish Island, Iran (TP) indicates cells which are forecasted as change and are actually change in the reference map.After generating the contingency tables, we calculated X t and Y t for different thresholds to plot the ROC curve according to Eq. (6) the following relations:

Table 3 .
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XL-1/W5, 2015 International Conference on Sensors & Models in Remote Sensing & Photogrammetry, 23-25 Nov 2015, Kish Island, Iran Coefficients, variables and knots of MARS in the study area