CALIBRATING CELLULAR AUTOMATA OF LAND USE/COVER CHANGE MODELS USING A GENETIC ALGORITHM

Spatially explicit land use / land cover (LUCC) models aim at simulating the patterns of change on the landscape. In order to simulate landscape structure, the simulation procedures of most computational LUCC models use a cellular automata to replicate the land use / cover patches. Generally, model evaluation is based on assessing the location of the simulated changes in comparison to the true locations but landscapes metrics can also be used to assess landscape structure. As model complexity increases, the need to improve calibration and assessment techniques also increases. In this study, we applied a genetic algorithm tool to optimize cellular automata’s parameters to simulate deforestation in a region of the Brazilian Amazon. We found that the genetic algorithm was able to calibrate the model to simulate more realistic landscape in term of connectivity. Results show also that more realistic simulated landscapes are often obtained at the expense of the location coincidence. However, when considering processes such as the fragmentation impacts on biodiversity, the simulation of more realistic landscape structure should be preferred to spatial coincidence performance.


INTRODUCTION
Spatially explicit land use / land cover change (LUCC) models aim at simulating the patterns of change on the landscape (Paegelow et al., 2013). Many of the models are based on a inductive pattern-based approach: In this approach, LUCC is modelled empirically using past LUCC spatial distribution and rate to develop a mathematical model that estimates the change potential as a function of a set of explanatory spatial variables and the expected amount of change (Paegelow and Olmedo, 2005;Mas et al., 2014). In prospective modelling, allocation procedures are used to simulate the projected amount of change in the most likely locations. In order to simulate landscape structure and fragmentation pattern, the simulation procedures of most computational LUCC models use a cellular automata (CA) that intends to replicate the land use/cover (LUC) patches. Generally, the assessment of model performance is based on the spatial coincidence between a simulated map and an observed LUC map for the same date and does not evaluate the model ability to simulate the landscape pattern as, for instance, the size, the shape and the distribution of patches. Landscapes metrics can be used to assess simulated landscape structure (Mas et al., 2012). As model complexity increases, the need to improve calibration techniques also increases. This study aims at applying a genetic algorithm to optimize cellular automata's parameters to simulate deforestation in a region of the Brazilian Amazon.

MATERIALS
Dinamica EGO freeware (hereafter DINAMICA) is an environmental modeling platform for the design of space-time models. It has been applied to a variety of studies, such as modeling tropical deforestation (Soares-Filho et al., 2001, 2002, 2006Cuevas and Mas, 2008), urban growth (Almeida et al., 2005), fire regimes * Corresponding author (Silvestrini et al., 2011) and, landscape patterns (Pe'er et al., 2013Soares Filho et al., 2003), among others. We chose it due to its flexibility and computing eficiency (Mas et al., 2014). We used a portion of one of the 12 case-study areas from Soares-Filho et al. (2013), comprising a TM-Landsat scene map from the PRODES project (INPE, 2011). The study area is located in the State of Pará along the road between Santarém and Cuiabá. Deforestation maps encompassing the years 1997 and 2001 were rasterized into a 250-m raster. As spatial drivers of deforestation, we selected only three variables from the dataset: distance to previously deforested lands, proximity to roads and, elevation in order to have a simple model easier to interpret.

METHODS
DINAMICA uses transition probability maps that are based on a Bayesian method of conditional probability known the weight of evidence method. These maps of probability are used to simulate landscape dynamics using both Markov matrices to project the quantity of change and a cellular automata (CA) approach to reproduce spatial patterns (Soares-Filho et al., 2002, 2010. Two complementary CA are available: the Expander, that simulates the expansion of previously formed patches and, the Patcher that generates new patches through a seeding mechanism. The behaviour of the CA is controlled by four main parameters: the mean patch size, the patch size variance, the isometry and the prune factor. Increasing patch size value leads to simulated maps with a less fragmented landscape; increasing the patch size variance leads to a more diverse landscape in term of size of the patches. Setting the isometry value greater than one leads to create more isometric patches. Increasing the prune factor allows simulated changes to occur in less likely areas. With a prune factor of one, the model becomes almost deterministic, that is changes are restricted to the areas with higher change probability (Soares-Filho et al., 2002;Mas et al., 2014). DINAMICA has also a genetic algorithm tool which has been used to calibrate the weights of evidence (Soares-Filho et al., 2013). Genetic algorithms are based on the Darwinian mechanisms of evolution and attempt to mimic the natural evolution of a population by selection, combination, and mutations (random changes in genes). Each chromosome is composed of genes that define its characteristics. In computational models, a chromosome is a string of numbers encoding the parameters that the genetic algorithm attempts to optimize. First, a population of chromosomes is created randomly. Subsequently, the genetic algorithm generates new individuals from the existing ones by means of processes of selection, crossover and mutation. The parents are selected from the best chromosomes, according to a fitness criterion. To increase genetic heterogeneity, mutation and crossover operators randomly changes or interchange some genes on a chromosome sequence. A new generation is created by copying the most successful individuals, by crossing-over them and by mutating some chromosome sequences. These processes generate a new population of chromosomes with a greater chance of including a near-optimum solution for the problem. This evolution process iterates until the fitness-stopping criterion is satisfied. Genetic algorithms have been shown to be able to optimize multi-parameter function Wang (1997). In the present study, the genetic algorithm optimizes the four main parameters of the Parcher CA: the mean patch size, the patch size variance, the isometry and, the prune factor. Forest Maps of 1997 and 2001 were overlaid in order to map deforested and conserved forest area. This deforestation map allowed to compute a matrix of Markov, the weights of evidence and a map of probability using the three explanatory variables (see appendix). Then deforestation was simulated from 1997 to 2001 using the 1997 as initial LUC map, the matrix of Markov to calculate expected annual deforested area and the set of weights of evidence to compute the probability of change. The variable distance from previously deforested area is a dynamic variable, that is computed at each annual iteration of the simulation. It is worth noting that, in the present case, the training (or calibration) period is the same than the simulation period because the objective of the study is fitting the model and not testing its prospective ability. The CA calibration aimed at fitting the parameters in order to simulate a landscape similar to true landscape with regards to the size of the patches of deforestation and its general spatial distribution (e.g. avoiding that simulated changes concentrate in most likely areas, near previous deforested area when in true landscape there are also a little quantity of changes in remote areas). We calculated three indices which depict the landscape characteristics: The mean area of the deforestation patches (DPMA), the standard deviation of the deforestation patches area (DPASD) and the mean distance to deforested patches in remote areas (MDFPr) which is the mean distance of forest cells to deforested area taking into account only cells located at a larger distance than the average distance. Fitness was assessed through the difference of the indices for the true change and the simulated change maps. The fitness criterion was computed by the weighted sum of three fitness components (equation 1).
where w1, w2, w3 = pondering weights, δDP M A = Absolute value of the difference of index DPMA between simulated and true map, δDP SD = Absolute value of the difference of index DPSD between simulated and true map, δM DF Pr = Absolute value of the difference of index M DF Pr between simulated and true map.
Final simulated map was assessed by visual inspection and through the computing of the spatial coincidence between true change and simulated changes during 1997-2001.

RESULTS
Figures 1, 2 and 3 present the forest in 1997, 2001 and the deforested areas during the period 1997-2001 respectively. Figure 4 shows the map of probability of deforestation according to the weights of evidence based on the three explanatory variables (maps of explanatory variables are in appendix). It can be observed that change presents a clear spatial pattern (patches) and occurred mainly on high probability areas but also in less likely areas (e.g. remote areas). Figures 5 represents the simulated LUCC map from the modelled which CA was calibrated by the genetic algorithm. As comparison, figure 6 shows the simulated LUCC map thresholding the cells with the highest probability to simulate the change. It can be observed that the map obtained using CA is more realistic in term of landscape structure because it presents patches of deforestation of broadly the same size and distribution than the true map of change ( Figure 3). However, the unrealistic map obtained without CA has a higher coincidence with true map (29%) than the more realistic map obtained with the CA calibrated by the genetic algorithm (22%).

CONCLUSION
We found that the genetic algorithm was able to calibrate the model's CA to simulate more realistic landscape in term of connectivity. Different spatial patterns can be observed depending on human activities contributing to deforestation as cattle ranching, shifting cultivation, commercial agriculture or logging (Anwar and Stein, 2015;Lorena and Lambin, 2007). This approach can be used to calibrate LUCC models and other types of models aiming at simulating landscape patterns. Results show also that more realistic simulated landscapes are often obtained at the expense of the location coincidence. However, when LUCC modelling is used to assess processes such as the fragmentation impacts on biodiversity, the simulation of more realistic landscape structure and change dynamics should be preferred to spatial coincidence performance (Malanson et al., 2007;Mas et al., 2012).