Combined use of remote sensing and spatial modelling: When surface water impacts buffalo (syncerus caffer caffer) movements in savanna environments

: In semi-arid savannas, the availability of surface water constrains movements and space-use of wild animals. To accurately model their movements in relation to water selection at a landscape scale, innovative methods have to be developed to i) better discriminate water bodies in space while characterizing their seasonal occurrences and ii) integrate this information in a spatially-explicit model to simulate animal movements according to surface water availability. In this study, we propose to combine satellite remote sensing (SRS) and spatial modelling in the case of the African buffalo ( Syncerus caffer caffer) movements at the periphery of Hwange National Park (Zimbabwe). An existing classification method of satellite Sentinel-2 time-series images has been adapted to produce monthly surface water maps at 10 meters spatial resolution. The resulting water maps have then been integrated into a spatialized mechanistic movement model based on a collective motion of self-propelled individuals to simulate buffalo movements in response to surface water. The use of spectral indices derived from Sentinel-2 in combination with the short-wave infrared (SWIR) band in a Random Forest (RF) classifier provided robust results with a mean Kappa index, over the time series, of 0.87 (max = 0.98, min = 0.65). The results highlighted strong space and time variabilities of water availability in the study area. The mechanistic movement model showed a positive and significant correlation between observations/simulations movements and space-use of buffalo’s herds (Spearman r = 0.69, p-value < 10 e -114 ) despite overestimating the presence of buffalo individuals at proximity of the surface water.


INTRODUCTION
In semi-arid environments such as southern African savannas, the availability of surface water constrains movements, distributions and space-use of wild animals (Chamaillé-Jammes et al., 2016). Having the capacities to monitor, through space and time, surface water availability at a landscape scale can potentially enable the characterization of wild animal movements in relation to this natural resource. The simulated distribution of wildlife in space and time resulting from the modelling of the relationship between an animal species and its water requirements could then be used to address human/wildlife coexistence related issues such as * Corresponding author competition for resources inside/outside protected areas (Young et al., 2005), crop or livestock destruction by wildlife (Valls-Fox, 2015), and risk of pathogen transmission between wild and domesticated species Miguel et al., 2013). The advent of satellite telemetry using global positioning system (GPS) allows to determine temporal and spatial position of animals in a given area with high precision, temporal accuracy and position updates available in rapid frequency 24 hours a day (Cagnacci et al., 2010). This breakthrough in technology enabled to better apprehend how and why animals move (Kays et al., 2015). Combining this technology with satellite remote sensing (SRS) generates opportunities for studies such as natural resource suitability mapping (Remelgado et al., 2018) or speciesenvironment interactions mapping (Sheeren et al., 2014). Indeed, SRS provides an array of tools and methodologies to discriminate environmental variables (e.g., surface water) at different spatial and time scales in areas with partial or no in-situ data coverage (Alsdorf et al., 2007). This is particularly true in the current context of increasing number and variety of SRS sensors (Paganini et al., 2018). For example, several studies have been combining GPS telemetry data with SRS in savanna environments to investigate the relationship between resource gradients and overlap between wild and domestic herbivores (Zengeya et al., 2015) or to assess the impact of small-scale ephemeral water sources on wildlife (Naidoo et al., 2020), greatly expanding our understanding of ecological functioning in relation to animal movement as a result. Since 2015, Sentinel-2 satellites provide 10m spatial resolution SRS images with a revisit frequency of 5 days that can potentially be combined with GPS telemetry data to conduct landscape scale ecological analysis. Applications and studies in the field of ecology using this technology need to be further developed in conjunction with spatial modelling.
Spatial models of animal movement taking into account biotic and abiotic drivers as well as behavioral mechanisms have been developed in recent years (Moorcroft, 2012;Westley et al., 2018). Mechanistic modelling approaches can take into account fine-scale ecological processes (e.g., environmental changes and animal responses) that underlie ecosystem functions (i.e., watering behavior of a focal species) and incorporates changes in ecosystem properties (e.g., inter-species competition for water resources) in response to changes in the environment (e.g., climate and water resource changes) (Rastetter et al., 2003). Models that describe the collective motion of groups of selfpropelled agents (Gregoire et al., 2003;Huepe and Aldana, 2008) can simulate herd dynamics easier than hard-to-calibrate individual-based models. Such 'swarm' models are parsimonious as they use few parameters (i.e., speed, alignment, cohesion) to mimic a group of individuals (Eriksson et al., 2010;Gregoire et al., 2003;Vicsek et al., 1995) and are a way to control the amount of self-organization within a herd of a specific species (i.e., the degree of alignment and cohesion of the individuals' headings). However, dynamic animal movement models that combine SRS with GPS telemetry in order to specifically characterize speciesenvironment interactions in space and time at a landscape scale are lacking. Indeed, SRS derived environmental data are rarely used in combination with spatial modelling although the understanding of animal movement and their associated ecological mechanisms could benefit from such approaches (Neumann et al., 2015;Rumiano et al., 2020).
Thus, the objectives of this study are two-fold: i) developing a method to map surface water at a landscape scale accounting for seasonal variations in a savanna type area near the Hwange National Park (Zimbabwe) using Sentinel-2 satellite images, and ii) integrating the resulting surface water maps in a spatialized mechanistic animal movement model, with the example of the African buffalo (Syncerus caffer caffer), a keystone species for conservation and production systems in southern African interfaces (Cornélis et al., 2014).

Study area
Our study area is located North West of Zimbabwe in the Matabeleland North Province (18°37' S, 26°52' E) ( Figure 1). More specifically, it lies at the northern periphery of Hwange National Park (HNP), within the Sikumi Forest Area (SFA) that is under the management of the Forestry Commission of Zimbabwe since 1968 and covers an area of approximately 200 km² sharing an open boundary with HNP (14650 km²). In this ecosystem, wildlife coexists with human activities such as cattle herding, firewood and thatching grass harvesting and tourism (Valls-Fox et al., 2018). Human settlements and agricultural fields are located only a few hundred meters away from the unfenced SFA boundaries (Guerbois et al., 2013). The vegetation of the area can be characterized as semi-arid wooded savannas with patches of grassland. Surface water is naturally provided by pans and springs, most of which dry-up during the dry season (May to September). Solar powered pumping stations are also present in the area and ensure year-round water availability. Annual rainfall approximates 600 mm per year in average with an inter-annual variability coefficient of 25 % between 1928-2005(Chamaillé-Jammes et al., 2006. However, drought severity and inconsistency of rainfall increased in the area during the twentieth century (Chamaillé-Jammes et al., 2007).

Data
Telemetry data: 8 buffalo individuals have been monitored in the area from April 2010 to April 2014 by ultra-high frequency (UHF) collars (manufactured by African wildlife Tracking) set with a 1 hour frequency signal (Miguel, 2012;Valls Fox, 2015). Three groups of respectively three individuals (from April 20 th 2010 to August 18 th 2011), four individuals (from November 14 th 2011 to September 9 th 2013) and four individuals (from March 12 th 2013 to April 15 th 201) have been constituted. Each group represents buffaloes that are present at the same time in the same area ( Figure 1).
Remote sensing data: 24 Sentinel-2 satellite images of a complete year, corresponding to one image per month for the two tiles (T35KNV & T35KMV), necessary to spatially cover the entire area, have been downloaded in level 1C (Top Of Atmosphere reflectance and orthorectified images) via the Copernicus Open Access Hub. As no Sentinel-2 images were produced at the time of the telemetry data acquisition, we have chosen images from the year 2018 which is representative of the annual rainfall precipitation measured via Tropical Applications of Meteorology using SATellite data and ground-based observations (TAMSAT) compared to the years were the telemetry data have been collected. Only the images with less than 10% of cloud cover have been considered. As no images were cloud free for the month of February 2018, the series was completed by two images from February 2019, one per tile.
Reference polygons derived from image interpretation: For each Sentinel-2 image and each land-use types to be classified ("surface water" and "other"), a set of 100 reference polygons have been evenly vectorised over the study area.
Surface water ground truth data: These data consist in GPS coordinates locating surface water collected on the field during previous studies conducted in the area (Guerbois, 2012;Miguel, 2012; Valls Fox, 2015) ( Figure 1).

Methodology
The methodology is structured in separate phases ( Figure 2).
Figure 2. Flowchart combining remote sensing data with telemetry data to model the focal species movements

Mapping the surface water
Pre-treatment: The Sen2Cor v2.8 application (Sen2Cor, European Space Agency) has been used to apply atmospheric corrections, thus transforming L1C images to level L2A (Top Of Canopy) images. The 20 meters spatial resolution spectral bands have been resampled by bilinear interpolation to 10 meters spatial resolution before being projected to the WGS84/UTM35S projection system and clipped to the study zone spatial extent. Following (Du et al., 2016), the modified normal difference water index (MNDWI) and the normalized difference water index (NDWI) have been calculated and stacked with Sentinel-2 shortwave infrared (SWIR) band. At the end of the pre-treatment, 24 three-layer rasters (NDVI, MNDWI, SWIR), 12 (one per month) for each of the two tiles covering the study area, composed the image corpus used in the supervised classification process.

Classification:
The reference polygons (c.f. 2.2) have been used to clip the 24 pre-treated multi-layer raster stacks to create training and validation raster samples. These raster samples were then randomly selected with a 50/50 ratio towards training and validation and used in the random forest (RF) classifier (Breiman, 2001). The 50/50 ratio has been chosen as it allows a more reliable comparison between training and validation samples than a ratio with a lower proportion of validation samples (Mercier et al., 2018). RF algorithm was chosen because of its advantages of simple parametrization, reliable and rapid execution in processing time of large volume of variables and data and its proven efficiency in satellite image landcover classification (Pelletier et al., 2016). The RF algorithm has then been applied on all the 24 pre-processed multi-layer rasters to obtain a classification at 10 meters of spatial resolution.
Post-classification: For each classified raster image, the pixels classified as 'water' have been vectorised to allow the manually removal of the noise pixels (false positives). As the water surfaces reach their maximum spatial extents in March, when the peak precipitation occurs, the two derived classification images of the month of March (one per tile) have been selected to map the maximum water extent in the area. The resulted vector layers of the month of March have then been used as a template to mask all of the noise pixels present in the 11 other months of the year vector layers.
Surface water classification validation: The surface water ground truth data (c.f. 2.2) were used to validate the classification when being located directly on a surface water polygon or within a 100m buffer area around the surface water polygon. Reference polygons derived from image interpretation (c.f. 2.2) have been used as training and validation references to apply a crossvalidation on two classification accuracy indicators (i.e. overall accuracy (OA) and Kappa index) and test the robustness and stability of the classification method. 50 iterations of classification using randomly selected reference polygons were performed to run the cross-validation.

Processing telemetry data
Behavioural metrics calculation: In-situ telemetry data (c.f. 2.2) have been used to calculate the movement's speed of buffaloes. The speed value gathering 75% of the values of the speed distribution observed within the three buffalo groups (v1, v2, v3 = 0.48, 0.45, 0.46 km/h resp.) determines the distance v0 that buffaloes are able to cover in one model time step (10 minutes) in the following modelling section. In addition, the median distance between individuals of a same group has been calculated and mean/median daily distances covered between water points by buffalo have been calculated for validation.
Identification of behavioural phases. African buffalo drink water daily (Cornélis et al., 2014). The telemetry in-situ data have been used in accordance to correlate the speed and the probability for individuals to be near the surface water every hour over a period of 24 hours for the entire duration of the telemetry data measurement (Figure 3). As a result, two distinct phases were identified: a watering phase (from 9am to 7pm) and a free wandering phase (from to 7pm to 9am). (1) (2) Figure 3. Mean probabilities of the observed buffalo to be nearby (< 100m, shaded zone) surface water as a function of the time of the day (blue line), superimposed to the median speed of the observed buffalo (red dashed line), the median interdistance (brown line) and the phi values (green line)

Modelling the buffalo movements in space and time
Choice of the modelling language: The domain specific language Ocelet has been used to build the animal movement model (Degenne and Lo Seen, 2016). This language has the capacity to integrate spatial entities in vector and raster format and create relations between them to simulate spatio-temporal dynamics. The developed spatial model is composed of three main interacting spatial entities: (i) the buffalo individuals, (ii) the herd, (iii) the surface water.
Animal modelling approach: To model buffalo movements in space and time, a model of collective motion of self-propelled individuals (Gregoire et al., 2003) has been chosen, as it is parsimonious and mimics a wide range of movements. Derived from the Vicsek model (Vicsek et al., 1995) in which individuals interact at short distances, the model induces an overall cohesion of a population of individuals through space and time (Gregoire et al., 2003). Hence, the model highlights specific properties: no leader in the herd, noisy environment and/or communications, local interactions. In the model, buffalo move at discrete time steps by a fixed distance v0, their direction defined for each time step t as an angle : where controls the herd alignment that corresponds to the sum of individual's speed vectors ⃗ ( ≠ ), while controls the herd cohesion expressed as the sum of the vectors ⃗ ⃗⃗ that link two individuals i and j, and  the noise that represents the uncertainty with which the direction of each individual is influenced by neighbouring individuals ( being a random angle, comprised between - and ). The cohesion force ⃗ ⃗⃗ (Gregoire et al., 2003) between each pair of individuals and is expressed as follows: where ⃗⃗ ⃗⃗ represents the unit vector along the segment going from individual i to individual j within a defined distance of interaction r0 and rij between individuals i and j. ⃗ ⃗⃗ is defined by several parameters ( Calibration: To control the animal movement modeled we used two integrated indices calculated at each timestep t (Figure 3). The first one is the Phi order parameter (φ) that summarizes the averaged alignment of the herd: where N is the total number of individuals. The second indicator is the median interdistance that reflects the averaged cohesion of the herd. For the simulated data, φ and interdistance values have been calculated from four randomly selected individuals within the modeled herd of 200 individuals to level with the observed data where four individuals make up the herd at most (cf. 2.2). The absolute differences between the observed and simulated values of and interdistance have been calculated. We have then chosen the parameters tryptic ( , and ) minimizing the difference between observations and simulations for both behavioral phases. The interdistance distributions being nonnormal, the Kullback-Leibler (KL) divergence (Kullback and Leibler, 1951)  the centroid derived from observed individuals. Spatial density rasters of the centroids have been computed using a quadratic kernel shape from planar distances with a search radius of 500m at a 10m spatial resolution. The model being stochastic, 50 iterations for each of the three herd groups, considering the entirety of their respective time periods (c.f. 2.2), have been conducted for the simulation and used to derive a final simulated median raster. Concerning the observed data, the same method of density calculation have been used for each of the three groups, also considering the entirety of their respective time periods, before deriving the final observed median raster. In the end, the simulated median raster has been subtracted to the observed median raster to measure quantitatively and spatially their differences. Spearman correlation coefficients have also been calculated from 1000 iterations of 1000 randomly selected sample pixels on the observed and simulated median rasters.

Monthly surface water maps
In total, 290 ponds have been identified through the classification of Sentinel-2 images time series, highlighting strong seasonal patterns of water spatial distribution and availability, with only 24 ponds detected in August, the driest month of the season, and 17 water ponds that have been detected every month of the time series, indicating that 94% of the surface water depend on the season. The mean OA value of the time series, both tiles combined, is 0.93 (min 0.82max 0.99) and the mean kappa index value is 0.87 (min 0.65max 0.98), with temporal and spatial fluctuations (Figure 4). Kappa index and OA values are higher for the KMV tile than for the KNV tile (Figure 1) during the dry season (May to September) but lower during the wet season (November to April) (Figure 4). For the validation of the water classification with the use of the observed data, 85% of the GPS points referencing the presence of surface water (c.f. 2.2) have been detected when applying a buffer of 100m around the polygon classified as surface water and 60% have been detected without applying a buffer.

Calibration results
For the free wandering phase (c.f. 2.3.2), has been set to 60, at 40 and  at 0.2 (Table 1). For the watering phase (c.f. 2.3.2), the value of α has been set to 90, confirming the initial assumption that the weight of the alignment would be more pronounced during the watering phase when all the individuals take the direction of the closest surface water.

Results of modelling buffalo movements in relation with surface water
The model is stochastic as each buffalo individuals can choose a random direction following an angle from 0° to 360° at the beginning of every free wandering phases (c.f. 2.3.2). As a result, each simulation produced a specific centroid trajectory of 200 buffalo individuals that can then be compared to the observed centroid trajectory of 4 individuals for the entire observed time period or over a different time period ( Figure 5). We observe that the area covered by simulated centroid trajectories is comparable in size to the area covered by the observed centroid trajectory although simulated centroid trajectories tend to extend further. The shape of simulated centroid and observed centroid trajectories follow the same general pattern. We note different round trips made within the area covered by the different centroid trajectories as well as recurrent use of specific surface water locations.
Figure 5. Observed and simulated herd' centroids trajectories comparison for a period of one month. The observed trajectory is symbolized by the graduated red line (from light red that symbolizes the beginning of the period to dark red that symbolizes the ending of the period). The black dot points represents the simulated herd' centroids trajectory.
Overall, the model tends to overestimate the presence of buffalo near water ponds and underestimate their presence in peripheral areas ( Figure 6A). Even if overestimated, validation results demonstrate the model capacity to simulate the movement of buffaloes towards the surface water. Indeed, simulated and observed median density rasters were significantly correlated (Spearman r = 0.69, p-value < 10 e -114 ). Most of the differences between the densities are small ( Figure 6B). The model however, fails in reproducing the densities observed outside the proximity of surface water ponds ( Figure 6A), explaining the differences between the observed and simulated densities for the pixel's density values superior to 0.25 ( Figure 6B). During the freedivagation phase, buffalo may take random paths away from their territory before turning around and heading back to the nearby surface water. This feature of the model explains why the territory covered by buffalo in the simulations is larger than that observed ( Figure 6A).

Mapping the surface water via SRS in savanna
Detecting surface water in semi-arid savanna using SRS at a landscape scale remains challenging due to surface water seasonality dynamics, landscape heterogeneity, presence of shades, and variety in surface water area sizes and morphologies (Moser et al., 2014). However, increase availability of free medium-resolution satellite sensors such as Sentinel-2 provides potentialities to characterize, via supervised classification of combined MNDWI and NDWI indices, surface water presence and dynamics at landscape scale (Du et al., 2016). Even if most studies focusing on buffalo movements only use in-situ observations of surface water (Zvidzai et al., 2013), SRS is increasingly used (Naidoo et al., 2020) and can be a valuable asset in areas that are difficult to access and where it is almost impossible to collect in-situ data. The surface water classification methodology developed in this study is efficient (c.f. 3.1) but may be limited by the spatial resolution of the SRS images used for input. Indeed, the use of satellite optical sensors such as Sentinel-2 images can show its limit when trying to detect the small ponds (surface<1,000 m²) or the surface water that may be hidden by the vegetation. The use of very-high SRS images in combination with hydrologic modelling (Soti et al., 2010) or time series of medium spatial resolution could be an improvement.

The mechanistic animal movement model
The mechanistic movement model, even if it requires significant development and implementation costs, is less dependent of a correlation between ecological processes and environment properties than an empirical model (Gaucherel, 2018). By mathematically simulating interactions and mutual constraints among ecological processes, mechanistic models improve the ecological realism and extrapolation to different environments of a given model (Kearney and Porter, 2009). By using a swarm model to mechanistically model buffalo herd movements, the knowledge of individual behaviours is reduced but the potential to develop animal movement models in area where in-situ data are lacking or expensive to collect is increased. It is important to keep in mind that the model developed in this study somehow neglects individual characteristics as only their interactions with neighbours are considered. As a result, interaction rules between individuals, mostly quantitative, can generate the same statistical variables leading to redundancy and model similarity (Eriksson et al., 2010). In this particular instance, agent-based modelling can provide alternative approaches but usually implies greater complexity in design (i.e., more rules, quantitative parameter estimation, complex sensitivity analyses) for tuning the model (Schulze et al., 2017), is much less tractable than mechanistic equation-based models and has a lower reproducibility potential.

Limits of the designed model
Only eight buffalo individuals have been monitored by telemetry and, at best, only four individuals were simultaneously recorded within the same area at the same time, thus partially reproducing the dynamics of a herd. Indeed, a buffalo herd is composed of at least 200 individuals in our study area (Miguel et al., 2017). Given the few individuals used to calibrate buffalo herd behaviour, proven dynamics such as fission-fusion within buffalo herds    (Rumiano et al., 2020). Indeed, the temporal structuration of the model in two behavioural phases (cf. 2.3.2) translates an over-simplification of buffalo ecological functioning. For example, times when buffalo are feeding in between the two behavioural phases have not been taken into account, leading to an underestimation of the presence of buffalo in areas located at the periphery of surface water. On the other hand, the trends of the model to overestimate the presence of buffalo at proximity of detected surface water may be due to the quality of SRS-derived surface water maps. Indeed, all the surface water have not been detected due to their small size, vegetation covering and potential draining at the time of satellite image acquisition, de facto reducing the choice of surface water locations that buffalo can reach in simulations compared to what happens in reality.

Perspectives
Perspectives of this first modelling study of buffalo movements in semi-arid savanna using SRS include the integration of other environmental variables (e.g. browsing areas, vegetation structure, …) and human infrastructures (e.g. agricultural fields, roads, …) to simulate more realistic buffalo movements. By adding more key factors influencing the buffalo's movements to the model, the latter could potentially be adapted to the study of contacts between wildlife and domesticated species at the interface between communal and protected areas. The present study provides an original modelling framework allowing the integration of SRS-derived environmental variables to address complex questions on disease propagation, ecological interactions between species or animal management.

CONCLUSION
The ecological and animal movement model developed in this study demonstrated how a mechanistic model can be spatialized and combined with remote sensing data to simulate buffaloes' movements in relation with surface water availability at a landscape scale. For the first time to our knowledge, we proposed to model buffalo at the individual and collective scales in heterogeneous environments by the use of a parsimonious swarm model. This simple and replicable framework can be considered as an alternative to the existing modelling tools in the understanding of animal movement in regard to water selection in several ecological contexts and environments.