ABANDONED AGRICULTURAL LAND IDENTIFICATION USING OBJECT-BASED APPROACH AND SENTINEL DATA IN THE DANUBIAN LOWLAND, SLOVAKIA

Farmland abandonment is a widespread phenomenon in different parts of the Earth especially in the countries of Central and Eastern Europe where large areas of agricultural land were left uncultivated, state-support and markets for agriculture disappeared and land reforms resulted in massive land ownership transfers following the collapse of socialism. Remote sensing and geographic information system provide powerful tools for identification and analysis of abandoned agricultural land (AAL) at various spatial and temporal scales. Here we present an approach to AAL extraction from Sentinel-1 and Sentinel-2 images, provided in the frame of the European Copernicus program. This study aims to investigate and map the spatial distribution of AAL on the foothill of Little Carpathians and in the Danubian Lowland, Slovakia. The presented case study showed the possibility of the use of Sentinel images and the object-based image analysis in the process of AAL identification that may improve the transfer of scientific knowledge to the local agri-environmental monitoring and management.


INTRODUCTION
The alarming scope of the farmland abandonment in Central and Eastern Europe which results in continuous forest and shrub growths outside the actively managed forests must be included into the global carbon storage and cycle (Regulation EU 2018/841). Abandoned agricultural land (AAL) could be defined as land void of any activities associated with agricultural production until this land becomes overgrown by other than agricultural vegetation : -AAL1: the initial stage of abandoned agricultural land overgrown by herbaceous formations > 90%; their tallness oscillates between 0.5-1.5 m, e.g. Calamagrostis, Festuca, Galium, Tanacetum, Achillea. -AAL2: a more advanced stage of abandoned agricultural land fully overgrown by grasses and broad-leaved herbs and shrubs with the canopy closure > 20%, tallness of which is up to 3 m, e.g. Rosa, Prunus, Crataegus, Cornus. Such a distinction between AAL classes is necessary to know for two reasons: -at what stage of overgrown (abandonment) is agricultural land, -what biomass potential AAL classes contain. Problems emerge with the operative and efficient acquisition of information about the AAL occurrence and its dynamism in large areas. Satellite remote sensing (RS) data may contribute to the elimination of this problem because they are acquired in regular intervals (e.g. every 1, 3, 16 or 26 days), what makes it possible to track the development of AAL in areas of different size up to thousands of hectares. Identification of AAL areas by RS data requires the knowledge whether these data contain information about their occurrence. If it is so, it is necessary to use such interpretation methods for their generation that will make it possible. It should be emphasized that the important factors which influence the identifiability of AAL areas on satellite images are the physiognomic heterogeneity and dynamism of overgrowing successions in different stages of development. These factors also condition the manifestation of AAL classes on the images by means of interpretation signs. The aim of this paper is to document the possibilities of the use of Sentinel images and the object-based image analysis (OBIA) with Random Forest (RF) classifier in the process of AAL and land cover/land use (LC/LU) identification at the local level. OBIA is widely used for AAL classification . The most common are OBIA methods combined with multitemporal analysis using vegetation indicese.g. (Karlík et al., 2017;Liu et al., 2017;Yusoff et al., 2017). The study performed by (Liu et al., 2017) also pointed out the limited usage of vegetation indices usage for AAL classification. (Yusoff et al., 2017) performed a simple trial and error approach for assigning classes to objects using SAR-based OBIA and pointed object-oriented classification rules as useful for AAL feature extraction. RF is one of the most effective tools in prediction. RF is a combination of tree predictors while each predictor depends on the values of a random vector sampled independently and with the same distribution in the forest (Breiman, 2001). Significantly more accurate results during AAL classification were obtained using RF while comparing with Support Vector Machine (Löw et al., 2015) or combination with Normalized Difference Vegetation Index (NDVI) data (Estel et al., 2015). An overview of other methods used in the process of AAL identification, extensive analysis of available literature was performed by . This review showed, that all the analysed papers identified AAL using an indirect approach by comparison of LC/LU data from different time horizons, NDVI time series and different statistical databases. Direct identification of AAL classes using RS data from the exact time horizon (single image) and based on prior defined interpretation features of AAL was missing.

STUDY AREA
The study area covering 617 km 2 is situated north-east of Bratislava and it is part of the Danubian Lowland, while a narrow strip of the territory is in the foothills of the Malé Karpaty Mts. (see Figure 1). The plain is on the Quaternary gravel sediments with fertile Chernozems and Chernitsas exploited prevailingly as arable land. Waterlogged depressions are covered by alder fen woods (locality NATURA 2000 -Šúr). Hilly lands on the Neogene and loess sediments with fertile Chernozems and Orthic Luvisols are exploited as arable land and vineyards with of oak woods refuges. The area is populated by compact rural settlements and three towns. Agriculture is a dominant activity, producing corn, wheat and sugar beet. Vineyards in a warm and moderately humid climate (with the mean yearly temperature of 9 °C and mean yearly atmospheric precipitation of 550-600 mm) prevail on the granite and granite-diorite slopes of the Malé Karpaty Mts. on Cambisols. The quality and authenticity of wine producers from the region are well known abroad. Field research was conducted during the culminating vegetation period (from May to July) in 2018. 10 training and test sites prevailingly covered with AAL were selected. Physiognomic characteristics of the successional vegetation (see Figure 2), its species composition, vegetation height, overall cover and clustering into patterns were recorded. These data were used for creating training sample sets for the RF classifier (see chapter 4.6) and for the accuracy validation (see chapter 4.7).

DATA
The most important advantages of the Sentinel mission are the frequent revisiting time and relatively high spatial resolution of the images.

Sentinel-2 data
In this study were used 10-meter resolution bands from Sentinel-2 (S-2) datasets. Images were assigned to Universal Transverse Mercator (UTM) zone 33. There were chosen four cloud-free images across one vegetation period. An issue was noticed, while selected images were the only cloud-free images during the vegetation period of 2018. The dates of selected S-2 images are given in Table 1.

Sentinel-1 data
Sentinel-1 (S-1) SAR Standard L1 products were used in this study. 30 Single Look Complex (SLC) images obtained between 1 st April 2018 and 31 st September 2018 were used for the creation of the temporal average image. Data were obtained from relative orbit number 124 with descending pass direction. Figure 3 shows the study technical flowchart.

METHODS
(1) Sentinel data were pre-processed to generate inputs for multi-resolution segmentation (MRS), Principal Component Analysis (PCA) and NDVI calculation.
(2) Stacked Sentinel data were segmented MRS with automatic scaling tool. Quality of the segmentation process was evaluated using segmentation goodness metrics.
(3) Resulting segments, stacked Sentinel data and NDVI data were used for object features statistics calculation.
(4) Forest and water bodies were masked using national datasets.
(5) Ground truth data were provided by field survey, resulting in the creation of training and validation samples.
(6) Random forest algorithm was performed using several different inputsobject features statistics, masked region and training samples. (7) Resulting AAL maps were evaluated by accuracy validation.

Sentinel data preprocessing
Sentinel-2 pre-processing was performed by the Sen2Cor algorithm (ESA release 2.5.5). The corrections were applied to remove or reduce the influence of the atmosphere.
To produce Sentinel-1 Level-1 data, these steps were performed: SLC slice assembly, debursting, precise orbit application, radiometric corrections, DEM assisted coregistration, stack averaging and terrain correction with geocoding. The resulting temporal average image consisted of 30 images creating topo-corrected stack averaged sigma naught raster with VV and VH polarisation assigned to UTM-33.

Normalized difference vegetation index
Vegetation indices are parameters sensitive to photosynthetically active radiation. They have been computed from spectral reflectance recorded by two or more spectral channels of the scanning device (Bannari et al., 1995). In this study, NDVI was applied to aim the distinction of vegetation between AAL and LC/LU classes. NDVI (1) is used for measurement of aboveground live biomass quantity, with the highest values corresponding to dense green vegetation (Grădinaru et al., 2019;Rouse Jr. et al., 1974): where B 8 = Sentinel-2 spectral band 8 = NIR (842 nm) B 4 = Sentinel-2 spectral band 4 = R (665 nm)

Principal Component Analysis
The aim of PCA is to transform the n-dimensional image into a new raster. Output image bands must be independent of each other while considering the amount of information contained in the original image file. Reduced number of input raster bands is the result of this method while computational requirements are reduced (Silleos et al., 2006). (Jensen, 1986) describes the mathematical and statistical concepts used to calculate PCA are: standard deviation, covariance, eigenvalues, eigenvectors and linear transformations.

Multi-resolution segmentation
The essence of object-based image analysis methods is to classify image records so that the resulting image approximates the results of visual interpretation. The resulting image provides more comprehensive results of homogeneous pixels that resemble human eye evaluation. The methods evaluate not only the spectral information but also other characteristics perceived by the human eyeshape, size, spatial relationships, texture (Jensen, 1986). Object-based methods create homogenous segments using various algorithms. The most commonly used algorithm is the multi-resolution segmentation (MRS) implemented in eCognition Developer (v9.5, Trimble Geospatial). MRS is a region-based algorithm that starts from the lowest pixel level and iteratively aggregates pixels into objects until certain userdefined homogeneity conditions are met. Several parameters need to be modified within the MRS algorithm settings, e.g. scale parameter (determines the size and homogeneity of the resulting objects) (Baatz and Schäpe, 2000). To avoid a complicated expert analysis of this parameter setting (Drǎguţ et al., 2010), the automated scaling tool Estimation of Scale Parameter 2 (ESP2) was used (Drăguţ et al., 2014). This tool analyses the local standard deviation within each scaling parameter setting and identifies the three most suitable parameters for MRS. Segmentation parameters were obtained using EPS2 while segmentation quality assessment was conducted. Five segmentation goodness metrics (Kavzoglu and Tonbul, 2017) were calculated as shown in Table 2. User-defined 20 homogeneous segments were used as testing polygons for calculation. Perfect segmentation results are reached when OS, US, RMS, AFI would be zero and Qr would be 1.  Forest and water bodies mask was used to the reduction of classified LC/LU classes and to avoid potential misclassification between abandoned and forested areas. The forest mask was created from forest compartments database provided by the National Forest Centre and water bodies from Basic Database (ZBGIS) provided by the Geodesy, Cartography and Cadastre Authority of Slovak Republic. Masked segments were the fundamental input data to the classification process and object features statistics calculation.

Object features statistics calculation
For each segment several object statistics were calculated: median values, standard deviation and grey level co-occurrence matrix (GLCM) variables.
Median and standard deviation values were calculated to minimise the impact of statistical outliers and for evaluation homogeneity between segments. Characteristics of GLCM textural variables were explained in the study of (Zhao et al., 2014). In our study, textural attributes of OBIA segments were analysed by the calculation of GLCM variables -Entropy and Homogeneity. The values were calculated for all directions so that the texture features should not be influenced by the angle. Entropy reflects the nonuniformity and complexity of segment texture. Homogeneity reflects the homogeneity of the object texture and scaled the local maxima of the segment.

Random Forest classification
Model building and application of RF classifier was completed in eCognition Developer (v9.5, Trimble Geospatial). Default parameters using image object level were used: max. categories (16), max. tree number (50)

Accuracy validation
In general, it is essential to determine the proportion of correctly classified areas out of their total number in the relevant classification class. Classification errors include a systematic (incorrectly set training sets) and a random component (spectral overlap of training sets). The most common method to evaluate the classification results were appliedclassification error matrix. The results of the classification process were compared with actual (reference) data from the field research. There were manually selected 142 objects ( Figure 5) used as validation samples with 243.73 hectares (25% of the total area of all created samples).

RESULTS
For PCA all available optical and derived NDVI data were used. The input number of components was 20 (four different dates with four 10m spectral bands and one derived NDVI image). Results are shown in Table 3. Eight components were chosen from PCA image with weightage value 99.01%. These components and S-1 temporal average VV/VH images were used as an input for object features statistics calculation. MRS with EPS2 algorithm was performed using S-1 temporal average data and four cloud-free S-2 images. Usage of ESP2 algorithm resulted in three different levels of an object. As defined in Table 2, five segmentation goodness metrics were calculated for each level. Considering the results, level 1 parameters (Table 4) were chosen (shape 0.2, compactness 0.5, scale parameter 55). These segments were used as image objects in the classification process.  Table 4. Evaluation of segmentation results using goodness metrics Abandoned agricultural land and LC/LU classes map provided by RF classification is shown in Figure 6 with the overall accuracy of 73% (Table 5). The statistical distribution of AAL and LC/LU classes (Figure 7) had shown that class AAL1 was identified in 1,557 ha (2.5%) and AAL2 in 2,726 ha (4.4%). The greatest problem of AAL identification was the overestimation between AAL1 / fodder crops (28%), AAL1 / permanent crops (8%) and AAL2 / shrubs (16%). This overestimation could be easily explained according to the spectral and textural similarity of overgrowing vegetation and annual crops. According to statistical evaluation, AAL classes were identified within 4,283 ha as 6.9% of the total area of interest.

PCA layer
Class homogeneity validation performed on the resulting AAL map has proven acceptable homogeneity within classes while the obtained standard deviation NDVI values were low ( Figure  8).

CONCLUSIONS
Understanding the spatial patterns of AAL is important for the assessment of the landscape potential to contribute not only to food production but also to other ecosystem services like biomass production or carbon sequestration. According to the aim of this paper, the possibility of the usage of Sentinel images and the OBIA with RF classifier was documented in the process of AAL and LC/LU identification at the local level in the agricultural landscape. The study area is the part of one of the most important centres of viticulture in Slovakia. The eastern part of the study area is used for intensive agricultural production, prevailingly on arable land. However, some of them were abandoned and overgrown with successional vegetation. Because of the dynamics of abandoning, RS data could be considered as one of the most important sources of information about this process. In the sense of our results, AAL1 and AAL2 classes cover 2.5% and 4.4% respectively from the whole experimental area. AAL classes are usually fragmented and occupy small areas of overgrowing vegetation. The agricultural abandonment is a gradual process, which is manifested by specific local features typical, e.g. for Slovak conditions. It is complicated to generalize and interpolate the parameters of such conditions for The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII- B3-2020, 2020XXIV ISPRS Congress (2020 other study areas without field survey. In the context of our contribution, we did not attempt to generalize the parameters used to identify AAL in another study area. Obtained results confirm that it is possible to identify different classes of AAL using Sentinel data. We found challenging to train the RF algorithm for sufficient classification of AAL. This led to misclassification between AAL1 and fodder crops (Fc) or AAL2 and shrubs (Sh)see Table 5. Further improvements could be done using higher frequency of obtained satellite data from the vegetation period and datasets with higher spatial resolution. We also assume that significant improvements could be reached by using high-resolution orthophotos with a combination of laser scanning data. To conclude we found Sentinel data useful for large scale monitoring of AAL which could be seen in the papers presented by e.g. (Alcantara et al., 2013;Estel et al., 2016;Kuemmerle et al., 2006;Prishchepov et al., 2012). The hybrid classification approach for large scale monitoring performed by (Kuemmerle et al., 2006) resulted in a reliable LC/LU map with an overall accuracy of 84%. Even this study proved that shrub areas are difficult to classify because of their overlap with grassland. Also, a high degree of spectral heterogeneity was pointed out. In the study of (Alcantara et al., 2013) focused on analysing spatial patterns of AAL the resulting PA (17.27%) and UA (50.75%) had shown significant issues within AAL identification process classification of AAL and pastures alike. The fallow land frequency could be also considered as the problematic part of AAL classification (Estel et al., 2015). It means those resulting areas identified as AAL could be in real-time just left as fallow land and later recultivated. The change in reflectance that is occurring when arable land is abandoned is significant (Prishchepov et al., 2012). This knowledge makes identification of AAL1 class easier (PA: 93%) with the comparison with the more gradual change between fodder crops, shrubs and AAL2 (PA: 75%). Because hay cutting occurred only once or twice a year in our study area, it is crucial to capture those areas directly after they are cut (Prishchepov et al., 2012).