FIRST INVESTIGATION OF MEDITERRANEAN OAK TREE VITALITY WITH HIGH-RESOLUTION WORLDVIEW-3 SATELLITE DATA: COMPARING TEN VEGETATION INDICES AND THREE MACHINE LEARNING CLASSIFIERS

Oak trees are the primary component in Mediterranean agro-silvopastoral systems. Since the second half of the 20 century, however, a severe oak decline has been observed. Climate change reinforces this problem, which is consistent with worldwide observable tree dieback. As the trees have significant ecological and socio-economic functions, their observation and assessment of vitality are increasingly researched. Satellite remote sensing is very well suitable for large-scale surveys of the extensive and sometimes hardly accessible areas. This study investigates the usability of high-resolution WorldView-3 data for the classification of tree vitality. The ground truth was collected on an Andalusian dehesa at the end of September 2019, timely corresponding with the satellite data acquisition. After customary post-processing of the WorldView-3 data, 10 vegetation indices (ARVI, CIgreen, CSI, DPI, EVI, GNDVI, NDVI, PSRI, RENDVI, and RGI) were calculated from the multispectral image. Three machine learning classifiers (Maximum Likelihood, Random Forest, and Support Vector Machine) were then used for a supervised image classification with three vitality classes (healthy, sick, and dead). Independent ground truth data were used for the validation. The best results were achieved with the red edge normalized difference vegetation index (RENDVI) and the Support Vector Machine classifier (F1 scores between 0.27 and 0.72). A maximal overall accuracy of around 0.6 is, however, improvable. Further studies should focus on other classification methods, more reliable ground truth, and combined analyses of spectral and structural data. * Corresponding author


INTRODUCTION
Agroforestry is regarded as a promising solution for sustainable land use and achieving the ambitious climate goals of the European Union (Eichhorn et al., 2006;Hernández-Morcillo et al., 2018;Tittensor et al., 2014). Mixing the cultivation of trees, pasture, and crops in agro-silvopastoral systems can help to reduce soil erosion and nitrogen leaching, fix CO2, and enhance biodiversity (den Herder et al., 2017;Kay et al., 2019;Palma et al., 2007). In the western Mediterranean, sparse holm oak (Quercus ilex) and cork oak (Quercus suber) coverage in a savannah-like landscape characterize the agro-silvopastoral systems of dehesas in Spain and montados in Portugal. Both terms are hereinafter summarized as dehesas. The trees have various environmental and socio-economic benefits. It is known that tree shades reduce the evapotranspiration, the soil beneath the tree canopy is rich in nutrients and organic matter, and thus the pasture under the trees has a higher quality (Moreno and Obrador, 2007;Serrano et al., 2018). The socio-economic value bases on cork production, the famous Iberian ham (jamón ibérico), big game hunting, and tourism (Fagerholm et al., 2019;Moreno and Obrador, 2007;Moreno and Pulido, 2009).
Dehesas are a prime example of sustainable balanced human land use and ecosystem protection. Since the second half of the 20 th century, however, there is increasing concern that this balance can hardly be maintained (Costa et al., 2014). Due to agricultural intensification and improper management, the ecosystems lose their regenerative capacity and vitality, marked by a significant oak tree decline (Camilo-Alves et al., 2013;Díaz et al., 1997;López-Sánchez et al., 2017). The important tree layer is, in particular, stressed through droughts (Gil-Pelegrín et al., 2008) and diseases (Tiberi et al., 2016), caused for example by the ambrosia beetle (Bellahirech et al., 2019) or 'la seca', a sudden dieback of oak trees caused by oomycete disease (Costa et al., 2014). The soil-borne pathogen Phytophthora cinnamomi may be the main factor for the tree decline (Camilo-Alves et al., 2013). To protect the ecosystems, dehesas have been included in the European Nature 2000 network, they are protected by the EU directive 92/43, and the autonomous community Extremadura introduced the dehesa law, which restricts the pruning of trees (European Union, 1992;Moreno and Pulido, 2009;Plieninger and Schaar, 2008). Nevertheless, dehesas still undergo a significant land use change, reinforced by climate change and hence they are regarded as a highly vulnerable ecosystem in the Mediterranean Navarro-Cerrillo et al., 2019;Rolo and Moreno, 2019).
Strong research interest on oak trees and their decline can be observed as they are fundamental for the ecosystem (Camilo-Alves et al., 2013;Costa et al., 2010). Satellite remote sensing is frequently used for capturing the extensive dehesa landscapes. Data from Landsat (Allen et al., 2018;Aubard et al., 2019;Hernández-Lambraño et al., 2019) or Sentinel-2 (Godinho et al., 2018) are widely used due to the easy accessibility and free availability. Furthermore, quite long time series are available for both sensors, which allows time series analyses. A shortcoming, however, is the spatial resolution of 10 m and 30 m. With these data, studies can be carried out on a regional scale, but studies at tree scale are hardly feasible as the oak trees typically have a crown diameter of 4-10 m. The modern commercial earth observation satellites WorldView-2 (WV-2) and WorldView-3 (WV-3) offer much higher resolution data. WV-2 provides panchromatic and multispectral images of 0.46 m and 1.84 m resolution, respectively. WV-3 reaches a resolution 0.31 m and 1.24 m in the panchromatic and multispectral range, respectively (DigitalGlobe, 2020). These high resolutions may allow a satellite-based investigation of tree vitality and oak decline at tree scale. Maybe due to the quite high purchasing costs, very few studies exist so far. Furthermore, the orthorectification of satellite images of regions with high relief energy can be problematic. Navarro-Cerrillo et al. (2019) used WV-2 data in combination with airborne laser scanning (ALS) data to classify defoliation levels of holm oaks. Gonçalves et al. (2018) estimated oak and pine biomass from WV-2 data. To the best of the authors' knowledge, there is no study where WV-3 data has been used to investigate tree vitality in a dehesa ecosystem. Research on this is urgently required, as the varying success of oak regeneration is also a global problem (Annighöfer et al., 2015). The aim of this study is a first investigation of the usability of WV-3 data for mapping tree vitality in a dehesa ecosystem in a mountainous region. About 1000 trees in the study area were manually mapped as ground truth. The orthorectified and atmospherically and topographically corrected image is classified with ten different vegetation indices and three different machine learning classifiers.

Study area
This study was conducted on the Dehesa San Francisco, which is located close to the small village Santa Olalla del Cala, about 60 km north of Sevilla ( Figure 1). The region has high relief energy as it lies in the Sierra Morena, one of the main mountain systems in Spain. A continental Mediterranean climate dominates the area, with dry, hot summers and mild winters. It usually rains only from autumn to spring. The dehesa covers an area of 500 ha with an altitude ranging between 350 m and 500 m above sea level. The territory is ideal for a study on tree vitality, as besides healthy areas, there are areas with a few diseased trees and severely affected areas (Sapp et al., 2019).

Data
The primary data basis for this study is a WV-3 satellite image from 22 September 2019 ( Figure 1). The image was captured around noon, which minimizes shadows. The whole image contains some cirrus clouds which fortunately lie apart from the Dehesa San Francisco. The image was purchased with a resolution of 0.5 m and 2.0 m in the panchromatic and multispectral range, respectively. The multispectral image contains 8 bands in the visible near infrared spectrum between 425 and 950 nm.
Freely available ALS Data were downloaded from the Spanish National Geographic Institute (Instituto Geográfico Nacional, 2020) to generate a high-resolution digital elevation model (DEM) for the orthorectification of the satellite image. The ground returns were filtered from the LiDAR point cloud and interpolated to a DEM raster with 1 m resolution.
As ground truth, the vitality of about 1000 oak trees was manually mapped in a field campaign at the end of September 2019. The shapes of the trees, which were digitized in a student thesis, were used as spatial base for this mapping. The trees are located in 12 areas of about 5 ha each ( Figure 1). These areas The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII- B3-2020XXIV ISPRS Congress (2020 are distributed across the dehesa and cover different vitality conditions. They also differ in location factors, like sun-exposed southern and shaded northern slopes. Previous to the campaign, maps of the trees have been created as georeferenced pdf files. The trees could be easily located in the field with these maps opened in the Avenza Maps app (Avenza, 2020) on a mobile device. The app shows the current position on the map, using the mobile phone's GPS sensor. This accuracy is sufficient for orientation on the map. The species (holm or cork oak) and degree of vitality were mapped for each tree. As a measure of vitality, the percentage of defoliated and decolorized parts was estimated. In the post-processing, the two percentages were summed up. All trees with a sum < 15% were classified as healthy and those with a sum > 95% as dead. Those in the range in between were classified as sick.

Satellite data processing
The WV-3 image was orthorectified with the DEM using the Rigorous Orthorectification of the ENVI 5.5.2 Photogrammetry Module. 30 easily identifiable oaks were used as ground control points (GCPs). In ArcGIS Pro 2.4.3, the trees were identified using the bing aerial image data. A hillshade of the digital surface model (DSM) with 1 m resolution of the entire ALS point cloud was created to determine the exact center of the trees. For the obtained XY coordinates, the Z coordinates were derived from the DEM. WGS 1984 UTM Zone 29N was used as spatial reference system. In the ENVI Orthorectification wizard, the trees were identified as GCPs in the high-resolution panchromatic image and the UTM coordinates were assigned to the image coordinates. The image coordinates were then converted for the lower resolution multispectral image. By this conversion, the accuracy of the identification in the panchromatic image could be maintained. The orthorectified image was then atmospherically and topographically corrected using ATCOR-3 with a dry rural atmosphere and the DEM.

Classification of tree vitality
The finally corrected image was transferred to ArcGIS Pro, which was used for the calculation of the vegetation indices (VIs) and the vitality classification. Ten VIs were selected from studies on tree canopy cover and vitality in dehesa systems (Godinho et al., 2018(Godinho et al., , 2016Navarro-Cerrillo et al., 2019). Table 1 shows the VIs and their formulas. The VIs were calculated for each pixel of the entire dehesa using the raster calculator. All VIs were calculated for the ATCOR-corrected and for the non-corrected image to investigate the usefulness of such a correction. The 12 manually mapped areas were divided into 8 training and 4 validation areas for the subsequent classification. The validation areas were chosen to represent different levels of vitality and locational factors. An ArcGIS model was established to execute the classification training and validation. Three machine learning classifiers were used to test their performance, namely Maximum Likelihood (ML), Random Forest (RF), and Support Vector Machine (SVM). The main steps of the model can be summarized as follows: Train classifier with data from training areas. Classify raster pixel-wise with the three vitality classes (healthy, sick, and dead). Calculate the zonal statistics for each tree polygons in the validation areas. The majority and median of the pixel values per polygon were used to assign a tree to a class. Create 1000 accuracy assessment points within the tree polygons in the validation areas with manually mapped vitality as ground truth. Update accuracy assessment points with zonal statistics value as classified value. Compute error matrix. Export error matrix as excel file.
Finally, a python script using the pandas library (McKinney, 2010) was written to, firstly, merge the 120 error matrices into one excel sheet and, secondly, calculate the F1 score as a measure of the classification's accuracy. Based on this table, the numerical results of the classifications can easily be compared. Furthermore, the raster data sets were visually inspected.

RESULTS
This study investigates tree vitality on a Spanish dehesa by calculating VIs from multispectral WV-3 data and classifying tree vitality using supervised image classification techniques. The ground truth was manually mapped and can be summarized as follows: the training data set contained 704 oak trees, of which 479 were mapped as healthy, 207 as sick, and 18 as dead; the validation data set included 354 trees, of which 218 were mapped as healthy, 118 as sick, and 18 as dead. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2020, 2020 XXIV ISPRS Congress (2020 edition) PSRI, RENDVI, and RGI) were each calculated for the ATCOR-corrected and the non-corrected image. The ArcGIS model was executed with the ML, RF, and SVM classifiers to classify each pixel. The majority and median of the pixel values per polygon were used to assign a tree to a vitality class (healthy, sick, and dead). This processing resulted in 120 error matrices.
The results of the ATCOR-corrected images are generally better than those of the non-corrected images. Only these results will thus be considered in the following. The overall accuracy of these classifications is listed in Table 2 for the approaches based on the majority and median of the pixel values. The weakest results were achieved with the ML classifier, while both other classifiers produce comparable results. The SVM classification, however, performed slightly better overall. Whether the majority or median of the pixel values is more suitable as a statistical measure cannot be concluded from these results.  Table 2. Overall accuracy of the classifications with majority and median as zonal statistic value for all VIs and classifiers using the ATCOR-corrected image. * Maps of these VIs are shown in Figure 2.
The VI raster data sets of the entire dehesa were equally colorized for visual control. The maps for the ARVI, GNDVI, and RENDVI with RF classifier and the DPI, EVI, and RENDVI with SVM classifier are shown in Figure 2. Although the overall accuracy for the ARVI with RF and DPI with SVM was reasonably good, the maps are not satisfying. It is hardly possible to detect a spatial pattern in both maps. Comparing the maps for the GNDVI with RF and EVI with SVM with the pattern of the trees (Figure 1) a better correspondence can be found, although both have slightly worse results in the overall accuracy. The two maps for the RENDVI demonstrate the impact of the chosen classifier on the outcome. While with RF, the largest part of dehesa is classified as healthy; with SVM, larger parts are classified as dead. Compared with Figure 1, the raster of the RENDVI with the SVM classifier looks more reliable. For example, the almost vegetation-free south-facing slope in the very south is classified as widely dead alias vegetation-free.
The error matrices and F1 scores with the majority and median as zonal statistic values, classified with RENDVI and SVM, are shown in Table 3. The accuracy assessment is based on 1000 randomly placed points. From these points, 697 were classified as healthy from the ground truth, 124 as sick, 95 as dead, and 84 as were invalid since they were inside the tree polygons during their creation, but outside the clipped raster used for the zonal statistics. The F1 scores show that the best results are achieved for the classification of healthy trees. The classification of sick and dead trees provides moderate to unsatisfactory results, but the median seems to be slightly better suitable than the majority.  Table 3. Error matrices with the majority and median of the pixel values, classified with RENDVI and SVM.

Healthy Sick Dead
The raster classified with RENDVI and SVM in comparison with the mapped ground truth is shown for areas number 4 and 11 in Figure 3. It is clearly visible that the classification gives very different results depending on the individual tree. The discussion will examine the reasons for this in more detail.

DISCUSSION
Firstly investigating the usability of WV-3 data for classifying the vitality of Mediterranean oak trees can be stated as the overall aim of this study. The satellite image of the mountainous region in the Sierra Morena was orthorectified and atmospherically and topographically corrected using conventional methods. The orthorectification was quite challenging due to the high relief energy in the study area. Although a very high-resolution DEM was used, a satisfying orthorectification without GCPs was not possible. Furthermore, the question remained which image of the earth's surface could be assumed to be as close as possible to the real ground? Initially, bing aerial image data was used as a reference. A comparison with the high-resolution DEM showed, however, Figure 2. Maps of the classified raster data set of the entire dehesa. ARVI, GNDVI, and RENDVI with Random Forest classifier (left) and DPI, EVI, and RENDVI with Support Vector Machine classifier (right). Green, orange, and red indicate pixels classified as healthy, sick, and dead, respectively.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII- B3-2020XXIV ISPRS Congress (2020 that easily identifiable objects like streets did not match. The ALS data was finally considered to be the most reliable reference, as the data is collected with an active sensor system, which is quite close to the earth's surface compared to satellite data and should, therefore, contain less distortion. Based on the DSM and DEM, 30 GCPs were established, which allowed a satisfying orthorectification of the satellite image. For further surveys, a placement of GCPs in the field and their acquisition with a precise GPS would be useful. In general, all classifications did not provide entirely satisfying results. With a maximal overall accuracy of 0.63, all results are rather average. Several reasons can be the cause of this. An essential aspect of supervised classifications is the selection of training and validation areas. In this study, the manually digitized shapes were taken from a student thesis to select the pixels for each individual oak tree. The contours of the trees were drawn quite roughly, as shown in Figure 3. It can thus be assumed that for several trees, pixels were included that could not actually be assigned to the tree, or conversely, actual pixels of the tree were not considered for the classification. For a more precise determination of the tree contours, the ALS data could be valuable. As Navarro-Cerrillo et al. (2019) showed, a tree segmentation approach based on a region growing algorithm can also be applied to determine the shape of oaks trees from ALS data in savannah-like dehesa ecosystems. The principle idea is that a local maxima filter is used to determine the highest point of a tree and then a region growing segmentation algorithm (Erikson and Olofsson, 2005) is used to determine the area of the tree crown. This approach will be investigated in a future study as the ALS data is available from the Spanish National Geographic Institute.
Another potential source of error is the ground truth for the supervised classification. The vitality status of the trees was estimated in the best possible consistent way. As the percentage of defoliated and decolorized parts is sometimes challenging, in particular for large trees, this can bias the classification. First, further studies should examine alternatives, for example, unsupervised classifications. Second, quantitative parameters of tree vitality as ground truth would be beneficial. Martinez-Trinidad et al. (2010) found that electrical resistance readings can detect vitality differences in trees. They, however, also concluded that vitality is a complex variable and that visual assessment is necessary. Buddenbaum et al. (2015) used ground-based visible near and short wave infrared (VNIR and SWIR) imaging spectroscopy for drought stress monitoring of tree seedlings. Such data would be beneficial for the presented approach. For practical reasons, however, this is only possible with small trees. Alternatively, spectral ground truth could be captured with low-flying airborne systems such as unmanned aerial vehicles (UAVs). Jenal et al. (2019) have recently introduced a novel UAV-borne VNIR/SWIR sensor, which might produce appropriate data.
In summary, these results support the enormous potential of WV-3 data for investigating the tree vitality of Mediterranean oaks. As oak or generally tree decline is a worldwide problem, reinforced by climate change, research in this area is urgently required (Annighöfer et al., 2015;Rolo and Moreno, 2019). Although the accuracy of the classifications is only moderate, it can be assumed that better results can be achieved with the presented suggestions for improvement. In particular, the combination of spectral and structural data should be investigated as shown by Navarro-Cerrillo et al. (2019) for classifying defoliation levels of holm oak or by Hartling et al. (2019) for the classification of urban tree species. WV-3 offers great potential, as in addition to the 8-band multispectral images in the VNIR range investigated here, the sensor can capture 8band multispectral images in the SWIR range. Analysis of the VNIR/SWIR domain may enable the detection of water stress in vegetation, as shown, for example, by Buddenbaum et al. (2015). It might also be worthwhile for further studies to divide the entire data set into several sub-sets and to perform a leaveone-out cross-validation for more robust results.

CONCLUSION
This study aimed to firstly investigate the usability of WorldView-3 data for mapping tree vitality in a dehesa ecosystem in a mountainous region. An image from late September was analyzed for which ground truth data were collected. The orthorectification of the image could finally be performed satisfactorily. The distortion caused by high relief energy in the study region is, however, a factor to be taken into account for further studies. Out of the ten investigated vegetation indices (ARVI, CIgreen, CSI, DPI, EVI, GNDVI, NDVI, PSRI, RENDVI, and RGI) and three machine learning classifiers (Maximum Likelihood, Random Forest, and Support Vector Machine), the best results were achieved with the red edge normalized difference vegetation index (RENDVI) and the Support Vector Machine classifier. Future studies will aim at obtaining more reliable ground truth, investigating other classification methods, and performing combined analyses of spectral and structural data to improve the results.