EVALUATION OF MACHINE LEARNING CLASSIFIERS FOR MAPPING FALCATA PLANTATIONS IN SENTINEL-2 IMAGE

Efficient and accurate mapping of forest and other industrial tree plantations (ITPs) is essential to ensure better monitoring and sustainable management of these plantations. In Caraga Region, Mindanao, Philippines, ITPs planted with Falcata (Paraserianthes falcataria (l.) Nielsen) are widespread and has contributed to more than 50% of the nationwide log production. At present, there is limited information on the location and extent of existing plantations. This provides an opportunity to evaluate satellite remote sensing approaches for mapping these plantations from images, particularly those provided by the Sentinel-2 mission. The objective of this study is to evaluate machine learning classifiers for mapping Falcata plantations in Sentinel-2 image using a 9 x 9 km study area in Caraga Region. It also aims to find the best classifier that can provide acceptable levels of accuracy by utilizing only the four bands of the 10-m spatial resolution and the 9 bands of the 20-m spatial resolution Level 2A Sentinel-2 image, respectively. The following classifiers and their variants were evaluated: Linear Support Vector Machine (SVM), Polynomial SVM, Radial Basis Function (RBF) SVM, Artificial Neural Network (Neural Net), Random Forest (RF), and Maximum Likelihood (ML). One or more of these classifiers have been successfully used in in natural and plantation forest mapping, including tree species classification from remotely sensed images. However, their performance and accuracy in detecting and discriminating Falcata plantations is yet to be evaluated. Results of the evaluation showed that the ML classifier has the highest overall accuracy (OA) of 90.90% and has more consistent values for Producer’s Accuracy (PA), and User’s Accuracy (UA) for Falcata and Non-Falcata classes, and hence, provides better Falcata classification results than the other classifiers when the 10-m spatial resolution Sentinel-2 image was used. The accuracy assessment of the 20-m subset classification provides relative different results from that of the 10-m subset, perhaps due to the inclusion of more bands. The highest OA was obtained by the Linear and RBF SVM classifiers at 92.05% each. The SVM classifiers have consistent performance and produce more accurate classification results than the other classifiers (i.e., more than 90% OA, PA, and UA). From these results, it can be concluded that Maximum Likelihood classifier is best to use for Falcata mapping using the 10-m spatial resolution Sentinel-2 image. For the 20-m resolution image, any of the two SVMs (linear or RBF) is more appropriate to use. However, it should be noted that these results are based on classifications where default parameters were used. Improvement in the classification accuracy may be achieved if these parameters were optimized. * Corresponding author


INTRODUCTION
Falcata (Paraserianthes falcataria (l.) Nielsen) is a multipurpose industrial tree plantation (ITP) species that has become a significant source of wood for the panel and plywood industries in countries like Indonesia and Philippines (Krisnawati et al., 2011). It has become a popular choice in ITPs due to its fast growth, ability to grow on a variety of soils, and its acceptable quality of wood. In the Philippines, Falcata plantations has a significant role in the country's total log production. In Caraga, a region in Mindanao Island, Philippines, Falcata plantations including tree farms by small-holder farmers and organizations, are widespread. The Falcata plantations and tree farms in this region provided 555,966 m 3 of logs produced in 2019, which is more than 50% of the nationwide total log production (FMB, 2019). To ensure better monitoring and sustainable management of these plantations, efficient and accurate mapping of Falcata plantations is essential. At present, there is limited information on the location and extent of existing plantations. This provides an opportunity to evaluate satellite remote sensing approaches for mapping these plantations from images, particularly those provided by the Sentinel-2 mission.
Traditional and modern machine learning classifiers that include Maximum Likelihood (ML), Artificial Neural Network (ANN), Support Vector Machine (SVM), and Random Forest (RF) have been utilized in natural and plantation forest mapping, including tree species classification from remotely sensed images in different parts of the world. Li et al. (2007) Luo et al. (2020) compared the performance of SVM and RF in both pixel-based and object-oriented image analysis methods for mapping mango plantations in 2-m resolution pansharpened Gaofen-1 imagery in parts of Hainan Island, China. In pixel-based classification, RF performed better than SVM, with the former's accuracy at 83.89%; however, the User's Accuracy of classified mango plantations was higher for SVM (88.19%) than RF (87.22%); in object-oriented classification, RF also performed better than SVM (Luo et al., 2020). Razak et al. (2018) used SVM to produce classification results for different rubber trees penological cycles in Sungai Buloh, Selangor, Malaysia with varying accuracies for each date of year 2015 Landsat 8 OLI images ranging from 90.10% to 97.08%. From these studies, it can be concluded that the accuracy of the classification depends on the type of classifier used, the class of interest, as well as the imagery used.
The objective of this study is to evaluate ML, ANN, SVM and RF for mapping Falcata plantations in Sentinel-2 image. It also aims to find the best classifier that can provide acceptable levels of accuracy by utilizing only the four bands of the 10-m spatial resolution and the 9 bands of the 20-m spatial resolution Level 2A Sentinel-2 image, respectively.

Study Area
A 9 x 9 km study area was selected in Butuan City, Agusan del Norte, Caraga Region where Falcata tree plantations, including small-holder tree farms, are widespread ( Figure 1).
Image subletting and subsequent steps (including image classification) were all performed in Envi 5.3 (Exelis Visual Information Solutions, Inc., USA) except for RF classification which was performed using ArcGIS 10.8 (Esri, USA).
Regions of Interests (ROIs) corresponding to Falcata plantations and other land-cover classes were collected from each subset and used for classifier training (Table 1). The other land cover classes include barren areas, built-up areas, croplands, forests (non-Falcata), grasslands, palms, shrubs, and water bodies. The ROI collection was aided by field surveys and high-resolution satellite image available in Google Earth (with same acquisition date as the Sentinel-2 image).
For the 10-m spatial resolution subset, a total of 1000 pixels were collected for Falcata; the number of pixels collected for other classes ranges from 631-2259. For the 20-m spatial resolution subset, 871 pixels were collected as training ROI for Falcata class; the number of pixels collected for other classes ranges from 501-1284. For classifier accuracy assessment, only two classes were considered: Falcata and Non-Falcata; for each class, an independent set comprising of 1000 pixels were randomly selected both in the 10-m and 20-m subsets. Image classifications were then performed using all bands each of the 10-m and 20-m images and their corresponding training ROIs.

Image Classification
The following classifiers and their variants were evaluated: Linear SVM, Polynomial SVM, Radial Basis Function (RBF) SVM, ANN ("Neural Net" in Envi 5.3), RF ("Random Trees" in ArcGIS 10.8), and ML. Default Envi 5.3 parameters of the SVM, Neural Net and ML classifiers were utilized to avoid complexity in the classification process. For RF, the following classifier training default parameters in ArcGIS 10.8 were used: maximum number of trees = 50; maximum tree depth = 30; and maximum number of sample per class = 1000. The result of each classification was subjected to postclassification procedure, specifically the merging of classes, to generate a Falcata/Non-Falcata Map. A total of 5 Falcata/Non-Falcata maps were generated for each image subset, with each map corresponding to one classifier. Each map was subjected to accuracy assessment where the producer's, user's and overall classification accuracies were computed.  Figure 2 shows the results of classification using the 4 bands of the 10-m resolution subset image.

Results for the 10-m resolution Sentinel-2 Image
The accuracy assessment of the 10-m subset classification maps (Table 2) showed the ML (Figure 2)  Overall, the ML classifier has a more consistent values for OA, PA, and UA, and hence, provides better Falcata classification results than the other classifiers when the 10-m spatial resolution Sentinel-2 image is used. Figure 3 shows the results of classification using the 4 bands of the 10-m resolution subset image.

Results for the 20-m resolution Sentinel-2 Image
The accuracy assessment of the 20-m subset classification (Table 3) provides relative different results from that of the 10m subset, perhaps due to the inclusion of more bands.

CONCLUSION
In this study, the accuracies of ML, three variants of SVM, ANN (Neural Net) and RF for mapping Falcata plantations in Sentinel-2 image were evaluated to find the best classifier that can provide acceptable levels of accuracy by utilizing only the four bands of the 10-m spatial resolution and the 9 bands of the 20-m spatial resolution Level 2A Sentinel-2 image, respectively. From the results, it can be concluded that the ML classifier is best to use for Falcata mapping using the 10-m spatial resolution Sentinel-2 image. For the 20-m resolution image, any of the two SVMs (linear or RBF) is more appropriate to use. However, it should be noted that these results are based on classifications where default parameters were used. Improvement in the classification accuracy may be achieved if these parameters were optimized.