INTEGRATION OF MULTITEMPORAL SENTINEL-1 AND SENTINEL-2 IMAGERY FOR LAND-COVER CLASSIFICATION USING MACHINE LEARNING METHODS

Using space-borne remote sensing data is widely used for land-cover classification (LCC) due to its ability to provide a big amount of data with a regular temporal revisit time. In recent years, optical and synthetic aperture radar (SAR) imagery have become available for free, and their integration in time series have improved LCC. This research evaluates the classification accuracy using multitemporal (MT) Sentinel-1 (S1) and Sentinel-2 (S2) imagery. Pixel-based LCC is made for S1 and S2 imagery, and for a combination of both datasets with Random Forest (RF) and Extreme Gradient Boosting (XGBoost; XGB). The extent of the study area, is located in the south-east of France, in Lyon. Regardless of LCC using single-date or MT data, the highest classification results were achieved with integrated S1 and S2 imagery and XGB method, whereas overall accuracy (OA) and Kappa coefficient (Kappa) increased from 85.51% to 91.09%, and from 0.81 to 0.88, respectively. Furthermore, the integration of MT imagery significantly improved the classification of urban areas and reduced misclassification between forest and low vegetation. In this paper, in terms of the pixel-based classification, XGB produced slightly better results than RF, and outperformed it in terms of computational time. This research improved LCC with integration of radar and optical MT imagery, which can be useful for areas hampered by a frequent cloud cover. Future work should use the aforementioned data for specific applications in remote sensing, as well as evaluate the classification performance with different approaches, such as neural networks or deep learning. * Corresponding author


INTRODUCTION
Land-cover classification (LCC) is significant for monitoring urban growth, agricultural planning, and deforestation (Souza, Jr et al., 2013;Veloso et al., 2017;Zakeri et al., 2017). Satellite imagery acquired from remote sensing (RS) is widely used in LCC and monitoring owing to a big amount of spatial data with a daily revisit time. The usual way of performing classification tasks is the use of optical satellite imagery. Optical RS uses the sun as an external source of irradiance; however, the acquisition of optical imagery may be limited if the cloud layer is large (Sun et al., 2019). Being an active microwave sensor, synthetic aperture radar (SAR) can provide data acquisition that is independent of solar illumination and cloud cover, as microwave radiation penetrates through clouds. SAR data is sensitive to the surface roughness, textural and dielectric properties of land objects (Feng et al., 2019).
For LCC, many studies preferred optical data to SAR imagery, because of a better understanding of the links between the observations (Immitzer et al., 2012;Gašparović et al., 2018;Noi and Kappas, 2018). In recent years, optical and SAR imagery have become available for free, and their integration in time series have improved LCC. Van Tricht et al. (2018) investigated the possibility of crop mapping using joint radar (Sentinel-1; S1) and optical (Sentinel-2; S2) data. The integration of S1 and S2 imagery, led to higher classification accuracies compared to optical-only classification. Sonobe et al. (2017) evaluated the suitability of S1 and S2 data for the classification of various crop types. Classification for a set of six crop types on five S1 and one S2 imagery with a Random Forest (RF) algorithm achieved overall accuracy (OA) of 95.7%. The research showed a remarkable potential for crop classification. Gómez (2017) combined S1 and S2 data for LCC. For a pixel-based classification of six land-cover classes with RF algorithm achieved OA was 84.33%, and Kappa of 0.81. For the pixelbased approach, the used data was S1, S2, and vegetation indices. The aforementioned research used data from Sentinel satellites developed within the Copernicus Programme. European Space Agency (ESA) provided free and open access to S1 radar and S2 optical data, with almost global data availability with the revisit frequency of 6 and 5 days, respectively (Van Tricht et al., 2018).
Besides highly spatial, temporal and spectral resolutions of the data, machine learning methods are fundamental for developing LCC maps over a large area within short acquisition window. Classifiers such as artificial neural networks (ANN), Support Vector Machine (SVM), or RF outperform traditional parametric approaches with their ability to deal with noise and unbalanced datasets (Abdullah et al., 2019). In this research, used algorithms were RF, and extreme gradient boosting (XGBoost; XGB). RF is a robust classifier that avoids overfitting through bootstrapping and provides good classification results and computer processing time (Waske and Braun, 2009;Niculescu et al., 2018). XGB is an implementation of gradient boosted decision trees developed by Chen and Guestrin (2016). In recent research for LCC applications, XGB slightly outperformed RF and SVM with increased processing time (Man et al., 2018;Hirayama et al., 2019). When the number of samples is large, SVM needs a lot of machine memory leading to increased computation time, so this algorithm was not used in this research (Mountrakis et al., 2011).
The purpose of this paper is (1) to evaluate how classification accuracy depends on the multitemporal input source (optical data, radar data, or a combination of both) at a pixel-level and (2) to evaluate the performance of the machine learning methods for producing LCC maps.

Study area
For the research, the city of Lyon, which is located in the southeast of France, was chosen ( Figure 1). The city is surrounded by the rivers Rhone and Saone, and it is the third most populated city in France. The characteristics of a study area is a mild climate with an average temperature of 11.6 °C. The average annual precipitation is 763 mm. For this research, almost 1200 km 2 area (30 km x 40 km) was examined, which includes landcover classes such as water, bare soil, forest, built-up and low vegetation.

Data
Because of the availability of S1 and S2 data, both multitemporal and multisensor Sentinel data were used for LCC. S1 is an imaging radar satellite whose constellation comprises two satellites: S1A and S1B. Both satellites carry a C-band (~5.55 cm), capable of providing dual polarisation observations in several measuring modes (Torres et al., 2012). For this research, three imagery of S1 GRDH (ground range detected in high resolution) products were used with a spatial resolution of 10 m. S1 Level-1 imagery were selected according to the date proximity in relation to the cloud-free S2 imagery (Table 1). S2 also consists of two identical polar-orbiting satellites, and it provides high resolution multispectral optical imagery within 13 spectral bands. For this research, three optical S2 (Level-2A) scenes were selected for LCC. We selected temporal imagery with zero cloud coverage (Table 2). Spectral bands in the visible and near-infrared spectrum, i.e., Blue (B02), Green (B03), Red (B04), and the Near-Infrared band (B08) with an identical spatial resolution of 10 m as S1 were used.

Sentinel-1 data pre-processing
After downloading the S1 imagery from the ESA's Sentinel Scientific Hub, SAR data pre-processing was implemented using S1 Toolbox provided by the ESA. S1 Level-1 products are not radiometrically corrected by default, and therefore digital pixel values need to be converted to radiometrically calibrated SAR backscatter (Filipponi, 2019). The radiometric calibration is performed by calculating the sigma naught (σ 0 ). Speckle noise is inevitable in the SAR imagery, owing to the coherent mode of backscattered signal processing (Oliver, 1991). Therefore, speckle filtering is necessary for most SAR image analysis. Along with many developed spatial speckle filters for speckle suppression (Shi and Fung, 1994), a 5 x 5 Frost filter was applied to each image. The Frost filter reduces The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B1-2020, 2020 XXIV ISPRS Congress (2020 edition) speckle noise by using local statistics in order to efficiently preserve edges in radar imagery (Frost et al., 1982). The aforementioned filter was already used in similar research for LCC using multitemporal (MT) SAR imagery (Waske and Braun, 2009;Maghsoudi et al., 2012), and hence, Frost filter was applied on single-date S1 imagery which was then stacked together. Level-1 GRD imagery are not considered for the geometric distortions caused by terrain topography. Therefore, Range Doppler terrain correction operator is applied on the S1 imagery in order to improve or evaluate the geopositioning accuracy Gašparović et al., 2019), and projecting the scenes in WGS 1984/UTM Zone 31 N. Before the pixel-based LCC, the σ 0 values were transformed to dB values through its logarithmic form, as shown in Equation (1)

Land-cover classification algorithms
For the supervised pixel-based classification, RF and XGB classifiers were tested using R version 3.6.0 (R Core Team, 2016).
RF is a tree-based algorithm that is created from a large number of individual decision trees (Breiman, 2001). Classifier randomly selects subset features using bagging method, and it is relatively robust to outliers and noise (Rodriguez-Galiano et al., 2012). RF has two hyperparameters: number of trees to grow within the model (ntree) and how many variables are available for selection at a node split (mtry). A research by Kulkarni et al. (2012) evaluated that mtry hyperparameter has a larger impact on classification accuracy than ntree parameter.
XGB, as an ensemble tree-boosting model, converts weak learners into strong learners. Weak learners are added until no further improvements can be made, and by using a gradient descent algorithm, the loss of the model is minimized (Chen and Guestrin, 2016). XGB has many hyperparameters that need to be optimized, as described by Man et al. (2018).

Accuracy assessment
In this research, the land-cover classes were selected using common categories (Table 3) described in similar studies (Clerici et al., 2017;Gašparović et al., 2018). The classification accuracy was assessed based on the error matrix (Foody, 2010 Table 3. Overview of the land-cover classes used in this research.

RESULTS AND DISCUSSION
The results of the LCC using the two machine learning methods described in Section 3 are shown here. In Section 4.1, results using the single-date S1 and S2 imagery are discussed, and in Section 4.2, results using the MT S1, and S2 imagery are discussed.

Land-cover classification on a single-date S1 and S2 imagery
In order to evaluate how classification accuracy changes on MT imagery, firstly, classification was done using single-date S1 and S2 imagery. For S1 and S2 imagery, June 4 th, and June 2 nd , 2019, were chosen as reference dates, respectively. OA and Kappa values for S1, S2, and using the combination of S1 and S2 are shown in Table 4.
Sensor S1 S2 S1+S2 For the S1 classification, XGB performed better than RF with OA values of 72.02%, and 70.41%, respectively, whereas in the S2 classification, RF achieved higher accuracy metrics than the XGB method, with an OA of 84.17%, and Kappa of 0.80. In the single-date SAR land-cover classification, speckle noise presents a challenging task in order to reduce speckle for quality image interpretation and further analysis (Xiao et al., 2003). Optical imagery, like S2, has already proven for the LCC applications, so we wanted to investigate the integration of radar and SAR data, similar to the research of Van Tricht et al., . XGB performed slightly better than RF for the combined S1 and S2 classification with an OA of 85.51% and Kappa 0.81. Similar to the LCC results obtained in Hirayama et al. (2019), XGB slightly outperformed RF. Irrespective of the classifier used for combined S1 and S2 pixel-based classification, increased OA, and Kappa values overlap with similar research in LCC (Gómez, 2017;Abdi, 2019). For better discrimination of the land-cover classes, the error matrix, along with UA, PA metrics, for the XGB classification using integrated S1 and S2 imagery, is shown in The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B1-2020, 2020 XXIV ISPRS Congress (2020 edition) As shown in Table 5, the water class achieved the highest UA and PA values of 98.4% and 98.5%, respectively. Increased wetland classification accuracy has been reported in a paper from Kaplan and Avdan (2018). Also, for the S1 imagery, the VV polarization has a better effect on mapping calm water surfaces than VH polarization (Martinis et al., 2018). Slightly lower UA values were obtained for the bare soil and forest class. Overall, the forest class has the highest omission error of 21.8%. According to Chust et al. (2004), the use of MT SAR imagery are more efficient for vegetation mapping, in opposite to single-date imagery. The built-up class attained higher PA values than UA values, which means that the XGB method correctly identified more ground truth pixels as bare soil, but the commission error was higher than the omission error. Including the Grey Level Co-occurrence Matrix (GLCM) texture variables increases classification results (Jin et al., 2018), especially for various built-up classes (Zakeri et al., 2017). Low vegetation class was mostly misclassified to the forest class, and viceversa. In order to increase LCC accuracy using a single-date S1 and S2 data, textural parameters (e.g., GLCM) for S1 (Idol et al., 2017), and various vegetation indices (e.g., NDVI, MSAVI) (Clerici et al., 2017) should be included and investigated along with machine learning methods used in this research.

Land-cover classification on a multitemporal S1 and S2 imagery
According to Vuolo et al. (2018), MT classification provides better results than single-date acquisitions within sub-optimal temporal windows. Therefore, available S2 imagery with zero cloud coverage before and after the reference date, as described in Table 2, were chosen for LCC. Afterward, according to S2 acquisition time, S1 imagery was used for MT LCC (Table 1). Accuracy assessment for the MT classification for S1 and S2 imagery, and using the combination of S1 and S2 are shown in Table 6.
Sensor S1 S2 S1+S2  Table 6. OA and Kappa values of RF and XGB, as applied to the multitemporal S1 and S2 imagery.
For the classification obtained using MT S1 imagery, XGB performed better than RF, with OA and Kappa values of 86.28% and 0.82 against 84.47% and 0.80, respectively. MT S1 imagery significantly increased classification accuracy for RF and XGB method. Such improvement occurred due to spatial speckle filtering on the single-date S1 imagery, which was then stacked together. Maghsoudi et al. (2012) reported that MT filtering offers no advantage for classification tasks against spatial speckle filters. Therefore, the increase of OA metric of 14.06%, and 14.26% for RF and XGB method on MT S1 imagery is achieved, respectively. In the MT S2 classification, XGB performed better than RF, with an OA value of 89.72%, and 88.26%, respectively. Obtained results for MT S2 imagery confirm that LCC methods applied to MT imagery perform better that single-date mapping methods (Belgiu and Csillik, 2018;Vuolo et al., 2018), since phenological patterns can be identified on a time-series datasets. Overall, the highest classification accuracy in this research was obtained using integrated MT S1 and S2 imagery. For the RF method, OA was 90.78%, and Kappa 0.88, whereas the XGB method achieved OA of 91.09%, and Kappa 0.88. Sun et al. (2019) used S1, S2, and Landsat-8 data for crop-type mapping. Their MT and the multi-source combination produced the highest OA of 93%, and Kappa 0.91 with an RF classifier. According to the authors, although the use of S1 imagery affected the LCC, their ability to classify crop type was weaker than for S2 data. Viskovic et al. (2019) used MT S1 and S2 data for crop classification. RF outperformed other classifiers (e.g., SVM, K-nearest neighbors) with an OA of 84.20%, and Kappa 0.82. Furthermore, in order to compare and evaluate MT classification accuracy for separate land-cover classes, Table 7 shows the error matrix along with UA, PA metrics for the XGB classifier. S1+S2 multi date  Table 7. Error matrix for the XGB classification using integrated MT S1 and S2 imagery, with UA [%], PA [%] metrics for each class.
For the MT S1 and S2 classification, the water class achieved the highest UA and PA values of 99.5%, and 97.5%, respectively. Since no precipitation events occurred during the acquisition of the radar imagery, bare soil class was very well classified using MT data. Otherwise, soil moisture, which increases dielectric constant, has a major effect on the backscatter magnitude (Molijn et al., 2018). Regardless of LCC on a single-date or MT S1 and S2 imagery, the bare soil class achieved high PA results and was well classified on LCC maps. The highest increase from single-date to MT LCC, in terms of UA metric, achieved the built-up class. MT imagery significantly helped to correctly separate built-up areas, with an UA of 98.0%, against UA of 66.9% obtained for LCC using single-date imagery. Using single speckle filtering on MT imagery significantly reduced confusion between forest and built-up class (i.e., 1744 false-negative forest pixels to 101 pixels). Besides MT imagery, GLCM texture features should be included in order to improve the classification OA, and the discrimination of urban areas (Dell'Acqua et al., 2003). In this research, the highest contribution of adding MT S1 and S2 imagery, in terms of decreased omission and commission errors, was for the forest and low vegetation class. For the forest class, UA and PA increased 7.6% and 8.3%, respectively, whereas 1347 pixels misclassified as low vegetation decreased to 619 pixels. Using Frost spatial filter for speckle reduction for singledate S1 imagery, which were then stacked together, efficiently preserved edges and features, which alleviated the differentiation between forest and low vegetation class. Likewise, Rüetschi et al., (2018) showed that MT SAR imagery have the potential to supplement optical RS data for the mapping of mixed forests. Using them, monitoring at various spatial and temporal scales can be used for quantification of changes in species composition due to climate change. Figure 2. Example subset for a central part of the study area: (a) S2 'true color' composite; (b) XGB classification using singledate S1 and S2 imagery; (c) XGB classification using MT S1 and S2 imagery. Figure 2 shows classification maps obtained with S1 and S2 data using an XGB classifier. It can be seen that granular noise decreased with additional temporal features of S1 and S2 imagery. The major improvement for land-cover classification using MT imagery, occurred for the built-up class, because of stable objects like buildings, and other artificial built-up structures. Additionally, salt and pepper effect was reduced for forest and low vegetation class owing to the speckle filtering, and the MT classification map has sharper and clearer boundaries between the land-cover classes. Also, optical satellite imagery (i.e., S2) improved the distinction of different features (e.g., roads, still water), which are commonly misclassified using radar data because of similar backscatter pattern (Haas and Ban, 2017). Chen and Guestrin (2016) reported that the XGB algorithm is designed for speed and performance by using gradient boosted decision trees, and hence, processing times for RF and XGB were investigated (Figure 3). The processing time for the singledate data using RF and XGB varied between 17 min and 31 min and between 3 min and 6 min, respectively. Using MT imagery, the computational time varied between 20 min and 26 min for RF, and between 5 min and 12 min for XGB. This research evaluated the integration of multisource and multitemporal data provided by ESA for LCC. Regardless of LCC on a single-date or MT imagery, the highest classification results were achieved with integrated S1 and S2 imagery (Table  4 and Table 6). Gómez (2017) mentioned that the benefits of joining S1 and S2 data are more applicable for the pixel-based than in the polygon-based approach. Furthermore, classification accuracy significantly improved on MT SAR imagery, with an OA and Kappa increase of 14.06% and 0.19, respectively. Temporal series of SAR imagery in combination with speckle filtering improves classification results, as reported in Skriver et al. (2011) and Maghsoudi et al. (2012). Future research should address the integration of GLCM texture features with MT SAR imagery, which can be used for areas that are most times covered with clouds. The second part of the research was to evaluate the RF and XGB classifiers for producing LCC maps. In this paper, for the pixel-based classification, XGB produced slightly better results than RF but outperformed it in terms of computational time. This gradient boosting algorithm gained popularity in various machine learning and data science competitions, and most recently in producing LCC maps (Man et al., 2018;Hirayama et al., 2019). Future research should compare the performance with different approaches, such as SVM, ANNs, and kernel-based extreme learning machine (KELM) (Clerici et al., 2017;Sonobe et al., 2017;Zhang et al., 2019).

CONCLUSIONS
In this research, classification accuracy was examined for LCC on multitemporal input data (S1, S2, and their integration) using two classifiers (RF and XGB).
A combination of multitemporal S1 and S2 imagery successfully classified five land-cover classes with the XGB classifier and OA of 91.09% and Kappa 0.88. Furthermore, the integration of MT imagery significantly improved the classification of urban areas and reduced misclassification between forest and low vegetation. It should also be noted how overall classification accuracy for S1 imagery increased from 72.02% to 86.28% with the use of the MT imagery, which can be useful for areas hampered by a frequent cloud cover. This research proved that RF and XGB algorithms are robust and can be used for LCC. In terms of computational time, XGB performed RF, whereas accuracy metrics were similar, so the trade-off between accuracy and processing time must be considered.
This research evaluated the potential of radar and optical imagery for land-cover classification, so the future work should focus on specific applications (e.g., crop classification, vegetation monitoring, urban area mapping). Additionally, neural networks and deep learning methods should be examined for land-cover classification on remote sensing data.