ASSESSMENTS OF SENTINEL-2 VEGETATION RED-EDGE SPECTRAL BANDS FOR IMPROVING LAND COVER CLASSIFICATION

The Multi Spectral Instrument (MSI) onboard Sentinel-2 can record the information in Vegetation Red-Edge (VRE) spectral domains. In this study, the performance of the VRE bands on improving land cover classification was evaluated based on a Sentinel2A MSI image in East Texas, USA. Two classification scenarios were designed by excluding and including the VRE bands. A Random Forest (RF) classifier was used to generate land cover maps and evaluate the contributions of different spectral bands. The combination of VRE bands increased the overall classification accuracy by 1.40%, which was statistically significant. Both confusion matrices and land cover maps indicated that the most beneficial increase was from vegetation-related land cover types, especially agriculture. Comparison of the relative importance of each band showed that the most beneficial VRE bands were Band 5 and Band 6. These results demonstrated the value of VRE bands for land cover classification. * Corresponding author Tel.: 86 28 61831586 E-mail address: binbinhe@uestc.edu.cn


INTRODUCTION
Land cover impacts on many parts of the human and physical environments, and its production has a long history in the remote sensing community.Many land cover maps have been produced using moderate-high spatial resolution remotely sensed data, such as Landsat and SPOT (Satellite Pour l'Observation de la Terre) datasets (Bartholome and Belward, 2005;Fuller, et al., 1994;Homer, et al., 2004).Recently, the availability of Sentinel-2 with more spectral bands provides significant new opportunities and challenges for mapping land cover.Compared with the data obtained from the previous optical satellites, the Multi Spectral Instrument (MSI) onboard Sentinel-2 can record the information in Vegetation Red-Edge (VRE) spectral domains (Figure 1).However, the integration of more information is not always helpful to improve the classification accuracy, which is determined by the high correlations between input variables and the necessity of more parameters to be estimated in classifier (Pal, et al., 2006;Zhu, et al., 2016).Thus, the necessities and effectiveness of VRE bands of Sentinel-2 MSI data are needed to be assessed for improving the land cover classification accuracy.

Study area and Sentinel-2 MSI data
The study area, with an area of 100 km 2 , is located in East Texas, USA (Figure 2).It is dominated by different land cover types, including forest, grassland, water, urban, agriculture, etc.These diverse land covers provide a good opportunity of testing the performance of VRE bands in classification.
A cloud-free Sentinel-2A MSI Level-1C image (T001639) acquired at 29, September, 2016 (Figure 2) was downloaded from U.S. Geological Survey (USGS) at EarthExplorer (http://earthexplorer.usgs.gov/).The Level-1C image is Top of Atmosphere (TOA) reflectance product already processed by radiometric and geometric corrections with sub-pixel accuracy (HUHET, 2015).To reduce the effect contributed from the atmosphere, the TOA data were converted into Bottom of Atmosphere (BOA) reflectances using the Sen2Cor tool (E.S.A, 2017).Meanwhile, all of the spectral bands were resampled to 10 m spatial resolution using nearest neighbour approach.

Land cover categories and sample data
The sample data used in this study were extracted from USGS's Land Cover Trends (LCT) 2000-2011 dataset (USGS, 2017).While the LCT was originally designed to analyse the land cover change of United States (Loveland, et al., 2002), it has been proved to be sufficient as reference data for classification  (Zhu, et al., 2016).The LCT dataset consists of many randomlydistributed 10 km × 10 km sample blocks throughout the United States, in which 11 land cover categories (detailed at http://landcovertrends.usgs.gov/main/classification.html) were manually interpreted at scale of 30 m for multi target years.Focusing on the classification, we excluded the two disturbance classes (i.e., mechanically disturbed and non-mechanically disturbed) in LCT and only used the remaining 9 stable categories (Table 1).And, a total of 1,000 samples were randomly selected from each of the 9 land cover types based on the LCT mapping results from 2010 (the nearest to the acquired date of the Sentinel-2A MSI image).To further ensure the correctness of randomly-selected samples, we carefully examined each of them using high resolution images in Google Earth TM and the obtained Sentinel-2A image.After removing incorrect pixels, the remaining 3,851 samples were used as the reference data in this study (Table 1).Note that there is no ice or snow in the study area.1. 9-catagories used in this study.NA: Not Available.

Random forest classifier
Random Forest (RF) classifier was used to generate land cover maps and evaluate the performance of different input variables on map accuracy.The RF classifier is an ensemble classification method that uses trees as base classifiers {h(x, Θk), k=1, …}, where x is the input vector and {Θk} are the independent and identically distributed random vectors (Breiman, 2001).RFC can run on large data bases, handle thousands of input variables, and assess the relative importance of the different variables.The RFC is easy to run by setting the number of trees and the number of prediction variables at each node.In this study, a total of 500 trees were grown, and the square root of the number of total input variables were used as the number of prediction variables at each node (Zhu, et al., 2016).

Accuracy evaluation
The randomly-selected samples were used as the basis for assessing the accuracies of classification results.All of samples were randomly separated into two groups, one group with 80% used to train the RFC and another one with 20% used to assess the classification accuracy (Fielding, et al., 1997).This assessment was repeated 50 times, and the confusion matrices, producer's accuracies, user's accuracies, and overall classification accuracies were used to compare the results from different scenarios.Meanwhile, a Paired t-test at 95% significance level was performed for the accuracy of each classification results to test whether the observed accuracy increase is statistically significant.

RESULTS AND DISCUSSION
The confusion matrixes for the two different classification scenarios are presented in Table 2 and Table 3, respectively.The overall classification accuracy was 71.75% (Table 2) only using the Bands 2,3,4,8,11,and 12 (Scenario 1).When combing the four VRE bands (Scenario 2), the overall accuracy (73.15%) increased by 1.40%, which was statistically significant in the Paired t-test (at 95% level).Except for the producer's and user's accuracy for water, the user's accuracy for developed, and the producer's accuracy for barren, all increased at different degrees by adding VRE bands.However, there are large areas of developed and barren incorrectly identified as mining no matter whether the VRE bands were inputted or not (red cycle in Figure 3c).This kind of commission error was made mostly due to their similar spectral features (Figure 4).The visual assessment indicated that the real accuracies for mining may not be good as the statistics accuracies (Table 2 and  3).Those results demonstrated that the integration of VRE bands is helpful to improve the classification accuracy, especially for the vegetation-related land cover types (forest, grassland, agriculture, and wetland).By carefully comparing the land cover map derived from the inputs including VRE bands (Figure 3a) and the one obtained from the inputs excluding VRE bands (Figure 3b), we found the basic pattern for the two land cover maps were very similar.
However, the most obvious difference between the two maps were located in the boundary of different land cover types and within agriculture (Figure 3d).The differences showed the VRE bands of Sentinel-2 MSI image have a special contribution to improve the identification of agriculture, which agreed with the substantial increase of its accuracies (both producer's accuracy and user's accuracy were more than 2%).To analyse the relative contributions of the spectral bands for land cover classification, we presented their relative importance in Figure 5.The Band 3 worked best in land cover classification for the Sentinel-2 MSI image.Moreover, except for the Band 8A, the remaining three VRE bands (Bands 5, 6, and 7) were also helpful to improve the classification accuracy.The contributions of Band 5 and Band 6 on land cover classification were not less than Bands 2, 4, 8, and 12.  (Bands 2,3,4,5,6,7,8,8A,11,and 12) of Sentinel-2 MSI data.

CONCLUSION
By testing an autumn (September) Sentinel-2 image at scale of 10 m located in East Texas, USA, the combination of VRE bands improved the overall classification accuracies by 1.40%, which was statistically significant.The adding VRE bands was helpful to identify vegetation-related land cover types, especially agriculture.Regarding the four VRE bands of Sentinel-2 MSI data, Band 5 and Band 6 presented substantial contributions on the improvement of land cover classification, which were not less than the normal bands (Bands 2, 4, 8, and 12).

Figure. 2 .
Figure. 2. Study area and data.The middle background is the true color composited Sentinel-2 MSI image (Bands 4, 3, and 2).A total of 10 Land Cover Trends (LCT) sample blocks (three parts) are located within the image.The two ignored disturbance classes (mechanically disturbed and non-mechanically disturbed) are marked by dotted box.

Figure 3 .
Figure 3.A subset area (5.7 km × 4 km) of the presented LCT's region (Figure 2) for illustrating the classification results from different input variables.a) Map from Scenario 2. b) Map from Scenario 1. c) True colour composited image (Bands 4, 3, and 2).d) Difference between maps derived from Scenario 1 and 2.

Figure. 4 .
Figure. 4. The average BOA reflectance of each band for developed, mining, and barren based on all sample points.BOA: Bottom of Atmosphere.