MAPPING VEGETATION AND MEASURING THE PERFORMANCE OF MACHINE LEARNING ALGORITHM IN LULC CLASSIFICATION IN THE LARGE AREA USING SENTINEL-2 AND LANDSAT-8 DATASETS OF DEHRADUN AS A TEST CASE
- Dept. of Computer Science and Engineering, Rajiv Gandhi Institute Of Petroleum Technology, Jais, Amethi, Uttar Pradesh, India
Keywords: Google Earth Engine, Landsat, Sentinel, Machine learning Algorithm, Classification
Abstract. In recent years, the data science and remote sensing communities have started to align due to user-friendly programming tools, access to high-end consumer computing power, and the availability of free satellite data. In particular, publicly available data from the European Space Agency’s “Sentinel” and American Earth observation satellite” landsat” missions have been used in various remote sensing applications.Google Earth Engine (GEE) is such a tool that publicly allow the use of these available datasets, there is a large amount of available data in GEE, which are being used for computing and analysing purpose. In this article, we compare the classification performance of four supervised machine learning algorithms: Classification and Regression Tree (CART), Random forests (RF), Gradient tree boosting (GTB), Support vector machines (SVM). The study area is located at 30.3165° N, 78.0322° E near the Himalayan foothills, with four land-use land-cover (LULC) classes. The satellite imagery used for the classification were multi-temporal scenes from Sentinel-2 and LANDSAT-8 covering spring, summer, autumn, and winter conditions. Here we collected a total of 2084 sample points in which 536, 506, 505, 540 points belong to urban, water, forest and agriculture points respectively. which were divided into training (70%) and evaluation (30%) subsets. Accuracy was assessed through metrics derived from an error matrix, for accuracy measurement we use confusion and Cohen’s kappa calculation method.We have calculated CART (Accuracy 93.52% and Kappa coefficient 91.36%), Random Forest (Accuracy 95.86% and Kappa coefficient 94.48%),Gradient Tree Boost (Accuracy 95.33% and Kappa coefficient 93.37%),Support Vector Machine (Accuracy 73.54% and Kappa coefficient 76.28%) for Landsat 8 data sets and CART (Accuracy 89.24% and Kappa coefficient 85.64%), Random Forest (Accuracy 91.45% and Kappa coefficient 88.59%),Gradient Tree Boost (Accuracy 87.71% and Kappa coefficient 83.58%),Support Vector Machine (Accuracy 84.96% and Kappa coefficient 79.99%) for Sentinel2 data sets. Further analysis for accuracy and machine learning algorithm are discussed in result section.