PERFORMANCE ASSESSMENT OF CLASSIFICATION ALGORITHMS FOR LANDUSE / LANDCOVER CHANGE USING SENTINEL 2 DATA-A CASE STUDY OF TIRUPPUR

In the history of mankind, one of the vibrant geographical phenomena is urbanization. The urbanization process is characterized by the expansion of the city from the core to peripheral areas which includes economic development, social, political forces and population density. Very rapid urbanization in the highly populated country like India, which changes natural land cover into urban land use, which is unavoidable. However, the study region Tiruppur is known as the knitwear capital of India that induces urban development in the region which results in the modification of the natural land cover. For understating the interaction between the natural landscape and human activities, land use and land cover (LULC) is considered as the important indicator. Research on land-use and land cover changes using remote sensing technology has a long history to evident. The advancement in the Remote Sensing and GIS techniques provide the fine resolution of data sets to proceed. Sentinel-2B imagery was chosen for this study for two main reasons one is that compare to Landsat imagery it has a high spatial resolution of 10 m and its radiometry includes three vegetation red edge bands. These two characteristics make the Sentinel-2B data appealing for LULC mapping. Different types of classification algorithms have been used to perform land use and land cover mapping. The study aims to create land use and land cover classification by making a comparison between different algorithms in Tiruppur by using Sentinel-2B satellite imagery. The commonly known classification algorithms, Kmeans, IsoData, support vector machines (SVMs), and maximum likelihood (ML) classification are adopted for investigation. This is followed by the selection of training pixels from the remaining classes to perform and compare different supervised learning algorithms for the firstand second-level classification in terms of accuracy rates. Accuracy was assessed through metrics derived from an error matrix, but primarily overall accuracy and kappa coefficient was used in allocating algorithm hierarchy. Finally, after the comparison, the highly accurate algorithm was suggested for the mapping of urban areas. The highest overall accuracy and kappa coefficient was produced by support vector machine (SVM) is due to the algorithm’s relatively small number of complex decision boundaries. The results are helpful to understand the performance of the classification algorithm for the future studies.


INTRODUCTION
In 21 st century the earth has been significantly transformed due to both natural and anthropogenic factors to the contentment of the human need (Foley et al., 2005). Ecosystem function and structure have been compromised by human activities, resulting in greater vulnerability of places, people, economic dynamics, and the climatic system (Kasperson and Kasperson, 2001;Ogle et al, 2017;Tyson et al, 2001).The magnitude and extent of land use/land cover (LULC) changes underway in many parts of the world (Feranec et al, 2017;Fuchs et al, 2015) are influenced by socio-economic and biophysical factors. These determinants are directly related to the functioning of local and national markets, international land external policies, as well as demographic and environmental conditions (Turner et al, 1993). However, the land use/land cover changes play a significant role in the natural processes as these dynamic changes directly disturb the equilibrium state of ecosystem i.e., impacting environment, agriculture and forests (Rahaman et al, 2017). In the early stages land use classifications were manually interpreted with extended time and difficulty in temporal analysis (Nemani and Running, 1997). In recent years, the advancement in remote sensing data products have made possible to evaluate the complex natural process (Roodposhti et al, 2020;Tripathy and Kumar, 2019). To understand the dynamics of landscape, GIS and Remote sensing are the cost effective and accurate alternative technique (Raziq et al, 2016) * Corresponding author it is very useful in the formulation, implementation and monitoring LULC change ( Masilamani, 2012). In this context, the study region was adopted based on the diversified land use/land cover features and the various image classifying algorithm were used for classification (i.e., IsoData, K-Means, Maximum likelihood, Support Vector Machine). Here each algorithm was performed by employing high resolution satellite data of sentinel 2B. The aim of the present study is to assess the performance of classification algorithms for land use / land cover change in urban and its environs and identifying the suitable algorithm with the accuracy assessment for the study area.

STUDY AREA
The study area Tiruppur is nown as the nitwear capital of ndia that induces urban development in the region which results in the modification of the natural land cover. Tiruppur orporation is located at N to and N and on the banks of the Noyyal River (Fig. 1). It covers an area of 1552.98 sq.km and situated 450 kilometers southwest of the state capital Chennai and about 50 kilometers east of Coimbatore. The climate in Tiruppur is tropical with the mean maximum and minimum temperatures varying between 35 to 22 °C (95 to 72 °F). The total population of the corporation as per the 2011 census is 8, 77,778 individuals. The study area Tiruppur is selected, because almost 80 percent of the study area is covered by urban land cover. However the study mainly focus on LULC classification, in specific it concentrate more on urban land use classification. It also helpful to identify the suitable classifier for urban studies

DATA AND METHODOLOGY
The following methodology has been applied for the sentinel 2B datasets of the study region. The sentinel 2B imagery was collected from United States Geological Survey (USGS) earth explorer on April 2020 and its spatial resolution is about 10m. The study attempts to assess the performance of the different classifiers and identification of the suitable classifier of the Land use Land cover (LULC) classification. After the image preprocessing (Atmospheric correction, Layer stacking, and Regional subset) of the sentinel 2B dataset, it further undergoes supervised and unsupervised classifications for the performance assessment and suitable classifier identification. The detailed methodological work flow of the present work is shown in the Figure 2.

Unsupervised Classification:
In unsupervised classification, the unclear image pixels are accumulated based on image clusters (Kalpana and Thanushkodi, 2010). In this Study unsupervised classification algorithms is used to detect the LULC pattern and information that was collected using specifies number of classes and iterations instead of collecting training data sets. IsoData and Kmeans are the unsupervised classifiers are used to perform, in IsoData evenly distributed pixels are clustered and the remaining pixels are grouped together based on defined threshold and, In K-Means the objects are developed through the gathering of similar pixels based on the center pixel of clusters (Priyadarshini et al, 2018).

Supervised classification
The training samples are collected to perform the LULC classification in supervised classifiers, each pixel in the imagery are classified into respective LULC classes with the aid of selected ROI pixels (Twisa and Buchroithner, 2019). The following three supervised algorithms are performed to undergo LULC classification in this study. The Maximum Likelihood (ML) lassifier: t assumes that each band's histogram is normally distributed and calculates the highest probability that a pixel belongs to the particular class. Support Vector Machine (SVM) is a binary classifier that performs by recognizing the paramount separating hyper plane between classes by focusing on training samples.

Post-Classification and Accuracy assessment:
The post classification process is instigated to eliminate the noises and to enhance the quality of classified output. This is performed to confirm the accuracy of classification process. The collected ground truth ROI is used to resolve the accuracy of the classifier. The error matrix is the most pervasively used method to determine the classification accuracy (Andualem et al, 2018;Manandhar et al, 2009).

RESULT AND DISCUSSION
The intention of the study is to analyze the performance of the classification algorithms and scrutinize the suitable classifier for the Urban Land use/ land cover (LU/LC) classification. To examine the difference in the classifiers, different types of commonly known classifiers have been used such as K-means, IsoData, support vector machines (SVM) and maximum likelihood (   The areal extent of each class in relation with the different classifiers was computed as portrayed in the table 1. The ground truth points (GTP) have been taken out from Google earth imagery to perform the ROC curve validation part. Based on the analysis support vector machine shows the better results followed by maximum likelihood and Isodata. It is determined because of the kappa coefficient (0.878) and overall accuracy (92.17%) of SVM possess comparatively high in margin than others also SVM classified the LULC classes more precisely in relation with the satellite imagery. For overall accuracy and kappa coefficient of all the classifiers are shown in the figure 6. in which SVM acquires high of kappa coefficient (0.878) and overall accuracy (92.17%) followed by ML it acquires kappa coefficient of 0.82 and over all accuracy of 88.73% then the unsupervised classifier IsoData obtains kappa coefficient of 0.78 and overall accuracy of 85.62%. The K-means is noted as the poor classifiers with 0.71 of kappa coefficient and 78.65% and of overall accuracy, respectively. The overall comprehension on the study exposed that the SVM represented the exact classified pixels in accordance with the ground truth point, kappa coefficient and overall accuracy. To strengthen the validation of the SVM classifier ROC (Receiver Operating Characteristic) curve ( Figure 7 ) and confusion matrix (Table 2) were generated Figure 7. ROC Curve -SVM values are closer to the 0.5 the performance of the classifier is poor and if closer to 1 the classier is excellent. Here, SVM classification showed that the AUC values of all the classes were closer to 1 and the ratio of misclassification and mixed pixels were comparatively very low which are shown in the confusion matrix. Undergoing all this analysis and validation in this study SVM classifier is recommended for LULC classification. While the study is based on the performance assessment and identification of suitable classifier for LULC, in specific the study concentrate more on urban land use classification. In this setup SVM classifier classifies overall LULC up ahead from all other classifier. In particular compare to other classifier it classify the built up more precise, it helpful to proceed the urban related models like Shannon entropy, CA markov chain model based on the SVM classification. Accordingly the present study area Tiruppur is one of the urban agglomerated cities in The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIV-M-3-2021 ASPRS 2021 Annual Conference, 29 March-2 April 2021, virtual Tamilnadu District. Urban sprawl and urban densification is the major problem in the past decades faced by the study area. Identification of the urban sprawl and its trends, for the sustainable urban development, LULC with precise Built up classification is inevitable. For LULC with precise built up classification SVM is the suitable classifier. The classification of the built up class by the SVM classifier is shown in the Figure 8 and the confusion matrix also helpful to validate the accuracy of the urban classification of the SVM classifier.

CONCLUSION
The context of the study has been assimilated in the view of identifying the better and suitable algorithm for the precise LULC classification which is the baseline and inevitable layer in every remote sensing based spatial analysis studies. The results of each LULC classes are depicting its precision in different algorithms. However, the overall accuracy and kappa coefficient values are assessed for each algorithm representing K-means (78.65%), IsoData (85.62%), support vector machines (SVMs) (92.17%) and maximum likelihood (ML) (88.73%) This illustrate that, SVM are showing better results. Especially, it is identified from the study that SVM poses the highest overall accuracy of 92.17% and specifying the significant of its efficiency in classifying the LULC features. Thus the SVM classifier is identified as significant algorithm in classifying LULC as well as the built up class in the study area, this can be applicable in assessing the urban sprawl and urban density modeling of the study area.