CROP CLASSIFICATION ON SINGLE DATE SENTINEL-2 IMAGERY USING RANDOM FOREST AND SUPPOR VECTOR MACHINE

Mapping of the crop using satellite images is a challenging task due to complexities within field, and having the similar spectral properties with other crops in the region. Recently launched Sentinel-2 satellite has thirteen spectral bands, fast revisit time and resolution at three different level (10m, 20m, 60m), as well as the free availability of data, makes it a good choice for vegetation mapping. This study aims to classify crop using single date Sentinel-2 imagery in the Roorkee, district Haridwar, Uttarakhand, India. Classification is performed by using two most popular and efficient machine learning algorithms: Random Forest (RF) and Support Vector Machine (SVM). In this study, four spectral bands, i.e., Near Infrared, Red, Green, and Blue of Sentinel-2 satellite are stacked for the classification. Results show that overall accuracy of the classification achieved by RF and SVM using Sentinel-2 imagery are 84.22% and 81.85% respectively. This study demonstrates that both classifiers performed well by setting an optimal value of tuning parameters, but RF achieved 2.37% higher overall accuracy over SVM. Analysis of the results states that the class specific accuracies of High-Density Forest attain the highest accuracy whereas Fodder class reports the lowest accuracy. Fodder achieve lowest accuracy because there is an intermixing of pixels among Wheat and Fodder crops. In this study, it is found that RF shows better potential in classifying crops more accurately in comparison to SVM and Sentinel-2 has great potential in vegetation mapping domain in remote sensing.


INTRODUCTION
Crop mapping plays an important role in sustainable agricultural practice and to deal with the environmental challenges due to climate change and other driving forces.Classification of crops provides essential information that is useful in a various decision making process for managing agricultural resources.Satellite image processing can provide timely and accurate information on crop type and reliable estimation of crop production using advance classification techniques.Selection of satellite imagery for crop classification depends on the factors like image availability, associated cost, diversity level in crop types, and extensiveness of the study area (Zheng, 2015).Freely available remote sensing datasets such as MODIS and Landsat have been utilized in many studies for vegetation mapping (Zheng, 2015;Waldner, 2015).Mix-pixel is a well-known problem that frequently occurs for MODIS because of low spatial resolution (250-500 m).However, it can be expected to achieve more accurate results by using 30m Landsat data as compared to MODIS specifically for the region characterized by small agricultural fields.The European Satellite Sentinel-2A provides multispectral data at medium spatial resolution and fair revisit time (5-day) provides an opportunity to address the problem arised due to coarse spatial resolution (Drusch, 2012).The Sentinel-2 satellite Multi-Spectral Instrument (MSI) has thirteen spectral bands with three different spatial resolutions (Table 1).Recently launched Sentinel-2 data have been used for various remote sensing applications (Whyte, 2018;Sonobe, 2017;Korhonen, 2017;Belgiu, 2018;Wang, 2018;Wang, 2016;Hawryło, 2018).
The potential of features from Sentinel-1, and 2 have been accessed by Sonobe (2017), and this work highlighted that band 4 (Red) of Sentinel-2 and VV polarization data of Sentinel-1 have highest importance for crop classification.A comparison is performed between Landsat-8 and Sentinel-2 by Korhonen (2017) and showed that Sentinel-2 gives marginally better results to estimate boreal forest canopy cover and leaf area index (LAI).Zheng (2015) utilize the SVM model effectively for crop type identification using time-series Landsat Normalized Difference Vegetation Index (NDVI) data.Results of this study demonstrated that the intelligent selection achieved higher classification accuracy as compared to the stratified random approach.Shao (2012) compared neural networks (NN), classification and regression trees (CART) and SVM for crop type mapping using MODIS NDVI data.The results of this comparison showed that SVM achieved higher accuracy of classification over NN and CART.The red edge bands of sentinel-2 provide additional information content that may be useful for monitoring canopy properties (Korhonen, 2017).Man (2018) formed an ensemble of five supervised methods and compared with Extreme gradient boosting (Xgboost), SVM, logistic regression and multilayer perceptron for land cover classification by using Landsat-8 (time-series data), and results showed that ensemble gives the highest accuracy followed by Xgboost.Son (2017) used multi-temporal Sentinel-1A satellite data to classify crops using RF and SVM model and results demonstrated the superiority of RF over SVM classifier.Whyte (2018) used Sentinel-1, and 2 data for wetland mapping in Object Based Image Analysis (OBIA) framework and found that RF outperforms SVM marginally but consistently throughout.Rodriguez-Galiano (2012) evaluated the effectiveness of RF classifiers using Landsat-5 data for complex land cover and land use categories and results showed that RF achieved high classification accuracy and it work well for small training data as well as robust to the noise.Li (2016) performed a comparison of different classification techniques in the object based framework and concluded that RF and SVM both methods are highly suitable in Geographic Object Based Image Analysis (GOBIA) for agricultural environment also (Quintano, 2018).
Given the above background, this study aims to explore the potential of Sentinel-2 satellite imagery for crop classification by implementing two powerful machine leaning algorithms (RF and SVM).This study evaluates the effectiveness of RF and SVM Machine learning algorithms in discriminating various crop in Roorkee, Uttarakhand, India.This paper is organised as follows: Section 2 presents the description of the study areas and the satellite data used; Section 3 presents methodology and classifiers description; Section 4 presents the results and discussion; Section 5 presents the conclusion.

STUDY AREA AND DATA
In this study, Sentinel-2 image acquired on 19 February 2018 in the growing season has been used.The study area is situated at the foothill of Himalaya and agriculture play a major role in the economy of the area.Four bands (NIR, Red, Green, Blue) of Sentinel-2 at 10m resolutions have been utilized for the classification purpose.The total area covered is 1049.31km2.The minimum bounding box coordinates are 77°43'37.826"E,30°14'51.009"Nat the upper left corner and 78°1'15.653"E,29°39'45.859"Nat lower right.Sentenel-2 satellite has 13 spectral bands, details of these bands such as names, spatial resolution, along with their corresponding wavelength is shown in Table 1.The selected area is divided into the eleven Land Use Land Cover (LULC) classes: High-density Forest, Low-Density Forest, Sandy area, Water, Fallow land, Built-up, Orchard, Wheat, Sugarcane, Fodder and Other crops.Here, Orchard signifies the class of planted trees in forms of the garden of fruits like mangos, guavas, and trees planted as field parcel as well as on the edges of the fields.The class named as 'Other crops' consists of the double crop like Trees and Wheat, Mustard and Wheat, vegetable (Radish, Cabbage, Cauliflower).The major crop in the selected region is Wheat and Sugarcane.Fodder are also important crops that are cultivated primarily for animal feed.False colour composite (FCC) of the study area using sentinel imagery is shown in Figure 1, Where NIR (band 8), Red (band 4), Green (band 3) is projected in red green and blue colour respectively.

Band
Figure 1.False colour composite (FCC) of the study area using Sentinel-2 image.

METHODOLGY
The proposed methodology for crop classification is shown in Figure 2. A single date Sentinel-2 imagery has been taken and individual bands of 10m spatial resolution NIR, R, G and B are stacked together to create a multispectral image cube.Once the stacked image is generated a single pixel contains 4-dimensional vector containing spectral values corresponding the considered bands.Reference dataset has been created by fields survey where ground truth values has been taken using Global Positioning system (GPS) instrument, apart from this some of the sample are also taken with the help of high resolution Google earth images.Now, created reference dataset has been splitted into training and testing subset where 70% of data is taken for training and rest 30% is taken for testing.Two models namely RF and SVM have been trained using training dataset.
Figure 2. Flow chart of adopted methodology RF and SVM both are popular, supervised machine learning methods.There is some user defined parameters that have to optimized to attain maximum accuracy.Once the models are trained, various classes of input image are predicted using optimal parameters achieved after fine-tuning, hence classified maps of study areas are generated.The accuracy of the models is tested on testing dataset.In this work, for the assessment of class specific accuracy, F1-score has been used.The F1-score has been utilized for class specific accuracy assessment, it is the harmonic mean between precision (p) and recall (r) for each class i. F1score is computed by following equation, (1)

Random Forest Classifier
The Random Forest (RF) (Breiman, 2001) is a successful ensemble classifiers developed based on the concept of learning strategy.RF is a non-parametric machine learning algorithm, produces high classification accuracy as well as capable of determining the variable importance.RF is an ensemble method shown excellent results for a number of remote sensing applications (Sonobe, 2017;Yin, 2018;Chan 2012;Chan, 2008;Hawryło, 2018).Ensemble method consists a number of classifiers whose response are combined to get final prediction.Random forest uses randomly with replacement policy to generate new training dataset.This policy reduces the variance and improves the accuracy of classification.RF algorithm select random subset of variables or predictors at each split (Liaw, 2002).Majority voting scheme is used to determine the output of the classification process.RF algorithm has tuning parameters: the number of trees used to form ensemble (ntree) and another parameter is mtry that signifies the number of variables/predictors used to split the nodes.However, the best split for a node plays an important role to increase the accuracy of the classification (Ishwaran, 2007;Ishwaran, 2008;Sonobe, 2017).RF algorithm has fallowing benefits in context to remote sensing applications (Rodriguez-Galiano, 2012):  RF is capable to run on large data sets.


This algorithm can to handle large (thousand) numbers of input variables. RF can estimate the variable's importance in the classification procedure. RF is robust to the noise as well as outliers.


The computational complexity of RF is low as compared to other ensemble method (e.g., boosting).

Support Vector Machine (SVM)
SVM is a supervised, non-parametric statistical learning technique shown its usefulness to solve a huge number of classification problems in remote sensing (Foody, 2004;Pal, 2005;Sonobe, 2017;Hawryło, 2018;Waldner, 2015).In the last decade, SVM gained more popularity for various remote sensing applications (Mountrakiset, 2011).SVM algorithm maps the training data into higher dimensional space and seeks the optimal hyperplane to distinguish different classes or category.This algorithm partitions the data using maximum separation margins (Vapnik 1995).This machine learning algorithm uses training data samples that lie on the boundaries of class distribution known as support vectors while the middle of the margin is optimal hyperplane.SVM may achieve high accuracy of classification by utilizing a small set of training samples (Zheng, 2015;Foody, 2004;Shao, 2012).However, this algorithm can also project low dimensional features into higher dimensional feature space.The projection to the higher dimensional space is termed as the kernel trick (Maxwell, 2018).Various kernels (Kavzoglu, 2009) may be used for the implementation of the SVM model, and different kernel has a different set of user defined parameters (Maxwell, 2018).For the implementation of SVM model for multi-class classification, Radial Basis Function (RBF) kernel has been used.It has two parameters: the regularization parameter and the kernel bandwidth denoted by C γ respectively.High value of C lead to high penalties for the inseparable points that may result in overfitting, on the other hand low value of C may lead to under-fitting (Sonobe, 2017) and γ controls the shape of the hyperplane (Ghosh, 2015).A mesh grid search strategy has been utilized to find out optimal values of C and γ parameters.

RESULTS AND DISCUSSION
In this work, Sentinel-2 image acquired in the growing season, four bands at 10m resolutions are stacked and the resultant image, has been used for the crop classification.Stratified random Sampling with 10-fold cross validation scheme has been adopted.Partitioned training and testing pixels are mutually exclusive.Both classifiers (RF and SVM) are implemented in R open source language.In remote sensing, the accuracy of LULC map is one of the most valuable indicators to determine the quality of the produced map, fitness for a specific application as well as an understanding of error and its implications (Foody, 2002).In Literature, overall accuracy derived from the confusion matrix and kappa coefficient have been widely used for evaluation purpose.In this work, for the performance evaluation overall accuracy, F1-score, and kappa coefficient have been computed by setting the optimal value of tuning parameters for both the classifiers.Tuning parameters of SVM classifiers are penalty parameter (C) and Gamma (γ); highest classification accuracy may be achieved by setting the optimal value of these tuning parameters.Values for regularization parameter is tested in the range of 2 0 to 2 8 and Gamma ranges from 0 to 2 at an interval of 0.1.The optimized value of penalty parameter (C) is estimated as 64 and Gamma (γ), was set to 1.  2) and classified image is shown in Figure 4.

Dataset
To achieve the maximum classification accuracy for RF model, parameters of the algorithm must have optimal values.There are two parameters, first parameter is ntree that represents the number of trees and second one is mtry represents the number of predictors or variables.High value of ntree parameters leads to high computational cost.For RF model the optimal value of ntree was estimated as 350, and another parameter mtry was set to 1.
Feature importance plays an important role to understand the contribution of an individual feature in the classification task.Feature importance is computed for both the classifiers and result is shown in Figure 3.For RF classifier NIR band contributes highest importance followed by blue band while Green band show no importance at all and blue band shows very low importance.On the other hand, SVM shows similar trend for NIR band (highest importance) while in contrast RF, Red band is second important feature here.Blue and green band has almost no importance in SVM model.3).Wheat pixels are misclassified as Fodder as well as Fodder is misclassified as Wheat due to the spectral similarity of both the crops.While the RF classifier achieved an accuracy of 82.11% for Wheat crop which is 3.56% higher than SVM.Similar trend has been observed for the Sugarcane crop.RF classifier gives 84.76% accuracy for Sugarcane which is 2.99% higher as compared to accuracy 81.76% produced by SVM classifier.It has also noticed that there is intermixing of pixels among Sugarcane Orchard and Low-Density Forest.This leads to decrease in accuracy for all these classes, but the most adverse effect is observed on the orchard, the resultant accuracy of orchard is less than 80% by both the classifiers.Similarly, for the Other crop class RF (83.24%) produces better results over SVM (80.99%).However, for some classes like Water and Orchard, both classifier shows almost same accuracy.

CONCLUSIONS
This study aims to classify crop type using single date Sentinel-2 imagery using RF and SVM classifiers.In order to achieve more accurate results tuning parameters of the classifiers are set to optimal values.In this study only four bands are considered and feature importance is computed for both the classifiers.Feature importance computation identifies that NIR band has highest importance for RF as well for SVM model.Results of the implementation demonstrate that RF classifier outperforms the SVM classifier.As the class specific accuracies are concerned some classes like orchard and water shown the similar accuracy by both classifiers or minor improvement by using RF classifier while rest of the classes showed an improvement by using RF over SVM.Results of this work shown that Sentinel-2 has great potential for crop classification and more accurate results produced by RF classifier.

Figure 3
Figure 3 shows feature importance for RF and SVM This work resulted in an 84.22% overall accuracy, and 83.05% kappa coefficient (Table2) for the RF model and classified image is shown in Figure5.RF and SVM model both performed well for the crop classification but RF achieved better results as a rise of (+2.37%) over SVM model.For the class specific accuracy, High Density Forest achieved the highest accuracy of 92.93% and 90.66 by RF and SVM respectively (Table3).

Figure 5
Figure 5 Classified image by RF Fodder is crop type that achieved the lowest accuracy by both the classifiers, 61.22% by RF and 59.21% for SVM model (Table3).Wheat pixels are misclassified as Fodder as well as Fodder is misclassified as Wheat due to the spectral similarity of both the crops.While the RF classifier achieved an accuracy of 82.11% for Wheat crop which is 3.56% higher than SVM.Similar trend has been observed for the Sugarcane crop.RF classifier gives 84.76% accuracy for Sugarcane which is 2.99% higher as compared to accuracy 81.76% produced by SVM classifier.It has also noticed that there is intermixing of pixels among Sugarcane Orchard and Low-Density Forest.This leads to decrease in accuracy for all these classes, but the most adverse effect is observed on the orchard, the resultant accuracy of orchard is less than 80% by both the classifiers.Similarly, for the Other crop class RF (83.24%) produces better results over SVM (80.99%).However, for some classes like Water and Orchard, both classifier shows almost same accuracy.

Table 2
) for the RF model and classified image is shown in Figure5.RF and SVM model both performed well for the crop classification but RF achieved better results as a rise of (+2.37%) over SVM model.For the class specific accuracy, High Density Forest achieved the highest accuracy of 92.93% and 90.66 by RF and SVM respectively (Table3).

Table 3
It has been observed that High-Density Forest class achieved more accuracy over Low-Density Forest, this is because Low-Density Forest is misclassified as orchard and vice versa.This intermixing of pixels lower down the accuracy of both the classes, i.e., Orchard and Low Density Forest.But the more negative effect is observed in Orchard class.Although both classifiers result almost similar (the difference is less than <1%) accuracy for the Orchard while RF reports more accurate Low Density Forest.
. shows class specific accuracy (F1-score) by RF and SVM Figure 4 classified image by SVM.