TEMPORAL INDICES DATA FOR SPECIFIC CROP DISCRIMINATION USING FUZZY BASED NOISE CLASSIFIER

Evaluation of fuzzy based classifier to identify and map a specific crop using multi-spectral and time series data spanning over one growing season. The temporal data is pre-processed with respect to geo-registration and five spectral indices SR (Simple Ratio), NDVI (Normalized Difference Vegetation index), TNDVI (Transformed Normalized Difference Vegetation Index), SAVI (SoilAdjusted Vegetation Index) and TVI (Triangular Vegetation Index). The noise classifier (NC) is evaluated in sub pixel classification approach and accuracy assessment has been carried out using fuzzy error matrix (FERM). The classification results with respect to the additional indices were compared in terms of image to image maximum classification accuracy. The overall accuracy observed in dataset 2 was 96.03% for TNDVI indices, using NC. Data used for this study was AWIFS for soft classification and LISS-III data for soft testing generated from Resourcesat-1(IRS-P6) satellite. The research indicates that appropriately used indices can incorporate temporal variations while extracting specific crop of interest with soft computing techniques for images having coarser spatial and temporal resolution remote sensing data.


INTRODUCTION
Time series of acquired multispectral image represent characteristics of a landscape and each element represented has a particular spectral response, which allows the researcher to get highly relevant information to make decisions without going to the field. Since objects including vegetation, have their unique spectral features (reflectance or emission response), they can be identified from remote sensing imagery according to their unique spatial characteristics. The strong contrast of absorption and scattering of the red and near infrared bands can be combined into different quantitative indices of vegetation conditions. The time series of such vegetation indices observed over a period can help in further classification of the vegetation as crop and other type of vegetation. Classification techniques for grouping cluster and finding substructure in data needs to be robust. By robustness we mean that the performance of an algorithm should not be affected significantly by small deviations from the assumed model and it should not deteriorate drastically due to noise and outliers. Robust statistics can be related to the concept of membership functions in fuzzy set theory or possibility distributions in possibility theory. This might explain the claim made by the proponents of fuzzy set theory that a fuzzy approach is more tolerant to variations and noise in the input data when compared with a crisp approach.
The immensely popular k-Means is a partitioning procedure that partitions data based on the minimization of a least squares type. The fuzzy derivative of k-Means known as Fuzzy c-Means (FCM) is based on a least squares functional; it is susceptible to outliers in the data. The performance of FCM is known to degrade drastically when the data set is noisy. This is similar to least square (LS) regression where the presence of a single outlier is enough to throw off the regression estimates. The need has therefore been to develop robust clustering algorithms within the framework of fuzzy c-means ( FCM) (primarily because of FCM's simplistic iterative scheme and good convergence properties). The usual FCM minimization constraints are relaxed to make the resulting algorithm robust. The possibilistic c-means (PCM) algorithm was developed to provide information on the relationship between vectors within a cluster. Instead of the usual probabilistic memberships as calculated by FCM, PCM provides an index that quantifies the uniqueness of a data vector as belonging to a cluster. This is also shown to impart a robust property to the procedure in the sense that noise points are less unique in good clusters. Another effective clustering technique based on FCM is the noise classifier (NC) algorithm which uses a conceptual class called the noise classifier to group together outliers in the data. All data vectors are assumed to be a constant distance, called the noise distance, away from the noise cluster. The presence of the noise cluster allows outliers to have arbitrarily small memberships in good clusters (Banerjee and Davé 2005).
Till date many researchers in remote sensing field have applied time series indices to study cropping pattern. Tingting and Chuang, 2010, used the time-series NDVI to identify common vegetation types or cropping patterns. They applied principal component analysis, linear spectral un-mixing method and support vector machine to classify cropland. Panda et al., 2010, has studied four widely used spectral indices to investigate corn crop yield. Back Propagation Neural Network (BPNN) model was developed to test the efficiency of four vegetation indices in corn crop yield production. Yang et al. 2009, has given a new vegetation index which is robust to low vegetation and sensitive to high vegetation and has potential to be an alternative to NDVI for crop condition monitoring. Te-Ming et al., 2009, has proposed new vegetation index by integrating with a Fast Intensity-Hue-Saturation (FIHS) for high resolution imagery which can extract and enhance green vegetation an alternative to NDVI. Wardlow and Egbert, 2008, used a hierarchical crop mapping to classify multi-temporal NDVI data. Linlin and Huadong, 2008, presented a simple phenology model to identify wheat crop using curve fitting procedure. Lucas et al., 2007, studied the use of time-series of Landsat sensor data using decision rules based on fuzzy logic to discriminate vegetation type. They found that the rule-based classification gave a good representation of the distribution of habitats and agricultural land. Sakamoto et al., 2005, developed a new method for remotely determining phenological stages of paddy rice. As for the filtering, they adopted wavelet and Fourier transforms. Three types of mother wavelet (Daubechies, Symlet and Coiflet) were used. As the result of validation, it was observed International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B8, 2012 XXII ISPRS Congress, 25 August -01 September 2012, Melbourne, Australia that wavelet transform performed better than the f ourier transform. Kumar and Roy, 2010, has worked with add on bands in multi-spectral dataset of Worldview -2. This work has proposed class based sensor independent spectral band ratio NDVI approach for extracting crop information. Yang et al., 2008, has identified that the accuracy of surface feature recognition is improved greatly, by introducing fuzzy statistics variables into classical principal component analysis (PCA) methods on applying to the multi-spectral Landsat ETM+ data for image enhancement. Kumar and Saggar, 2008, have found that possibilstic fuzzy classifier can be used for single class extraction of interest. Class based ratio data was used as input in possibilistic fuzzy classifier and water class has been identified at sub-pixel level. It was also observed from this approach that shadow pixels were not mixing with water class pixels. Acharyya et al., 2003, has studied a feature extraction method based on m-band wavelet packet frames for segmenting remotely sensed images. These wavelet features are then evaluated and selected using an efficient neuro fuzzy algorithm. The effectiveness of the methodology was demonstrated on two four-band Indian Remote Sensing satellite (IRS-1A) images containing five to six overlapping classes and a three-band SPOT image containing seven overlapping classes. Dave 1991, has introduced the concept of characterization and detection of noise in clustering. He has presented the approach which is applicable to a variety of fuzzy clustering algorithms as well as regression analysis. Dave and Krishnapuram 1997, has studied that the classical approach to clustering based on variations of the K-means or the fuzzy c-means is not robust. The alternative formulations based on noise clustering or possibilistic clustering is robust in that they can be shown to be founded on robust statistics. Banerjee and Davé 2005, proposed a scheme, called as mega-clustering algorithm is shown to be robust against outliers. Another interesting property is its ability to distinguish between true outliers and non-outliers (vectors that are neither part of any particular cluster nor can be considered true noise). Robustness is achieved by scaling down the fuzzy memberships, as generated by FCM. A lot of work has been done in the field of single class extraction through time series multi-spectral data but while going through the literature it has been identified that the effects of various band ratio indices for fuzzy noise classifier along with crop phenology has not been explored in the past.

INDICES AND CLASSIFICATION APPROACHES
To enhance the vegetation signal in remotely sensed data and provide an approximate measure of green vegetation amount, a number of spectral vegetation indices have been proposed. By combining data from multiple bands into single values, because they correlate the biophysical characteristics of the vegetation of the land cover from the satellite spectral signals.
A common practice in the remote sensing is the use of band ratio to eliminate the various albedo effects. Jordan (1969), first presented the ratio vegetation index (RVI) or simple ratio (SR). Rouse et al., 1973, further suggested the most widely used normalized difference vegetation index ( NDVI) to improve identification of vegetated areas and their conditions. However, the NDVI index is saturated in high biomass and it is sensitive to a number of perturbing factors, such as atmospheric effects, cloud, soil effects, and anisotropic effects, etc. Therefore, a number of derivatives and alternatives to NDVI have been proposed in the scientific literature to address these limitations. Tucker (1979), presented a transformed normalized difference vegetation index (TNDVI) by adding a constant 0.5 to NDVI and taking the square root. It always has positive values and the variances of the ratio are proportional to mean values. TNDVI indicates a slight better correlation between the amount of green biomass and that is found in a pixel (Senseman et al. 1996). To reduce the impact to the NDVI from the soil variations in lower vegetation cover areas, Huete (1988) proposed a soil-adjusted vegetation index (SAVI) by introducing a correction factor L (Zhengwei et al., 2008). Broge and Leblanc (2000), developed triangular vegetation index (TVI), which describes the radiative energy absorbed by the pigments as a function of the relative difference between red and near-infrared reflectance in conjunction with the magnitude of reflectance in the green region, where the light absorption by chlorophyll a and b is relatively insignificant. Table 1 show different indices studied in this work.

Vegetation Index Equation References
Simple Ration (SR)

NOISE CLASSIFIER
The concept of "Noise Cluster' is introduced such that noisy data points may be assigned to the noise class. The approach is developed for objective functional type (K-means or fuzzy Kmeans) algorithms, and its ability to detect 'good' clusters amongst noisy data is demonstrated. Clustering methods need to be robust if they are to be useful in practice. Uncertainty is imposed simultaneously with multispectral data acquisition in remote sensing. It grows and propagates in processing, transmitting and classification processes. This uncertainty affects the extracted information quality. Usually, the classification performance is evaluated by criteria such as the accuracy and reliability. These criteria cannot show the exact quality and certainty of the classification results. Unlike the correctness, no special criterion has been put forth for evaluation of the certainty and uncertainty of classification results. It follows the uncertainty problem in multispectral data classification process. Several uncertainty criteria are introduced and applied in order to evaluate the classification performance as membership value generation have been shown in equation (1) and (2). (2) δ >0, any float value greater than zero.
The objective function, which satisfies this requirement, may be formulated as; And ν >0, any float value greater than zero δ >0, any float value greater than zero Where 8>m>1, (any constant float value more than 1) N= row * column (image size i = stands for pixel position at i th location distance between X i and V j )

TEST DATA AND STUDY AREA
The study area taken for this research work was Aurangabad (19° 53' N, 75° 23' E) region, in Maharashtra state in India shown in figure 1. In this study remotely sensed images of Indian Remote Sensing Satellite (IRS-P6) were selected for cotton crop identification. In Aurangabad district the area under cotton is comparatively higher than the other crops. Due to the higher area under cotton cultivation and production of raw cotton, the stakeholders involves in the cotton supply chain are interdependent. For this purpose temporal images of Advanced Wide Field Sensor (AWIFS) and Linear Imaging Self Scanner sensor (LISS-III) with spatial resolution of 56m and 23.5m respectively where used. LISS-III time-series multi-spectral satellite data was used for testing purpose. Different datasets of the time series multi-spectral images were taken for further classification. These datasets were taken for making the inference of suitable time-series images. Total five scenes of the study areas were available. The datasets for AWIFS and LISS-III are shown in table 2. IRS-P6 LISS-III data are well suited for agricultural and forestry monitoring. LISS-III (IRS-P6) time-series multispectral satellite data was used as testing data for accuracy assessment. The coarse resolution images should be coregistered against the fine resolution images. The characteristic of the AWIFS and LISS-III sensor have been mentioned in Table 3. The spatial pixel ratio between classification image AWIFS and testing image LISS-III is 1:3, means that the one pixel of AWIFS image is equal to nine pixels of LISS-III image. Total five temporal scenes of LISS-III data were used for generating testing data sets. The datasets are shown in table 2. In India cotton season start from last week of May and it runs up to end of February. Depending on temperature and variety, 50 to 85 days are required from planting to first bud formation, 25 to 30 days for flower formation and 50 to 60 days from flower opening to mature ball. No clear distinction can be made in crop growth periods since vegetative growth is continued during flowering and ball formation and flowering is continued during ball formation.

TIME SERIES DATASETS
The AWIFS and LISS-III images were stacked in three (03) datasets. Considering the life-cycle of cotton crop, the Sets were established with the aim to get the most suitable time-series images, combining different multi-date in each set.

METHODOLOGY ADOPTED
The methodology adopted for this research work was broadly divided into four stages as shown in figure 2.

Figure 2: Methodology adopted
In the present research work the AWIFS data sets are used. While acquiring the raw time series multi-spectral data it has been processed for atmospheric and geometric correction. Training sites for cotton were identified on LISS-III and AWIFS images with the help of global positioning system (GPS) data and the visually interpreted FCC images. In order to reduce the error the LISS-III images were geo-referenced using Erdas AutoSynco, while AWIFS images were co-registered with reference to LISS-III images. Output cell size for LISS-III images was taken as 20 while for AWIFS 60 to have spatial pixel size ratio of 1:3 between AWIFS and LISS-III images. Common area of interest from all temporal images were generated using the subset tool in Erdas imagine. To create the models of the five band ratio vegetation index chosen, as discussed in section indices and classification approaches, ERDAS Model Maker was used.
As mentioned in Table 2 the different datasets were taken to find out the most suitable time-series (multi-date) images.
Multi-date various vegetation indexes (NDVI, TNDVI, SR, TVI, and SAVI) from AWIFS and LISS-III scenes were computed for the ground truth sites (Figure 3). The fuzzy set theory based sub-pixel classification technique was used for further classification. The samples of cotton were taken from both AWIFS and LISS-III time-series images. For testing the classification accuracy reference fraction images were used from LISS-III sensor of IRS-P6 satellite having both the data sets of same dates as of AWIFS data. The accuracy assessment of sub-pixel classification output has been conducted using fuzzy error matrix (FERM) (Binaghi et al., 1999).

RESULT AND DISCUSSION
In this paper, it has been presented how various indices along with special form of noise classifier impact the accuracy of the multi-temporal crop classification. For this the ALCM module from, SMIC: Sub-Pixel Multi-Spectral Image Classifier package (Kumar et al., 2006) has been used. The ALCM module has capability to process multiple multi-spectral images for single land cover class extraction at sub-pixel level using supervised approach.  Figure 4. It has been observed that, if we take the images of pre-flowering, flowering maturity, and harvesting stage, and for assessment take the same or similar date's images of testing and reference datasets, it will give the best result for crop identification and discrimination.

CONCLUSION
The aim of this study was to map single crop of interest using fuzzy based classifier with the help of time-series multi-spectral satellite images. The crop under consideration in this work is cotton cultivated in Aurangabad district of Maharashtra province in India. Data used for this study was AWiFS (coarser resolution) for soft classification and LISS-III (medium coarser) data for soft testing from Resourcesat-1 (IRS-P6) satellite. The output noise classifier (NC) along with the five indices has been studied. NC classifier was evaluated in sub pixel classification approach and fuzzy accuracy assessment has been carried out using FERM. It is found that the maximum accuracy achieved was 96.02% for TNDVI index of dataset 2. According to results obtained from this work, selection of suitable temporal data sets, appropriate band ratio and use of fuzzy based classifiers, helps in handling mixed pixels in coarser data sets.