Classification of agricultural fields using time series of dual polarimetry TerraSAR-X images

Due to its special imaging characteristics, Synthetic Aperture Radar (SAR) has become an important source of information for a variety of remote sensing applications dealing with environmental changes. SAR images contain information about both phase and intensity in different polarization modes, making them sensitive to geometrical structure and physical properties of the targets such as dielectric and plant water content. In this study we investigate multi temporal changes occurring to different crop types due to phenological changes using high-resolution TerraSAR-X imagers. The dataset includes 17 dual-polarimetry TSX data acquired from June 2012 to August 2013 in Lorestan province, Iran. Several features are extracted from polarized data and classified using support vector machine (SVM) classifier. Training samples and different features employed in classification are also assessed in the study. Results show a satisfactory accuracy for classification which is about 0.91 in kappa coefficient. * Corresponding author.


INTRODUCTION
Agricultural monitoring is one of the important issues for every community as it is one of the economic foundations and is related to humans' food preparation.In this regard monitoring phenological changes and classification of different crop types with the aim of recognizing growth behaviour, cultivation problems and crop yield estimation has received more considerations in recent years.
Satellite radar imagery provides reliable and frequent imaging capability from space.Due to its all-weather functionality, sensitivity to target geometrical structure and orientation, soil moisture and dielectric constant radar imagery has proven powerful for applications which need frequent imagery over an area or involving with the soil and vegetation condition.One of the limitations of radar imagery for classification purposes is the lack of dimension for a single image.There are different ways to overcome this limitation by employing different integrations such as using multifrequency, multipolarization, and multitemporal observations in combination with optical data ( Dhar et al., (2009) and Zhao et al., (2014)) Sabour et al., (2008) used dual polarization HH/VH time series of ENVISAT data for classification of crop types.H/A/alpha decomposition of fully polarimetric L-band data and also hybrid decomposition parameters behaviour during growing season of agricultural areas is investigated by Jain (2010).Zhao et al., (2014) discussed about the feasibility of c-and x-band polarimetric SAR images to crop harvest pattern recognition and tried to classify them with a decision tree algorithm.Voormansik et al., (2013) investigated the capability of dual polarimetry TerraSAR-X (TSX) images for detecting grassland cutting practices.In this research we have utilized the underlying phenological characteristics of different crops which may exist in polarimetric features to show the ability of temporal signatures for agricultural classification purposes.Support vector machine (SVM) as kernel based supervised classifier was used for this purpose.To improve the accuracy, a modification has been done on the training samples and finally, results of classification with different parameters are compared.

Study area
Our test site is located in the north east of Doroud city in Lorestan province, Iran.It is located in a region with centre coordinate of 33°31'55.00"N,49° 6'8.00"E and a population of 162,800 people.The area includes agricultural fields with different crop types, which are mostly cereals and beans, rice and alfalfa.

Satellite Data
The dataset includes 17 TSX StripMap images in coherent dual polarimetriy (HH/VV) mode.11 images cover the time interval from Jun 12 th to November 2 nd 2012 while the other 6 images from cover from May 13 th to August 4 th 2013.All images were acquired in a descending mode at 07:25 local time (02:55 UTC) with a scene centre incidence angle of 22º and in single look slant range complex (SSC) format.Average range and azimuth spacing are 0.9m and 2.4 m respectively.Acquisition intervals are not equal, most of them are 11 days separated but some of them have an interval of 22 to 30 days.

Feature extraction
After data separation of each year we have a subset of the desired area from the whole scene.To reduce the effect of speckle, SSC images were converted to 5*5 multilooked complex images and projected to ground range with geocoding, so that the final resolution is 12m.Every individual point scatterer can be described by a target vector in pauli basis with fully polarimetric monostatic mode as follows (Lee and Pottier, 2009): Where HH S and VV S are the like-polarized components of scattering matrix and HV S is the cross-polarized one.
To obtain information about the distributed targets such as soil and vegetation, a coherency matrix is formed which is shown in equation 2 and every element of this matrix, is a spatially average of the adjacent pixels response (Lee and Pottier, 2009).

† k.k = T
(2) H/A/Alpha polarimetric decomposition which is based on eigenvector decomposition of the (2×2) complex covariance matrix and has been described by Pottier et al., (2009), is utilized here with a difference that instead of covariance matrix we used the top left sub matrix of equation ( 2).In fact the target vector of equation ( 1) is formed with the first two elements including HH and VV polarizations.Therefore the decomposition parameters, i.e.Entropy/Anisotropy/Alpha, are calculated as follows: The pseudo-probabilities of the 2 T is calculated as: With p1 ≥ p2 (5) Where i  is the ith eigenvalue.Now by using pseudo-probability (pi) values of the eigenvalues ( i  ), entropy (H), Anisotropy (A), and alpha angle (  ) are calculated as follows: ) and ratio between HH and VV are also calculated as follows:

Classification
In this study, the support vector machine (SVM) classifier with RBF1 kernel is employed.SVM is a supervised linear Kernelbase classifier which needs careful attention in the procedure of training samples selection (Srivastava and Bhambhu, 2010).One of the main properties of SVM is that it doesn't suffer from the limitation of training samples and data dimensionality.However, the accuracy of training samples highly affects the final result.In order to be sure about the quality of training samples, a refinement has been applied which will be discussed more in the next subsection.Radar images in x-band are much more influenced by speckle phenomenon due to the small wavelength.So for further suppressing of this effect on the classification accuracy, all features were speckle filtered using refined Lee filter (Lee, 1981) with a 3×3 window size and then stacked for the two years separately.Accordingly we have 7 parameters for each year (entropy, anisotropy, dominant scattering alpha angle, coherence magnitude between HH and VV, the ratio of HH intensity to VV intensity, with HH and VV backscatter) which formed 77 stacked bands for 2012 and 42 for 2013.The RBF kernel parameters include gamma and penalty factors which were tested with different values to obtain the best ones.However, the classification results were not affected much by selecting gamma factors.Therefore, this value was chosen only relative to the number of bands participating in the classification, and it was 1 divided by the number of bands.But the penalty factor was selected as 1000 with respect to the overall accuracy of the classification.The reference map for each year which is shown in Figure 2 was provided through local survey from the farmers, so 8 classes were considered for each year.

Refinement of training samples
In order to investigate the accuracy of training samples, a modification procedure is employed after training sample collection.This procedure includes a 3-sigma outlier detection test, which is based on mean and standard deviation of the desired class.Samples which are out of the interval of 3 sigma are assumed to be as outliers (Kasunic, 2011).This assumption preserves 99% of the observations while detecting the outliers.This step was done using the first 3 bands of PCA 2 transform.The reason of selecting PCA transform was to suppress the noise further while preserving the most information with reducing the number of bands.

Classification result and accuracy assessment
After outlier detection of training samples, 1% of them were used to train the support vectors while the rest were utilized for accuracy assessment.In order to see the effect of refinement on the classification accuracy the full data set was classified and assessed two times with and without refinement.The results showed that the overall accuracy increases by about 2%.Accuracy assessment is done with calculating a number of criteria such as completeness, correctness and quality (Heipke and Mayer, 1997), the overall accuracy (OA) and kappa coefficient (Liu et al., 2007).Kappa coefficient and ovaerall accuracy were estimated for the whole image with different 2 Principal Component Analysis combinations of features for each year.The other 3 criteria were assessed only for a subset area of about 1.69 km 2 .Formulation and results come as follows.User and producer accuracies for the full parameters case in each year are also shown in Table 2.
The classified area used for estimation of the three first criteria is shown in the Error!Reference source not found.4and the estimated parameters are given in Table 3.The assessment was repeated for the year 2013 with only the full parameters case and the results are shown in Figure 5 and Table 4.As we can see from the results in Figure 3, the accuracy increases when using all the information.Another important issue is that the quality of extracted fields for the year 2012 is better than 2013.The reason can be related to the number of images used for each year, as we had 11 images for the year 2012 but only 6 for the year 2013 and that the later images does not cover whole the agricultural season.
To show the classification results of agricultural areas, a vegetation mask was built and applied to omit the extra classes such as urban areas.The mask was built using NDVI maximum composite of 14 Landsat 8 images over the year 2013.A threshold of 0.2 is used corresponding to typical and semiintensive vegetation.The masked classified map for both years is shown in Figure 66 and 7 corresponding to the years 2012 and 2013 respectively.
After classification and assessment procedures, classes were spectrally averaged in all the features.Then the time series behaviour is plotted for each feature.The plots are depicted in Figure 8.This is another proof of the separated classes and their correspondence for 2 years.As we can see, each class is shown in a specific colour with different line type for each year.The same classes follow the same behaviour every year but there is some fluctuations.The reason may be because of climate change and the amount of moisture due to precipitation aactivities.This assumption should be investigated in the future.

DISCUSSION AND CONCLUSION
The aim of this paper was to show the capability of multitemporal x-band dual polarimetric radar images to classify agricultural fields.A classification procedure has been proposed using kernel based supervised SVM classifier.Although SVM is not suffered from data dimensionality limitations, but its accuracy is highly dependent on the accuracy of the training data.Therefore to become certain about the training data quality, we applied a procedure named training samples refinement including an outlier detection.After the refinement, classification was done and assessed with different feature combinations.As it is shown in Figure 3, the accuracy and quality of classification with using only time series signature of HH to VV ratio, is very low and less than all others.However, as it is depicted in the Error!Reference source not found.8,the pick value of this feature for class 1 is higher than other crop types, so it is an appropriate indicator for separating class 1 which corresponds to rice fields.The characteristics and behaviour of this feature has been used for rice area crop monitoring in several works such as Bouvet and Le Toan, (2009).
As we can see from the results, the final classification map does not follow a specific pattern.The changes are due to alternating cultivation between crops.Farmers do not always cultivate a specific plant in this region in order to save the quality of soil.So crop types like wheat, barley and beans are implanted alternatively in a field every 2 to 4 years.This is an important issue for the image preparation.Mapping crop types in an area through phenological changes requires acquiring dataset within one agricultural season in order to not have any changes, in particular for the areas with diversity of crop cultivation.

Figure 1 :
Figure 1: An example of a TSX amplitude image of July 26 th 2012 over google earth image in Doroud and the subset area shown with RGB colour composite in Pauli basis over the agricultural fields.

Where 2 T
is the top left sub matrix of the coherency matrix (T) shown in equation 2, and the equation 3 shows the eigenvector display.You can see the eigenvalue decomposition of 2 T as it is shown in equation 4.

Figure 2 :
Figure 2: Reference maps for 2012 (on the top) and 2013 (on the bottom)

Figure 3 :
Figure 3: Accuracy assessment of classification results with different parameters combination for the year 2012.FP: Full parameters stack, Co: Coherency, Ra: Ratio of HH/VV, En: Entropy, An: Anisotropy, Al: Alpha, SHH and SVV: polarimetric backscatter values, PCA: 3 first bands of PCA.

Figure 5 :
Figure 5: Area used for quality assessment of classification results for 2013 with full parameters set

Figure 6 :
Figure 6 : Classification map of the year 2012 (on the top) and 2013 (on the bottom)

Table 2 :
User and producer accuracies for the full parameters case

Table 3 :
completeness, correctness and quality assessment of different band combinations for the subset area of the year 2012