THE INVESTIGATION OF SENSITIVITY OF SVM CLASSIFIER RESPECT TO THE NUMBER OF FETURES AND THE NUMBER OF TRAINING SAMPLES

Supervised classification of hyperspectral images is a difficult task due to the unbalance between the high dimensionality of the data and the limited availability of labeled training samples. Recently support vector machine (SVM), has received considerable attention for classifying high dimensional data and is applied successfully for classification of hyperspectral images because it discriminates classes by a geometrical criterion not by statistical criteria. In this paper, we investigate sensitivity of SVM classifier respect to two factors. The first factor is the dimensional of data (the number of features) and the second factor is the number of training samples. We evaluate the effect of these factors on the performance of classification in the point of view both accuracy and reliability. Experiments are carried out on the three different common used hyperspectral datasets, Indian pines, Pavia University and Salinas.


INTRODUCTION
By developing hyperspectral imaging technology, it is possible to simultaneously capture image with hundreds of contiguous narrow spectral bands.Recently SVM is applied successfully for classification of hyperspectral images as a non-parametric classifier (Gualtieri, 1999, Pal, 2010, Tarabalka, 2010, Braun, 2012).Since discrimination of classes is based on geometrical criteria not by statistical criteria, they can work even with limited training sample size hence it can overcome the Hughes phenomenon.SVM works based on finding an optimum hyperplane that maximized the margin between two classes.If training data are not separated linearity, a kernel method is used to map data to higher dimension space where data are separated linearly (Moustakidis, 2012, Pal, 2012).In this paper, we investigate the sensitivity of SVM classifier respect to the number of features and the number of training samples.We do experiments using three common used hyperspectral images in different number of features (using PCA) and with different number of training samples.This paper organized as follows: a brief description of SVM is given in section 2. The experimental results and evaluation of sensitivity of SVM classifier respect to the number of features and training sample size is represented in section 3. Finally this paper is included in section 4.

A DESCRIPTION OF SVM
Let us assume that training samples are ሺ‫ܠ‬ ሻ ௗ×ଵ ሺ݅ = 1, … , ܰሻ and the labels of them are ‫ݕ‬ ∈ ሼ−1, +1ሽ.If two classes are linearly separable, then we can find one hyperplane that separates the two classes.The discriminant function is defined as follows: where ‫ܟ‬ ௗ×ଵ is a vector normal to hyperspectral and ܾ is bias.
In order to find the separable hyperplane, ‫ܟ‬ and ܾ should be estimated so that: The SVM approach finds the hyperplane that maximize margin (the maximum distance between the closet training samples) between two classes.This margin is equal to ଵ ‫‖ܟ‖‬ .The optimal hyperplane can be found by solution of following optimization problem: using a Lagrangian formulation, the discriminant function associated with the optimal hyperplane is obtained by: This contribution has been peer-reviewed.The peer-review was conducted on the basis of the abstract.Also the ground truth map (GTM) of Indian pines is shown in Figure 1 and GTM of two other datasets are seen in Figure 2. The used measures for evaluation of performance of classifiers are average class accuracy and average class reliability.Accuracy and reliability for each class defined as follows: accuracy is the number of test samples that are correctly classified divided to the overall test samples and reliability is number of test samples that are correctly classified divided to the overall samples that are labeled as this class.

Experiments and Evaluation
The first experiments are for investigation of sensitivity of SVM classifier respect to the number of features.Feature reduction is done using PCA.The number of used training samples per class ( ܰ ௧ ሻ in investigation of the number of feature is fixed and considered equal to 10.The obtained results are shown in Figure 3.It can be seen that with increasing the number of features, the performance of classifier, both accuracy and reliability, is improved generally to a number and after that reminds in that same value without any increase.The obtained class maps for three hyperspectral images are shown in Figures 4-6

CONCLUSION
In this paper, we evaluate the effect of the number of features and training samples on the performance of SVM classifier.We used PCA for feature reduction.With increase the number of features, the performance is improved generally to a number and after that increasing of the number of feature has no effect on the performance of classification.Also, increasing of the number of used training samples is not considerable with more increase after a number.These results obtained for three common used hyperspectral images.

Figure 1 .
Figure 1.Ground truth map of Indian pines dataset . The second experiments are done for investigation of the number of training samples in classification performance.In these experiments, the number of features (ܰ ሻ is fixed and equal to 16, 20 and 15 for Indiana, Pavia and Salinas datasets respectively.The results are shown in Figure 7.As we can see, with increasing the number of used training samples, the performance of classifier is improved considerably to a number and after that, this increase is not notable.The obtained class maps for these experiments are shown in Figures 8-10.

Figure 3 .Figure 5 Figure 8 .
Figure 3.The average class accuracy and average reliability accuracy versus the number of principal components

Table 1 .
the number of samples in each dataset