COUPLE GRAPH BASED LABEL PROPAGATION METHOD FOR HYPERSPECTRAL REMOTE SENSING DATA CLASSIFICATION

Graph based semi-supervised classification method are widely used for hyperspectral image classification. We present a couple graph based label propagation method, which contains both the adjacency graph and the similar graph. We propose to construct the similar graph by using the similar probability, which utilize the label similarity among examples probably. The adjacency graph was utilized by a common manifold learning method, which has effective improve the classification accuracy of hyperspectral data. The experiments indicate that the couple graph Laplacian which unite both the adjacency graph and the similar graph, produce superior classification results than other manifold Learning based graph Laplacian and Sparse representation based graph Laplacian in label propagation framework.

The most common method such as Mincut (Blum et al., 2001), Gaussian Random and Harmonic Functions (Zhu, et al. 2003) short by GRHF, Local and Global Consistency (Zhou, et al. 2003) short for LGC, Linear Neighbor Propagation (Wang, et al. 2007) short for LNP, and the local tangent space alignment (Zhang, et al. 2004) short for LTSA, are widely applied to data classification by numerous researchers.For hyperspectral image classification, supervised classifier required a large number of labeled data due to the high dimensional spectra.However, labeled instances are often difficultly, costly, or time consuming to obtain.Those semi-supervised learning algorithm that utilizes both labeled and unlabeled data is widely employed to solve the small size sample problem.
Usually, the graph Laplacian matrix in graph based label propagation method is obtained by constructing the data adjacency graph and choosing graph edge weights.k-nearestneighbor method used to construct the data adjacency, and graph weight are chosen by binary weights, heat kernel weights, or Euclidean distance weights.However, these weights are only determined by the pair wise distances between data points, ignoring the neighborhood relations and thereby potentially underutilizing available information.
In this paper, we proposed a novel graph in label propagation framework which unites both the adjacency graph and the similar graph.Since manifold learning(ML) approach is capable of exploring the manifold geometry of data (Belkin, et la. 2006), it is suitable for calculating the graph Laplacian in LP.The laplacian eigenmaps(LE) which proposed by (Belkin, 2003), was used to construct the adjacency graph in this paper.The class-probability of each unlabeled point can be calculated by solving an l 1 optimization problem, which has a significant influence to construct the similar graph.In this study, the adjacency graph and similar graph are liner combine in label propagation framework.Experiments on real hyperspetral data sets demonstrate the effectiveness of our approach.
The rest of the paper is outlined as follows.Section 2 reviews the framework of label propagation, and the way to construct the adjacency graph and similar graph.Section 3 shows the experimental results, where four methods are contrast on two hypersprctral data, and two factors that influence the graph Laplcain are analyzed.Finally, conclusions are summarized in section 4.

The label propagation framework
Under the regularization framework, the graph based label propagation is used exploit the geometry of the marginal distribution.Let X l =[x 1 ,…,x l ] denote l labeled data with labels The regularized function to be minimized is defined as: where V = some loss function γ I = the corresponding regularization parameters.
2 f = the manifold regularization term that reflects the smoothness of f on the data manifold, which defined as: where M = D-W is the graph Laplacian matrix W = the edge weights of graph D = the diagonal degree matrix of W given by ∑ .1/(l+u) 2 = the normalizing coefficient, the nature scale factor for the empirical estimate of Laplace operator.

The couple graph based label propagation method
In this paper, we attempt to build a couple graph combine with the adjacency graph and the similar graph.The geometry of data is modeled with the couple graph where nodes consist of both labeled and unlabeled data points connected by edge weights.The couple graph Laplacian matrix is defined as follows: Where M = the couple graph Laplacian matrix, = the adjacency graph matrix, ̃ = the similar graph matrix, a is the scale factor for the empirical estimate of adjacency graph and similar graph.
= the tradeoff between the similar graph and the adjacency graph.

L=D-W is the adjacency graph Laplacian, given by
Construct the adjacency graph with labeled and unlabeled data is using k-nearest-neighbors, where spectral information divergence (SID) was utilized to choose neighbors of the adjacency graph.Then calculate the edge weight matrix W by using LE.
where ̃ is the edge weight of the similar graph.The main question is to find the similar data pairs and calculate the edge weight.We try to choose the similar data pairs, by solving an l 1 optimization problem on sparse representation (SR): Firstly, calculate the class-probability of each unlabeled data by.
T 0 , where (X 0 ) i is an unlabeled data ,  is the sparse coefficient vector.
is the true label of train data.
Secondly, utilize the class-probability to find the similar data pairs.Calculate the class-probability by: where

Data Description
Two hyperspectral data were used for the experiment.The first one was collected by the Hyperion scanner on the EO-1 satellite, which has a 30-m spatial resolution, covering the 357-2576nm of the spectrum in 10-nm bands, over Okavango Delta Botswana (BOT) in May 2001.The second is the 224-Band AVIRIS data set, which was collected over the Kennedy Space Center(KSC) in March 1996, KSC has a 18-m spatial resolution and a 10-nm spectral resolution over the range of 400-2500nm.
After removing the un-calibrated and noisy bands, 149-bands and 176-bands are remained for BOT and KSC data respectively.
In the BOT data, labeled data consist of nine identified land cover types, and classes 3(Riparian) and classes 6 (Woodlands) are very alike among the total 9 classes.In the KSC data, 13 land cover classes were labeled.The classes of Cabbage Palm/Oak Hammock (classes 4) and Slash Pine (classes 5)are all trees that grow in upland; they have mixed spectral signatures with subtle differences and are very difficult to classify.Focusing on the classification of the novel gaph Laplacian, we choose classes 3 and classes 6 of BOT data, classes 4 and classes 5 of KSC data as our experiment data sets."One Versus Other" classification strategy can be used in the multiclass classifier.All the points will be dividing into two subsets, one for training and other for testing.

Analysis of CGLP
In the proposed couple graph based label propagation (CGLP) method, we chose radial basis function (RBF) kernels, , where δ is the kernel width, and varied in the range {0.001,0.01,0.1}.For the other three parameters (k, and T): Parameters k which is the number of nearest neighbors in adjacency graph was varied in the range {5,...,20} with step of 5; T is an empirical threshold of similar probability, which varied in the range (0.1,0.9) with step of 0.1; appoint the tradeoff between the similar graph and the adjacency graph, which varied in the range (0,1) with step of 0.1.Figure 1 shows the overall accuracies of Classes 3 and Classes 6 of BOT data set and figure 2 shows the overall accuracies of Classes 4 and Classes 5 of KSC data set, by using the above four method.Several observations can be obtained: 1) CGLP always produced higher classification accuracies than other four methods.For KSC data, CGLP produce slightly better performance than other three methods.For BOT data, CGLP produce significantly outperform than other three methods.2) With the number of labels decreases, CGLP have a better performance than NSRC.3) Both KSC and BOT data, the NSRC improved fast by enhance the number of label data.

Analysis of threshold T and coefficient
To analyze the impact of the empirical threshold T to graph Laplacian, we selected 30 labeled samples for each class, and changed the parameter T with the steps of 0.1 in the range {0.3, …, 0.7}, δ and k as shown above.It can be seen that the value of T can not be too big or too small.The T value is meaningless if T is too small.Too big, the similar graph has little contribution to graph Laplacian.Through the adjustment of T, and seek out those important and meaningful points.The random test data for Classes 4 and Classes 5 of KSC data were presented in figure 3.
To analyze the impact of tradeoff between the similar graph and the adjacency graph.We fixed the empirical threshold T fixed to 0.5, and selected 30 labeled samples for each class, δ and k as shown above.According to experience, the value of better less than 0.5.We changed the parameter with the steps of 0.1 in the range {0, … , 0.4}.With the increase of , the classification accuracy is increased.The random test data experimental results of Classes 4 and Classes 5 of KSC data were presented in figure 4. It should be note that multi-graph by other characteristics also can be unite in this framework.However, the combination of similar graph and adjacency graph is not only limited to linear method.Combine multi-graph with nonlinear method will be in the future work.

Figure 1
Figure 1 Classification results by different methods over Classes 3 and Classes 6 of BOT data Four classifiers were applied on the two data sets.There are GRHF, LNP, NSRC(Non-negative Sparse Representation Classifier) and CGLP.NSRC is sparse representation based supervised method.It should be note that the k-NN method was employed as the adjacency measurement to search neighbors

Figure 2
Figure 2 Classification results by different methods over Classes 4 and Classes 5 of KSC data

Figure 3
Figure 3 the impact of T for Classes 4 and 5 of KSC data