BAND SELECTION OF HYPERSPECTRAL IMAGES BASED ON MARKOV CLUSTERING AND SPECTRAL DIFFERENCE MEASUREMENT FOR OBJECT EXTRACTION

For the existing hyperspectral image (HSI) band selection (BS) algorithm does not consider the strong correlation between adjacent bands and does not meet the high-precision extraction of single target, a HSI BS algorithm based on Markov clustering and target landtype spectral difference measurement is proposed in this paper. Specifically, when using Markov clustering for band clustering, the inter band correlation information is embedded and the noise or bad channel band breakpoint is set to adaptively divide the optimal band clustering subset. Then, in each cluster, based on the band difference under the supervision of target category, an evaluation criterion function is designed to select the optimal band combination for single target object extraction. The BS algorithm proposed in this paper is called MCLSD for short. Taking ZY-1 02D HSI in Tongnan of Chongqing as experimental data, taking cultivated land as extraction object and the Random Forest as classifier, the classification accuracy of the selected bands is evaluated. In addition, the MCLSD is compared with the improved sparse subspace clustering (ISSC) (Sun et al., 2015), orthogonal projection band selection (OPBS) (Zhang et al., 2018) and sparse nonnegative matrix factorization (SNMF) (Qin and et al., 2015). Experimental results show that the MCLSD algorithm can select the most suitable band for cultivated land extraction and achieve higher classification accuracy. Especially when the number of bands is less than 5, the MCLSD algorithm has significant advantages over ISSC, OPBS and SNMF. So the MCLSD BS method can meet the demand of the high-precision extraction of target features from HSI data.


INTROUDUCTION
Hyperspectral image (HSI) obtains the information of the interest target through a large number of narrow electromagnetic wave bands (Tong et al., 2006). Compared with multispectral images, these bands can provide more abundant spectral and image information, better describe the spectral characteristics of targets, and improve the ability of detection and recognition. Therefore, it is widely used in various research fields, such as precision agriculture, forestry resource investigation, water quality monitoring, and production quality inspection (Yu et al., 2013). Because the band width of hyperspectral image is narrow and overlapped with each other, the redundancy of information is high and there is strong correlation between bands, which leads to large storage space and long data processing time in the computer. And some band data contain serious noise or invalid zero value (bad channel). If it is directly used in image classification, it will have a certain impact on the results, and the phenomenon of dimension Hughes will appear, resulting in the deterioration of classification performance (Bioucas-Dias et al., 2013). Therefore, dimensionality reduction technology has become a hot issue in HSI analysis.
The HSI dimensionality reduction can be divided into feature extraction and feature selection (band selection). Feature extraction is to map high-dimensional spatial data to lowdimensional space according to certain criteria, and extract new feature subsets to represent the original HSI data. Typical methods include principal component analysis (PCA), linear discriminant analysis (LDA) and independent component analysis (ICA) (Wang et al., 2021). However, through spatial transformation, the physical meaning of the original HSI data will be changed and some key information will be lost. Feature selection, also known as band selection, is to select a subset of bands from the original spectral bands, which can maximize the physical meaning of the band. The band selection method may only need a selected subset of spectral bands for subsequent data analysis. Compared with feature extraction methods, band selection can better preserve physical information and have better ability to interpret and express the original data.
According to whether the labelled samples are used, band selection can be divided into unsupervised, supervised and semisupervised methods. The unsupervised method selects a subset from the hyperspectral band by setting the evaluation criterion function without using labelled samples. The commonly used criteria include the variance, the signal-to-noise ratio, the entropy, the k-order statistics and the Euclidean distance (Sun et al., 2019). Based on this strategy, the existing unsupervised band selection methods can be divided into sorting, searching, clustering, sparse, embedded learning and mixed mode (Sun et al., 2017). Although the band set selected by unsupervised method has the advantages of low redundancy and high signal-to-noise ratio, it may not meet the high-precision extraction of single target. The supervised and semi-supervised methods select the relevant bands according to the supervision information and prior knowledge. Because the label information allows to evaluate the separability of classes, the bands with strong recognition for some classes can be selected finally . Obviously, the supervised method is oriented to classification and recognition, which is conducive to better classification performance. Although these band selection methods can achieve satisfactory classification results, there are three shortcomings. Firstly, there is a lack of band selection algorithm for high-precision extraction of single target category at present. Secondly, a band has strong correlation with adjacent bands within a certain range and low correlation with further bands. And discontinuous bands in different wavelength ranges cannot be grouped and clustered. Thirdly, band clustering will be affected when there are quality problems in some bands.
Aiming at the problem that the current band selection algorithm does not consider the strong correlation between adjacent bands and does not meet the high-precision extraction of single target category, this paper proposes a HSI band selection algorithm based on Markov clustering and target object spectral difference measurement. Specifically, when using the Markov clustering for band clustering, the correlation information between bands is considered, and the band breakpoints of noise and bad channel are set, which promotes the bands are adaptively divided into the optimal band clustering subsets. Then, the spectral difference measurement function under the supervision of target category is designed, and the optimal band combination extracted for a single object is selected in each cluster. In addition, taking the ZY1 02D HSI as the experimental data and the random forest as the classifier, this paper evaluates whether the selected bands have high classification accuracy and determines the optimal number of bands. At the same time, it is compared with the three BS method: ISSC, OPBS and SNMF.

METHOD
The detailed procedure of this method introduced in this paper is showed in Fig.1. Firstly, the adjacency matrix of the relationship between bands is constructed. For one HSI, the correlation matrix between bands is calculated, and the reciprocal of the index distance between bands is used as the weight to construct the adjacency matrix. If a band has serious band noise, or all pixel value is zero, the band has no connection with other bands and is an isolated node in the Markov graph. Then, the Markov clustering algorithm is used to adaptively divide the band into several clusters, which includes eliminating parity dependence, standardizing probability matrix, alternating expansion and inflation operations, and iterative optimization of clustering process (Ariful et al., 2018). Finally, in each cluster, based on the band difference measurement under the supervision of target category, an evaluation criterion function is designed to select the optimal band combination for single ground object extraction.

Markov Clustering with embedded Spectral Proximity Correlation
One band of HSI has strong correlation with adjacent bands in a certain range, and low correlation with further bands. And there are serious problems in the data quality of some bands, such as zero value band, bad line and strip noise. The K-means clustering, Gaussian mixture clustering and other algorithms are difficult to take into account the strong correlation between adjacent bands. The Markov clustering belongs to a graph clustering algorithm, which divides closely connected nodes into the same cluster through the connection relationship between nodes. As shown in Fig. 2, the relationship between bands of HSI can be represented by Markov graph model. The N is the total number of bands, each node represents a band, there is a connecting edge between two nodes, and the transfer probability on each connecting edge represents the correlation between the two bands. Therefore, the Markov clustering algorithm clusters based on graph model, which makes band clustering reach a stable state through multiple expansion and inflation operations, and can cluster adjacent strongly correlated bands.

Construct the adjacency matrix of inter band correlation:
For HSI data, the correlation coefficient between bands is calculated, and the reciprocal of the index distance between bands is used as the weight to construct the inter band adjacency matrix. If the noise of a band is serious, or it is all zero value, then the band has no connection with other bands and is an isolated node in the Markov graph. (2) Where is the connection probability between ith band and band jth.
The th band or th band are isolated node. (3)

Eliminate Parity Dependency:
One of the core operations of the Markov clustering algorithm based on graph model is expansion. The expansion operation simulates the random walk behaviour of flow objects on the graph. When a stream object performs random walk on a graph with some specific structure, it will produce parity dependency effect. In order to solve this problem caused by the expansion operation, it is necessary to add self-circulation for each vertex before processing the state transition matrix of the graph, that is, the value of the diagonal of the matrix is set to 1. The improved adjacency matrix is obtained as follows:

Normalized Probability Matrix:
Based on the adjacency matrix after eliminating parity dependence, the following formula is used to standardize the probability matrix: Where ′ ∈B' represents the elements located in ith row and jth column of adjacency matrix B' after eliminating parity dependence. And the n is the total number of bands, ′ ∈P represents the elements located in ith row and jth column of the standardized probability matrix P.

Alternate Operation of Expansion and Inflation:
The alternating operation of expansion and inflation is the core content of the clustering process. The expansion operation is to multiply the probability matrix e times and expand the flow object to different regions of the Markov graph. The value of the e determines the size of the walking region. In this paper, the e is set to 2. = Then, the inflation operation is applied to the probability matrix P. The function of inflation operation is to enhance the association within clustered nodes and weaken the association between non-clustered nodes, that is, increase the current high probability and reduce the current low probability. When the inflation operation acts on the probability matrix, its parameter r will determine the intensity of this action, and then affect the particle size of clustering. The specific calculation formula is as follows: Where, the P represents the inflation operation, and ∈P represents the elements in the ith row and the jth column of P. In this paper, the r is set to 2.

Band Selection based on Spectral Difference Measurement
After the iterative convergence of Markov clustering, the adjacent bands with strong correlation will be divided into the same cluster. In each cluster, a spectral difference index (SDI) is designed under the supervision of the target object and background to measure the spectral difference between one band and other bands. In order to extract the target more accurately based on the selected bands, the spectral difference between the target object and the background in one band is taken as a component of SDI. The JS divergence measures the difference between the two probability distributions. JS divergence is based on the variant of KL divergence, which solves the problem of KL divergence asymmetry. Generally, JS divergence is symmetric, and its value is between 0 and 1.
Where and are discrete probability distributions of band X and band Y respectively. The greater the ( , ) value, the greater the difference between and . The KL divergence is calculated as follows: KL( , ) = ∑ ( ) log( ( )) − ∑ ( ) log( ( )) (10) The process of band selection according to SDI in all cluster is shown in Fig. 3. For each band in each cluster, the SDI is calculated. In each cluster, the bands are arranged in descending order according to SDI. Based on the set number of bands, those bands with higher SDI value are selected in order in each cluster. is not less than the m, assuming that BN is divided by m to get s and the remainder is r, the first s bands in each cluster are selected, and then the unselected bands are arranged in descending order by SDI to select the first r bands.

Supervised Classification to Determine the Optimal Number of Bands
The MCLSD band selection algorithm can select the most suitable bands for target extraction under a certain number of bands. However, for different number of bands, the accuracy of target extraction is different. How to find the optimal number of bands? This paper uses the Random Forest supervised classification method to determine the optimal number of bands according to the data of training samples and test samples. Specifically, each number corresponds to a selected band set, and the supervised classification obtains a classification accuracy value. Then the curve of classification accuracy and the number of bands is drew. When the accuracy value is the highest and stable, the corresponding number of bands is the optimal number of bands.

HSI Experimental Data
The ZY-1 02D HSI data used in this paper is located in Tongnan District, Chongqing, China, and the imaging time is November 10, 2020. The ZY-1 02D satellite was successfully launched on September 12, 2019, working in solar synchronous orbit, with a return period of 55 days and a design life of 5 years. The satellite is equipped with a hyperspectral camera, which can effectively 166 spectral band data with a width of 60-kilometer, and the spatial resolution is better than 30-meter. The name of hyperspectral camera is advanced hyperspectral imager (AHSI). Its visible near-infrared and short wave infrared spectral resolutions reach 10 nm and 20 nm respectively. The spectral range of 166 bands is 0.40~2.50 um, of which the first 90 are visible light bands, the last 76 are short wave infrared bands, and the band serial number index is 0 to 165.
The bands with index numbers of 96,97,98,99,100,101,102,103,104,105,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,153,154,155,156,157,158,159,160,161,162,163,164,165 have serious noise or zero value invalid pixels. In the MCLSD algorithm, there is no connection between these bands and other bands. The small block HSI with size 630*380 pixels shown in Fig. 4 is used as the experimental data, which covers 215.46 square kilometres.

Training and Testing Samples
On the 2-meter resolution GF2 image in the same period, the sample areas of cultivated land and non-cultivated land were selected shown in Fig. 5. A total of 7352 sample points were made in the sample area, which were randomly divided into training set and testing set according to 6:4 stratified sampling. In addition, this paper takes the Random Forest as the classifier, the decision tree as its base classifier, and the number of trees is set to 10.

Experimental Results and Analysis
After the HSI experimental data are processed by Markov clustering embedded with spectral proximity correlation, 21 band clusters are obtained as shown in Table 1. In each cluster, the bands are continuous. The number of bands in each cluster is not equal, and the minimum number of bands is 2 and the maximum number of bands is 9. The clustering results show that the adjacent bands with strong correlation are successfully clustered.

CONCLUSIONS
A HSI band selection algorithm based on Markov clustering and target land-type spectral difference measurement is proposed in this paper. The experimental results show that the MCLSD algorithm in this paper can select the most suitable band for cultivated land extraction and can achieve higher classification accuracy. Especially when the number of bands is less than 5, the MCLSD algorithm has significant advantages over ISSC, OPBS and SNMF. The BS method introduced in this paper can meet the demand of the high-precision extraction of target features from HSI data.