HYPERSPECTRAL IMAGE DENOISING USING A NONLOCAL SPECTRAL SPATIAL PRINCIPAL COMPONENT ANALYSIS

Hyperspectral images (HSIs) denoising is a critical research area in image processing duo to its importance in improving the quality of HSIs, which has a negative impact on object detection and classification and so on. In this paper, we develop a noise reduction method based on principal component analysis (PCA) for hyperspectral imagery, which is dependent on the assumption that the noise can be removed by selecting the leading principal components. The main contribution of paper is to introduce the spectral spatial structure and nonlocal similarity of the HSIs into the PCA denoising model. PCA with spectral spatial structure can exploit spectral correlation and spatial correlation of HSI by using 3D blocks instead of 2D patches. Nonlocal similarity means the similarity between the referenced pixel and other pixels in nonlocal area, where Mahalanobis distance algorithm is used to estimate the spatial spectral similarity by calculating the distance in 3D blocks. The proposed method is tested on both simulated and real hyperspectral images, the results demonstrate that the proposed method is superior to several other popular methods in HSI denoising.


INTRODUCTION
Hyperspectral image (HSI) is produced with high spectral resolution, providing contiguous or noncontiguous bands throughout the 400-2500 nm region.HSI is capable of supporting various important application in the field of remote sensing (Wang and Niu, 2009), such as environmental monitoring, discriminating different land cover types, mineral identification and so on.However, the existence of noise changes the spectral curve of HSI, which has a negative impact on various HSI processing tasks, classification, unmixing, subpixel mapping, target detection, and so on.Therefore, how to reduce the noise influence in HSI is an essential step to improve the image quality.
In recent years, image denoising based on principal component analysis (PCA) model has been attracting more attention, and it has been proved that PCA algorithm is very effective and efficient denoising approach because PCA can separate the signal and noise well by converting the data into the PCA domain.Examples include the adaptive PCA denoising scheme (Muresan and Parks, 2003) and SAR image denoising via clustering-Based PCA (Xu et al., 2014).However, in hyperspectral denoising method, PCA denoising algorithm exists only as an auxiliary algorithm.Chen and Qian (Chen andQian, 2008, Chen andQian, 2009) proposed to perform dimension reduction and HSI denoising based on wavelet shrink and PCA.Chen et al. (Chen and Qian, 2011) proposed a new HSI denoising algorithm, where PCA is first used to decorrelate the data, and then wavelets are used to perform denoising in low energy in low-energy PCA output channels.
Traditionally, HSI denoising techniques are based on band-byband or pixel-by-pixel processing, i.e., which lead the losss of * Corresponding author correlation between bands and pixels.In recent years, therefore, more and more HSI denoising algorithms tend to exploit the characteristics of high spatial correlation and spectral correlation of hyperspectral data.Yuan et al (Yuan et al., 2012) has proposed spectral-spatial adaptive total-variation (TV) model for HSI denoising, which is capable of accounting for the noise intensity difference between different bands and spatial property differences between different pixels.Karami et al (Karami et al., 2011) proposed genetic kernel Tucker dencomposition (GKTD) algorithm for HSI denoising, which exploits both the spectral and spatial information in the image.Linlin xu (Xu et al., 2017) proposed a method using spatial spectral monte carlo sampling approach, which based on posterior probability in a nonparametric manner.Besides in field of HSI noise reduction, spectral-spatial joint structure is also used in HSI data compression (Christophe et al., 2008) and classification (Qian and Ye, 2013).In PCAbased HSI algorithms, although spectral PCA has been proposed for the segmentation of hyperspectral images (Yoshino, 2007), spatial domain information has not been well integrated into the PCA model.Therefore, establishing a PCA denoising model that combine spatial and spectral information is a necessary work.
Most noise reduction methods rely on the local information of signals, the main drawback of which is that the information provided by the neighborhood is too limited to preserve the true structure, details and texture of an image.To deal with this problem, nonlocal algorithm was proposed for image denoising (Buades et al., 2005).Nonlocal apporach is based on the assumption that every pixel in image has many similar pixels in the same image.A nonlocal spare representation based noise reduction algorithm is introduced (Dong et al., 2011), where spare representation of the similar patches are recovered by a regularized linear regression model with shared constraint of sparsity.For HSI, nonlocality also exists in spectral-spatial space, which sug-gests that the similarity must be considered in every 3D block in HSI.Therefore, the nonlocal similarity is not the distance between two points, but the similarity of two correlation matrix.The Mahalanobis distance (Jr et al., 2010) takes into account the correlation of the dataset and is scale invariant, i.e., not dependent on the scale of measurements.Therefore, Mahalanobis distance can calculate the nonlocal similarity in the spectral-spatial space.As a result, constructing a nonlocal spatial spectral correlation model base on PCA is crucial for hyperspectral denoising.This paper, therefore, introduces a novel denoising approach base on nonlocal spectral spatial PCA (nonlocal SSPCA) for HSI denoising, which integrate spatial information and spectral information of nonlocal area into the PCA model.The novelty of nonlocal SSPCA lies in the following aspects: 1) separating the signal and noise in HSI only by PCA; 2) exploiting spectral-spatial joint structure of the HSI in PCA; 3) incorporating nonlocality of the HSI into PCA denoising model by calculating the Mahalanobis distance among pixels.This paper is originated as follows.Section II introduces the proposed denoising framework, as well as the PCA and the nonlocal similarity approaches.Section III introduces the experiments results on both simulated and real hyperspectral images.Section IV concludes the study.

Problem Formulation
The HSI as a collection of all of spatial position and all of the bands is represented by I jkb (j = 1, 2, 3, ..., m), (k = 1, 2, 3, ..., n), (b = 1, 2, 3, ..., p), where j and k determine the location of I jkb in image space, and b represents the band number in spectral domain.I jk: is a p × 1 vector, representing the spectrum curve in the position of (j, k), and I ::b is a z×1 (z = m×n) vector, representing the all the pixels in the bth band, where I is a HSI that is contaminated by noise.The noise degradation model of the HSI can be written as (Yuan et al., 2012) (Atkinson et al., 2003) where X jkb denotes the unobservable noise-free variable and n b is Gaussian-white noise In proposed PCA denoising model, because I jkb cannot reflect the spatial correlation and spectral correlation of HSI.we use construct Y to entail the spatial spectral information, and use Y as the input to the PCA model for spatial spectral denoising of HSI.
the size of yi is t × 1 vector.And from (1) we can get Now the problem is transformed to the estimation of X from the noisy measurement Y .

HSI denoising in PCA domain
In section 2.1, the problem have been turned into how to denoise the matrix of Y by PCA, The goal of PCA is to find an orthonormal transformation the matrix of P .However, in the process of data centralization, in practice, u is calculated from samples of Y , but not X.But, zero mean noise characteristics dictate that the mean vector of yi is the same as xi, i.e., E[y] = E[x] = u.Therefore, we subtract the mean value of u from Y to get the centralized dataset of the optimal PCA transformation matrix of P can be obtained by computing the cavariance matrix of Σ X of X.However, the available dataset Ȳ is contaminated by noise so that Σ X can not be directly computed.Therefore, we need to estimate it by using the linear noise model, which can be expressed as the signal Σ X and noise N are uncorrelated.Therefore, the XN T and N XT will be nearly zero matrices, we can get where Σ X and ΣN are the covariance matrices, respectively.Since Gaussian noise is uncorrelated, we can know that ΣN is a t × t diagonal matrix with all the diagonal components being σ 2 N .Therefore, we can get Thus we have where I is identity matrix.We can prove that the PCA transformation matrix of P associated with Σ X is the same as the PCA transformation matrix associated with Σ Ȳ .Therefore, in PCA denoising model, we can directly get W X by decomposing Σ Ȳ and use W Ȳ to replace the W X .Through the above analysis, we can directly decompose Σ Ȳ by where λ1 ≥ λ2 ≥ ... ≥ λt is eigenvalue vector of Y , Wr (r = 1, 2, 3, ..., t), t × 1 vectors, denote the sequence of mutually orthogonal PCA bases into which the projection of image stack Ȳ produces the PCs with sequentially largest variances represented by λr.The orthonormal PCA transformation matrix for X is then and then we transform our data into a PCA domain by where Ŷ is the de-correlated dataset for signal, most energy of Ŷ concentrates on the several most important components, i.e., while the last few components are mostly due to noise.Therefore, by selecting the fist K PCs that contains most of the scan signal, the noise can be remove X = Ŷ (:, 1 : K)P T (1 : K, :) reformatting X results in denoised dataset.Since Y is transformed according to a certain rule based on the original hyperspectral data I, finally we will use X to reconstruct our denoised HSI according to the original rules.3

nonlocal spatial spectral PCA denoising
).The key to the nonlocal algorithm is to select the T pixels with the highest similarity to the central pixel in a L × L × p 3D block.
Because every pixel with spectral information can be represented L × L random vectors, the distance of random vectors cannot be calculated by the classical Euclidean distance algorithm.Therefore, the Mahalanobis distance can be considered for calculating the distance of relevant random vectors, which can be represented by S(I jk: , v l: ) = (I jk: − v l: where v l: (l = L×L−1) represents all pixels other than the center pixel I jk: .Since we only select the T pixels that are closest to the center pixel, in the next calculation, we set l = T .Therefore, the similar weights can be defined as where nlg,:(g = 0, 1, 2, ..., T ) is p×1 vector, which is distributed around the referenced pixel.Therefore, in nonlocal SSPCA, we can get a vector of t elements t = p × (T + 1) in ( 2), the covariance matrix of yi can be represented by

Complete procedure of the propose approach
The detailed procedure of nonlocal SSPCA is given in the following.
1.Select a large 3D block around the referenced pixels of the HSI, in order to obtain region samples at boundary areas, we perform image padding in spatial dimension before obtaining samples.
2.Use Mahalanobis distance algorithm to calculate the distance between all the pixels in 3D blocks and referenced pixel like (13).
In this step, spectral information is also used to calculate the distance between two pixels.
3.Select the nearest T pixels in window to contribute to the denoising according to the distance S, and calculating the similarity weight according the distance S and the number of pixels T using ( 14) and ( 15).6.Reorganize the HSI using weighted average like (19) end Through the above calculation process, we can finally get the denoised HSI.

Experimental design
In all experiments, several popular denoising techniques in hyperspectral images are compared, such as wavelet-based Bay-ersShrink, wiener filter, and the TV method.The parameters of referenced methods are set by following the suggestion of the respective authors.In proposed methods, a region of 3 × 3 is used for spatial PCA and SSPCA.In nonlocal SSPCA, the nonlocal area is set to be 19 × 19.Approximately 12 pixels are taken to contribute to denoising of referenced pixels.Due to the different methods of constructing the matrix of Y , the value of K is also different.Choosing the appropriate K value to ensure that most of signals are captured is a crucial job in the experiment.

Test on simulated image
The data is simulated based on the Indian Pines image, which was captured by airborne visible/infraed imaging spectrometer (AVIRIS) over a vegetation area in northwestern Indiana, USA, with a spatial resolution of 20 m, consisting of 145 × 145 pixels of 16 ground-truth classes and 220 spectral bands.Only the labeling information of Indian Pines images is used in simulation, 10 ground-truth classes with 17 spectral of 224 spectral bands randomly chosen from USGS spectra library.The image is considered as a clean image.then, we set SNR=20.
After testing all the methods on simulated image, in Fig. 2, we show the PSNR and SSIM in each band with line chart.As we can see, for both measure, the line of nonlocal SSPCA is above those of the other methods, indicating that nonlocal SSPCA outperforms the other methods in term of both noise removel and detail preservation.The second best method seems to be SSPCA, and followed closed by spectral PCA.Due to the importance of spectral information in PCA denoising model, spatial PCA achieves the lowest PSNR and SSIM lines among all methods.Wiener, wavelet and TV methods are comparable performance, but wiener have a high SSIM values.The lines of all denoising methods are above the lines of the noisy image.
Fig. 3-4 display the clean, noisy, and denoisimg images of three bands achieved by different methods.In the picture, we find that nonlocal SSPCA is most similar to the true image, what's more, nonlocal SSPCA not only represses the presence of noise, but also preserve the the edge information and detail information.SSPCA also has a nice denoising effect, but we find that there is ambiguity at the boundaries of the image.TV tends to oversmooth image in denoising processing.There is most of noise still presented in spatial PCA, which fails to reduce the noise.Fig. 5 show the spectra clean, noisy, and denoisimg produced by different methods.The similarity of true image reflect the effect of the denoising.The line of nonlocal SSPCA produces the most similar spectra to the true images.SSPCA also have a good denoising effect.Although spectral PCA can obviously restrain the noise, the effect is not very stable with the changes of feature type.Comparing to Wiener and wavelet, denoising performance of TV method is stronger.However, by combining the visual effect of image, we find that wavelet achieves better balance between noise removal and detail preservation.Spatial PCA fails to significantly reduce the noise.

Test on The Real Hyperspectral Data
In this experiment, all methods are tested on the Indian Pines image.The HSI contains 220 spectral bands, with each band contains different intensity of the noise.In order to testing our proposed method, all bands are used in this experiment.
The denoising results of band 200 are shown in Fig. 6.It can be clearly seen that the proposed nonlocal SSPCA method achieves better denoising results than other methods.Because of containing nonlocal spectral information and spectral in PCA denoising model, the nonlocal SSPCA reduces most random noise efficiently, while also preserves image details, e.g., bright boundary information and point targets very well.Although SSPCA also suppresses most of the noise in the HSI, the denoised image is blurred slightly.Local detailed information, such as edges and texture information of the image are lost in the denoising process.The denoising result using the TV model and wavelet model appear oversmoothed, and most of detailed information is lost.Other methods tend to either keep undesirable artifacts or oversmooth the image.

CONCLUSION
In this paper, we have proposed nonlocal SSPCA hyperspectral image denoising algorithm.The PCA was used to separate the noise and signal from contaminated hyperspectral data.Meanwhile, the Mahalanobis distance algorithm was used to estimate the spatial spectral similarity by calculating the distance between the spectrum of referenced pixel and the spectrum of others pixels in nonlocal area, the nonlocal spatial spectral similarity was used to estimated the acceptance probability and captured the image non-stationary information into denoising model.And then, this proposed method was tested on hyperspectral images, Meanwhile in comparison with several other classic denoising method.
Through the analysis of both the numerical and visual effects, we have demonstrated that our proposed method have excellent results in denoising.It not only removes a lot of noise, but also preserves detailed information.

Figure 1 .
Figure 1.Selecting some similar pixels 3D block around the reference pixels The core of nonlocal algorithm is to address the non-stationarity of image signals by restructuring the Y .The boundary information and texture information can be obtain by selecting the similar pixels with referenced pixels of I jkb in a nonlocal window, In our proposed method, However, besides the nonlocal similarity in image, many researches have shown that the HSI has strong correlations in both spatial and spectral domains.So, we use 3D blocks instead of 2D patches in the sparse representation (see Fig. 3).The key to the nonlocal algorithm is to select the T pixels with the highest similarity to the central pixel in a L × L × p 3D block.Because every pixel with spectral information can be represented L × L random vectors, the distance of random vectors cannot be calculated by the classical Euclidean distance algorithm.Therefore, the Mahalanobis distance can be considered for calculating the distance of relevant random vectors, which can be represented by S(I jk: , v l: ) = (I jk: − v l: ) T Σ −1 (I jk: − v l: )(13) conditions 0 ≤ w(I jk: , v l: ) ≤ 1 and T w(I jk: , v l: ) = 1.The parameter h acts as a degree of algorithm.It controls the decay of the exponential function and therefore the decay of weights as a function of the distances S(I jk: , v l: ).P (I jk: ) is the normalizing constant P (I jk: ) = 17) where σ nl A,C nl B,D provide the covariance between different spatial position of A and B and different bands of C and D in 3D blocks.Therefore, nonlocal SSPCA can capture a nonlocal spatial correlation information and spectral correlation information, and utilize these information in PCA denoising model.The PCA denoising process for Y has been introduced in section 2.2, we can get Xi = ( ÎT j,k,: , vT 1,: , vT 2,: , ..., vT T,: ) T (18) where the size of X is the same as Y. Therefore we can get noisefree HSI by Ījk: = Îjk: + T l=1 w(I jk: , v l: )v l: 2 (19)

4.
Use the selected pixels to construct the PCA transformation matrix of Y , where the columns of the Y represent the number of pixels, and the rows represent the nonlocal spatial information and spectral information of the HSI.5.PCA transformation of Y : 5.1.Data centralization of Y 5.2.Calculate the covariance matrix using (17) 5.3.Factorize Σ Ŷ = W Λ Ŷ W T , set the PCA transformation matrix of P 5.4.Transform the dataset to de-correlation PCA domain Ŷ = Y P 5.5.Select the leading K PCs that contains most of the scan signal by (12) .

Figure 3 .
Figure 3. Denoising results achieved by different methods, on band 1 of simulated data.Denoised image by the proposed nonlocal SSPCA method is the most similar one to the true image.The other methods tend to either preserve undesirable artifacts or oversmooth and weak signal.

Figure 4 .
Figure 4. Denoising results achieved by different methods, on band 50 of simulated data.

Figure 2 .
Figure 2. PSNR and SSIM values of the different denoising approaches in each band of the simulated experiment.(a) PSNR.(b) SSIM value.The lines of nonlocal SSPCA above those of the other methods, indicating that nonlocal SSPCA outperforms the other methods in term of both noise removel and detail preservation.The secord best method seems to be SSPCA, and followed closed by spectral PCA.Spatial PCA achieves the lowest PSNR and SSIM lines among all methods.Wiener, wavelet and TV methods comparable performance, but wiener have a high SSIM value.

Figure 5 .
Figure 5. Spectra at pixel (60,60) of the simulated data, before and after denoising by different methods.

Figure 6 .
Figure 6.Denoising results achieved by different methods on band 200 of real data.It can be clearly seen that the proposed nonlocal SSPCA method achieved better denosing results than other all methods.Because of containing nonlocal spectral information and spectral information in PCA denoising model, it reduces most random noise efficiently, while also preserving image details, e.g., birght boundary information and point targets very well.