BINARY HYPERSPECTRAL CHANGE DETECTION BASED ON 3D CONVOLUTION DEEP LEARNING

Timely and accurate change detection of Earth’s surface features is extremely important for understanding relationships and interactions between human and natural phenomena to promote better decision making. The bi-temporal hyperspectral imagery has a high potential for the detection of surface changes. However, the extraction of changes from bi-temporal hyperspectral imagery due to special content of data, and environment conditions (atmospheric condition), change into challenging task. To this end, this research proposed a change detection framework based on deep learning using bi-temporal hyperspectral imagery. The proposed framework is applied in two main steps: (1) predict phase that the change areas highlighted from no-change areas using image differencing algorithm (ID), (2) decision phase that it decides for detecting change pixels based on 3D convolution neural network (CNN). The efficiency of the presented method is evaluated using Hyperion multi-temporal hyperspectral imagery. To evaluate the performance of the proposed method, two bi-temporal hyperspectral Hyperion with a variety of land cover classes were used. The results show that the proposed method has high accuracy and low false alarms rate: overall accuracy is more than 95%, and the kappa coefficient is greater than 0.9 and the miss-detection is lower than 10% and the false rate is lower than 4%.


INTRODUCTION
Remote sensing (RS) as the most important information resource plays a role key in the monitoring of our earth (Ishtiaque et al., 2020). Past decades, with the advancement of remote sensing sensor in both spatial and spectral resolution there are a huge interesting in studying hyperspectral imagery in many of fields (Liu, 2015). The hyperspectral imagery provides more details from change detection (CD) compare to multispectral images . Hyperspectral change detection (HCD) is one of the most important applications of hyperspectral RS imagery . The CD is the process of identifying differences in the state of an object or phenomenon by observing it at different times (Hussain et al., 2013). The change detection is used in many applications that include: urban monitoring, land use/cover mapping, and damage assessment (Ghosh and Chakravortty, 2020;Liu et al., 2019;López-Fandiño et al., 2019;Pati et al., 2020;Zhan et al., 2020). Recently, due to the increasing availability of hyperspectral images the HCD convert to hot research topic area and many methods have been developed by researchers. Wu et al. (2013) proposed a subspace-based HCD method that measures spectral changes. This method constructs the background subspace using pixels first and second time and additional information. Yuan et al. (2015) proposed semi-supervised HCD using distance metric learning. The proposed method tries to predict changes and nochange pixels to build a matrix. Then those classified by a classifier. Liu et al. (2016) proposed an unsupervised HCD based on spectral unmixing. The proposed method at first, the stacked multitemporal dataset and is divided into many parts then the spectral unmixing is applied for each part. Finally, endmember grouping is applied and the multiple change map is obtained by labeling on an abundance map. Seydi et al. (2017) proposed match based HCD based on combining distance/similarity spectral measure metrics. The mentioned method is applied to two main phases: (1) predict change area by match based method, and (2) deciding on change/no-change pixels by the threshold selection method. Li et al. (2019) proposed unsupervised deep * Corresponding author noise modeling for HCD. The proposed framework is applied in three main steps: (1) fully convolutional network (FCN) is used to learn discriminative features from high-dimensional data, (2) two-stream feature fusion module utilized for fusion feature maps, and (3) the unsupervised noise modeling module, applied to tackle the influence of the noise and the robust training of the proposed network and the final change map is obtained after these steps. Generally, as discussed many types of HCD methods were proposed by researchers. There are many main challenges among the type of HCD methods that are: (a) the hyperspectral images is complex data and also has high dimensional as a result need to a special technique for processing, (b) existence noise and condition atmospheric have an effect on results, (c) the complexity some proposed methods, and (d) however, the HCD has high spectral information but those ignored the potential spatial features that improved result of HCD (Ertürk, 2019;Huang et al., 2019;Liu et al., 2019;Marinelli et al., 2019). Other hand, with developing hyperspectral remote sensing sensors, both spectral and spatial resolution will be improved. So, the HCD methods need to constantly update. Based on these problems it is necessary to HCD methods to minimize problems. So, this research proposed an HCD method. The proposed method is based on a 3D-conventional neural network and image differencing (ID). This paper is organized as follows: In Section 2, the details of the proposed HCD method are presented. Section 3 introduces the study area and hyperspectral image datasets. In Section 4, the evaluation results are presented, and, finally, Section 5 concludes with the experiment results

PROPOSED HYPERSPECTRAL CHANGE DETECTION FRAMEWORK
This section has investigated the details of the proposed HCD method. The proposed HCD method is applied in two main steps (Fig. 1). The first step is predicting the change area from the no-change area by the ID algorithm. The second step is the extraction of deep features by 3D-CNN and classification them for binary change detection.

ID algorithm
In the algebra-based change detection category, the ID algorithm is on the most common change detection methods (Seydi and Hasanlou, 2017). Due to simple interpretation and mathematical of this method, is used widely. The ID is applied based on substation band to band pixels on the first and second-time image dataset. The equation of ID algorithm for two vectors as follows: (1) where, , and , are the date 1 and date 2 pixel reflectance in row i and column j, in band c.

3D Conventional Neural Network
Recently, CNN converts to the most popular methods in a field on image processing (Alom et al., 2019;Ball et al., 2017). This method has shown promising performance in many applications such as classification (He et al., 2017), target detection (Hao et al., 2020;Vincent and Besson, 2020), damage detection (Seydi and Rastiveis, 2019), and change detection (Ghosh and Chakravortty, 2020;Huang et al., 2019;Ma et al., 2019;Wang et al., 2019;Zhan et al., 2020). The 2D-CNN focuses on the extraction of HSI spatial information and missing channel-related information (He et al., 2017;Li et al., 2017;Song et al., 2018). Whereas, the hyperspectral image data are volumetric data and high spectral content. So, the 2D-CNN fails to fully utilize the deep spectralspatial features. To considering this problem, this research is used 3D-CNN that used a 3D convolution layer. This convolution layer able to extract spectral and spatial features simultaneously. This proposed architecture is included: (1) convolution layer, (2) max-pooling layer, and (3) fully connected layer. The main purpose of the convolution layer is the extraction of different features/patterns from input data (Alom et al., 2019;Huang et al., 2019). This process is applied by convolving 3D kernels with the 3D patches. The features are extracted by 3D kernel over multiple contiguous bands in the input layer. The value at position (x,y,z) on the j th feature i th layer is given by (Qi et al., 2019;Roy et al., 2019): ( 2) where b denotes bias, g the activation function, m is the feature cube connected to the current feature cube in the (i − 1) th layer, W is the (r, s, t) th value of the kernel connected to the m th feature cube in the preceding layer, the R, S, and T are the length, width, and depth of the convolution kernel size. The activation function of the convolution layer is a rectified linear unit (ReLU). The ReLU activation function can be formulated as: All weights are initialized by Glorot normal initializer (Glorot and Bengio, 2010) and trained using a back-propagation algorithm with the Adam optimizer by using the soft-max loss.
The mini-batches of size 150 and train the network for 150 epochs. Fig. 2. is presented the proposed architecture for hyperspectral binary change detection. This architecture has 6 convolution layers and two fully connected layers with no max-pooling layer.

CASE STUDY AND DATASET
In this study, hyperspectral image data-sets are used for analyzing changes illustrated in (Fig. 3). The first dataset belongs to farmland near the city of Yuncheng Jiangsu province in China, which was acquired on May 3, 2006, and April 23, 2007, respectively. This scene is mainly a combination of soil, river, tree, building, road and agricultural field. The second dataset belongs to an irrigated agricultural field of Hermiston city in Umatilla County, Oregon, OR, the USA, which was acquired on May 1, 2004, and May 8, 2007, respectively. These datasets are captured by the Hyperion sensor. The Hyperion sensor contains 242 spectral bands with wavelengths between 0.4 and 2.5 micrometers and with a spatial resolution 30m and bandwidth of 7.5 km. Hyperion data were obtained at The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2020, 2020 XXIV ISPRS Congress (2020 edition) two separate range images using the push broom technology. One of these spectra was a VNIR range which includes 70 bands between wavelength 356-1058 nm and SWIR wavelength consisting of 172 bands between wavelength 852-2577nm.

Pre-processing
The hyperspectral imagery needs to some pre-processing. This pre-processing is divided into two main parts: (1) spectral correction, geometric correction. The pre-processing step begins with spectral correction; then spatial correction is applied. Spectral correction for the Hyperion L1R data includes discarding of no-data bands, de-striping, de-noising, smile correction, radiometric correction, and atmospheric correction. The second step during pre-processing is a geometric correction. Finally, the 154 spectral bands used for hyperspectral change detection.

Accuracy Assessment
The accuracy assessment is the final step in any RS analysis. This research evaluated the performance of HCD based on numerical and visual analysis. For numerical analysis is used most common accuracy assessment indices for evaluation of performance proposed HCD method (Table 2). Table 2. Different types of accuracy assessment methods in HCD. Criteria Formula Fig. 4. The results of HCD methods for the USA dataset. The (a) MAD-SVM, (b) IR-MAD-SVM, (c) proposed method, and (d) Ground truth.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2020, 2020 XXIV ISPRS Congress (2020 edition)

Results
The result of HCD is compared to other most common HCD methods. These methods are included: multivariate alteration detection (MAD), and iteratively reweighted multivariate alteration detection (IR-MAD). The MAD and IR-MAD need to threshold selection that this process is conducted by the support vector machine algorithm (SVM). The radial bias function (RBF) kernel is considered as kernel of the SVM classifier. The number of training data for both of dataset is included 5000 pixels. The input patch-size for both datasets is 11×11. To tune and select optimized SVM parameters (i.e. gamma ( ) and penalty coefficient (C)), we performed a CV with grid search (GS) procedures.

The China Dataset
The results of the HCD are presented in Fig. 4. Based on the presented result in this figure, all of the methods are detected main changes. The main difference among HCD methods and proposed methods related to subtle changes. The proposed is presented with a good result compared to other methods. Also, the proposed method has a low false rate while other methods have false rates (MAD).
The result of numerical HCD on the China dataset is presented in Table 3. Based on the presented results in this table, the proposed method has the highest accuracy in OA and Kappa terms. However, the miss-detection of MAD is lowest but it has high false alarm rates.

The USA Dataset
The results of HCD on the USA dataset are parented in Fig. 5. On this data set, all methods are provided acceptable results. Generally, all of the main changes is detected by three HCD methods. Between three HCD methods, the MAD methods are presented weak result compare to other HCD methods because there are many pixels is detected as change pixels wrongly (River). As can be seen clearly, the best satisfactory performance is provided by the proposed HCD method. The numerical analysis of HCD methods is presented in Table 4. Based on this table, all methods are provided an accuracy of more than 86% by in OA term. The Kappa index has a high variation.
The proposed method has the highest accuracy by OA and Kappa indices. And also, the proposed method has the lowest error in CD. The false alarm rate of IR-MAD-SVM and the proposed method are very close, both have a false alarm rate under 5% while the miss-detection of IR-MAD-SVM is more than 18%. Based on the visual and numerical analysis the proposed method has high accuracy compared to other methods. This performance originated from using the content of hyperspectral imagery for change detection purposes. The proposed method is used the 3-D convolution for HCD that it investigates the relation of spectral bands together.

CONCLUSION
This research is considered the potential of hyperspectral imagery for change detection purposes. We evaluated the performance of the proposed methods by two real hyperspectral imageries. The result of HCD is the hyperspectral imagery has a high potential for HCD but need special techniques for the extraction of change information. Also, the result of HCD of the proposed method is compared to other most common HCD methods. The result of HCD is shown the proposed method is providing good performance compare to other methods. Also, the performance of MAD and IR-MAD algorithms with the complexity of data is decreased while the proposed method is robust against the high diversity of land cover changes.