SHIP DETECTION BASED ON MULTIPLE FEATURES IN RANDOM FOREST MODEL FOR HYPERSPECTRAL IMAGES

A novel method for detecting ships which aim to make full use of both the spatial and spectral information from hyperspectral images is proposed. Firstly, the band which is high signal-noise ratio in the range of near infrared or short-wave infrared spectrum, is used to segment land and sea on Otsu threshold segmentation method. Secondly, multiple features that include spectral and texture features are extracted from hyperspectral images. Principal components analysis (PCA) is used to extract spectral features, the Grey Level Co-occurrence Matrix (GLCM) is used to extract texture features. Finally, Random Forest (RF) model is introduced to detect ships based on the extracted features. To illustrate the effectiveness of the method, we carry out experiments over the EO-1 data by comparing single feature and different multiple features. Compared with the traditional single feature method and Support Vector Machine (SVM) model, the proposed method can stably achieve the target detection of ships under complex background and can effectively improve the detection accuracy of ships. * Corresponding author


INTRODUCTION
Reliable ship detection plays an important role in national economy and security, and ship surveillance is one of main work in maritime management.Traditional ship detection requires manual observation with all kinds of ships, which costs manpower and material resources greatly (Christina et al., 2010).With the development of remote sensing, satellite and artificial intelligence technologies, fast and accurately detecting ships based on remote sensing images has attracted widespread concern.Hyperspectral data that is a new type of remote sensing data has become one of recent research focuses.Ship detection has received more and more considerable interest in hyperspectral image processing (Geng et al., 2014).
Hyperspectral images can provide hundreds even thousands of bands.Although the abundant bands play an important role in the application, the high correlation among bands makes the data greatly redundant, leading to a problem due to the curse of dimensionality (Hughes phenomenon) (Hughes, 1968).To solve this problem, feature extraction is considered as an effective method.
Many researches of ship detection are focused on the straightforward approach that is to consider the values acquired for only spectral bands (Keskin et al., 2016), (Wang et al.,2016) and (Dai et al., 2015).Hyperspectral images have the characteristic of acquiring spectral and spatial information simultaneously (Chein, 2003).Many techniques, which aim to make full use of both the spatial and spectral information from hyperspectral images, are investigated.For ship detection of hyperspectral images, we also consider the approach exploiting spatial-spectral features, i.e. relations within the image neighbourhood are considered.
Ship detection can be defined as a classification problem in which all pixels are divided into ship and background (sea and other objects).The traditional method can obtain better performance on the low-dimensional data sets, but its performance decrease on the high-dimensional data sets.The decision to use Random Forest (RF) classifier is based on its good results that have been achieved in the classification on airborne hyperspectral data (Keskin et al., 2016), (Schilling et al., 2015) and (Chan et al., 2008).RF is a classifier based on integrated strategy with high prediction accuracy and is efficient on large data sets (Breiman, 2001).
In this context, an improved method based on multiple features in RF model for hyperspectral images is proposed to detect ship targets stably and accurately.This paper is structured as follows: Section 2 describes correlative theories and the proposed method.Section 3 presents the experimental results and analysis.Subsequently, the conclusion and suggestions for future work are provided in Section 4.

METHODOLOGY
An improved method based on multiple features in RF model for hyperspectral images is proposed to detect ship targets stably and accurately.In the method, the process of ship detection has the following three main stages: sea-land segmentation, multiple features extraction and ship detection based on RF model.Figure 1 shows the ship detection algorithm.Firstly, land areas are removed by threshold segmentation, and sea area is obtained.Secondly, multiple features including spectral and texture features are extracted.Thirdly, RF model is introduced, and the extracted features are concatenated to form a joint feature vector that is used for ship detection.The detailed description on the whole process will be presented in the following parts.
Figure 1.Flowchart of the proposed algorithm

Sea-Land Segmentation
The images used to detect ships usually contain complicated coastal land, and it results in testing time increase and detection accuracy reduce.Sea area should be isolated from land to reduce the influence.In the near infrared and short-wave infrared bands, water absorbs almost all incident energy and has a very low reflectivity approaching zero.Compared to water, land has a higher reflectivity.Therefore, the band with high signal-noise ratio is selected in the range of near infrared or short-wave infrared spectrum and is used to segment land and sea by the grayscale threshold.
Otsu threshold method and morphological treatment are used achieve sea-land segmentation.Otsu threshold method 1979) is commonly used segmentation algorithm.This method is a classical adaptive threshold segmentation method with small calculation and stable performance.The image pixels are divided into gray target and background, the maximum variance within the two classes is taken as the threshold.The threshold selection function is: Where, 2 () is the variance within the two classes separated by a threshold j T , 0  , 1  are the probability while the pixel is target or background, 0  , 1  and  are the grayscale mean value of two classes and the whole image.In order to obtain the best segmentation effect, the optimum threshold should be used to maximize the function.
After threshold segmentation, open operation is processed to remove the "glitches", it can eliminate isolated features, fill holes and smooth the edge of the large area.

Spatial and Spectral Features Extraction
In order to improve the performance of ship detection methods, feature extraction methods are introduced to hyperspectral images to obtain spectral and texture features.

Texture Features:
The extracted texture features are based on the Gray Level Co-occurrence Matrix (GLCM).Texture is a property that can be defined as regular repetition of an element or pattern on a surface.Statistical approaches are the most common methods of texture extraction used in the analysis of satellite images (Mokhtarzade, 2008).GLCM is a good method for texture feature extraction that not only considers the distribution of gray levels, but also position of pixels relative to each other (Changren et al., 2010).
The GLCM matrix can be generated by calculating the joint conditional probability density ( , | , ) P i j d  of gray level for two pixels.
In this way, the information of pixels is obtained in the matrix.Not all the matrix parameters are suitable for the analysis of hyperspectral images.In this paper, eight feature vectors by the GLCM matrix are selected, including mean, variance, homogeneity, contrast, dissimilarity, entropy, second moment and correlation.The matrix is computed from the first principal component after dimensionality reduction.

Ship Detection Based on RF Model
RF model is a combined classifier with many weakly correlated classifiers, which contains a lot of classification and regression trees.In addition to have no need to pre-process data, the parameter setting of RF model is relatively simple, and it is only necessary to define the number of decision trees and the dimension of each tree feature subset.
In each new training set, about one-third of the samples are left out which is called out-of-bag data (OOB).Therefore, using OOB estimates as an ingredient in estimates of generalization error.The study of error estimates for bagged classifiers (Breiman, 1996b) gives empirical evidence to show that the OOB estimate is as accurate as using a testing set of the same size as the training set.OOB can be used to assess the performance of RF model.
An important characteristic of RF model is the evaluation of the importance of features in the process of model training.The paper proposes the method of setting the threshold to remove low importance features.Then, we can get the final selected set of features which can obtain the better result.
Finally, in post processing step morphological filters are used to make the results more clearly and remove noises.

EXPERIMENTAL RESULT AND ANALYSIS
In the experiment, the EO-1 data in the port area of India Bombay is used.The data consists of 217×229 pixels and 242 spectral reflectance bands in the wavelength range from 355.59 ~2577.08 nm .Figure 3

Features Extraction
In PCA transformation, the information content of each principal component corresponds to the eigenvalue.

Parameters of RF Model
To improve the accuracy of detection, this paper deals with the different combination of the decision trees number ntrees and feature subset dimension m .By comparing OOB error rate, to select the best ntrees and m .Figure 6 shows that the accuracy of OOB is best and OOB error rate tends to stable, when m is 7, ntrees is 590, and all of features are used.In the same way, different model parameters are selected according to different feature set. Figure 6.The relationship between OOB and parameters RF can obtain the importance coefficient of each feature.According to the importance coefficient, some unimportant features are removed and retrain the model to obtain the final feature set which is used for ship detection.Figure 7 shows the importance coefficient of 26 spectral features and 32 texture features.
Figure 7.The importance coefficient of features

Ship Detection Results and Precision Evaluation
There is not any ground truth for accuracy assessment, this stage is done by visual and comparison between a manuallyproduced image (Figure 8 The results illuminate that the proposed method could guarantee a high estimation of target detection and a low estimation of undetected target and false alarm.Multiple features compared to simple feature achieves better detection results, RF compared to Support Vector Machine (SVM) achieves fewer false alarms.
Ship is the typical three-dimensional target, and texture features are more important in the detection.By spectral-spatial features combination, the complementary advantages of multiple features are excavated.Spectral feature detection corrects the false alarms of texture feature detection, and texture feature detection fills the undetected targets of spectral feature detection.The target shape is more complete, and the detection performance is better.

CONCLUSION
A ship detection method based on multiple features in RF model for hyperspectral images is proposed.Optimized results are observed with full use of both the spatial and spectral information from hyperspectral images.The results show that using multiple features significantly improved accuracy over using single feature, the spectral-spatial method can achieve better accuracy in the unbalanced training set.All ships in the testing image can be detected by the proposed method and has relatively few false alarms.Compared to single spectral feature and single texture feature, the detection rate increased 27% and false alarm rate reduced 66%.The proposed method can stably achieve the target detection of ships at the sea or by the sea and can effectively improve the detection accuracy of ships.
Ship identification using features of multi-source data will be investigated in the future works.
2.2.1 Spectral Features: Principal component analysis (PCA)is used to extract spectral features.The original hyperspectral image consists of bands with consecutive spectral responses which have the characters of redundancy and related.PCA can replace the original bands by a new set of unrelated bands which can benefit for dimensionality reduction and extraction of spectral features(Eklundh et al., 1993).
Figure 2 shows the model generation steps: (1) Each new training set is obtained by sampling randomly with replacement from the original training set, bagging is used in tandem with random feature subset selection.Then a tree is grown on the new training set using random feature selection.The trees grown are not pruned.(2) The number of repeating the first steps is determined by the number of trees.Then a forest is grown.(3) Testing set can be predicted by every tree and the final result is a weighted sum of all trees.

Figure 2 .
Figure 2. Schematic diagram of RF model (a) shows the colour composite of the image.The 51st band is extracted to complete sea-land segmentation (figure 3(b)), which the corresponding wavelength is 864 nm that is near infrared band.(a) (b) Figure 3. (a)Three-band colour composite of image.(b) Gray scale image of the 51st band.

Figure 4
Figure 4(a) shows the result of Otsu segmentation.The size of erosion and dilation structure operator are 33  and 55  .the extraction effect of sea area is shown in figure 4(b).
Figure 5 (a ~ h) show 8 texture feature vectors in direction 45 . (a) Mean reflects the regularity of image texture.(b) Variance measures the deviation between the gray value and the mean value of each pixel.(c) Homogeneity measures whether the local change of image texture is uniform.(d) Contrast reflects the local change information of the image.(e) Dissimilarity is similar to contrast, but linear increase.(f) Entropy measures the amount of information contained in the image.(g) Second moment reflects the distribution of image grayscale.(h) Correlation reflects the similarity between adjacent pixels in the horizontal and vertical direction.
(a)).The scene contains 11 ships.The evaluation indexes are the number of detected ships, undetected ships and false alarms.The results are shown in table 2, figure 8 (b ~ e) show the outputs of different feature sets, different models and our proposed method.In figure 8, green circles are the false alarms that in the texture features model have been detected as ship by fault.(a) Manually-produced reference image.(b) Ship detection result of the spectral features model.(c) Ship detection result of the texture features model.(d) Ship detection result of multiple features in RF model.(e) Ship detection result of multiple features in SVM model

Table 1 .
Table1is the eigenvalue and accumulated information of each PC.The first 26 PCs contains 99% information, and the PCs behind 26 have small eigenvalues and are seriously disturbed by noise.Therefore, the first 26 PCs are taken as spectral features.The components information A total of 32 texture feature vectors are computed from the first PC of the image using a 3×3 window with direction 0 45 90 135 ， ， ， .