A SPECTRALLY IMPROVED POINT CLOUD CLASSIFICATION METHOD FOR MULTISPECTRAL LIDAR

Precise point cloud classification can enhance lidar performance in various applications, such as land cover mapping, forestry management and autonomous driving. The development of multispectral lidar improves classification performance with rich spectral information. However, the employment of spectral information for classification is still underdeveloped. Therefore, we proposed a spectrally improved classification method for multispectral LiDAR. We conducted spectral improvement in two aspects: (1) we improved the eigenentropy-based neighbourhood selection by spectral angle match (SAM) to reform the more reliable neighbour; (2) we utilized both geometric and spectral features and compare the contributions of these features. A three-wavelength multispectral lidar and a complex indoor experimental scene were used for demonstration. The results indicate the effectiveness of our proposed spectrally improved method and the promising potential of spectral information on lidar classification.


INTRODUCTION
Since the invention of lidar, lidar point cloud classification has attracted considerable attention in the field of remote sensing (Vosselman and Maas 2010). Precise point cloud classification can enhance lidar performance in various applications, such as land cover mapping, forestry management and autonomous driving.
There are many traditional single-wavelength LiDAR data classification studies, which are limited by the lack of spectral information. The spectral information from passive technologies could remedy this limitation, but the data fusion need to deal with the problem of the varying illumination conditions (Malik et al. 2007) and the registration problem (Zhang et al. 2015). The development of multispectral lidar successfully obtains spectral and spatial information simultaneously (Hakala et al. 2012;Wei et al. 2012;Woodhouse et al. 2011). With years of development, the multispectral lidar has been gradually effective and practical (Chen et al. 2019c;Ren et al. 2018).
Many researchers try to explore the advantages of spectral information on classification. Basically, the raw spectral intensities with or without normalization were directly employed for classification Kaasalainen and Malkamä ki 2020). For target recognition, the raw spectral intensities could also be compared with library spectra (Myntti 2015). Then, similar with the development of passive multispectral technology, researcher designed the active spectral (vegetation) indices for spectral feature extraction (Morsy et al. 2016;S. Kaasalainen 2017). Considering the different influences on different incidence angles, the spectral indices (SI) may not perfectly remove the incidence angle effect (Kaasalainen et al. 2018). But SI still improves the classification performance than raw spectral intensities because of the alleviation of the radiometric * Corresponding author uncertainty of instrument and incidence angle . Except the spectral index, more spectral features were proposed for classification, e.g., the spectral statistical parameters (Vauhkonen et al. 2013) and color features (Chen et al. 2019a).
In addition, the spatial information is combined with spectral information to enhance the classification. Focusing on the sparse signal, the hierarchical Bayesian model is used for distance estimation and classification on single-photon system . The hierarchical method processing spectral and spatial information in several phases is also popular (Chen et al. 2017;Chen et al. 2019b;Suomalainen et al. 2011). In former works, the spectral information was not fully utilized for classification. multispectral point cloud. The methods exploring the potential of spectral information need to be developed. Therefore, we proposed a spectrally improved classification for multispectral LiDAR. Our method was developed based on the fundamental framework for pointwise point cloud classification proposed by Weinmann et al. (2015). This fundamental framework achieved nice classification result and had been used by many researchers. That fundamental framework was designed for single-wavelength LiDAR, and the spectral information was not considered in the neighbourhood selection and feature extraction. To explore the advantage of the spectral information form multispectral LiDAR, we improved the framework in two aspects: (1) we improved the eigenentropy-based neighbourhood selection by spectral angle match (SAM) to reform the more reliable neighbour; (2) we utilized both geometric and spectral features and compare the contributions of these features. A threewavelength multispectral lidar and a complex indoor experimental scene were employed for demonstration.

METHOD
In this research, the multispectral lidar point cloud classification consists three phases: (1) neighbourhood selection, (2) feature extraction and classification, and (3) refinement of classification result. In the first phase, the spectral information is used to construct a more precise neighbourhood. In the second phase, seven spectral and ten geometric features are constructed and the most important features are selected for classification by random forest. In the third phase, the spectral and spatial information are employed by conditional random field for label smoothing.

Spectrally Improved Neighborhood
For neighborhood selection, we used the spectral similarity to construct a novel spectrally improved eigenentropy-based neighborhood selection. We first conducted the popular eigenentropy-based neighborhood selection. After that, every point obtained their own optimized eigenentropy-based k nearest neighborhood. After the decision of k, we expanded the neighborhood with 1.5*k nearest points as an alternative neighborhood. Next, the spectral similarity between the alternative neighborhood points and the center point were measured according to SAM. SAM is chose as a common spectral similarity measurement approach, which is much less sensitive to the brightness. Finally, the k most spectrally-similar points in the alternative neighborhood reformed the spectrally improved eigenentropy-based neighborhood.
This spectra-improved modification on neighbourhood selection is based on the hypothesis that the spectrum-similar points tend to be from the same material and have similar spectral and geometric attributes. Compared with the traditional eigenentropy-based neighborhood, the spectrally improved neighborhood may extract the more precise geometric and spectral features and then result in a higher classification accuracy.

Feature Extraction and Selection
In former lidar classification research, the geometric features are the main information for classification, because the radiometric calibration on intensity is not well-developed. After the improvement of the intensity data quality and the radiometric calibration, the intensity features could be used for classification, which principally improves the classification result, which has been demonstrated in many studies (Yan and Shaker 2014).
In this research, 7 kinds of spectral features and 11 geometric features are designed, and then the feature selection is conducted to figure out the most effective features for multispectral LiDAR classification.

Spectral Feature:
We totally constructed 7 kinds of spectral features. Firstly, we employed four moments of statistics: the first raw moment (mean), positive square root of the second central moments (standard deviation), the third central moments (skewness), and the fourth central moment (kurtosis). The four features indicate the statistical spectral attributes of neighborhood points. Based on the assumption that points from the same class shares the similar spectra, the feature of mean is supposed to be an effective feature. The formulas of these features are expressed in the following: Where the subscript ∈ indicate the spectral channel . is the mean of intensities in spectral channel ∈ , indicates the number of neighborhood points, is the intensity of the point ∈ in spectral channel ∈ . In addition, the coefficient of variation (CV) and ratio of one channel to all (ratio) are also considered.
Besides, we borrowed ideas from the passive multispectral technology to construct SI. The SI used in this research is designed by the normalized difference vegetation index (NDVI), because NDVI is one of the most popular and effective indices.

= − +
Where the is the spectral index constructed by mean of channel m and n. Based on this formula, we build 2 spectral indies.

Geometric Feature:
We used 11 common geometric features, which has been widely used in lidar classification. These features are calculated based on a 3D structure tensor represented by the local covariance matrix derived from the 3D coordinates of neighborhood points. The first three geometric features are three eigenvalues of this 3D structure tensor: 1 , 2 and 3 . Then, three saliencies, named linearity, planarity and sphericity, are constructed based on these eigenvalues . In addition, we also involved sum of eigenvalues, omnivariance, anisotropy, and eigenentropy. These defined spectral and geometric features are corresponded to different quantities and units. Therefore, a normalization was conducted to map each dimension onto the interval [0,1].

Feature Selection:
After the construction of spectral and geometric features, there might be some useless features increase feature redundancy the computational cost. Therefore, we used the random forest to evaluate the importance of features, according to which, the most useful ones are selected for classification to decrease the overfitting and computational time (Dash and Liu 1997).
In this research, we used the random forest for classification as it offers a good trade-off between efficiency and accuracy (Breiman 2001). We also evaluate the feature importance by random forest. In random forest or decision trees, every internal node provides a division of samples based on a single feature. The quality of a split is measured by impurity, which could be described as information gain or Gini impurity. Nodes with the greatest decrease in impurity occur at the head of trees. In contrast, nodes with the least decrease in impurity appear at the end of trees. In a random forest, the impurity decreases are averaged as the mean decrease in impurity, according to which the feature importance is ranked.

Classification and CRF
In this research, the random forest and CRF were used for classification and label smoothing. As one of ensemble learning methods, random forest is a good trade-off on accuracy and efficiency. A random forest contains a few decision trees and provides additional randomness while training the decision trees. Each decision tree would produce its own response. After that, the final result would be the most popular output voted by all decision trees in the random forest. Random forest improves the robustness and classification accuracy, compared with a single decision tree.
CRF (Lafferty et al. 2001) is used in this research because CRF has been widely demonstrated to be effective by both terrestrial and airborne lidar (Vosselman et al. 2017). The association potential can be delineated by the posterior probability of random forest. For interaction potentials, the contrast-sensitive Potts model (Boykov and Jolly 2001) is employed. After the construction of the graph and potential, the loopy belief propagation (LBP) (Frey and MacKay 1998) is performed to maximize the posterior probability.

Experimental Instrument
The experimental instrument employed in this research is a threewavelength multispectral lidar. The three wavelengths are 466, 527, and 628 nm. The laser source is a supercontinuum laser (SuperK, NKT Photonics). The transmitted spectrum covers from 450 nm to 2000 nm and the pulse repetition rate is 20 kHz. After the transmission of the laser beam, it is reflected by a scanning mirror for 3D scanning. Then, the laser beam would be backscattered by the detection target. The backscattered signal would be reflected by the mirror again and into the optical receiving unit. Next, two dichroic mirrors would divide the echo signal into different wavelengths, which would be transformed into electric signal by three avalanche photodiodes.

Experimental Materials
A complex indoor experimental scene is used for validation as illustrated in Figure 2. The scene includes 14 materials with different spatial and spectral properties. The detection distance is about 7m. The experimental targets include a piece of black paper, a Sansevieria trifasciata plant, two ceramic flowerpots, two brown cardboard boxes, a blue lamp, a black ceramic vacuum cup, a white ceramic drinking cup, a yellow cellulose tape, a wooden box, Pachira macrocarpa leaves, P. macrocarpa trunk, a blue plastic bin, and a carrot-like ceramic object in the bin.

Figure 2. Experimental scene containing 14 materials
After the scanning of the experimental scene, the multispectral lidar point cloud is obtained as shown in Figure 3. The size is about 1.01 × 0.51 × 0.35 m, and 6684 points were produced. We conducted the reference target-based radiometric calibration to obtain the precise spectral information of every point. The effects of detection distance and beam incidence angle on spectral intensities are calibrated through it.

Neighborhood Selection and Features
We compare the classification results with traditional eigenentropy-based neighborhood selection and spectrally improved neighborhood selection. Figure 4 shows the random forest classification accuracies with different feature sets, in which CRF is not conducted yet. The comparative classification results demonstrated the advantages of spectral information in two ways: neighbour selection and classification features. The classification accuracy results indicated that our proposed spectrally improved neighborhood selection method outperformed the traditional eigenentropy-based method, especially for the geo & spe (improved from 87.26% to 92.37%) and the spectral only features (improved from 84.62% to 87.01%). To explain the reason why it could increase the classification accuracy, we show an example for exhibiting the superiority of the spectrally improved neighborhood in Figure 5.
(a) (b) Figure 5. Neighborhood of (a) eigenentropy-based method, and (b) spectrally improved method. The red points are the neighborhood points. Figure 5 indicates a neighborhood example of a centre point in the blue plastic bin. The spectrally improved neighborhood selection successfully removed the falsely neighborhood points from the carrot-like ceramic object in the bin. The spectral similarity could distinguish the spectral difference between these two targets. That demonstrated the effectiveness of our proposed spectrally improved neighborhood selection.
The advantage of spectral information is also revealed in classification features. As shown in Figure 4, the classification result based on spectral features produce much higher overall accuracies than the geometric features. That might be not only owning to the better prediction capability of spectral features, but also owning to the complex indoor scene.
We evaluated the feature importance by random forest and figured out that the spectral features obtained higher contributions than geometric features. The most important features are ordered decreasingly here: mean_spe, SI_spe, StdDev_spe, Skew_spe, CV_spe, Kurtosis_spe, Eigenvalue_geo, ChangeofCurv_geo, Sphericity_geo, Anisotropy_geo, and Omnivariance_geo. Therefore, the spectral features contributed much more than geometric features for multispectral lidar classification.

CRF
The results in figure 6 showed the effective performance of CRF on labelling smoothing. And the overall accuracy of the random forest and CRF are 88% and 93%, respectively. The salt-andpepper noise errors were significantly alleviated by CRF, though the overall accuracy was not much improved. In addition, the error points in the black paper, the Sansevieria trifasciata plant, the brown cardboard box, and the black ceramic vacuum cup were almost corrected.
(a) (b) Figure 6. Result based on (a) random forest and (b) CRF. The green and red points indicate the correct and false points, respectively.

CONCLUSION
To sum up, this paper studied a spectrally improved classification method for multispectral lidar. The main improvements were on the neighbour selection and feature extraction. Finally, the CRF was employed for labelling smoothing. We demonstrated our proposed method with an indoor complex scene and a threewavelength multispectral lidar. The experimental results showed that the spectral information could help for more precise neighbour selection and the more effective feature extraction, and finally contributed to a higher classification overall accuracy.