An improved coherent point drift method for tls point cloud registration of complex scenes

: Processing unorganized 3D point clouds is highly desirable, especially for the applications in complex scenes (such as: mountainous or vegetation areas). Registration is the precondition to obtain complete surface information of complex scenes. However, for complex environment, the automatic registration of TLS point clouds is still a challenging problem. In this research, we propose an automatic registration for TLS point clouds of complex scenes based on coherent point drift (CPD) algorithm combined with a robust covariance descriptor. Out method consists of three steps: the construction of the covariance descriptor, uniform sampling of point clouds, and CPD optimization procedures based on Expectation-Maximization (EM algorithm). In the first step, we calculate a feature vector to construct a covariance matrix for each point based on the estimated normal vectors. In the subsequent step, to ensure efficiency, we use uniform sampling to obtain a small point set from the original TLS data. Finally, we form an objective function combining the geometric information described by the proposed descriptor, and optimize the transformation iteratively by maximizing the likelihood function. The experimental results on the TLS datasets of various scenes demonstrate the reliability and efficiency of the proposed method. Especially for complex environments with disordered vegetation or point density variations, this method can be much more efficient than original CPD algorithm.


INTRODUCTION
During the last decades, advances in laser scanning technology have led to significant development of research and activities related to computer vision, topographic mapping, and terrain analysis [Xu et al., 2017]. Among them, terrestrial laser scanning (TLS) is frequently used for various applications (such as: object extraction, tracking, deformation detection, building reconstruction) since it can collect dense point clouds quickly and accurately. In such applications, processing unorganized 3D point clouds are inevitable and highly desirable , especially for tasks in complex areas (for example: mountainous or vegetation scenes). However, to obtain complete information of an area or scene, multiple TLS stations are required, leading to the registration problem of transforming the point clouds from different stations into a same coordinate system. Various 3D registration methods have been proposed, demonstrating superior performance, but they usually need to be carefully designed to work well in specific environments. In general, an efficient registration of TLS point clouds should solve two major problems: extracting the registration primitives (geometric features) and determining the corresponding primitives [Habib et al., 2010]. However, in a complex environment, outliers caused by disordered vegetation, and occlusions caused by complex objects pose challenges for automatic registration. Specifically, various outliers or noise affect the extraction accuracy of registration primitives. For TLS datasets, point densities also vary considerable depending on the scanning distance and incidence angle. This varying point density decrease the reliability of extraction. On the other hand, complex as similar structures increase the number of mismatched correspondences since various similar local surface of one predefined level appear in complex environment.
To tackle the aforementioned problems, we propose an effective TLS registration method for complex scenes by improving the CPD method. The CPD algorithm determines the optimal transformation between stations by maximizing a Gaussian Mixture Model (GMM) likelihood function. It takes the whole point cloud into consideration without extracting geometric features, and matches iteratively to maximize the values of an objective function. The method has a strong robustness to outliers or noise [Lu et al., 2018]. Besides, we designed a robust 3D descriptor of a suitable covariance matrix to describe the geometric information of each point, ensuring that a global optimum is achieved. Considering all these factors, the core concept of our proposal is to combine the advantages of the covariance descriptor and CPD algorithm. Compared with original method, this method exhibits excellent performance and good applicability for complex scenes.

Related Work
Some existing methods use artificial markers to perform alignment between different stations [Kim et al., 2016]. However, the deployment and precise positioning of the artificial targets are generally labor-intensive and timeconsuming, especially for mountainous or riverbank scenes.
To date, a variety of automatic registration methods were provided. Many classification methods have been proposed to classify them [Salvi et al., 2007]. According to the registration errors, these methods are generally categorized into coarse and fine methods. The former provide initial transformation parameters for the latter. Without rough registration, fine methods are easy to fall into local minima.
Most coarse methods are based on geometric primitives (including feature points, straight lines, spatial curves, regular planes etc.). Primitives (geometric elements used for registration) contain discriminative geometric information that facilitate the matching of correspondences. Specifically, feature points are usually extracted from point clouds to increase matching efficiency, [Ge, 2017]. Various feature point extraction methods are available, including SIFT [Pang et al., 2012], SURF [Aoki et al., 2017], and DoG [Theiler et al., 2014]. However, feature point based methods are sensitive to outliers or point density variations. Apart from these, straight lines [Date et al., 2018] and regular planes [Forstner et al., 2017] are also popular primitives, but limited to artificial environments where regular features can be easily extracted. Besides, spatial curves [Yang et al., 2014] and curved planes [Raposo et al., 2018] are frequently used as registration primitives as well, exhibiting good performance for free-form objects. However, for TLS point clouds of complex scenes, few effective spatial curves or curved planes can be found. These registration primitives based methods mainly apply matching strategies (e.g., index, conditional constraint or RANSAC searching) to search potential primitives, and use feature descriptors to measure and determine correspondences.
Fine methods aim at refining the initial transformation. Typical fine methods are the Iterative Closest Point (ICP) algorithm [Besl et al., 1992] and its variations [Dong et al., 2016;Li et al., 2015]. ICP minimizes the objective function formed by the squared distances between the closest points iteratively to get the accurate transformation. Traditional ICP is limited by its narrow region of convergence. Good initial values are needed to avoid falling into a local minimum. Other registration methods are common used, such as: 4-points Congruent Sets [Mellado et al., 2014], Simultaneous Localization and Mapping method [Saeedi et al., 2014].
Recently, probability methods such as Coherent Point Drift show competitive performance in different scenarios. CPD was firstly introduced in [Myronenko et al., 2010]. It treats the registration of two point clouds as a probability estimation problem. Based on motion coherence theory, Gaussian Mixture Model (GMM) centroids are fit to the point clouds using the Expectation-Maximization (EM) algorithm. The CPD algorithm does not need initial values, or a series of strategies to ensure enough correspondences. CPD offers superior accuracy and stability in presence of outliers. However, CPD only uses the constraint of distance between two point clouds to measure similarity, performing poorly on data with varying point density.

Our Contributions
In this research, we extend the CPD algorithm with a novel descriptor for robust registration of complex scene TLS point clouds. The main contributions and innovations are as follows: (1) A robust descriptor is proposed, using three feature values between the current point and its neighbour to construct a covariance matrix. Next, the generalized eigenvalues are calculated to measure the difference between any two points, making it robust to outliers and varying point density.
(2) Based on the descriptor, we extend the CPD algorithm by improving its objective function and the posterior probability function, to make use of distance information as well as robust geometric information provided by the descriptor.

METHODOLOGY
Our proposed registration method consists of three steps: the construction of the covariance descriptor, uniform sampling of TLS points, and CPD registration procedures. In the first step, the normal vectors of each point are estimated. Then, we calculate feature values to form a covariance matrix for each point. In the subsequent step, to ensure efficiency, we sample the TLS point clouds uniformly. Finally, we construct an objective function considering the geometric information described by the descriptor, and optimize the transformation iteratively by maximizing the likelihood function. The workflow is shown in Figure 1. The details will be introduced in the following sections.

Construction of covariance-based descriptor
Covariance is a method of decreasing the dimension, by quantifying the change of many variables together. Inspired by [Cirujeda et al., 2015], we constructed a covariance-based descriptor gathering shape information of a local surface. It offers many intrinsic advantages: invariant to spatial transformation, and robust to outliers and point density variation.
For one point and its neighbours, the first step is to calculate the feature vector for each neighbour based on normal vectors. The feature vector of one neighbour j P is formed as: Where, j  is the angle between the normal vector of current point i P and neighbour j P ; j  is the angle between the normal vector of neighbour j P and the vector from neighbour j P to current point i P ; j  and j  together reflect the shape of the local surface (as shown in Figure 2 Based on the feature vectors of neighbours, we construct a covariance matrix for current point i P , written as: where n is the number of neighbours within a specified radius;  indicates the average feature vector of neighbours.
The covariance matrix contains the feature information of the local surface. We form a covariance matrix for each point to describe its local characteristics.
Notably, the covariance formed by the feature vectors has different dimensional variables. To measure the dissimilarity between any two points reasonably, we use the generalized eigenvalues of two covariance matrixes, as: where 1 2 3 ,,    are the generalized eigenvalues of covariance matrixes 1 r C and 2 r C [Tuzel et al., 2006]. The dissimilarity describes the geometric differences of the local surface well.
The dissimilarity is normalized between (0, 1), written as: (4) where f w is a weight to increase the descriptiveness. Smaller dissimilarity value represents that the geometric difference between two points is small.

Improved CPD algorithm
The CPD algorithm considers the registration problem between two point clouds: To account for this, the following formula is formed: where 01 w  , representing the amount of outliers. Then the EM algorithm is used to estimate the optimum transformation iteratively. During the E-step, the matching probability between any two points from 3 N X  and 3 M Y  as well as the transformation are "guessed" first. Then Bayes' theorem is used to compute the posterior possibility to construct a likelihood function. In the M-step, these parameters are updated iteratively by minimizing the upper bound of the objective function. However, only distance information is considered in the objective function, easily leading to incorrect positions (as Figure 3(b) shows). Considering this, we construct the objective function as:

Experimental datasets
TLS point clouds of complex scenes are used to demonstrate the performance of the proposed method. Specifically, mountainous and river bank areas are selected (see Figure 4). The first dataset is about a mountainous area located on an Island in China. The second dataset is sampling a riverbank area, located in the Luogang district of Guangdong province China. Both datasets have lots of occlusions and noise. To test the method, we select four stations from them separately, and use Geomagic Studio 2012 to simplify the original point clouds first. Detailed information on datasets used is listed in Table 1. Figure 4. Two TLS datasets of complex scenes: (a)-(d) T1 to T4 stations of mountainous area, (e)-(h) T1 to T4 stations of riverbank area. Table 1. Detailed information of datasets   Figure 5, we can see that different degrees of overlap, point density variation, and even missing of points exist in the datasets. However, Figure 5 shows that adjacent TLS point clouds were aligned well by the proposed method. It shows the robustness and reliability of the method, demonstrating that the proposed method is suitable for TLS data of complex scenes. Table 2 shows that the registration errors are small (about 0.10m for mountain data, and about 0.15m for riverbank data). The RMSE shows the good global alignment statistically. Notably, these registration results can be improved further by fine registration method.

Registration results
(a) (b) Figure 6. Registration details of Figure 5 Particularly, the last row in Table 2 shows that the registration accuracy of T3 and T4 from riverbank area is relative large (more than 0.20m). Some details of Figure 5 (f) are extracted and shown in Figure 6. It shows that there is a translation between the building walls, and the bridge floors. This is because the majority of points concentrates on the areas (like the road along the river) near the scanner. For the distinct areas, the point density is relative small. Thus, dense areas are easily matched together based on the constraints of probability. Therefore, in our future work, we will give different weights for the points with different point densities to compensate for that. To evaluate the performance further, we applied the original CPD algorithm to register the TLS point clouds directly. The results are shown in Table 3. Table 3 shows that it has poor performance in complex environments. This also demonstrates the satisfactory performance of the proposed method.  Table 4 shows that noise exerts little influence on the proposed method since the mean error stays within 0.15m. The proposed method is also robust to varying point density. The correct position can be reached even with sparse point density (for example: 0.4m). Table 4. Registration accuracy of different situations

CONCLUSION AND FUTURE WORK
In this research, we propose an automatic registration method for TLS point clouds by improving the CPD algorithm, combining the geometric information described by a covariance descriptor to robustly register point clouds of complex scenes. The experimental results on TLS point clouds from different scenes demonstrates the efficiency and reliability of our proposal. Especially for complex environments with disordered vegetation or point density variations, this method is much more efficient than the original CPD algorithm. The proposed method combines the advantages of novel covariance descriptor and the CPD algorithm, which achieves a robust performance providing a good alignment.
However, there are still some problems that need to be further investigated, for example, the probability of two points should consider the influence exerted by point density, which is the common phenomenon for TLS point clouds; Variance should be improved to improve the convergence efficiency. In the future, we will try to apply extended coherent point drift to consider geometric constraints more scientifically. Comparisons to other descriptors and registration methods will be carried out to explore the potential performance.