Global Registration of Kinect Point Clouds using Augmented Extended Information Filter and Multiple Features

Because the Infra-Red (IR) Kinect sensor only provides accurate depths up to 5 m for a limited field of view (60), the problem of registration error accumulation becomes inevitable in indoor mapping. Therefore, in this paper, a global registration method is proposed based on augmented extended Information Filter (AEIF). The point cloud registration is regarded as a stochastic system so that AEIF is used to produces the accurate estimates of rigid transformation parameters through eliminating the error accumulation suffered by the pair-wise registration. Moreover, because the indoor scene normally contains planar primitives, they can be employed to control the registration of multiple scans. Therefore, the planar primitives are first fitted based on optimized BaySAC algorithm and simplification algorithm preserving the feature points. Besides the constraint of corresponding points, we then derive the plane normal vector constraint as an additional observation model of AEIF to optimize the registration parameters between each pair of adjacent scans. The proposed approach is tested on point clouds acquired by a Kinect camera from an indoor environment. The experimental results show that our proposed algorithm is proven to be capable of improving the accuracy of multiple scans aligning by 90%.


INTRODUCTION
3D scene modeling for indoor environments has stirred significant interest in the last few years.Recently, the Microsoft Kinect sensors, originally developed as a gaming interface, have received a great deal of attention as being able to produce high quality depth maps.These RGB-D cameras capture visual data along with per-pixel depth information in real time.However, one of the biggest problems encountered while processing the Kinect point clouds is the registration in which rigid transformation parameters (RTPs) are determined in order to bring one dataset into alignment with the other.Because the Infra-Red (IR) Kinect sensor provides accurate depths only up to a limited distance (typically less than 5 m) for a limited field of view (60), the problem of registration error accumulation becomes inevitable when multiple scans registration is required.Global registration has been a well-discussed issue in terrestrial laser scanning data processing.Bergevin et al. (1996) presented an algorithm that considers the network of views as a whole and minimises the registration errors of all views simultaneously.Stoddart and Hilton (1996) identify pair-wise correspondences between points in all views and then iteratively minimise correspondence errors over all views using a descent algorithm.This basic technique was extended by Neugebauer (1997) and Eggert et al. (1998) using a multiresolution framework, surface normal processing, and boundary point processing.Williams and Bennamoun (2000) suggested a further refinement by including individual covariance weights for each point.There is currently no consensus as to the best approach for solving the global registration problem.Kang et al. (2009) proposed a global registration method which minimizes the self-closure errors across all scans through simultaneous least-squares adjustment.Majdi et al.(2013) realized a spatial alignment of consecutive 3D data under the supervision of a parallel loop-closure detection thread.In recent years, the optimization algorithms using a series of measurements observed over time have been introduced in point cloud registration.Ma and Ellis(2004) proposed the Unscented Particle Filter (UPF) algorithm to register two point data sets in the presence of isotropic Gaussian noise.In this paper, we regard the point cloud registration as a stochastic system and the global registration as the process that recursively estimates the rigid transformation parameters of each scans, so that augmented extended Information Filter (AEIF) is utilized to produces the accurate estimates of rigid transformation parameters through eliminating the error accumulation suffered by the pair-wise registration.

GLOBAL REGISTRATION USING AUGMENTED EXTENDED INFORMATION FILTER
Kang et al. proposed a global registration algorithm using the augmented extended Kalman filter (AEKF), which also derived a constraint of the central axis of a subway tunnel to control the registration of multiple scans.Because the canonical form of extended Information Filter (EIF) (1999)completely describes the Gaussian by the information (inverse covariance) matrix information vector rather than a dense covariance matrix and mean vector of EKF, to improve the efficiency we utilize the EIF to estimate the six rigid transformation parameters (three for the translation and three for the rotation).EIF is normally employed in robotic motion planning and control, and indoor navigation.However, as a global registration process, the RTPs that are acquired by pair-wise registrations should be globally optimized.Therefore, in this paper the system state is augmented to contain the RTPs of all pair-wise registrations that have been completed, so the optimized RTPs in the global reference frame are estimated in terms of the RTPs of the new registration and its preceding registration.This paper presents a design for an augmented Extended Information Filter (AEIF) for the global registration of Kinect point clouds.
For a scale factor of 1, the rigid transformation between adjacent scans is parameterized as follows: where  ' and Xare the coordinates of the corresponding points in the analyzed and fixed scans, respectively, R is the rotation matrix computed by three rotations around the coordinate axes φ, ω, κ, T is the translation vector.Therefore, each transformation has six degrees of freedom (6DOF):  ,   ,   , , ω, κ.
The detailed description of EIF can be found in 1999.The system proposed in this paper is an augmented one and moreover the observation model is derived from multiple features.Therefore, this section focuses on the augmented part of the system and multiple-features observation model.

State space and system models
The augmented system state comprises the RTPs of all pair-wised registration completed.The system state at time tk is defined as X(k): where, n represents the number of pair-wised registration completed.Xi(k) denotes the RTPs of the i-th pair-wis

System Status Model
As the RTPs of each pair-wised registration are static, the system state transition equation becomes: where, f (.) is the state-transition model.As we can see, the statetransition model is a unit matrix I, which can be ignored.

System augmented model
During the global registration process, when a new pairwised registration is considered at time k, its RTPs are added into the system state vector.The RTPs in the global reference frame are estimated in terms of the RTPs of the new registration and its preceding registration using Equation (4).
where   () represents the augmented RTPs of the currently considered registration,   (k − 1) denotes the RTPs of the preceding registration, g(.) is a system-augmented function, the RTPs of the new pair-wise registration are and ω(k) describes a variety of uncertainties in the pair-wise registration and modeling process, which is assumed to comply with the Gaussian distribution and is thus expressed as a white noise vector Ν (0,  ).

Observation model
An observation model is established to optimize the RTPs of all pair-wised registration by minimizing the differences between the 3D corresponding feature pairs transformed into the common reference frame.Besides point features, since the indoor scene normally contains a plenty of planar primitives, planar features are employed to control the registration of multiple scans.
Therefore, the observation model derived from multiple features is as follow:

Point-feature models
The observation model of point feature is established to minimize the differences between the 3D corresponding point pairs transformed into the common reference frame.
where (   ,   ,   ) and (  ′,   ′,   ′) are respectively the coordinates of the corresponding point pair i in the fixed and analyzed scans, respectively, transformed into the common reference frame.h(. ) is the observation model, v(k) denotes a variety of uncertainties in the scanning measurement and the transformation of coordinates, which is supposed to comply with the Gaussian distribution.
The coordinates (  ′,   ′,   ′) are computed as follows: where (  ′,   ′,   ′) are the coordinates of the point in the analyzed scan; ( , ,  , ,  , ) represent the translation from scan i+1 to i; and Ri is the rotation from i+1 to i. Equation ( 5) is linearized as: where, ∇ℎ  is the Jacobian matrix derived from Equation ( 5) and ( 6).

Planar feature model
A singularity-free representation of a plane (2014) that describes a plane using the normal vector  ⃗ = [  ,   ,   ]  and the perpendicular distance from the origin was employed.This representation is also known as the Hesse form of the plane.Equation ( 9) is the full expression for the parameterization.
Because there can be only three degrees of freedom in a plane, we impose a constraint on the length of the normal vector  ⃗ : It is not expected that there are enough triple perpendicular planes in the indoor environment, so the planar feature constraint is only derived to control the rotation deviation in the registration process.
For each plane, three following equations are used to compute the difference in normal vector . Δ . Δ =   − (  ) where ni and nj are the normal vectors of corresponding planes i and j, R is the rotation.The observation model is derived as where (  ,   ,   ) , (  ,   ,   ) are the corresponding normal vectors of a plane in the coordinate frames of scan i and j,    and    denote the rotations respectively from the coordinate frames of scan i and j to the global coordinate, h(. ) is the observation model, and () denotes a variety of uncertainties in the scanning measurement and the transformation of coordinates, which is assumed to comply with the Gaussian distribution.Equation ( 11) is linearized as: where, ∇ℎ  is the Jacobian matrix derived from Equation ( 11), ∆  and ∆  respectively denote (∆  , ∆  , ∆  ) and (∆  , ∆  , ∆  ) The more observation equations of planar features that are added to the AEIF system, the greater the contribution that the constraint has on the registration.

Fitting of planar primitives
To ensure the efficiency and robustness of fitting, the optimized BaySAC algorithm ( 2014) is employed to estimate the parameters of planar primitives.Moreover, a simplification algorithm preserving the feature points is employed to simplify of the point cloud while ensuring the fitting accuracy.We begin with the discrete 3D laser scanned data and derive the feature points of the point clouds through the analysis of point cloud smoothness and boundary features.We then perform the thinning of non-feature points to achieve point cloud simplification while retaining the feature points.Since feature points are of great importance for further processing, they are kept as many as possible.Even though a few false feature points may be remained, their influences on the fitting process can be ignored.
Therefore, the observation model derived from multiple features is as follows: The RTPs of all of the pair-wise registrations are then optimized by minimizing both the differences between the corresponding 3D point pairs and the corresponding normal vectors of planar features, which is expected to improve the robustness and accuracy of the proposed global registration approach.

Status augmented
When a new pair-wised registration is completed, its   (k) is added to the system state.The information vector and the information matrix are then augmented as follows: Λ − ( + 1)denotes the priori information matrix and η − ( + 1)represents the priori information vector, ∇ is the Jacobian matrix of system augmented model, Q is covariance matrix.

EXPERIMENTAL RESULT
The proposed approach was tested on real datasets (Figure 1) that were acquired by Kinect 2.0 in a room.Sixty-one scans were captured with an average shift of 0.4m between the scanning centers.3 shows obvious registration error accumulation.To estimate the error accumulation, Figure 3 b~d respectively illustrates the cross sections extracted at the same position from multiple scans, which were transformed into the same coordinate system using pair-wised registration results.Distinct deviations among the cross sections in Figure 3 2 lists the numeric differences between the corresponding points, which were transformed into the same coordinate system using pair-wised registration results.The average deviation is 0.1770m, which is much larger than the average value 0.012 m of Table 1.The difference proves the existence of the error accumulation of pair-wised registration.As proposed in Section 2, we implemented augmented extended Information Filter to eliminate the error accumulation suffered by the pair-wise registration for the purpose of producing the accurate estimates of rigid transformation parameters.

Global registration using augmented extended Information filter
The pair-wised registration results were used to construct the augmented extended Information Filter system from which the global registration was implemented.Besides the corresponding points shown in Figure 2, plannar features were extracted and added into the observation model (Figure 4) Figure 4 Corresponding plannar features extracted from consecutive scans For the comparison of registration accuracies, cross sections were also extracted at the same position as shown in Figure 5. Figure 5 illustrates that the deviations between the cross sections decrease comparing with the deviations shown in Figure 3.The average deviation computed from the results in Table 3 accordingly reduces to 0.023m.

CONCLUSIONS
In this paper, we proposed a global registration approach based on augmented extended Information Filter.The point cloud registration was regarded as a stochastic system so that we utilized AEIF to produce the accurate estimates of rigid transformation parameters through eliminating the error accumulation suffered by the pair-wise registration.
The proposed algorithm was implemented using Kinect point clouds that was acquired in an indoor environment.The experimental results show that, when aligning multiple scans into a common coordinate frame, the consecutively implementation of pair-wised registration leads to error accumulation (from 0.012m to 0.1770m).The results also illustrate that the application of the global registration based on AEIF can reduce the error accumulation (from 0.1770m to 0.023m), which improves the accuracy of aligning multiple scans by 90%.
As the indoor environment normally contains plenty of curve and linear primitives besides planar ones, future work will focus on incorporating such primitives with points to improve the robustness and applicability of our registration algorithm.

Figure 2 .
Figure 2. The pair-wised registration result Pair-wised registrations were implemented by the SIFT-based image matching method proposed byKang et al.(2009).Figure2(a)shows the corresponding points between consecutive scans, based on which the pair-wised registration result was acquired (Figure2(b)).Table1lists the accuracies of the pair-wised registrations, which average value is 0.012m.Table1.The accuracies of pair-wised registrations Figure 2(a) shows the corresponding points between consecutive scans, based on which the pair-wised registration result was acquired (Figure 2(b)).
b~d present owing to the accumulation of pair-wised registration errors.The error accumulation of pair-wised registration.(a) Overview; (b) Position A; (c) Position B; (d) Position C Table

Table 1 .
Table 1 lists the accuracies of the pair-wised registrations, which average value is 0.012m.The accuracies of pair-wised registrations

Table 3
Deviations between corresponding point pairs