SEGMENT-BASED LIDAR ODOMETRY FOR LESS STRUCTURED OUTDOOR SCENE

Accurate registration of sparse sequential point clouds data frames acquired by a 3D light detection and ranging (LiDAR) sensor like VLP-16 is a prerequisite for the back-end optimization of general LiDAR SLAM algorithms to achieve a globally consistent map. This process is also called LiDAR odometry. Aiming to achieve lower drift and robust LiDAR odometry in less structured outdoor scene using a low-cost wheeled robot-borne laser scanning system, a segment-based sampling strategy for LiDAR odometry is proposed in this paper. Proposed method was tested in two typical less structured outdoor scenes and compared with other two state of the art methods. The results reveal that the proposed method achieves lower drift and significantly outperform the state of the art.


INTRODUCTION
In the last ten years, 3D LiDAR SLAM (Simultaneous Localization And Mapping) is an active research topic in Robotics and 3D Vision (Zhang and Singh, 2017). On the one hand, backpack and handheld mobile mapping system (MMS) which generally are equipped with laser scanners, cameras, inertial measurement units (IMU) utilizes this technology to derive the position and orientation (POSE) of the MMS for 3D mapping (Nüchter et al., 2015;Kaarta Stencil 2, 2018;Leica BLK2GO, 2019;Karam et al., 2019). On the other hand, taken the LiDAR SLAM as the core technology for localizing and mapping, flexible unmanned ground vehicles (UGVs) are developed for various applications such as autonomous driving (Claussmann et al., 2019), building information modeling (Blaser et al., 2018), etc.
In general, LiDAR SLAM consists of two modules (Nüchter et al., 2007): 1) Front-end (LiDAR odometry) which uses sequential LiDAR scans to estimate the sensor's pose in realtime. 2) Back-end which detects the loops and applies the global optimization to generate the refined poses and map. This paper focuses on the former module.
There are various works on SLAM in the literature, but we concentrate here on the approaches used 3D LiDAR sensor to register point clouds into a common coordinate system. Most existing methods are the variants of the traditional Iterative Closest Point (ICP) algorithm (Bes and McKay, 1992;Chen and Medioni, 1991). Pomerleau et al. (2015) review the different solutions of ICP and present a generic scheme for registration algorithms. Point-to plane distance is widely used as an error metric of ICP because of fast convergence rate. Different from the traditional scan matching, LiDAR odometry adopts the strategy that aligns current scan to the map instead of the last scan to reduce the cumulative error in the consecutive scan matching (Wulf et al., 2008). Nüchter et al. (2007) adopts a stop-scan-go strategy to scan the environment and register neighboring scans to a common  Corresponding author coordinate system by the ICP algorithm. Bosse and Zlot (2009) use a spinning 2D LiDAR sensor to acquire point clouds which is registered by matching 3D voxels based on geometric features like planar and cylinder features, and a continuous 6DOF sensor trajectory is recovered. Zhang and Singh (2014) propose a realtime lidar odometry and mapping method called LOAM. This method consists of two parts, one algorithm performs odometry at high frequency to get undistorted sweep and another algorithm aligns the sweep to the map at low frequency. LOAM is the best LiDAR odometry method on the KITTI odometry benchmark (Geiger et al., 2012). Shan and Englot (2018) propose a lightweight lidar odometry and mapping method based on LOAM. A two step Levenberg-Marquardt optimization is proposed to accelerate the computation time. Three Degree of Freedom(DOF) transformation which is [ , , ℎ ] is estimated by matching the planar features first, then another three DOF transformation is estimated by matching edge features. Deschaud (2018) proposes a scan-to-model matching method with the Implicit Moving Least Squares (IMLS) surface representation. They use the previous relative transformation to unwarp the current frame point cloud data assuming that the egomotion is relatively similar between two consecutive scans. Behley and Stachniss (2018) propose a method called SuMa which constructs a surfel-based map and calculate the pose by exploiting the projective data association between current scan and a rendered model view from that surfel map. To further improve the pose estimation, Chen et al. (2019) add semantic constrains derived by deep neural network to the frame-to-model ICP problem based on SuMa.
From the above, most methods rely on the feature points extracted from the raw point cloud to do the scan matching. Those methods work well in structured scene, but can't be well adopted to less structured scenes with feature points hard to describe such as tree leaves (Ji et al., 2018;Shan and Englot, 2018).
To decrease the cumulative matching error in less structured scene which is very common in outdoor, we propose a segment-The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII- B1-2020, 2020XXIV ISPRS Congress (2020 based sampling strategy to filter the point cloud, ensuring a more stable feature point extraction to achieve lower drift LiDAR odometry in such scenes.

SEGMENT-BASED LIDAR ODOMETRY
A segment-based LiDAR odometry method is proposed to register the sequential point cloud data acquired by a sparse 3D LiDAR sensor like VLP-16. As illustrated in Figure 3, two key modules are considered in the proposed method. The first part is the point cloud segmentation in which the raw laser point clouds are filtered to generate segments that consists of the ground points and specific objects points. Second part is the scan matching, aligning the low-level feature points extracted from the filtered point cloud segments to the map, to get the pose of current frame. Once one frame of laser points has been aligned to the map, feature map will be updated and wait for the matching of the next frame.

Point Clouds Segmentation
The point clouds acquired by a data frame of the 3D LiDAR sensor is defined as a scan . Both dynamic and static object points such as poles, cars, facades, tree trunks and leaves are collected in a single scan. Feature points among these static objects excluding tree leaves are more stable for matching. Therefore, it is necessary to filter the point clouds via object segmentation before further processing (i.e. scan matching). The fast range image-based segmentation proposed by Bogoslavskyi (Bogoslavskyi and Stachniss, 2016) is utilized for the ground and non-ground point clouds segmentation. The ground points and object points with number of points more than 20, defined as , are added to the segments for the scan matching.
(a) Raw laser point clouds rendered in intensity (b) Object segments' points (random color) Figure 1. Illustration of the point cloud segmentation

LiDAR Odometry
Two coordinate systems, namely, local coordinate system and world coordinate system , are involved in the proposed method. Local coordinate system is a 3D coordinate system with its origin at the geometric center of the 3D LiDAR sensor. The initial local coordinate system is defined as the world coordinate system . The task of this module is estimate the ego-motion of the LiDAR sensor using a sequence of point clouds and corresponding segment clouds , = 0,1, … , . represents the number of LiDAR scans. The ego-motion of the LiDAR sensor is represented by the 6 DOF transformation between the local coordinate system and the world coordinate system , defined as . consists of two parts, including the rotation matrix and the translation vector . Therefore, a point in local coordinate system can be transformed into the world coordinate system by the formula 2.
The edge feature points and planar feature points are extracted from the segment clouds according to the criterion proposed by LOAM (Zhang and Singh, 2014). Then, is estimated through feature points in current frame registering with the feature map point clouds by minimizing the point to line and point to plane distances iteratively until converges solved by the Levenberg-Marquardt method (Marquardt, 1963) which is used to solve the nonlinear least squares problem. Distances larger than 1m are regarded as outliers and removed. Shown in formula3, the initial transformation of is estimated by the transformation increment with respect to the last two frames. After deriving the final , the feature points which are extracted from will be added into the feature map point clouds for the matching of the next frame. Furthermore, a sliding window based "scan to map" matching strategy is used to balance the precision and efficiency, as showing in Figure 4.

Experiment data
The proposed method was tested using the data collected by our robot-borne laser scanning system (Wu et al., 2019). The system uses the Clearpath Jackal as the motion platform, and consists of a Velodyne VLP-16 laser scanner, a Zed camera, a Mti-300 IMU and a NUC on-board processor (as shown in Figure 5). In this experiment, we only use the point cloud data collected by the VLP-16 which has 16 scanlines evenly distributed in the vertical field of view of ±15 degrees. The speed of the UGV is about 1.5m/s. Algorithm is implemented in C++ and executed in the Ubuntu Linux system. Point clouds collected in two typical sites including a 1.5 km road in Wuhan university (Site 1) and another relatively short greenway of Donghu Lake in Wuhan, China (Site 2) are used for experiments. Table 1 provides the dataset specifications for the two sites. Overall point clouds mapping result for the two typical sites are as shown in Figure 7. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B1-2020, 2020 XXIV ISPRS Congress (2020 edition)

Comparison with other methods and analysis
We benchmark our method against state of the art LiDAR odometry method, namely LOAM 1 (Zhang and Singh, 2014) and LeGO-LOAM(Shan and Englot, 2018).
Due to the lack of ground truth of the pose, we use two criteria to assess the accuracy of our method that is the loop closure difference showing the drift over the whole trajectory and the root 1 https://github.com/laboshinl/loam_velodyne mean square error (RMSE) of plane fitting showing the local mapping quality.
In Site 1, 1 closed loop is presented. The start position and the end position is the same place. From Table 2, our method demonstrates 0.5% position drift over 1.5km, outperforming two other methods.
10 planes (e.g., planar ground, planar facades) in the map were randomly selected in Site 1, and plane fitting RMSE was calculated for our method, as shown in Figure 6. The average RMSE of the plane fitting for our method is 2.7cm. In Site2, there is no loop, so 5 planes (planar ground) in the maps were randomly selected and plane fitting RMSE was evaluated for three methods, as shown in Table 3. The average RMSE of the plane fitting for our method is 3cm. Quantitative result from Table 3 shows that our algorithm achieves the best performance.  It can be found from Figure 8 that LOAM has a significant drift in elevation direction. It is not surprising because feature points from unstable objects such as leaves have a negative effect on registration. LeGO-LOAM also utilize point cloud segmentation to filter unstable points, though there are two main differences with our method. First, we propose the object segments for the stable feature points extraction; Second, feature map point clouds are extracted from raw laser data considering the effect of sparse point cloud data on segmentation. In the proposed LiDAR odometry procedure, we ignore the motion distortion in each frame of point clouds due to the slow motion of the platform.

CONCLUSIONS
In this paper, we propose a segment-based LiDAR odometry method for sparse sequential point clouds data frames. Experiments demonstrate that our method is more robust and have lower drift in less structured scene. Future work will be focused on global optimization for the purpose of global consistently 3D mapping, and using IMU to adjust motion distortion.