A THRESHOLD-FREE FILTERING ALGORITHM FOR AIRBORNE LIDAR POINT CLOUDS BASED ON EXPECTATION-MAXIMIZATION

Filtering is a key step for most applications of airborne LiDAR point clouds. Although lots of filtering algorithms have been put forward in recent years, most of them suffer from parameters setting or thresholds adjusting, which will be time-consuming and reduce the degree of automation of the algorithm. To overcome this problem, this paper proposed a threshold-free filtering algorithm based on expectation-maximization. The proposed algorithm is developed based on an assumption that point clouds are seen as a mixture of Gaussian models. The separation of ground points and non-ground points from point clouds can be replaced as a separation of a mixed Gaussian model. Expectation-maximization (EM) is applied for realizing the separation. EM is used to calculate maximum likelihood estimates of the mixture parameters. Using the estimated parameters, the likelihoods of each point belonging to ground or object can be computed. After several iterations, point clouds can be labelled as the component with a larger likelihood. Furthermore, intensity information was also utilized to optimize the filtering results acquired using the EM method. The proposed algorithm was tested using two different datasets used in practice. Experimental results showed that the proposed method can filter non-ground points effectively. To quantitatively evaluate the proposed method, this paper adopted the dataset provided by the ISPRS for the test. The proposed algorithm can obtain a 4.48% total error which is much lower than most of the eight classical filtering algorithms reported by the ISPRS. ∗ Corresponding author


INTRODUCTION
Airborne Light Detection and Ranging (LiDAR) has been developing very fast in the past decades.Comparing with traditional remote sensing means, airborne LiDAR owns the strength of high efficiency in collecting geographic information of large areas (Liu, 2008;Meng et al., 2010).Moreover, weather conditions have no influence on this technique when gathering point clouds (Vosselman and Maas, 2010).Thus, airborne LiDAR has been widely used in many areas, such as digital terrain model (DTM) extraction (Li et al., 2017;Ozcan and Unsalan, 2016), three-dimensional model generation (Yang et al., 2017;Zhou and Zhou, 2014), forest parameter estimation (Hyypp et al., 2012;Vega et al., 2016), etc.
Although airborne LiDAR is cost-effective when acquiring three-dimensional information, it takes up a lot of time in processing point cloud data, especially in manual classification.It has been established that the manual classification and quality control consume an estimated 60-80% of processing time (Flood, 2001).Consequently, it is urgent to develop algorithms to speed up point clouds post-processing efficiency.
To develop algorithms for many other applications, one fundamental step is to extract DTM information from point clouds that contain both ground points and non-ground points.This process is generally known as filtering.Aiming at realizing filtering effectively, lots of algorithms have been put forward in the past twenty years.These algorithms can be categorized into four classes, namely slope-based (Vosselman, 2000) (Sampath, 2005;Sithole, 2001;Susaki, 2012), morphology-based, interpolation-based (Chen et al., 2007;Hui et al., 2016;Hui et al., 2017;Li et al., 2013;Li et al., 2014;Mongus et al., 2014;Pingel et al., 2013), and segmentation-based (Tóvári and Pfeifer, 2005) (Lin and Zhang, 2014) (Chen et al., 2016).Although these algorithms in the literature can obtain a good performance, most of them suffer from parameters setting or thresholds adjusting.Obviously, it is inconvenient for inexperienced staff.Moreover, many parameters setting or thresholds adjusting is always time-consuming and reduces the degree of automation of the algorithm.To overcome this problem, this paper proposed a threshold-free filtering algorithm based on expectation-maximization.

METHODOLOGY
According to central limit theorem, naturally measured sample data will lead to a normal distribution.Conversely, due to the complex terrain environments point clouds can be assumed as a mixture of Gaussian models.Therefore, the separation of ground points and non-ground points from point clouds can be seen as a separation of a mixed Gaussian model.
Expectation-maximization (EM) is applied for realizing the separation.EM is an approach for fitting probability distributions and can calculate maximum likelihood estimates of parameters to probabilistic models being fit to data.When we do not know which component (ground or object) the point belonging to, EM can be used to calculate maximum likelihood estimates of the mixture parameters.Using the estimated parameters the likelihoods of each point belonging to ground or object can be computed.It is obvious that the point is labelled as the class corresponding to the maximum likelihood.The detailed procedure of the proposed algorithm comprises three steps as presented in Sections 2.1 to 2.3, respectively:

Point Clouds Denoising
Due to the influence of external environments or the laser rangefinder malfunction, the acquired point clouds always contain noisy points, including high outliers and low outliers.Both of these two outliers may disturb the assumed normal distribution; especially the low outliers may have a great influence on the final filtering results.Hence, the outliers should firstly be removed.
To realize denoising, the point clouds are first organized using k-dimension tree.A point is eliminated if its elevation value changes greatly before and after morphological opening operation among its k neighbors.
The morphological opening operation is achieved by performing an erosion of the dataset followed by a dilation given as Equation ( 1): ( )

Ground Points Extraction Using EM
To realize the filtering, the posterior probability of a point p belonging to ground points ( G ) should be calculated.It can be determined according to Equation (3): ( ) ( ) ( ) ( )

P p G P G P G p P p =
(3) where ( ) P G is the prior probability of ground points It is obvious that p will be labelled as a ground point if

( )
P G p is greater than 0.5.Since we assume that two classes namely ground points and object points form the mixed Gaussian models, it is easy to set ( ) P G to be 0.5.Thus, to obtain the posterior probability of a point belonging to ground points, we need to calculate the class-conditional density

( )
P p G given as Equation ( 4), which can be estimated using EM algorithm.EM algorithm is a general method for fitting probability distributions that mainly includes the following four steps (Dempster et al., 1977): ⅰInitializing mixture parameters to random values.
EM algorithm keep iterating until it reaches convergence.Meanwhile, the posterior probability of a point p belonging to ground points or object points is obtained.Obviously, p is classified as the class with larger posterior probability.
According to the binary classification results filtering outcome can be achieved.

Optimization Using Intensity Information
Nowadays, most airborne LiDAR systems provide intensity information which can help to optimize the filtering results obtained at Section 2.2.Owing to the characteristic that same materials have similar reflection intensities, the errorclassifying points obtained by the EM method can be revised according to intensity information.Since point clouds are organized using k-dimension tree at Section 2.1, it is easy to find the k nearest neighbors of one point.If one point's intensity value is greatly different from that of its neighbors, we can infer that this point is wrongly classified.Meanwhile, the corresponding label should be modified.

EXPERIMENTAL RESULTS AND ANALYSIS
The proposed algorithm was tested using two datasets used in practice.The first dataset (Sample1) was acquired using an Optech ALTM scanner from the city of Jingmen in China with an area of 1.20 km 2 and 984,998 points cloud data in total.The second dataset (Sample2) was located within the city of Luoding, China, characterized by modern architecture with low and high-storey buildings.Both of these two datasets contain X, Y, Z coordinates and intensity information.Figures 1 and 2 are intensity images of the two areas.It can be found that these two areas contain different terrain features.To test the effect of intensity information on the filtering, this paper compared the performance before and after using the intensity data.The comparison results are shown in Figure 3.It can be found that both the performances of Sample1 and Sample2 are improved by applying the intensity information to the filtering results in terms of total error.It is because the intensity data commonly provides more semantic information which will help revise the filtering results.Conversely, the TypeⅠerror of Sample1 turns a little larger after using the intensity information.This can be explained that there are many different natural objects and water areas in Sample1.These complex objects or areas are easy to generate intensity noise, which will affect the filtering results.

Figure 3. Performance comparison results
In order to more objectively evaluate the filtering accuracy of the proposed method, this paper selected one sample data (sample_21) provided by the ISPRS and compared its total error with the one of other eight classical filtering algorithms tested by the ISPRS (Sithole and Vosselman, 2004) as shown in Figure 4. Sample_21 was selected since it contains some special features, including roads with bridges, data gaps, vegetation, etc., which constitute filtering challenges.From Figure 4, it can be found that the total error of the proposed method is only slightly higher than that of the methods proposed by Axelsson and Pfeifer.Considering that both of these two methods involve complex parameters setting, the proposed method owns great advantages in automation.

CONCLUSION
Point cloud filtering is a necessary step in the point cloud processing, analysis and applications.To break through the limitation of complex parameter settings for the existing filtering algorithms, this paper proposed a threshold-free filtering algorithm based on expectation-maximization.In this paper, the filtering is seen as a separation of mixed Gaussian models.By applying EM algorithm to elevations of point clouds, ground points can be extracted automatically.Intensity information is also adopted for refining the filtering outcomes.Experimental results show that the proposed method can achieve a good performance without complex parameters setting or thresholds adjusting.Furthermore, the filtering performance is little affected by the format and resolution of the airborne LiDAR data.Therefore, the proposed method can provide a good foundation for the post-processing of airborne LiDAR point clouds.
of the k -th point

G
 is the Gaussian equation with parameters j µ and j δ , which are mean and standard deviation of elevations.

Figure 1 .
Figure 1.Intensity image of the first dataset