STATISTICAL OUTLIER DETECTION METHOD FOR AIRBORNE LIDAR DATA

Sampling the Earth’s surface using airborne LASER scanning (ALS) systems suffers from several factors inherent to the LASER system itself as well as external factors, such as the presence of particles in the atmosphere, and/or multi-path returns due to reflections. The resulting point cloud may therefore contain some outliers and removing them is an important (and difficult) step for all subsequent processes that use this kind of data as input. In the literature, there are several approaches for outlier removal, some of which require external information, such as spatial frequency characteristics or presume parametric mathematical models for surface fitting. A limitation on the height histogram filtering approach was identified from the literature review: outliers that lie within the ground elevation difference might not be detected. To overcome such a limitation, this paper proposes an adaptive alternative based on point cloud cell subdivision. Instead of computing a single histogram for the whole dataset, the method applies the filtering to smaller patches, in which the ground elevation difference can be ignored. A study area was filtered, and the results were compared quantitatively with two other methods implemented in both free and commercial software packages. The reference data was generated manually in order to provide useful quality measures. The experiment shows that none of the tested filters was able to reach a level of complete automation, therefore manual corrections by the operator are still necessary. * Corresponding author


INTRODUCTION
Despite the robustness of state of the art of airborne LASER scanning (ALS) systems, and even with diligent data acquisition, the resulting point clouds may contain undesired measures due to external factors present in the scene.According to Ben-Gal (2005), the presence of outlying observations in the dataset may lead to model misspecification, biased parameter estimation and, consequently, incorrect results.As suggested by Lin and Zhang (2014), automatic outlier removal is quite a challenging task due to the various types of error possible.
Within the scope of ALS, the outliers (Figure 1) can be divided into positive and negative (Matkan et al., 2014): the positive ones consist of LASER returns from objects near the scanning system (Figure 1b), such as birds or small unmanned aircraft, for instance.In addition to those causes, Leslar et al. (2010) mention that suspended particles in the atmosphere (snow and dust, for example) are also possible sources of positive outliers.Theoretically the atmospheric effects could interfere in some situations, in practice however, they are less likely to generate outliers since the data acquisition is usually planned to occur under reasonable weather conditions.
Aside from the robust state-of-the-art equipment used on conventional airplanes, there has been a constant development of lightweight LiDAR systems that were designed specifically for unmanned aerial vehicles (UAV) as well as some which were adapted from automated driving systems.Assuming that these LiDAR systems coupled to UAVs are equal or less robust than the full-sized variants, and also that the platform usually flies at lower heights, it can be expected that the resulting point clouds will contain outlaying points.
Negative outliers (Figure 1c) are usually attributed to multipath trajectory of the LASER pulse, analogous to the effect that occurs in the Global Navigation Satellite System (GNSS), that is, the pulse travels a longer trajectory than the real, resulting in points whose positions are below the Earth's surface.According to Sithole and Vosselman (2004), negative outliers may lead to problems during terrain filtering operations, since the commonly applied methods assume lowest points as terrain.Some authors such as Sithole and Vosselman (2004), Mongus and Žalik (2012), and Lin and Zhang (2014), for instance, adopt high and low outliers instead of positive and negative.Other possible nomenclature is proposed by both the American Society for Photogrammetry and Remote Sensing (ASPRS) and the United States Geological Survey (USGS), which considers high and low noise as synonyms for positive and negative outliers (Heidemann, 2018), respectively.Lin and Zhang ( 2014) provide yet two other classes of outlier based on spatial distribution: isolated outliers and clustered outliers.
According to Ben-Gal (2005), the methods for outlier detection can be distinguished as parametric or non-parametric, that is, model independent.In the context of airborne LiDAR data or ALS data, the mathematical model for the Earth's surface and its features can be quite complex, thus the non-parametric techniques are preferred.An alternative is the use of local mathematical models, such as proposed by Leslar et al. (2010).
Among the non-parametric methods for outlier detection, there are the distance-based ones, which are recommended for large datasets and are implemented in most of the point cloud processing software.Usually, they search for observations that are significantly different from the average value, which is computed locally in a given neighbourhood, introducing another problem, that is, the definition of this neighbourhood.
As the outlier filtering problem is a binary classification problem, the quantitative analysis can be performed with the computation of error matrices, which enable the evaluation of Type I (failure to remove outliers) and Type II (removal of valid points) errors.Besides relative comparisons with other methods, the error matrix computation requires reference data (otherwise known as ground truth), however, such data is rather difficult to produce, since it will be manually made in most cases.Several approaches can be used for outlier removal in ALS data, the simplest being based on external information such as the average terrain altitude of the region of work, or the maximum difference between the altitudes of the first and last return (Matkan et al., 2014).Another strategy was proposed by Leslar et al. (2010), which consists in bivariate quadratic polynomial fitting to the neighborhood of all points in the dataset.In this approach, points whose altitudes are very different from those expected according to the mathematical model are removed.
Fitting such a mathematical model to the surface is a valid approach, however, dense urban regions or vegetated areas might be incorrectly modelled depending on the object's complexity.
Some approaches, such as Li et al. (2011) and Shen et al. (2011) attempt to detect outlaying points from the heights histogram.Both papers also employed the KD-tree (Kdimensional tree) towards computation time savings, as the query for neighbouring points is less time-demanding due to the topological structure of the tree.This abstract data type was also used in other methods, such as the one proposed by Lin and Zhang (2014).
Similarly, Silván-Cárdenas and Wang (2006) presented an approach which was later modified and extended by Lin and Zhang (2014).The authors proposed a semi-automatic method where the operator has to visually inspect the elevation histogram to locate the lowest and highest tails of the distribution.A second step is then employed to remove remaining outliers using a criterion based on height difference thresholding.Finally, the operator has to manually correct possible misclassifications of the filter.
The height histogram analysis is a useful mechanism to identify outliers in airborne derived point clouds, however, they ignore the terrain relief and also the height variation of the objects above it.If the studied area is large and with rugged relief, then the method is likely to fail detecting outliers close to the surface.
Alternatively, based on the aspects discussed and in order to develop a more balanced approach, this paper proposes an adaptive outlier filtering algorithm for ALS data.In Section 2 a brief description of outlier filtering approaches is given.The proposed method is introduced in Section 3. In Section 4 some experiments and results are discussed, followed by some concluding remarks in the last section.

Parametric surface fitting
Several methods of outlier detection were designed as a local descriptor over the point's surroundings.The main idea is to compute the distance of the point to a parametric surface estimated from its neighbors.The usual mathematical models are bivariate quadratic polynomials (Leslar et al., 2010) and planes (Rashidi and Brilakis, 2016).The assumptions of this approach over the sampled data may be violated depending on the neighborhood size and also on the complexity of the parametric surface selected.Vegetated areas are less likely to be correctly modelled with planar surfaces, also most building roofs with smooth continuous surfaces as bivariate polynomials, for instance.

Morphological outlier filter
According to Kilian (1996) mathematical morphology theory has suitable operators to compose terrain filtering algorithms.
The general idea has been extended into adaptive approaches such as the progressive morphological filter introduced by Zhang et al. (2003).Mongus and Žalik (2012), used morphological operators as a pre-processing step in order to remove outliers and avoid problems during the later stages of the proposed method, which is a terrain filtering algorithm based on thin plate splines.
Morphological filtering methods are usually based on extensions of digital image operators, which were adapted to process the point's height instead of the grayscale intensity.The main operation is the opening, which is a composite of other two fundamental morphological operations: erosion and dilation.The erosion operation levels down the higher points to the lowest one within the structuring element which is convolved in the grid, while the dilation levels up the lower points to the highest one.The opening operation is defined as an erosion directly followed by a dilation using the same structuring element.The erosion will remove unwanted features, such as outliers or buildings, vegetation and other objects in the case of terrain filtering.The erosion operation will also introduce some deformation to the terrain relief, which is going to be partially reconstructed during the dilation.When using morphological operators to filter the terrain, the structuring element size selection might be based on the nonground objects that must be removed.The morphological outlier filter uses the same principle, however, as the outliers are usually isolated points, the structuring element can be rather small.
The main drawback to using morphological operators consists in the requirement of a raster data format.As a consequence, transforming the point cloud to a grid demands interpolation, which may introduce error into the data.Alternatively, Mongus and Žalik (2012) adapted the operators to work on a grid created from cell subdivision.Each cell (voxel) contains an array of points with (X,Y,Z) coordinates.The operator will then select the most appropriate entry to work with, that is, the higher or lower point of the array, for instance.

Spatial frequency (SF) outlier filter
The LAStools software version 171215 implements a straightforward outlier removal technique (lasnoise) based on the spatial frequency (SF) of the points.As described in the documentation, for each point in the dataset, a 3D structurevoxel -formed by 3x3x3 elements is generated having the current point as centre.If the number of points within this voxel (27 cells) is inferior to a certain threshold, the current point is considered an outlier (isolated point).This approach is efficient but some outlier groupings are difficult to identify, which may result in filtering problems (Type II errors).

Statistical outlier removal (SOR) filter
The software CloudCompare version 2.9.1 uses the Statistical Outlier Removal (SOR) from the Point Cloud Library (PCL), which assumes that the distance between a given point and its neighbours is normally distributed.The statistical model implemented is described in Rusu et al. (2007), who adapted the filter from Zhang (1994) for another context.
For each point pi (i=1…n), in the dataset, the average distance ri (K) considering K-nearest neighbors (KNN) is computed.This value is assessed using the sigma rule on the entire dataset, that is, if the result is not within N standard-deviations from the mean, then the point is treated as an outlier.The software mentioned adopts K=6 and N=1 as default parameters for the SOR filter.
Assuming that the average distance to the KNN is normally distributed, the standard-deviation multiplier (N) can be chosen based upon the cumulative distribution function (CDF) from the normal distribution.If the data seems to have few outliers, a higher N value can be selected, which will result in fewer points being removed.When adopting N=3, the expected inlier percentage should be approximately 99.73%.In this way, a low value for N might result in more Type II errors in the filtering.For this filter, Shen et al. (2011) suggest that K must be larger than the number of points within a cluster in order to remove it.
This approach was intended for terrestrial (mostly indoor) point clouds (as the original paper suggests), and it is sensitive to density variations.For ALS data the method may lead to problems in some regions, such as in areas of vegetation, where the LASER pulse penetrates the canopy (resulting in multiple returns, and therefore an increase in the point density), power transmission lines, and overlapping strips.
Similar ideas have been shown in Rashidi and Brilakis (2016), where the authors proposed metrics based on the KNN.The first criterion is analogous to the SOR filter, whereas the second assume that the point cloud was sampled over planar segments and attempt to adjust a plane to the KNN using the least squares (LS) or random sample consensus (RANSAC) algorithm.The point is more likely to be an outlier as the distance to the adjusted plane increases.Apart from being a suitable alternative for indoor point clouds, the assumptions for this criterion will be violated when dealing with ALS data.

CELL HISTOGRAM FILTER
The proposed outlier removal method, designated here as cell histogram (CH) filter, consists of a straightforward variation of the Lin and Zhang (2014) algorithm.Initially, the point cloud is divided into a two-dimensional grid of rectangular cells, using a similar data structure described in Mongus and Žalik (2012).
Apart from the computation time savings (as with the KD-tree mentioned earlier), structuring the point cloud in a grid of rectangular cells is particularly helpful in order to accommodate the terrain variations for the height histograms.The height histogram is built for each cell, considering a bin width in order of magnitude of vertical accuracy.The algorithm sweeps the histogram searching for the first and last bin whose frequency is above a specified threshold in terms of number of points, that is, the highest and lowest tails of the height distribution.The acceptance interval is computed as corresponding to minimum and maximum height for the first and last bin, respectively.

EXPERIMENTS AND RESULTS
The selected study area shown in Figure 2   This point cloud has a considerable ground elevation difference of about 30 m.In addition, the region contains several objects that are easily mistaken as positive outliers, such as light poles, power transmission lines, and low density sampling on building walls.A total of 848 outliers were detected manually in the point cloud, most of them being negative outliers (650) and situated below the bigger edifications.Positive outliers are mostly isolated points (198), while negative outliers where found in clusters.

Establishing benchmarks
In order to provide results for comparison, the point cloud mentioned was processed with two outlier filters available in both LAStools and CloudCompare softwares.A comparison with morphological filters, not considered in this study, would also be interesting.
Firstly, the SOR filter was applied several times using CloudCompare, each application considering a standarddeviation multiplier (N) of 4, which assumes 99.9936% of the population as inliers.In theory, if the ri (K) distances were normally distributed, approximately 105 outliers would be expected on the point cloud (1.65 million points).However, the assumption of normality was not assessed in this study, and the parameter (K) also needs to be considered in this algorithm.
The point cloud was processed ten times with the SOR filter, the first time with K=4, and then increasing by 2 in each consecutive run.The results for true positive (TP) and false positive (FP) are presented in Figure 3.The false negative (FN) was omitted from the graph, since the values were too high (about 13000 for K=4 and increased towards 14500 for K=22) and would interfere in the scale for the vertical axis.The graph shows two trends and also two asymptotes, one for each quantity.The TP curve starts with about 340 correct filtered outliers and stabilizes close to 680 (approximately 80% of the outliers).As expected, the FP decreases at almost the inverse rate, and shows no further improvement when it reaches 165 undetected outliers (about 20% of the outliers).As mentioned before, the main problem with this filter is that, it removed almost 0,9% of valid points (14500 FN) in order to achieve those results.
The point cloud was also processed several times with the SF filter implemented on LAStools.All runs considered the same voxel size (a 8 m 3 cuboid), only changing the frequency threshold.Some patterns shown in Figure 4 are similar to the SOR filter (Figure 3), the main difference being that the SF filter has fewer FN occurrences than the method mentioned before.
As can be seen in Figure 4, the FN increases at a higher rate than the TP.This is not optimal since the method should avoid FN occurrences, however, those values are still considerably lower when compared to the SOR filter.It can be assumed that further increasing the frequency threshold may remove many more valid points while making less difference on the outlier detection rate.The results from the experiments conducted highlight the difficulty of the outlier detection problem.The trends observed were expected, that is, a higher outlier detection rate causes a higher valid point removal as well.Although the experiments were performed with only one point cloud, this behavior can also be expected for other filters and other datasets.

Proposed method evaluation
The CH filter was applied several times on the study area point cloud in the same way as the other two filters (SOR and SF).
All processes considered the same square cell size of 50 m, and histogram bin width of 15 cm.The bin width value was selected based on the vertical accuracy of the point cloud (12 cm, as mentioned before).The results shown in Figure 5 are comparable to the other filters, principally at the rate at which the FN increases for both SF and CH filters.Apart from having similar behavior, the TP and FP rates for those filters appear to be different, that is, the CH filter shows a more modest increase on accuracy over the parameter changes.
The F-score (also known as F1-score or F-measure), is a metric based on the effectiveness measure proposed by Van Rijsbergen (1979).This metric was computed for both SF and CH filter results (Figure 6) in order to provide a proper numeric comparison.This measure is suitable for this comparison since it provides a unique value and it takes the TP, FP and FN into acount.The SOR filter was ignored in this comparison due to its high FN values, which were discrepant from the other two filters.
Figure 6.F-score comparison between the SF and CH filters As can be seen from the diversified results in Figure 6, it is not clear which filter stands out.Numerically, both filters achieved incomplete detection, with F-scores below 50%.No tested filters was able to provide a proper outlier detection on the study area, so manual corrections would be necessary in order to refine the filtering.
Table 1 shows the peak of correctly detected outliers using the same configuration as before, except for the frequency threshold which was increased.Those values (around 550 and 520, for SF and CH filters, respectively) represent at least 130 fewer detected outliers (15%) when compared to the SOR filter.Based only on the TP and FP rates, the SF filter seems to provide a better result than the CH filter, however, it removed almost a thousand more valid points.In practical terms, the algorithm selection can be based on the misclassification costs, that is, choosing between "failing to remove outliers" or "removing valid points", whichever has fewer errors prejudicial to your work.

Method
In order to provide a more detailed analysis, the TP shown in Table 1 were classified as positive and negative outliers, and the results are presented in Table 2.The SOR values are based on the processing with K=18.It can be noted from the values that the SOR filter has detected almost the double the positive outliers than the other two filters.Another noticeable fact is related to both SF and CH filters, which detected almost the same number of negative outliers.Assuming that negative outliers are more problematic than positive ones, since they might cause problems with terrain filtering algorithms as discussed before, the SF and CH filters were able to detect at least 73% of them.These methods seem to have problems when detecting positive outliers, as for the CH filter, for instance, which was capable of identifying only 23% of the total (198).
In summary, neglecting other factors that might influence the filtering (such as the voxel size, which were not explored in this study), the SOR filter has higher rates for both TP and FN.The CH filter achieved less for TP, however FN numbers were much lower.Finally, the SF filter is a more balanced approach between the three evaluated filters.

Visual analysis
A visual interpretation of the filtering results as well as the numerical assessment might provide useful insights on their characteristics.Figure 7 shows some differences on the FN occurrences among the filters using the same results presented on Table 2.The algorithms with higher FN rates have removed a few (SF) or several (SOR) points sampled on the walls of the evaluated building.The CH filter did not suffer from this problem, since only the highest and lowest tails of the histogram are removed, while the bins in between remain unchanged.

CONCLUSIONS
This paper provides a brief description of current approaches for outlier detection on point clouds.A straightforward variation of the height histogram filter was proposed, the cell histogram filter which aims at an adaptive solution to terrain relief.Two outlier filters were compared to the proposed method, and the experiments highlighted some characteristics of the methods.While it was not possible to point out which was the best solution, this simple analysis provided information to help choose the correct algorithm based on the TP, FP and FN rates.Independently of the chosen filtering method, manual corrections are still necessary in order to achieve a reasonable result.
It is clear that the outlier detection problem is difficult and far from a proper solution.Further study towards the improvement of current algorithms is required.As for future developments, the quantitative evaluation of other approaches such as the parametric surface fitting and morphological filters is suggested in order to provide a broader analysis.Also, the impact of voxel size should be also studied as well as the frequency threshold.
Finally, a study must be performed to assess the impact of valid point removal (FN) on subsequent post processing algorithms on the point cloud, and how to reduce those FN rates.

Figure 1 .
Figure 1.(a) ALS outlier taxonomy.Examples of datasets with (b) positive and (c) negative outliers consists of a small portion of the urban region of Presidente Prudente/Brazil.The LiDAR dataset was acquired with a RIEGL LMS Q680i in 2014 by Sensormap Geotecnologia, a subsidiary company of the Engemap Group(Tommaselli et al., 2018).The point cloud has 1.65 million points and covers an area of 428 m by 351 m, resulting in a average point density of 10.9 pts/m 2 .According toTommaselli et al. (2018), the average flying height was 550 m, and the point cloud has a 12 cm vertical root mean square error (RMSE), which was assessed using ground control points (GCPs) tracked with double frequency GNSS receivers.

Figure 2 .
Figure 2. Study area limits in UTM zone 22 S grid coordinates

Figure 3 .
Figure 3.Effect of neighbouring points (K) on SOR filter, considering a sigma multipliyer (N) equals 4

Figure 4 .
Figure 4. Effect of frequency threshold (isolated) on SF filter, using a voxel size (step) of 2 m

Figure 5 .
Figure 5.Effect of frequency threshold on CH filter, using square cells of 50 m and bin width of 15 cm

Table 1 .
Higher TP ocurrences for SF and CH filters

Table 2 .
Positive and negative outlier distribution