SPARSE POINT CLOUD FILTERING BASED ON COVARIANCE FEATURES

: This work presents an extended photogrammetric pipeline aimed to improve 3D reconstruction results. Standard photogrammetric pipelines can produce noisy 3D data, especially when images are acquired with various sensors featuring different properties. In this paper, we propose an automatic filtering procedure based on some geometric features computed on the sparse point cloud created within the bundle adjustment phase. Bad 3D tie points and outliers are detected and removed, relying on micro and macro-clusters analyses. Clusters are built according to the prevalent dimensionality class (1D, 2D, 3D) assigned to low-entropy points, and corresponding to the main linear, planar o scatter local behaviour of the point cloud. While the macro-clusters analysis removes small-sized clusters and high-entropy points, in the micro-clusters investigation covariance features are used to verify the inner coherence of each point to the assigned class. Results on heritage scenarios are presented and discussed.


INTRODUCTION
Data filtering plays a fundamental role in 3D reconstructions as it helps e.g. to reduce the noise produced by acquisition sensors and processing procedures.Noisy results are especially frequent when different sensors and platforms are employed, due to scale and illumination changes or quality of single sources (Figure 1).In the literature, many filtering algorithms and methods have been developed and included in the processing pipelines (Xian-Feng et al., 2017).Filtering is even more critical in heritage 3D reconstruction applications, where objects are generally characterized by frequent surface variations and finely detailed elements.Missing essential details due to noisy reconstructions can strongly penalize 3D documentation tasks, architectural studies, or the planning of restoration activities.In the photogrammetric pipeline, a filtering step can be applied in different phases of the 3D reconstruction procedure: it can be done on the images, on the sparse point cloud, on the dense point cloud or meshes.Most of the developed methods have been devised for filtering meshes.Nevertheless, removing outliers and bad computed 3D points in the raw data level (Bastonero et al., 2014) is more convenient with respect to computational efforts and filtering effects and results.

Aim of the paper
In our previous work (Farella et al., 2019), we implemented a filtering procedure on the sparse point cloud to remove outliers and bad computed points and improve the bundle adjustment results before performing the final dense reconstruction.This procedure focused on removing bad 3D tie points based on some quality features, mainly indicative of a wrong acquisition procedure and some photogrammetric reconstruction issues.Improvements in the accuracy of the final 3D results were achieved by filtering the sparse point cloud and re-estimating camera parameters before computing the dense reconstruction.The goal of this work is to investigate the effectiveness of covariance features (Chehata et al., 2009;Mallet et al., 2011) in identifying and removing outliers in photogrammetric sparse point clouds obtained within a bundle adjustment procedure.Such outliers are linked to erroneous tie points in the images and they can badly affect the bundle adjustment results.The covariance features are based on the covariance matrix computed on the distribution of the 3D points and are representative of the local geometrical behavior of the point cloud itself.The main advantages of using such features are: • their feasibility for heterogeneous and unstructured data; • no a-priori knowledge of the scene structure is required (provided for example, by machine learning approaches).The covariance or eigen-features are widely used in point cloud classification studies, while they are rarely investigated for point cloud filtering purposes.Using these features, the proposed pipeline (Figure 2) exploits some geometrical properties of the sparse point cloud, considering the predominant linear, planar or volumetric behavior of each point within its neighborhood, for applying more specific filtering techniques for each case.

RELATED WORKS
The availability of low-cost sensors and the spreading of automated photogrammetric solutions, also among non-expert  -Feng et al., 2017).The first group includes all the works based on the adaptation of statistical principles (Avron et al., 2010), probability methods, and weighted combinations of variables for analyzing the point distribution.In the neighborhood and projection-based approaches, the point cloud de-noising is achieved respectively examining the point position and its similarity measures in a defined neighborhood (Hu et al., 2006), or adjusting the position of each point with different projection strategies (Wang et al., 2013).The signal processingbased methods adopt different operators (such as the Laplacian operator) for the filtering procedure (Rosman et al., 2013), while the PDEs-based techniques (Partial Different Equations) rely on the analysis of some geometrical properties and features for identifying outliers (Xiao et al., 2006).All these procedures deal with filtering issues (such as noise, shrinkage, drifting removal and feature preservation, computation time, etc.) focusing on individual properties of the input data (e.g., points position, geometrical characteristics or other features suitable for statistical models).This could reduce the effectiveness of the filtering procedure, the preservation of the topology, and the capacity to identify outliers.For these reasons, further methods and hybrid techniques (Zaman et al., 2016) are nowadays considered more interesting approaches for noise removal while trying to preserve objects shape and properties.
When the filtering procedure is mainly based on the analysis of geometric properties, the covariance matrix can be used as a shape descriptor of the point cloud (Xiao et al., 2006;Pauly et al., 2002).The covariance features are widely used in segmentation and classification procedures because of their capability to provide an in-depth knowledge on the geometrical structure of the reconstructed scene (Weinmann et al., 2013;Weinmann et al., 2017a-b;West et al., 2004;Gross & Thoennessen, 2006;Hackel et al., 2016).
In the first step of the presented work, these geometrical properties of the sparse point cloud are explored and employed for choosing the points to remove.
In our method, we propose a hybrid and cluster-based approach that exploits the covariance features to describe the local point cloud geometry (linear (1D), planar (2D) or volumetric (3D)) and applies specific filtering procedures for each dimensionality case.

The covariance features
The covariance matrix can be considered as a 3D tensor containing geometrical information about the point distribution within a neighborhood.Using the PCA (Principal Component Analysis) statistical analysis, it is possible to extract from the covariance matrix three eigenvalues (!1, !2, !3) representing the local 3D structure and measuring the variation of the local point set along the direction of the corresponding eigenvector.Thus, the PCA defines the principal directions (three orthogonal vectors) and magnitudes (eigenvalues) of the variation of the points distribution around the center of the defined neighborhood (centroid).
The combination of these magnitudes in the three directions returns some shape descriptors, used to define the prevalent linear, planar o scatter behavior of the neighborhood.These local 3D shape features are called covariance or eigen-features and their geometric and mathematic formulation is showed in the Table 1. (1) Table 1: The mathematic formulation of some covariance features used in the proposed method to filter sparse point clouds.

Optimal search radius and class identification
The description of the 3D local structure of a set of points is usually performed considering their distribution in given neighborhoods.
The geometric features behave differently with respect to the search radius, which determines the size of the neighborhood on which these features will be computed.Moreover, autocorrelation and other characteristics could be observed only in some scales (Blomley et al. 2014).
The local neighborhood can generally be computed as a sphere or a cylinder with a fixed radius, or it can be described by the number of the k ∈ N nearest neighbors.
Whether based on a selected radius or a k parameter, an empiric knowledge of the scene is always required for defining the search values.Nevertheless, the k-based approach is more suitable with point clouds with different point density.More sophisticated procedures have been developed to optimize the k estimation, refining this value for each point, considering the curvature or the local point density, for example.Other methods, such as the dimensionality-based scale selection (Demantkè et al., 2011;Gressin et al., 2013) or the eigenentropy-based selection (Weinmann et al., 2014) showed to be more efficient compared to fixed-scale 3D neighborhoods (Weinmann et al., 2015).In this work, we tested and adopted the dimensionality-based scale approach.This approach finds, for each 3D point, the optimal search radius by computing the neighborhood at different and increasing radii (between a minimum and maximum values) and selecting the one minimizing a measure of unpredictability of the point set.In this approach, the spherical volume of the neighborhoods guarantees an isotropic and rotation invariant behavior, and the shape descriptors are not conditioned by the shape of the neighborhood.Concerning the search radii selection, radii ranges were identified relying on some scene characteristics (r min -r max).Sixteen sampled scales in this range were then used (not linearly increasing and closer to the r min).
In our work, sixty radii values (constantly increasing) are indeed tested and the best twenty are chosen for describing the prevalent geometry of the input data.The criterion for selecting them is based on how many times a specific radius yielded the lowest neighborhood entropy.

Cluster analysis
After the optimal search radius and labeling procedure, each point and its optimal neighborhood is here considered as a cluster of points having the same geometrical behavior (linear, planar, scatter) (Figure 3).Clustering methods are generally unsupervised learning approaches, where elements with similar behavior are grouped (Madhulatha, 2012).Clustering algorithms can be hierarchical if successive clusters are created from previous estimated ones, or partitional when they are all organized in non-overlapping subsets.In the first case, agglomerative (bottom-up) or divisive (top-down) clusters are mainly based on distance measure functions.In the partitional clustering case, some heuristic models are employed (such as k-mean or k-medoids algorithms).
Further methods are based on data density, on multi-resolution grids, or can be Model-Based (Statistical or Neural Network approaches).
In this work, we propose a label-guided approach.Clusters are built starting from lowest-entropy points (Eq.1), and their size derived by their computed best radius (Section 3.2).The points with the same label of the principal point (low-entropy point) in the optimal search radius are aggregated in the same cluster.Therefore, the dimensionality label assigned to each principal The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W15, 2019 27th CIPA International Symposium "Documenting the past for a better future", 1-5 September 2019, Ávila, Spain point is transferred to the corresponding cluster (linear, planar or scatter).The so-built clusters are then used to identify outliers in the sparse point cloud generated at the end of the image orientation procedure.
The outlier detection method is based on two principles: • (Macro) Cluster size analysis: small clusters are indicative of a high unpredictability of the point set, meaning that the inner points behave differently according to the abovedefined geometrical features.Indeed, the optimal search radius based on the minimization of the entropy ensures that close points behaving very differently will be clustered in small sets.• (Micro) Inner cluster analysis: since the label of each cluster is known, it is possible to perform a fine-grade analysis to check the coherence of each point with the assigned cluster label.For this purpose, a broader set of eigenfeatures (Section 3.1) can be exploited.For example, knowing that a set of points has been clustered and labeled as a planar surface, planar-related features, such as the planarity and the anisotropy, can be used to detect possible outliers in the cluster and remove them.

Test case
Our procedure was tested on different photogrammetric datasets, acquired from terrestrial and UAV platform cameras.Hereafter results from Modena case study are presented.The dataset consists of 138 images (82 terrestrial and 56 from an UAV platform) of a side of the Modena Cathedral in Italy.A Nikon D750 with a 28 mm lens (pixel size of 5.98 µm) was employed for the terrestrial acquisitions, whereas a Canon EOS 600D (focal length of 28mm, pixel size of 4.4 µm) for the UAV images.

Search radii selection and eigenfeatures extraction
The developed pipeline is software independent and it can work with either open and commercial software.The in-house tool reads as input the image orientation results (estimated intrinsic and extrinsic camera parameters, as well as the 2D and 3D tie points coordinates).On the coordinates of the 3D tie points, it then computes, as described in section 3.2, the Eigen values at different radii values keeping the radius yielding the lowest neighborhood entropy.This part exploits the Point Cloud Library PCL (Version 1.9.1) for the background computations.
The choice of suitable search radii for feature extraction is essential to achieve clear, correct, and coherent results.For their identification, we used a modified version of the entropy and dimensionality-based approach presented in (Demantkè et al., 2011, Gressin et al., 2013).
In our procedure, dimensional and geometrical characteristics of the reconstructed scene are used to define the range of radii for the optimal radius search (minimum and maximum values).
Using linearly increasing values in this range, instead of a square factor, at first, sixty radii were tested, and entropy values were calculated on a subset of the original point cloud.The selected subset includes the most relevant parts of the dataset, in terms of geometrical complexity and structure.Working on a subset representative of the entire scene, allowed us to test a more significant number of radii with a reduced computational effort.Therefore, the twenty most recurrent radii minimizing the entropy of the considered points were used for feature extraction in the entire dataset.In the presented case study, the sixty radii values were selected in the range [0.1-6 mt], considering a constant increment of 0.1 mt in each test (Figure 4).

Dimensionality classes and labeling
The entropy function presented in section 3.2 provides for each reconstructed 3D tie point a measure of the probability to belong to a part of the scene with specific geometric behavior.We would expect, for example, that points lying on the façade or the floor present a higher possibility to belong to the planar dimensionality class (2D), while more complex architectural structures should be highlighted in the 3D class.Therefore, a dimensionality class and a label (1D, 2D, 3D) are assigned to each 3D tie point (Eq.2).
In Modena case study, about 4% of the points were labelled as 1D, 84% as 2D, and 12% as 3D (Table 2).Cathedral structures were mainly identified as planar surfaces (façade, roof, floor), principal edges were recognized as linear units, while eaves and more complex sculptures as 3D objects.In the main entrance area, a more variable point cloud density and a greater geometrical complexity of the decorative elements increase the ambiguity of the labeling results (Figure 5).

Clustering procedure
As a result of the previous steps of our pipeline, to each 3D tie point is associated an optimal search radius and an entropy value, as well as a label with a dimensionality class, used for the clustering procedure.
The clusters are built starting from points having the lowest entropy and going up to more uncertain points.The computed best search radius of each considered points was used to define the size of the cluster.The cluster is expanded only with 3D tie points lying in the considered radius and labelled with the same dimensionality class.With this clustering procedure, sets of points having a prevalent geometric behavior (provided by the low-entropy value of the main point in that radius) were grouped.
The implemented method for aggregating points allowed us to apply specific and focused filtering approaches in each dimensionality case (Figure 6).

Filtering procedure
At this step of our pipeline, the sparse point cloud is clustered in groups of variable sizes having each one three possible dimensionality classes.As described in section 3.3, outlier detection and removal are performed through a macro and a micro analysis of the clusters.
The "macro" clusters analysis is based on the removal of small clusters, characterized by high-entropy points and, consequently, a reduced cluster size.High-entropy values indicate an ambiguous geometrical behavior of the neighbor's points and, most likely, the outlier nature of these points.Clusters containing less than ten points are automatically removed from the scene.
The "micro" clusters analysis is based on the evaluation of points distribution and its coherence with respect to the assigned class.At this part of the analysis, the eigenfeatures of all the points of the cluster are re-computed at new radii values, more indicative of the inner behavior of single clusters and classes.In this case, radii values are selected, keeping into account the average spatial resolution of the point cloud within the cluster.For feature extraction, we considered a scale factor for the search radius, equal to ten times the average cluster spatial resolution.This scale factor was chosen to guarantee an adequate minimum number of points for the analysis, but also an in-depth evaluation of the cluster behavior.

Results and evaluation
Remaining points, following the eigen-features filtering procedure, were then used for re-running the bundle adjustments and re-computing the orientation parameters.A new dense reconstruction was then achieved using the filtered tie points set.
As validation of our work, external checks were employed.Laser scanner data, acquired with a Leica HDS7000, were used as ground truth for comparing the two dense reconstructions.
Qualitative and quantitative checks and evaluation were performed, considering: • The RMSE on plane fittings (Table 3) (Figure 7); • The average mean and standard deviation with a cloud-tocloud distance on some selected areas (   Figure 7: Some areas selected for plane fitting analyses.

CONCLUSION AND FUTURE WORKS
This work presented an extended photogrammetric workflow, which includes a filtering step on the sparse point cloud before running the dense reconstruction, based on some geometric properties of the points.The covariance features and a dimensionality based-scaled approach (Section 3.2) are used to define the geometric behavior of the points.Sets of points with the same behavior are labeled and clustered, and specific filters are applied to each cluster considering their prevalent geometric properties.Results and improvements of this procedure have been verified through a quantitative and qualitative (Figure 8) analysis using external checks.
In future tests, new radii selection methods will be explored, considering that the low-density of the sparse point cloud could negatively condition the estimation of the prevalent geometric behavior, as sometimes observed in our work.The presented procedure will also be extended, including new methods for automatically defining eigen-filtering thresholds and for estimating the improvements of other inner quality parameters after the filtering step (such as the variation of the re-projection error or the multiplicity values).Finally, the developed methodology will be combined with the procedure presented in Farella et al. (2019), considering in the filtering step photogrammetric acquisition issues and reconstructed geometric properties of the points.The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W15, 2019 27th CIPA International Symposium "Documenting the past for a better future", 1-5 September 2019, Ávila, Spain

Figure 2 :
Figure 2: The flowchart of the implemented pipeline.An automatic filtering procedure developed in Python (lower blocks) enriches the traditional photogrammetric pipeline (upper blocks) for improving 3D dense reconstruction results.

Figure 3 :
Figure 3: A schematic representation of the adaptive clusters identification, based on the minimization of the entropy defined over the dimensionality classes (Linear -1D, Planar -2D, Scatter 3D).

ComputedFigure 4 :Figure 5 :
Figure 4: Some results of developed procedure for radii selection in the Modena case study.The twenty radii chosen for the feature extraction were identified in the range 0.1 -2.7 meters.They correspond to the most frequent values which minimize the entropy of the points neighborhood.
For each dimensionality class, one or two eigen-features were chosen to describe the points distribution: • The linearity L for 1D clusters; • The anisotropy A and planarity P for 2D clusters; • The omnivariance O and eigenentropy E for 3D clusters.After selecting some representative clusters of each class, filtering thresholds were empirically tested.The evaluation was performed considering qualitative improvements and noise reduction, filtering out low-value points.The same thresholds were then automatically applied to the entire sparse point cloud and results evaluated through qualitative and quantitative analyses.The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W15, 2019 27th CIPA International Symposium "Documenting the past for a better future", 1-5 September 2019, Ávila, Spain a) b) Figure 6: An explicative example of clusters built for the planar (2D) dimensionality class in the range 0.1 -2.7 meters (a), and the joint visualization of clusters computed for each class (b).
Fig. 8), computed on the original and the post-filtering dense point clouds.The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W15, 2019 27th CIPA International Symposium "Documenting the past for a better future", 1-5 September 2019, Ávila, Spain Cloud to cloud distance on sub-areas (cm) Average cloud-to-cloud distance variation (cm) evaluated on 5 sub-areas.

Figure 8 :
Figure 8: Single-cluster visual evaluation of the improvements in the dense point clouds obtained after the filtering procedure.Dense point clouds computed for each cluster show a general improvement of results, with much denser point clouds, less noise and a higher definition of the architectural details.

Table 3 :
Plane fitting RMSE (cm) and variations on eight selected planar areas (