AN ADAPTED CONNECTED COMPONENT LABELING FOR CLUSTERING NON-PLANAR OBJECTS FROM AIRBORNE LIDAR POINT CLOUD

Light Detection And Ranging (LiDAR) is an active remote sensing technology used for several applications. A segmentation of Airborne Laser Scanning (ALS) point cloud is very important task that still interest many scientists. In this paper, the Connected Component Analysis (CCA), or Connected Component Labeling is proposed for clustering non-planar objects from Airborne Laser Scanning (ALS) LiDAR point cloud. From raw point cloud, sub-surface segmentation method is applied as preliminary filter to remove planar surfaces. Starting from unassigned points , CCA is applied on 3D data considering only neighboring distance as initial parameter. To evaluate the clustering, an interactive labeling of the resulting components is performed. Then, components are classified using Support Vector Machine, Random Forest and Decision Tree. The ALS data used is characterized by a low density (4-6 points/m2), and is covering an urban area, located in residential parts of Vaihingen city in southern Germany. The visualization of the results shown the potential of the proposed method to identify dormers, chimneys and ground class.


INTRODUCTION
Light Detection And Ranging (LiDAR) is an active remote sensing technology used for several applications. A segmentation of Airborne Laser Scanning (ALS) point cloud is very important task that still interest many scientists. Most of point clouds segmentation methods are typically adapted to extract planar surfaces as roof faces and other surfaces from airborne laser scanning data. For this purpose, different segmentation methods have been developed based, for example, on Hough transform (Maltezos, Ioannidis, 2016), RANSAC (Chen et al., 2012), or surface growing (Alharthy, Bethel, 2004). Kada and Wichmann (Kada, Wichmann, 2012) had proposed point cloud segmentation algorithm called subsurface growing. As indicated by its name, the algorithm is an extension of the wellknown surface growing approach, in which the growing process continues below the surfaces. The purpose was to gather better model data, as roof features and shapes become more apparent regarding these subsurface segments. Despite the fact that the subsurface growing algorithm gives good results, even for complex building roof shapes, the problem remains with roof faces generated by dormers and chimneys. Several dormers and chimneys remain not identified because their shapes are composed of several planar faces. Although the planar segmentation methods of point clouds often serve their purpose, those methods are not suited to segment non-planar objects like power lines, cars, trees and any object with free form shape. There is less works related to non-planar objects segmentation from point clouds. The aim is to combine planar and nonplanar segmentation methods in order to increase entities identification. Clustering is the process of grouping points with similar feature vectors into a single cluster separate from points with dissimilar feature vectors. In the literature, the point clouds clustering methods are usually used for clustering multi-planar segments and 3D building facades (Zolanvari et al., 2018). For this * Corresponding author purpose, large roof segments are extracted with a sub-surface growing method and the remaining points (e.g. dormers and chimneys) with a clustering methods. The Connected Component Analysis, or Labeling (CCA), is a known method of clustering that was widely used in image processing. In this paper, the CCA is exploited for clustering non-planar objects from Airborne Laser Scanning (ALS) LiDAR point cloud. The proposed method does not take only k nearest neighbors point, but rather all points that the distance from each one is less than a fixed distance beforehand. Starting from a random point belonging to initial set of points, this point is labeled with the current label symbol and is removed thereafter from the set. It is different compared from that proposed in the literature . Our method does not stop at immediate neighbors of the interest points, but continues until no further neighbors are found. In order to evaluate the clustering algorithm, classification of components is performed. Support Vector Machine, Random Forest and Decision Tree classifiers are tested and compared. The visualization of the results shown the great potential of connected component analysis when applying it for clustering non-planar surfaces such as dormers and chimneys. This paper consists of three sections. Following a description of the methodology, Connected component labeling is introduced. Then , classification step is detailed. Experimental results are visualized followed by discussions. The conclusion summarizes the findings of the present work.

METHODOLOGY
This paper focuses on on non-planar surfaces clustering of raw airborne LiDAR data. First of all, sub-surface growing segmentation method is applied on LiDAR point cloud as preliminary filter to extract planar roof surfaces. Then, from untreated points, an adapted algorithm of connected components labeling is applied. Finally, in order to evaluate the clustering algorithm, an interactive resulting components labeling is performed , fol-The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2020, 2020 XXIV ISPRS Congress (2020 edition)

Connected Components Labelling
Connected components analysis (CCA) ,or connected components labeling, is a clustering algorithm basically used for image processing. In the context of LiDAR data, Milaresis et al. employed CCA to recognize the background and foreground objects from digital elevation model (Miliaresis, Kokkas, 2007). Zhang et al. exploited CCA from 3D point cloud for the classification in urban area (Zhang et al., 2013). in (Xiao et al., 2016), CCA is used for tree points clustering. However, closely located and intersected trees are clustered together as multi-tree components, which required further treatment with mean shift segmentation. In (Vosselman, 2013), Vosselman exploited knearest neighbors for clustering connected component of point clouds using a fixed neighborhood size. The proposed CCA does not consider only k nearest neighbors point, but also the neighbors of neighbors. It is different than the one presented in , where the authors considered only the immediate neighbors for forming components. The proposed algorithm extends the process until no points are found. The process of CCA is described in algorithm 1.

Classification
There are two approaches for point cloud classification, pointbased and segment-based classification (Elberink, Vosselman, 2012). In point-based classification approach, class label is affected point by point. For segment-based classification, a prior segmentation method is required before the classification. The class label is assigned by segment and the points of segment share the same class label (Vosselman et al., 2017). In this paper, a clustering of SSG unassigned points is performed using CCA. The resulting components facilitate the interactive la-  (Li et al., 2016, Crasto et al., 2015, widely used for LiDAR data classification are tested and compared.

Study area
The proposed approach is applied on ALS point cloud provided by the German Society for Photogrammetry, Remote Sensing and Geoinformation (DGPF) (Cramer, 2010). It was acquired by Leica Geosystems using a Leica ALS50 system and is characterized by a low density (4-6 points/m²). The area covers urban Vaihingen city in southern Germany. It is marked by the presence of several roof superstructures, as shown in Figure 2.

Interactive components labeling
For evaluation purposes, the training set is selected, as can be shown in Figure 3. An interactive labeling of components is achieved before the classification. The targeted classes are dormers, chimneys and ground. The components are draped over orthophoto and labeled manually. Than, the label of points is deducted according to the component they belongs to.

Data preparation
The learning data set is randomly split in training (70%) and testing (30%) subsets. Classifiers parameters are determined using grid search and cross validation. Table 1 shows the details of each class.  (Figure 4 (a)), 15,817 components are generated (Figure 4 (b)). In Figure 4.(c), only components with more than five points are considered.  The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2020, 2020 XXIV ISPRS Congress (2020 edition)

Evaluations metrics
To evaluate CCA algorithm, the components are subject to classification process. The performance measurements used for classification are overall accuracy (OA), execution time and area under the curve (AUC) score.

Components classification
The Table 2 illustrates the components classification results. Good scores are achieved by the tested classifiers. The best result is recorded with SVM classifier OA (86.22%) and AUC (0.96). Figure 5 shows multiclass receiver operating characteristic curve for SVM classifier. Classes dormers and ground AUC scores exceeds 0.90 and is close to 0.99 for chimneys class. The figure 6 shows an examples of resulting components projected over the orthophoto. The components represent the classes dormers, ground and chimneys. The neighborhood distance metric directly affects the components results. Testing another distance is time-consuming, due to the components interactive labeling required. This can be seen as drawback of connected component labeling method. The second disadvantage of the CCA method is the dependence of the resulting components to the chosen neighborhood distance. The figure 7 shows two dormers considered as a single due to their proximity. In deed, there is no universal metric usable for all the cases.

CONCLUSIONS
In this paper connected component labeling for both of planar and non planar objects clustering is proposed. Only a neighborhood distance is initially defined. To evaluate the clustering, interactive labeling is done, followed by component classification step. Support vector machine, Random forest and decision The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2020, 2020 XXIV ISPRS Congress (2020 edition) tree classifiers were exploited. The results demonstrate the relevance of the proposed method to identify dormers, chimneys and ground. Interactive labeling task being long and tedious, only Vaihingen city data set was tested. In future work, further data sets will be studied and compared to the obtained results.