SEGMENTATION OF LARGE UNSTRUCTURED POINT CLOUDS USING OCTREE-BASED REGION GROWING AND CONDITIONAL RANDOM FIELDS

: Point cloud segmentation is a crucial step in scene understanding and interpretation. The goal is to decompose the initial data into sets of workable clusters with similar properties. Additionally, it is a key aspect in the automated procedure from point cloud data to BIM. Current approaches typically only segment a single type of primitive such as planes or cylinders. Also, current algorithms suffer from oversegmenting the data and are often sensor or scene dependent. In this work, a method is presented to automatically segment large unstructured point clouds of buildings. More speciﬁcally, the segmentation is formulated as a graph optimisation problem. First, the data is oversegmented with a greedy octree-based region growing method. The growing is conditioned on the segmentation of planes as well as smooth surfaces. Next, the candidate clusters are represented by a Conditional Random Field after which the most likely conﬁguration of candidate clusters is computed given a set of local and contextual features. The experiments prove that the used method is a fast and reliable framework for unstructured point cloud segmentation. Processing speeds up to 40,000 points per second are recorded for the region growing. Additionally, the recall and precision of the graph clustering is approximately 80% . Overall, nearly 22% of oversegmentation is reduced by clustering the data. These clusters will be classiﬁed and used as a basis for the reconstruction of BIM models.


INTRODUCTION
Point cloud segmentation is a widely discussed topic in the research community.The objective is to associatively cluster points with similar characteristics.Not only does it drastically reduce the amount of data, it also allows for better data interpretation.For instance, Once the point cloud is segmented, it is processed by reasoning frameworks to identify different building elements for the purpose of reconstruction Building Information Models (BIM's) (Bassier et al., 2017).Point cloud segmentation is also used in fields such as computer vision, remote sensing, robotics and other applications that require the partitioning of 3D point cloud data (Volk et al., 2014).
Due to the sheer size of the point cloud information, automated segmentation is becoming increasingly computationally challenging.For instance, point clouds captured by Terrestrial Laser Scanners (TLS) consist of tens of millions of points per scan.Also, the scanned objects are often partly occluded, the data is unevenly distributed due to the range to the sensor and there is noise present in the data set (Tang et al., 2010).These problems prove challenging when developing segmentation algorithms (Nguyen and Le, 2013).
The emphasis of this work is on the segmentation of large unstructured point clouds of buildings.More specifically, we look to identify building object parts for classification purposes.The proposed method is able to properly partition the data even in highly cluttered and noisy environments.Also, our approach is capable of detecting smooth surface including planes and complex geometry.
The remainder of this work is structured as follows.The background is presented in Section 2. In Section 3 the related work is discussed.In Section 3. the methodology is presented.The test design and experimental results are proposed in Section 4. The approach is discussed in Section 5. Finally, the conclusions are presented in Section 6.

BACKGROUND & RELATED WORK
Several methods have been proposed for efficient p oint cloud segmentation including region growing, edge detection, modelbased methods, machine learning techniques, graphical models, etc. (Nguyen and Le, 2013;Lin et al., 2015;Fan et al., 2017;Vosselman et al., 2017).In this research, an effi-cient region growing algorithm is implemented for the segmen-tation similar to Yau et al. (Yau et al., 2013), Vo et al. (Vo et al., 2015), Habib et al. (Habib and Lin, 2016) and Lin et al. (Lin et al., 2015).They propose algorithms for the segmentation of unstructured point clouds of small-scale objects and urban environments.We expand their approach by parallel processing subsets of the initial point cloud.By doing so, the number of points can be greatly increased.Our method closely aligns with the work of Su et al. (Su et al., 2016).They also perform an initial oversegmentation after which they use graph theory to cluster the segment.While they focus on the processing of cylindrical pipes, we segment both planes as well as complex geometry.
Both structured and unstructured data sets are processed: The former directly exploits the data structure of the used sensor while the latter integrates costly nearest neighbour searches to process point cloud data.While being very efficient, processing structured data is often sensor and case specific (Zhou et al., 2014)  of successive laser scans using a Hough-transform implementation (Grant et al., 2013).Holz D. and Behnke S. rapidly segment RGB-D sensor data using region growing (Holz and Behnke, 2012).
In contrast, processing unstructured data is more general but requires preprocessing of the data.Typically, the data is restructured into voxel octrees for efficient nearest neighbour searches (Vo et al., 2015;Su et al., 2016).We integrate a similar approach as the emphasis of this research is on the processing of unstructured data in order to be sensor independent.
A promising segmentation approach is the use of graph theory.By representing the inputs as a set of nodes connected by edges, the segmentation can be treated as a graph optimization problem.Popular implementations of graphical models are Conditional Random Fields (Wolf et al., 2015;Xiong et al., 2013;Niemeyer et al., 2012) and Markov Random Fields (Munoz et al., 2009).These probabilistic models solve the segmentation by modelling the posteriori probability of the outputs given a set of feature values.In our approach, we implement a Conditional Random Field for the purpose of clustering our initial segments to enhance the segmentation process.
The goal of our research after segmentation is to process the clustered segments by reasoning frameworks to compute class labels for each cluster.Some researchers consider segmentation and classification as a single step process (Hackel et al., 2016, Landrieu et al., 2017).However, the features extracted from a single point and its neighbourhood typically encode only local characteristics.In our work, segmentation and semantic labelling are treated as individual steps.By reducing the point cloud to a set of clusters, the number of samples is greatly reduced and more global and distinct features can be computed from the groups of points.

METHODOLOGY
In this paper, a two-step segmentation algorithm is proposed that associatively clusters points and computes the properties for each group.An overview of the general workflow is depicted in Fig. 1.
First, an efficient octree-based region growing method is implemented that rapidly oversegments the data.Next, a clustering algorithm is used based on graph theory to reduce the number of segments created.Both steps are discussed in detail in the following paragraphs.Region growing The oversegmentation of the data is performed using an iterative growing algorithm based on the normal and colour information of the point cloud.As input, our algorithm takes any point cloud in a widely accepted format.The initial data is restructured as a voxel octree P for efficient neighbourhood searches.Also, the data is tiled and parallel processed for computational efficiency.Each tile is uniformly seeded s ∈ S after which segments M are iteratively grown.During every iteration, candidate points C are considered in a dynamic search radius d from s (Fig. 2 top): The inliers of M are given by the points in C that meet the following conditions (2) where t nc and tRGB are respectively thresholds for the normal similarity and the colour similarity of c compared to s.The growing is performed by considering new points in M as the new seeds for the segment.To make the algorithm more efficient, only a subset of C is used for the growing (Fig. 2 mid).A set of candidate growing points Q is defined along the border of C that are conditioned on where d is the distance from c to s and the normal threshold t nq is more strict than t nc to ensure good growing candidates.During the growing stage, the members of Q are iteratively used to seek new growing points Q and candidates C until Q is empty.For each q ∈ Q the following steps are performed (Fig. 2 where nq and RGB(q) are respectively the mean normal and mean colour of the nearby points of c in M .Potential segments are only accepted when containing sufficient points based on a threshold.Both the seeds and candidate groups are treated efficiently so no unnecessary neighbourhood searches are performed.The result is a set of segments M that contain points with similar properties.
Clustering Once the data is oversegmented, the individual segments are processed by an associative classification framework to enhance the clustering.More specifically, the clustering of the segments is considered a binary graph optimization problem of a Conditional Random Field (CRF) (Sutton and Mccallum, 2011).Binary class labels y ∈ ζ = {0, 1} are computed for each observed segment to determine whether or not a segment is willing to cluster with its neighbours.A graph G = (M, E) is constructed whose nodes M are the segments of the initial region growing and a set of edges E, derived from the adjacency matrix that connects each node to its four nearest neighbours.As a criterion for adjacent neighbours, the Euclidean distance between the boundaries of u ∈ Mi and v ∈ Mj is observed.A set of local features xn = {xn1, xn2, . . ., x nk } is computed for each node M in G including the surface type and size (Fig. 3 left).Additionally, contextual features xe = {xe1, xe2, . . ., x el } are extracted based on the edges such as surface type similarity, coplanarity and the distance between the boundaries of the nodes.Given the graph and the local and contextual features, the unary and pairwise potentials are initialised (Fig. 3 mid).The conditional probability p(y|x) of the class labels y ∈ ζ = {0, 1} is given by (Eq.5).
where Z is the partitioning function that normalises the input parameters, ω = {ωn, ωe} the pre-trained node and edge weights and φn, φe the log-linear potentials given the feature vectors.The final class labels are computed by minimising the negative loglikelihood in the Conditional Random Field (Eq. 6) given the feature vectors x and ω.However, exact inference is impossible in a densely connected CRF.Therefore, we approximate the log-likelihood using a loopy belief propagation method (Frey and MacKay, 1998) as implemented by Schmidt (Schmidt, 2010).After marginal inference is computed, each cluster is assigned the label with the highest probability.The result is a set of labelled 3D point clusters.
Once the class labels are computed, the segments that are labelled as potential clustering candidates are grouped together based on (Fig. 3 right).Iteratively, the most likely connected clustering candidates are merged given their edge potentials φe(xe, ωe).The result is a set of clustered segments with similar properties.

EXPERIMENTS
The algorithm was trained and tested on different data sets.The candidate data was not de-noised and a wide variety of objects was present in the point clouds (Fig. 1 and 4).All the input data was stored in binary file formats for efficient I/O.In addition to our own data, the 2D-3D-Semantics (2D-3D-S) benchmark data of Stanford was evaluated (Armeni et al., 2017) (Fig. 4).Areas 1 to 4 were used for the testing with more than 18-50 million 3D points each.All data was processed on a laptop with an Intel Core i7-4900MQ CPU @ 2.8Ghz with 4 cores, 7 active threads and 32 GB RAM.The algorithm was implemented in Matlab 2017 and uses the parallel computing toolbox, the Computer vision toolbox and the UGM toolbox (Schmidt, 2010).
Segmentation The segmentation results are shown in table 1 and Fig. 4. By default, the tile size was set to 1 million 3D points.This fairly small tile size was chosen to avoid idle cores on the CPU as some tiles take more time to process due to the complexity of the geometry.The dynamic search radius for the nearest neighbours was initialised at 0.1m, the angular thresholds tnc = 45 • and tnq = 15 • and the colour threshold tRGB = 20.Additionally, a minimal point count of 2000 points was defined for potential segments.The test results show promising results with an approximate processing speed of 40,000 points/s including I/O, normal computation and segmentation.By parallel processing multiple tiles, the segmentation speed is multiplied by the number of available threads which speeds up the process significantly.On average, 1 cluster is created for every 15,000 points.Most clusters are properly found as shown in Fig. 4. The surface types other than the planar segments showed increased oversegmentation and misclustering.This is expected given the fact that these objects suffer greatly from occlusions and noise.Additionally, the data was oversegmented due to feature variance and because of the data tiling.Also, nearly 30% of the data was unfit Clustering The clustering results are shown in table 2 and Fig. 4. The node and edge features were standardised for generalisation purposes.Two node features were used encoding the size and the surface type of each segment.Three edge features were introduced that represented the coplanarity, the distance between boundaries and the surface type similarity.The weights of the model were trained by minimizing the negative log-likelihood given the feature vectors x and the labels y of known observations.In total, 3000 surfaces of an office building were used to train the weights (Fig. 1).In order to improve the performance of the model, an equal number of observations of each class was observed.Additionally, a regularization parameter λ was introduced that penalises deviating weights.
The model was tested on the Stanford 2D-3D-Semantics Dataset (2D-3D-S).The performance is depicted in Fig. 5.The average recall and precision is 78% and 79% respectively.This is very accurate given the large feature variance and the configuration of the observed segments.However, a portion of the inliers are easily found due to the data tiling.Planar surfaces belonging to different tiles will cluster more easily as they initially belonged to the same object.Overall, an average reduction of 22.4% was recorded for the clustering of the segments.The mean time for the feature extraction of each cluster and computing inference in the graph is 80 milliseconds per segment.The experiments revealed that the majority of clustered components are of the planar surface type.This is expected due to the used features and the characteristics of these objects.However, several pipe segments properly clustered due to surface similarity and connectivity.

DISCUSSION
Point cloud segmentation can be performed with numerous 2D and 3D algorithms.Most methods are capable of finding planar or other primitives in a scene.However, there are few methods that deal with large unstructured data.A promising approach is data tiling which separates the initial point cloud into more workable chunks and allows for efficient parallel processing.As in our work, the segmentation process can be sped up nearly equal to the number of available threads.This approach can be further extended towards GPU integration.
A key issue in point cloud segmentation is oversegmentation.
While other approaches typically propose heuristics to cluster the individual segments, we tackle this problem using graph theory.By considering the segments as a Conditional Random Field, a probabilistic classification is made to determine which nodes should be clustered.A major advantage is the associative behaviour of neighbouring clusters which leads to more accurate classification.Also, the model parameters are learned from extensive training data which allows better generalisation.

CONCLUSION
This paper presents an unsupervised method to segment large unstructured point clouds of buildings.The segmentation is considered as an optimisation problem that can be solved with graph theory.First, the data is oversegmented into planes and complex geometry.A greedy octree-based region growing method is implemented to associatively group points.For efficiency purposes, the data is tiled and parallel processed on multiple CPU threads.Next, the candidate segments are represented by a Conditional Random Field (CRF) with both local and contextual features to determine the most likely clustering candidates.The result of the method is a set of segmented clusters of the initial point cloud data.
The experiments prove that the used method is a fast and reliable segmentation framework for unstructured point cloud data.Segmentation speeds up to 40,000 points per second are recorded for the region growing.Additionally, the recall and precision of graph clustering is approximately 80%.Overall, it can be stated that the used algorithm shows promising results for point cloud segmentation.
In future work, the presented approach will be investigated further to reduce oversegmentation and have better results on surface types other than planar segments.Also, GPU integration will be considered to further speed up the process.The use of Conditional Random Fields for the clustering of the initial segments will be extended to yield better clustering results and to be applicable to other research topics such as classification and reconstruction of BIM objects from point cloud data.

Figure 1 .
Figure 1.Overview workflow and intermediate results.Initial point cloud (a), oversegmented clusters with non-valid clusters in black (b), graph representation of clusters (c) and the final grouped clusters (d).

Figure 2 .
Figure 2. Overview workflow Region Growing.Candidate segment points C in P (top), Extraction of growing points Q from C (mid) and update of Q, M and C during each iteration (bottom).

Figure 4 .
Figure 4. Overview experiments Stanford 2D-3D-Semantics Dataset (2D-3D-S): The top and bottom row respectively depict segmentation and classification results for Area1 and a subset of Area1 for clarification.From left to right can be viewed: Initial point cloud (left), oversegmented clusters with non-valid clusters in black (mid) and the final grouped clusters (right).
. For instance, Grant et al. achieve near real-time segmentation