QUALITY ANALYSIS ON RANSAC-BASED ROOF FACETS EXTRACTION FROM AIRBORNE LIDAR DATA

RANSAC algorithm is a robust method for model estimation. It is widely used in the extraction of geometry primitives and 3D model reconstruction. However, there has been relatively little comprehensive evaluation in RANSAC-based approach for plane extraction. In order to provide a reference for improving the quality on RANSAC-based approach for roof facets extraction or segmentation, this paper focuses on the quality analysis on classical RANSAC algorithm. Airborne LIDAR data from the test Area 1 and Area 2 in Vaihingen (German) is used. 33 buildings (4 buildings with flat roofs and 29 buildings with slope roofs) extracted from LIDAR data are taken as input for planes extraction. Based on the characteristics of detected planar surfaces, planes can fall into several categories: non-segmented planes, over-segmented planes, under-segmented planes and spurious planes. Then, several causes for these quality problems are discussed. Some experimental results and analyses show that, considering spatial-domain connectivity, most of the quality problems of classical RANSAC algorithm can be improved. However, there are still many issues requiring in-depth research. Finally, some methods are suggested to solve these problems. * Corresponding author. Jixing Yan, yjxsky@whu.edu.cn.


INTRODUCTION
As a direct method for collecting dense accurate 3D point clouds, Light Detection and Ranging (LIDAR) has become an important technology in topographic mapping and 3D city modelling.3D building reconstruction is a strong focus of 3D city modelling, and much progress has been reported (Vosselman, 1999, Verma, 2006, Sampath, 2010).Among the reported studies, building reconstruction is usually based on the assumption that a building is a polyhedral model, which consists of plane primitives.Consequently, the procedure of building roof reconstruction can be decomposed into two main steps: 3D roof facets extraction and topology construction.To some extends, the quality of 3D building models mainly depends on the accuracy of 3D roof facets detection from building (Elberink, 2011).
Generally, 3D roof facets extraction from LIDAR data involves several basic methods and techniques such as segmentation, classification and clustering (Sampath, 2010).Local surface normal, calculated from the neighbourhood 3D points, is taken as the most important feature for detecting planes from building roof.However, surface normal is sensitive to noise.In addition to uncertainty in the measure, LIDAR data will inevitably contains returns from parts of trees, antenna or electric wires over building roofs.Moreover, the approach for neighbourhood selection from unstructured LIDAR point clouds will also affect the calculation accuracy of surface normal.
Another popular method for extracting roof facets from point clouds is the Random Sample Consensus (RANSAC) (Forlani, 2006, Bretar, 2005, Kurdi, 2007) algorithm and 3D Hough Transformation (Vosselman, 2001, Huang, 2011).Both of them are robust methods for estimation of the model parameters.Hough Transformation and its extensions can only be used to detect several 3D objects such as lines, planes, cylinders etc, while RANSAC approach is more all-purpose in the detection of geometry primitives.In addition, Hough Transformation is sensitive to segmentation parameters.However, both of them can lead to false or surplus planes when used in the extraction of roof facets from LIDAR data (Vosselman, 2001, Tarsha-Kurdi, 2007).
In terms of roof facets extracted by RANSAC, there have been many qualitative descriptions in literatures but seldom of them provide a comprehensive evaluation in quality.In order to provide a reference for improvement, we focus on the quality analysis.Building roofs extracted from Airborne Laser scanner Data in Vaihingen test areas (Cramer, 2010) are used, and some experiments and quality problems are discussed.This paper is organized as follows.In Section 2, we introduce the classical RANSAC algorithm for plane extraction and give an overview of related work in roof facets extraction.In Section 3, we introduce the test data and some experiments in classical RANSAC for plane extraction, then the experimental results are analysed.In Section 4, we draw a conclusion from this work.And some future work is discussed.

RANSAC AND RELATED WORK
The RANSAC (Random Sample Consensus) algorithm proposed by Fischler and Robert (Fischler and Robert, 1981) is a robust method for extract models from a data set.It is often used to extract geometry primitives from 3D point clouds in computer version.In this section, we introduce the classical RANSAC algorithm for plane extraction and give a short overview of related work in roof facets extraction.

RANSAC
RANSAC algorithm is an iterative method to estimate the parameters of a certain model from a set of observed data.With application to plane model, classical RANSAC can be described as follows: 1) Randomly select 3 points from data, which will define a plane p. 2) Find the distances of the remaining points from the plane p.The points with distance smaller than a critical distance t are called "inliers" and belong to plane p. Record the three points and the number of the inliers, this record is called "best_model".3) Repeat process of 1) and 2) k times or until no planes with point number bigger than d can be found.In each time, if the number of inliers is greater than those in the best_model, replace best_model maintained earlier with the new one.In the end, the parameters of plane model are determined from the final best_model.
As above, it's clearly that RANSAC can only estimate one plane for a particular data set.To detect all planes, RANSAC algorithm is repeated until no more planes can be found.In each time, points that belong to a plane will be excluded from the original data.

Related work
Generally, previous work about RANSAC for roof facets extraction from LIDAR can be divided into the following categories: approach based on position of point (x, y and z); approach based on surface normal.
( Brenner, 2000) introduces RANSAC algorithm to detect planes for roof segmentation from a laser scanner DSM with a ground resolution of one meter.Results show that RANSAC-based approach generates more planar regions than the other two algorithms such as normal vector compatibility and contour based segmentations.Then, regions are filtered based on a set of rules which define several relationships between the normal vectors of planes and ground plane edges.However, RANSAC algorithm is just taken as a method, and there is less discussion on the planar regions extracted by RANSAC.
A ND-RANSAC (Normal Driven RANSAC) approach was proposed by Bretar and Roux (Bretar, 2005) to extract planar primitives from raw LIDAR data.Instead of randomly selecting points from all data points on roof, initial points (3 points) that define a plane are randomly selected from the point sets sharing the same orientation of normal vectors.It reduces the number of draws and improves the efficiency of RANSAC algorithm.
Besides, the parameters k and t of RANSAC algorithm can be automatically determined by analyzing the distribution of normal vectors.A lot of work is done to improve the efficiency of RANSAC.(Forlani, 2006) introduces a method with a combination of RANSAC and region growing to extract roof facets from raw LIDAR data.A region growing algorithm based on gradient orientation is firstly used to determine roof planar segments, and points within each region are determined whether they belong to a single plane by RANSAC.In this paper, RANSAC algorithm is used as a robust method to further subdivide the sub-regions, while quality on the sub-regions is less discussed.
RANSAC algorithm tends to detect the best mathematical plane among 3D building point cloud even if this plane does not always represent a roof plane.In order to overcome this limitation, an extended RANSAC algorithm is proposed (Tarsha-Kurdi, 2007, Tarsha-Kurdi, 2008).The process of RANSAC is improved by adding a limit to the minimum number of points and a standard deviation in the final fitted plane.Besides, in order to extend the capacities of RANSAC algorithm and obtain exact roof planes, the raw LIDAR data is converted into DSM.Then DSM generates a point set after a simple low-pass filter.This approach can reduce the errors and noise of point clouds.In the end, a region growing algorithm is used to decide whether the remaining set of points represents noise or roof details.
As mentioned above, RANSAC algorithm is more as a process for plane extraction from data set.Besides, as explained in Section 1, the estimation of local surface normal is sensitive to noise.There is no clear conclusion whether the unstable parameters have impact on the reliability of RANSAC.What's more, the problems on the planes extracted by classical RANSAC algorithm, which have important implications for improving quality on roof facets extraction, are less discussed.

Test data
The Area 3 in Vaihingen is purely residential area with small detached houses, but most of the architectural features in this region can be found in area 1 and area 2. Therefore, area 1 and area 2 are selected as the test areas.As shown in Figure 1, area 1 (Figure 1(a)) is located in the centre of the city of Vaihingen, characterized by dense historic buildings with complex shapes.Area 2 (Figure 1(b)) is located by the river, featuring with a few high-rising residential buildings.
Digital aerial images, DSM and Airborne LIDAR data are available in the test areas.In this experiment, LIDAR data is taken as input data, and the others are used as reference.For further information about data of "ISPRS Test Project", please refer to (http://www.itc.nl/ISPRS_WGIII4/tests_datasets.html).

Experiment and analysis
With the help of 2D building plans of test areas, buildings can be extracted from LIDAR data.There are about 25 buildings in area 1 and 8 buildings in area 2.Then, classical RANSAC algorithm coded by Peter Kovesi (Kovesi, 2006)

Non-segmented planes:
In Figure 2(a), points in the white square area presents a slope roof of a high-rising residential building.However, without consideration of spatialdomain connectivity, points on the slope roof are classified into other planar surfaces by RANSAC, which leads to a nonsegmented plane.Profile of this building (Figure 3) can prove it.The flat roof with most points is first detected and removed from the point clouds of building.In the end, there are no points left for the slope roof.In addition, there is another cause for nosegmented planes.As shown in Figure 4(b), there is a certain chance that planar surfaces in the hip roof are not detected.That is because 3 points, not on the same planar surface of roof, are randomly selected in the initial process of RANSAC, which may lead to a spurious plane (green points in Figure 4  From above analysis, although fewer remaining points on the surface lead to a non-detected plane, the ultimate cause of non-segmented plane is random sample without spatialdomain connectivity.However, this explanation only applies to the small planar surfaces of roof.Large planar surfaces can be always detected from roof by RANSAC.Generally, over-segmentation is mainly caused by a smaller threshold value.In order to test this, different thresholds of parameter t (distance to the fitting plane) in RANSAC are used to extract planes from roof.As shown in Figure 5(b), the appropriate value of t for this data is 0.05, and four planar surfaces of the gable roofs are correctly detected.Besides, it should be noted that more trivial facets are extracted from roof with a smaller value of parameter t.However, it doesn't mean that a greater value of t will not cause over-segmented planes.

Over
Taking Figure 5(c) for example, although the value of t is a little greater than the appropriate value in Figure 5(b), it causes two over-segmented planes.According to profile (Figure 6) of this test data, parts of point clouds on the parallel surfaces can be classified into both of them.Without the consideration of spatial-domain connectivity, these points will be segmented into the planar surface which is first detected.
Figure 6.Profile of roof in Figure 5 However, no parameters can satisfy any situation.Most of the planar surfaces of roof can be detected when the threshold t is set to 0.1.The number of over-segmented planes wills increases if threshold t is set too small, although a smaller threshold may be appropriate for this data.Because RANSAC tends to detect the best mathematic planes, points belong to other roofs or noise may be classified into the planes with a large number of points, which will reduce the accuracy of boundary of planar surface.However, considering spatial connectivity, coplanar planes (Figure 2(c)) and tails far way from body plane can be separated from body planes.As far as the tails adjacent to body plane, it may be separated based on point density.However, parts of the exact planar surface may be removed, such as points in the white rectangular area in Figure 7(b).In that case, topology relationships between detected planes may need to be considered.As shown in Figure 8, in the test of 33 buildings, we find that spurious planes are related to the number of detected planes.

Spurious
The buildings with most detected planes in Figure 8 are buildings with complex shapes (Figure 9(a)) or high-rising residential buildings (Figure 9(b)).It should be pointed out that buildings in the red polygonal area of Figure 9(a) are adjacent to each other.It is hard to separate from point clouds.Therefore, they are regarded as one building in the test.For further analysis, different parameter t is used to detect planes.From Table 1, we can see that some of spurious planes can be removed by increasing the value of parameter t, but this will lead to more non-segmented planes.For over-segmented planes of building with complex shapes, it is noted that over-segmented planes decrease a lot with the increasing of the value of parameter t.
But small over-segmented planes are eliminated, which will affect the accuracy of boundary of planar surfaces.
In this experiment, under-segmented planes with tails in Figure 9(a) are more common than the building in Figure 9(b).That is because the former consists of several buildings having a lot of slope roofs with different directions.Because of the probable intersection of plane defined by point sets, some points may belong to several mathematical planes, which can lead to tails.On the contrary, most of planar surfaces of the latter building are parallel and discontinuous in height, so points tend to belong to one mathematical plane.Because there are some points on the surfaces of facade, a few points may be classified into the planar surfaces of flat roof.But these points are very rare and far away from the body planes, they are easy to be separated.

CONCLUSION
Roof facets extraction is the basis of 3D building reconstruction based on polyhedral model.It has been the focus of research all the time.As a robust method for model estimation, RANSAC is widely used in the extraction of geometry primitives.This paper gives a comprehensive evaluation of RANSAC-based approach for roof facets extraction We give four detail categories of inaccurate planes detected by RANSAC.Based on some experiments, the reasons for quality problems are discussed.Experiments show that non-segmented planes are sensitive to the number of points on planar surface.Small planes tend to be discarded.Over-segmented planes are susceptible to the parameters of RANSAC.Whether the value of parameter is too smaller or a little bigger, an inappropriate value will lead to over-segmentation.Under-segmented planes are sensitive to the shape of building.Complex shapes mean that points belonging to several mathematical planes are more likely to be segmented into the larger plane, which will cause under-segmentation.Spurious surfaces are common in all test data.It is related to the number of detected planes.Buildings with complex shapes tend to have more spurious planes.
Increasing the related threshold of RANSAC can reduce the number of spurious planes, but this will affect the accuracy of plane detection.Most of the quality problems above can be improved, if spatial-domain connectivity is considered.However, some problems such as the tail adjacent to body plane can't be solved.And there are still many issues to be studied.
Point density and topology relationships between planes are suggested to be considered.
(b)).As a result, some of points on the same planar surface are removed, and the plane may not be detected because of fewer points (white points in the larger rectangular area in Figure4(b)).

Figure 7 .
Figure 7. Planes detected by RANSAC.(a) undersegmented planes.(b) Low point density area on roof planes: Spurious planes (Figure2(d)) are false planar surfaces detected by RANSAC.They are common to most test data.Because points on large planar surfaces tend to be first detected and removed from raw data, only a few points such as noise, points on edge or small planar surface are left.These points, which have low point density and lack of spatial connectivity, are segmented into spurious planes.

Figure 8 .Figure 9 .
Figure 8. Plane number plot (x-axis represents building ID, yaxis represents number of detected planes)