BUILDING BOUNDARY EXTRACTION FROM LIDAR DATA USING A LOCAL ESTIMATED PARAMETER FOR ALPHA SHAPE ALGORITHM

The α-shape algorithm is a very common option to extract building boundaries from LiDAR data. This algorithm is normally executed in 2D space considering a parameter α as a binary classifier which controls the distinctiveness of points whether or not they belong to the object boundary. For point cloud data, this parameter is directly related to the local point density and the level of detail of building boundaries. Studies that have explored this concept usually consider a unique parameter α to extract all buildings in the dataset. However, the point density can have a considerable variation along the point cloud and, in this case, the use a global parameter may not be the best choice. Alternatively, this paper proposes a data-driven method that estimates a local parameter for each building. The method evaluation considered six test areas with different levels of complexity, selected from a LiDAR dataset acquired over the city of Presidente Prudente/Brazil. From the qualitative and quantitative analysis, it could be seen that the proposed method generated better results than when a global parameter is used. The proposed method was also able to withstand density variation among the LiDAR data, having a positional accuracy around 0.22 m, against 0.40 m of global parameter. * Corresponding author


INTRODUCTION
LiDAR data has gained attention in several sciences, such as those related to the geodetic sciences, where the input data is considered for the extraction of different kinds of objects: buildings, trees, roads, power lines, planar surfaces and vehicles among other things.For cartographic purposes, the feature extraction is fundamental for digital terrain model representations and the acquisition and/or update of geographic information systems, which can be used as database in many practical applications.
In urban planning, building boundary extraction has an essential role due to the high demand of updated cartographic products in order to subsidize decision making.The automatic and semiautomatic extraction process has been researched by many authors (Kim and Habib, 2009, Galvanin and Dal Poz, 2012, Awrangjeb, 2016).The difficulty of developing a fully automatic method is related to the complexity of the buildings, for instance, buildings composed of concave and convex segments, inner and outer boundaries.The convex hull is a concept that has been explored for automatic building extraction from LiDAR data.Sampath and Shan (2007) proposed the modified convex hull formation algorithm by limiting the search space using a neighbourhood, which enables the extraction of some concavities in the shape.The challenge consists in the neighbourhood size determination, whereas the limitation corresponds to non-extraction of inner boundaries.In contrast, the alpha shape (α-shape) has the ability to detect both inner and outer boundaries.Many works have been using this algorithm, as seen in Shen (2008), Jochem et al. (2009), Shen et al. (2011), Satari et al. (2012), He et al. (2014), Li et al. (2015), and Albers et al. (2016).The α-shape algorithm was introduced by Edelsbrunner et al. (1983), aiming at the extraction of the shape of a set of points in 2D space.It considers a parameter α, responsible for selecting the straight segments that composes the object boundary.The challenge is associated with the determination of this parameter, since it depends directly on point cloud density and the desired level of detail.
According to Jochem et al. (2009), in the context of building boundary extraction from LiDAR data, the average point spacing is a good estimate of the parameter α.Shen (2008), Shen et al. (2011), andHe et al. (2014) consider a similar approach, as the value α is obtained by multiplying the average point distance by a constant, which was determined empirically.Li et al. (2015) consider two α values, one larger and the other smaller than the average point spacing.This method generates two boundaries for each building, and the application of a refinement process was necessary to obtain the final boundary.However, the use of a constant parameter to extract all buildings in the dataset might not be the best choice, since the point density can vary.There can be several causes for such point density variation: aircraft velocity change, recording of multiple returns (normally in trees and building edges), aircraft rotation change, and the presence of occlusions.Figure 1 presents examples of point density variation over the same dataset.Therefore, the use a parameter α that can withstand density variation becomes compelling.
To overcome this limitation, this paper proposes an adaptive approach to estimate a local parameter α for each building instead for the entire dataset.The method assumes that point density does not vary significantly within each building.The input data consists of the set of points derived from the building roof.The method is divided into four steps.Firstly, a triangular mesh is generated over the input using the Constrained Delaunay Triangulation (CDT), then, a statistical procedure is executed to identify and eliminate long edges that are inconsistent with the spacing between the sampled points.In a third step, the parameter α is estimated considering the average length of remaining edges.Finally, the α-shape algorithm is performed considering the estimated parameter for each building.
Figure 1.Point density variation on real data, from the same data set

METHOD
Figure 2 shows the simplified proposed method flowchart.The input data corresponds to the roof points of a given building, whereas the output is the building boundary.Considering that the main objective is the extraction of contours using a local parameter α, the set of roof points of each building was selected manually using CloudCompare software, version 2.9.1.The method is composed of three steps for determination of a local parameter α, followed by the execution of the α-shape algorithm.
Figure 2. Flowchart of the steps involved in the local parameter α estimation, used to perform the α-shape algorithm

Local Alpha Parameter Estimation
The first step consists of the CDT over the building roof points.
As the point cloud derived from the LASER scanning is represented in 3D space, the altimetric component was omitted in the process.The triangulation process can be impaired when there are points located in the same position.In order to eliminate the duplicated points, a pre-processing over roof points was performed using the lasduplicate tool from the LAStools package (Isenburg, 2018).
In this work, the local parameter α estimation is based on the average point spacing which is not constant over the 3D point cloud.Before the estimation step, a statistical procedure is executed to detect and eliminate some lines obtained by the CDT located outside the outlines of the buildings.An example of a CDT considering a set of points sampled over a building is presented in Figure 3.Some long edges which can affect the determination of the average point spacing can be seen in Figure 3a.
Assuming the length of edges is normally distributed, a simple criterion to discard these long edges is considered, which is based on the average length of the edges (eavg) and their standard-deviation (se).Edges longer than the average plus three standard-deviations are outlier candidates, and can be detected as follows: If ei < |3se + eavg|, then ei is not an outlier; Otherwise, ei is considered an outlier.
where ei is the length of the edge i.
The parameter α of a given building j is computed by means of Equation 1.In this equation, only the edges marked as nonoutliers are considered in the calculation.Figure 3b presents an example which used the above criterion to eliminate the edges considered as outliers.
where αj = estimated parameter for the building j n = number of edges considered ei = length of the edge i that satisfies the criterion

Alpha Shape Algorithm
The α-shape algorithm (Edelsbrunner et al., 1983)  straight-line segments that compose the boundary.For that, the 2D set of points and the parameter α are considered as input.In this work, the set of points corresponds to the LiDAR points sampled over the building roof, and the parameter α is estimated considering the proposed method (Subsection 2.1).
Considering the points pi and pj, and the parameter α as the radius of a circle passing through these points, it is possible to calculate two circles, as shown in Figure 4.If no point exists inside at least one circle, then points pi and pj are considered as edge points and can be connected to obtain a boundary segment (Figure 4a).If this is not the case, points pi and pj do not form a boundary segment (Figure 4b).

Quality Assessment
In this work, the building boundaries assessment was performed both qualitatively and quantitatively.The qualitative analysis was carried out by means of a visual inspection, whereas the quantitative analysis was performed using some quality measures: completeness (Comp), correctness (Corr), The Hausdorff distance H(A, B) between two sets A and B is defined as follows (Huttenlocher et al., 1993): The PoLiS metric p(A, B) between two polygons A and B is defined by Avbelj et al. (2015) as: (7) where q and r correspond to the number of vertices of polygons A and B, respectively.∂A and ∂B denote the boundary of polygons A and B, respectively.

Experiments and Results
This section presents the experiments and results obtained using the proposed method.The experiments were performed considering a LiDAR dataset acquired over the city of Presidente Prudente/Brazil by the company Sensormap Geotecnologia, which belongs to the Engemap Group, as can be seen in Tommaselli et al. (2018).The data set was acquired from a flight height of 900 m with an average density of 5 points/m 2 .The airborne LASER scanning system used was a RIEGL LMS-Q680i.This system, which has a precision of up to 2 cm in the range measurement, uses the principles of LiDAR to measure distances.It has a LASER pulse repetition rate of up to 400 kHz and an effective measurement rate of up to 266 kHz at a 60° scan angle.The system also has the ability to store multiple returns.Two clippings of areas from the dataset used are presented in Figure 5.The reference building boundaries were obtained from stereo restitution using digital aerial images of 80 megapixel acquired by one Phase One medium format camera (model iXA 180), with an average ground sample distance (GSD) of 10 cm.
To verify the performance of the proposed method, the building boundaries were extracted following two strategies.In the first strategy, the α-shape algorithm was performed using a global parameter α, which corresponds to the average point spacing of the dataset.In this case, the average point spacing was calculated considering all the points of the cloud point, that is, points sampled over vegetation, ground, and other anthropic objects.For the second strategy, the proposed method was used to estimate a local parameter α for each building by means of Equation 1.Both strategies were implemented in C programming language using Code::Blocks version 16.01.The kd-tree storage structure and search functions, which are implemented in the Fast Library for Approximate Nearest Neighbors -FLANN library (Muja and Lowe, 2013) version 1.8.0, were used to perform neighbourhood search operations.
The experiments were performed considering six different buildings (Figure 6) from the same dataset.Building 1 corresponds to a simple convex building of rectangular shape (Figure 6a).Building 2 has a rectangular shape and is composed of a small concave region (Figure 6b).Building 3 has a greater complexity, being formed by many concave regions (Figure 6c).Building 4 has a high complexity, its contour being composed of convex and concave segments (Figure 6d).In addition, there is a little gap on the left side, as can be observed in the profile generated from LiDAR data (Figure 6e).Building 5 also has some complexity, as its design includes both outer and inner boundaries (Figure 6f).Building 6 (Figure 6g) is located in a region of overlapping LiDAR strips, and a change of point density can be noted in the building (Figures 7k and 7l).
The building boundaries extracted using the α-shape algorithm are presented in Figure 7.The results related to the global parameter α are presented in Figures 7a, 7c, 7e, 7g, 7i, and 7k (left side), whereas the results considering the method proposed are shown in Figures 7b, 7d, 7f, 7h, 7j and 7l (right side).The quality parameters estimated for the different strategies are presented in Table 1.Two values of α are shown for each building.The first corresponds to the global parameter (α=0.45 m, same value for all buildings), whereas the second corresponds to the local parameter, which was estimated considering the proposed method.The quality parameters were computed considering the Equations 2-7, which were described in subsection 2.3.In Table 1, the first column corresponds to the building identification, for example, building 1 is represented by B1 and so on.The completeness, correctness and Fscore parameters are presented in the third, fourth and fifth columns, whereas the PoLiS and Hausdorff distance metric are presented in the sixth and seventh columns.Some fields are missing from the table (cells with a hyphen), since the algorithm failed to yield a unique closed polygon using those α parameters and, as consequence, the computation of completeness, correctness and Fscore parameters was not possible.The reference building boundaries were manually-derived on aerial images using a photogrammetric restitution procedure using the ERDAS IMAGINE LPS system.
Table 1.Quality parameters for different buildings and strategies

Discussion of Results
By comparing the local parameter estimated for each building with the global parameter estimated for the cloud point, i. e. α=0.45 m (Table 1), it is possible to verify that the latter is smaller for all cases, except for building 4. The reason for the α value being smaller is directly related to the use of all points in the estimation process, including multiple returns, which usually occurs in border and vegetation regions, as can be seen in Figure 5.In contrast, the local parameter is estimated considering only the roof points of the building of interest.
As mentioned, the buildings presented in Figure 7 were selected from the same dataset.Analyzing the values of α computed by means of the proposed method, it can be seen that each building had a different local parameter.The difference between the highest and lowest values is approximately 0.40 m.This difference indicates the point density variation along the dataset and reinforces the idea of the use of a parameter α for each building.
It can be seen that the results obtained for building 1 (Figures 7a and 7b) using the global parameter were not satisfactory, since many incorrect segments were extracted.This problem occurred due to the point spacing on building 1 which is much larger than the global parameter estimated.On the other hand, the use of a local parameter overcame this problem.
Some incorrect segments were also extracted when the global parameter was used on building 2 (Figures 7c and 7d).The extracted boundary also presented a more irregular appearance (shape with aliasing), as highlighted in Figures 7c and 7d, since the global parameter is smaller.In contrast, the use of the local parameter produced more regular boundaries and did not extract incorrectly in the middle of the building, as observed in Figure 7d.
In building 3 (Figures 7e and 7f) only a few incorrect segments were extracted when the global parameter was considered.Despite this limitation, the global parameter generated better results in the extraction of concave segments.However, in the convex regions, the use of the local parameter produced better results.
It can be seen in Figures 7g and 7h that the use of the global parameter was not able to extract the small concave region located on the left side of building 4. In contrast, the use of the local parameter extracted this detail, since the estimated local parameter α=0.40 m is smaller than the global parameter (α=0.45 m).
As can be observed visually in building 5 (Figures 7i and 7j), the results using a global and local parameter are very similar, having only a few discrepancies.The global parameter generated better results in some concave regions, whereas the local parameter produced better results in some convex regions, as highlighted in Figures 7i and 7j.These results are directly related to the fact that the use of a smaller α extracts more details, whereas a higher α generated a smoother boundary.
In building 6 (Figures 7k and 7l), it can be clearly observed that the point distance in this building changes a lot.It is possible to observe from the results that neither the global nor the local parameter generated good results, since many incorrect segments were extracted.These results indicate that neither the global nor the local parameters work well when there is a significant density variation along the building.
In terms of completeness, correctness and Fscore, it is not possible to perform a comparative analysis between the two strategies, since for most cases, when the global parameter was considered, it was not possible to calculate the quality parameters, as can seen in Table 1.Considering the local parameter, the average values of completeness, correctness and Fscore were 95%, 98% and 97%, respectively.These results indicate a high superposition between the extracted and reference boundaries.
Considering the PoLiS metric (Table 1), the proposed method presented average value of 0.22 m, against 0.40 m for the global parameter.For the global parameter, the worst case presented a PoLiS value of 1.18 m (B1), whereas the best case presented a PoLiS value of 0.15 m (B4).In contrast, the proposed method presented values of 0.31 m and 0.14 m for the worst (B6) and best (B5) case, respectively.Analysing the Hausdorff distances (Table 1), it can be seen that the average value for the global parameter is around 1.95 m, against 1.01 m for the proposed method.The lower value of the Hausdorff distance is always related to the use of the local parameter.From this, it can be concluded that the proposed method is suitable for extracting building roof boundaries.
In summary, the qualitative analysis indicates that the proposed method can be used to estimate the parameter α, necessary in the extraction of building boundaries with the α-shape algorithm.The results showed an improvement when compared with the use of a global parameter.

CONCLUSION
This paper proposes a data-driven approach to estimate a perbuilding parameter α, that is, a local α-shape estimation.This adaptive method makes use of the average point spacing within each building domain.The main advantage of the proposed method lies in the determination of a parameter that can withstand density variation in the LiDAR data.
The results using real data show that the proposed method generated more detailed boundaries even for complex buildings composed of convex and concave segments, and with inner boundaries.Additionally, it was observed that if a considerable density variation occurs within a building, a single alpha value might be insufficient.
As for future research, the application of this method is suggested considering the same area with different point density.The development of a method that can overcome the problem highlighted in Section 3.2, for situations where there is a significant density variation within a single building, is also recommended.

Figure 3 .
Figure 3. Representation of triangular mesh before (a) and after (b) the execution of procedure for the detection and elimination of outliers has been used in many tasks to obtain the approximate building boundaries.The general idea consists of the selection of Roof

Figure 4 .
Figure 4. Alpha shape algorithm criterion.Points pi and pj form a boundary segment (a), whereas pi and pj do not form a boundary segment (b).

Figure 5 .
Figure 5. Two clippings of areas from the dataset used.Points sampled over buildings and ground (a), and vegetation (b)

Figure 6 .
Figure 6.Images of the buildings selected for the experiments (a-d, and f-g) and the profile from LiDAR data considering the segment A-B drawn on Building 4 (e)