ANALYSIS OF 3 D BUILDING MODELS ACCURACY BASED ON THE AIRBORNE LASER SCANNING POINT CLOUDS

Creating 3D building models in large scale is becoming more popular and finds many applications. Nowadays, a wide term “3D building models” can be applied to several types of products: well-known CityGML solid models (available on few Levels of Detail), which are mainly generated from Airborne Laser Scanning (ALS) data, as well as 3D mesh models that can be created from both nadir and oblique aerial images. City authorities and national mapping agencies are interested in obtaining the 3D building models. Apart from the completeness of the models, the accuracy aspect is also important. Final accuracy of a building model depends on various factors (accuracy of the source data, complexity of the roof shapes, etc.). In this paper the methodology of inspection of dataset containing 3D models is presented. The proposed approach check all building in dataset with comparison to ALS point clouds testing both: accuracy and level of details. Using analysis of statistical parameters for normal heights for reference point cloud and tested planes and segmentation of point cloud provides the tool that can indicate which building and which roof plane in do not fulfill requirement of model accuracy and detail correctness. Proposed method was tested on two datasets: solid and mesh model.


INTRODUCTION
Building modelling for large areas (countries/cities) is an increasing trend in 3D visualisation, which is observed in many European countries.Therefore, city authorities and national mapping agencies are interested in obtaining the 3D building models.However, country or region-wide 3D modelling requires the determination of key aspects, i.e. source data (LIDAR/aerial images), which is closely related to the methodology of building modelling, final model accuracy, standard of model type, and the level of detail.Currently, the most popular standard of 3D models is an application schema referred to as CityGML (Kolbe et al., 2005).With the usage of this standard, all city landscape elements can be modelled.In CityGML, the Levels of Details (LOD) are dedicated to building models (Biljecki et al., 2016a).These levels can be used in various variants and because of considerable interest in the standard for building models, they are still being gradually increased (Biljecki et al., 2016b).
Nowadays, with the growing accuracy and resolution of the remote sensing data, which are used for building modelling, users draw more attention to the accuracy of the models.There are plenty of geometric features and aspects that can be analysed within the accuracy assessment approach and therefore it may be difficult to include all of them in the evaluation workflow.In literature, approaches pertaining to the assessment of building accuracy, also on a large scale, have been presented (Wong and Ellul, 2016).
In this article, the possibility of conducting an automatic accuracy assessment of 3D building models is presented.The analysed building models were generated from two different types of source data: classified Airborne Laser Scanning (ALS) point clouds (solid models) and oblique imagery point clouds obtained with dense image matching (mesh models).This methodology focused on the roof structure evaluation.The accuracy assessment approach is based on the calculation of the distance between the roof surface and the reference point cloud for both types of models.In the next step, the normalised point cloud segments are detected and for each segment, the statistical parameters are calculated.

Building modelling from ALS and oblique imagery
The ALS technique is still under development and it makes it possible to acquire accurate geometric data about the terrain of a large area in short period of time.In Poland, the entire country is covered with ALS data, as a result of an ISOK project (ISOK -IT System of the Country's Protection against extreme hazards).This data is nowadays mostly used for 3D modelling in Poland.The data is characterised with decimetre accuracy, and therefore it is the most accurate data which can provide the user with vertical information about the terrain and its cover for larger areas such as a city/country (Kurczyński and Bakuła, 2013).Examples of the application of the ALS data, which were collected within the ISOK project can be found in Polish literature (Cisło-Lesicka et al., 2014).
Different methods of building detection and modelling from ALS data were presented in the literature (Kada and McKinley, 2009;Sampath and Shan, 2010;Sun and Salvaggio, 2013;Verma et al., 2006).Some of them rely on the integration of the ALS point clouds with aerial images in order to improve the geometric accuracy of the building models (Rottensteiner and Briese, 2003).Contrary to appearances, building modelling from ALS data is not an easy task due to the data features.First of all, proper building modelling is possible only when the density of the point cloud is sufficient -at least 3-4 points per square meter (Forlani et al., 2006).However, considering the quality of ALS data collected with nowadays technology, this requirement is not difficult to fulfil.Nevertheless, high discrepancies in point density within the point cloud can turn out to be an obstacle in the proper detection of roof surfaces (Tarsha-Kurdi et al., 2007).If a classified point cloud is used in building modelling, the correctness of the classification can exert an impact on the accuracy of the final models.Points which are not echoes returned from a building but were assigned to this class, influence the accuracy of the roof surface detection.Additionally, for low buildings, which are situated near high vegetation, problems with the roof edge indication may occur.When problems with roof edge detection appear, the application of building outlines may be useful to reconstruct the shape.However, in such a situation, when building contours are used and the point density is not sufficient, problems with proper roof shape detection may still occur.
Another obstacle in proper and automatic building modelling are the complex shapes of roofs, particularly for relatively newly built single-family houses, which often have different shape dormers, as well as multi-slope buildings, for which problems with the indication of the roof ridge and slope of the roof surface may occur (Forlani et al., 2006).What is more, the ALS point cloud is characterised by the given accuracy, both vertical and horizontal, which directly impacts the accuracy of the 3D models.Finally, unfiltered noise points, as well as details on the roof surfaces, can decrease the accuracy of the roof shape reconstruction (Tarsha-Kurdi, 2008).
In this article, the accuracy of 3D mesh models is also analysed.These models were created from oblique aerial images.Dense image matching (DIM) methods make it possible to extract 3D information about surface geometry from the oblique and/or nadir images (Liu and Guo, 2014).As a result, textured mesh building models are obtained.Compared to nadir images, in oblique imagery it is possible to register features which could be excluded in nadir images.Additionally, it is also possible to obtain the building facades from multiple view angles.However, there are limitations to oblique image acquisition.In order to generate sufficient 3D point clouds from this imagery, a greater overlap is required, which leads to a bigger number of flightlines, higher costs and significantly more images (Remondino et al., 2016)

Building accuracy assessment
Two approaches to the accuracy assessment of 3D building models can be found in the literature.The first approach consists in comparing the created building model to the reference model, which is presented in the same form, e.g. in the pixel-based evaluation, the raster representations of the detection results and the reference are compared (Rutzinger et al., 2009).This approach was commonly used in benchmarks (Rottensteiner et al., 2013;Truong-Hong and Laefer, 2015).
The second approach consists in carrying out the accuracy assessment with the usage of LIDAR point clouds as reference data.Dorninger and Pfeifer (2008) suggested a method in which one of the crucial elements is the calculation of a distance between the 3D building model and the point cloud.Oude Elberink and Vosselman (2011) presented a more complex methodology of the accuracy assessment, which is also based on the ALS point cloud.In this article, three features are proposed.The first one is similar to Dorninger and Pfeifer's (2008) orthogonal vertical difference between the 3D model and the point cloud.Oude Elberink and Vosselman (2011) indicated that the analyses based solely on the perpendicular distances between laser points and the model faces could be misleading because most of the points will be close to modelled planes, especially in the case of building modelling using a data-driven approach.Therefore, two more measures were recommended in the article.The first additional measure consists in calculating the distance between model vertices and the nearest point from the LiDAR point, which is calculated in order to evaluate the distance between roof corners and the reference data.The second one was chosen in order to evaluate the occurrence segments detected from the point cloud, with the usage of the method presented in Oude Elberink and Vosselman ( 2009), however, it was not used in the final building model.
A slightly different approach to the quality assessment of 3D building models was presented by Akca et al. (2010).The authors used least squares 3D surface matching instead of directly calculating the distances between the 3D models and the point clouds, which makes it possible to address three quality criteria: the accuracy of the reference system, the positional accuracy, and the completeness of a 3D model.

CAPAP project
In Poland, as part of the development of the spatial information application, after the acquisition of laser scanning data for the whole country with a high density of 12 points per square meter for big cities and 4 points per square meter for smaller cities and rural areas (Kurczyński and Bakuła, 2013), it was decided that the next step was to create a 3D landscape model starting from building models at the LOD2 level according to the CityGML 2.0 standard.These plans are implemented as part of the project Public Administration Center for Spatial Analysis CAPAP organised by the Polish Head Office of Geodesy and Cartography (Stoter et al., 2016;Pilarska et al., 2017).This project provides a range of products and services related to spatial data, including 15 million of buildings in LOD2 standard based on the ALS elevation data and the building footprint from the topographic database BDOT10K.The selection of two specific data registers (ALS point clouds and topographic building contours), guaranteed the harmonisation of both sets and a certain method of generalisation appropriate to the LOD2 standard.
In the project, the entire country was divided into 5 areas for which each contractor performs 3D models of buildings.The separated inspection performed by different contractors is also planned.Recommendations regarding the inspection of building models include the quantitative (including checking the number of CityGML files and 3D products provided by the contractor) and the qualitative (including checking the compliance of CityGML files by means of the XSD scheme (GUGIK, 2017) verification.A detailed inspection of the 3D models is performed for at least 0.25% randomly and regularly located models.The following parameters for each selected building will be investigated with reference to ALS-based, manually created planes:  The minimum distance from any point of the roof plane in the tested building model to the corresponding plane manually fitted on the basis of ALS data, which should not exceed 1 m; this value may be exceeded by no more than 20% (1.2 m) for 5% of the number of 3D building models inspected as part of a sample;  The difference in inclination between the roof plane in the tested building model and the corresponding plane manually fitted on the basis of ALS data, which should not exceed 5°; this value may be exceed by no more than 20% (6°) for 5% of the number of 3D building models inspected as part of a sample;  The maximum difference between the height of the 3D model of the tested building and the maximum height of the building measured on the basis of ALS data, which should not exceed 1 m; this value can be exceed by no more than 20% (1.2 m) for 5% of the number of 3D building models inspected as part of a sample.
This inspection ordered to an external company is well suited within the implementation of national projects.However, checking the selected sample is difficult to perform in a fully automated manner, especially due to the fact that it is predominantly based on the created reference planes measured manually.This limits the application of this approach and ensures the quality of the entire data set.

Motivation
In a countrywide project, such as CAPAP, there is no possibility to verify the accuracy of all buildings.According to the Specification of Essential Terms of the Order, only 0.25% of the selected building models will be examined within the accuracy assessment process.It means that from a total of 15 million buildings only about 38 thousands will be checked.Additionally, the evaluation process will be almost fully manual.In the proposed methodology the automatic accuracy assessment of the building models is possible, which makes it possible to evaluate the generalisation level and small outlying objects.As a result, thanks to such analyses, it is possible to gain deeper knowledge about the completeness and quality of every single building model.
According to the mesh models, this type of 3D modelling has gained high recognition among remote sensing and GIS specialists over the past few years.Therefore, it seems to be important to get the ability to evaluate the accuracy of the models.

DATA AND METHODS
In the experiment, two test fields from Polish cities were examined -part of Warsaw (504 solid LOD2 CityGML models) and part of Katowice (2059 mesh models).When it comes to the LOD2 model, the reference data are also the data which were used for generating building models.With reference to the mesh model, the reference LIDAR data were acquired simultaneously with oblique images used for 3D mesh generation.As a result, there is no time difference between the 3D models and the reference data and there was no need to include the change detection analysis in the accuracy assessment methodology.
What is more, in the used data, there were no problems with the relative orientation of the models and with the reference data.
Problems with the relative reference of the data were described in Akca et al. (2010).

Data
The solid building models were created for part of Warsaw from the classified Airborne Laser Scanning (ALS) point cloud and building contours.The accuracy assessment was conducted based on the reference ALS data, which were also used for 3D modelling.The point density of the data was 12 point per square meter.Building on the selected study area is diversified; there are single-family houses with a more complex roof structure and multi-family houses.
The 3D meshes for the centre of Katowice were created from airborne oblique images acquired in 2014; the dataset was precisely described in Ostrowski (2016), images were oriented in Pix4D and 3D mesh was generated in ContextCapture.The reference ALS point cloud, with a density of 8 points per square meter, was obtained simultaneously with oblique images.

Methodology
In this section, the methodology is described, which is dedicated to the automatic accuracy assessment of building models.In this methodology, the assumptions of the CAPAP project are included, while the LOD2 requirements and generalisation level are also examined.The proposed methodology seems to be easily applied to other types of 3D models, therefore this approach can be also be used to evaluate the accuracy of 3D mesh models from aerial oblique imagery.
According to the studies presented in the literature (Akca et al., 2010;Oude Ebernik and Vosselman, 2011) the following criteria of the accuracy assessment of the 3D building models can be outlined:

Global positional accuracy
According to Akca et al. (2010), due to differences in production techniques, the reference frames of the input and verification datasets may differ from each other and lead to positional shifts and angular tilts.This issue may occur if the data used for building modelling and/or accuracy assessment were not collected simultaneously.Thus, there may be a need to translate and/or rotate the data in order to minimise the influence of the global positional accuracy on the accuracy assessment results.

Local positional accuracy
This feature shows the correctness of modelling the individual planes that the building consists of.In the second approach presented in Akca et al. (2010), the measures are used in order to define the local positional accuracy.One of them is the 3D Euclidean distance vector between the plane and the point cloud, while the second one is the Euclidean distance for every point of the cloud.Oude Ebernik and Vosselman (2011) also use the distance between individual points and the model; furthermore they use the distances between the vertices of the model and their nearest points in the reference point cloud in order to determine the accuracy of the positions of the vertices themselves.

Completeness
This measure is used to determine whether all building elements have been modelled correctly.According to this measure, two types of errors can be distinguished: omission error (i.e., missing piece of a building model) and commission error (object that is not a building is treated as a fragment of the building model).As in the case of the ISPRS benchmark (Rottensteiner, 2013), completeness can be calculated on different levels (features, plane, surface).However, according to LOD2 models only those building elements which fulfils the element size requirements need to be modelled.If such elements are not included in a model, they should be treated as 'false negative' in the completeness calculation.

Assumptions of the proposed methodology
The methodology presented in this article is focused on assessing the completeness of the building models.Due to the fact that in the CAPAP project building models must strictly correspond to the building contours, examining the distances between the roof surface vertices and ALS point cloud like in Oude Ebernik and Vosselman (2011) will be inconclusive.
What is more, due to synchronous data acquisition or as a result of creating the building models from the same data that serves as a reference, the element of determining the global location correctness was also omitted (Akca et al., 2010).However this step can be easily introduced into the methodology by adding it before calculating the distance between the point cloud and the models.
In Figure 1, the workflow of the proposed methodology is presented.In the first stage, reference points from the ALS data are assigned to the appropriate objects.In the case of solid LOD2 models, the reference objects are roof planes, and for the mesh models, building contours from the national topographic geodatabase are used.As a next step, for every ALS point, the distance from the building model (LOD2/mesh) is calculated (along the normal vector to the plane in the model).Further point cloud are divided into two groups with the usage of a threshold, which is related to the accuracy of ALS data.A similar approach was presented in Doringer and Pfeifer (2008), Figure 1.Workflow of the proposed method of building quality assessment (yellow -parts unique for CityGML models, greenparts unique for 3D meshes, blue -parts common for both types of building models) and Oude Erberink and Vossleman (2011).The threshold value might vary because of the quality of the reference ALS data or the expected model quality.Next, the point cloud segmentation with the usage of the OPALS software is conducted, and the conditional segmentation is carried out, which assumed that the points belonging to one have to be located in a short horizontal distance from each other (0.5 m) and should be assigned to the group according to the threshold value (i.e., the normalised distance for all the points in the segment has to be lower/higher than the threshold).The minimum number of points in a single segment is 10.Finally for each segment, the following statistics are calculated: standard deviation, Root Mean Square error (RMS), quantiles of normal distribution, segment area using the alpha-shape algorithm, minimum, maximum, mean and median values of the normalised distance.
In Figure 2, the results of the aforementioned steps of the proposed methodology are presented.
The calculated statistics are used in order to determine the final quality parameters, which can then be applied to individual planes and also to entire buildings.In this experiment, the roof planes were divided into three classes of accuracy: 1) Class 1 -all segments which belong to the particular plane fulfil the assumption that the RMS value is lower than the threshold.It means that within a given roof plane no omission errors were detected; 2) Class 2 -Not all segments of the plane fulfil the requirement described in Class 1, but none of the segments meets the conditions of specification in LOD2, i.e. the omission errors which are detected within the plane fulfil the generalisation assumptions in the accepted standard (minimum detail size, area, height, etc.); 3) Class 3 -At least one segment does not meet the condition of Class 2, i.e. omissions were found which according to the accepted standard (minimum detail size, area, height, etc.) should be included in the model.
To understand the results of inspection of 3D models, it is worth noting that, the class 1 and 2 can indicate buildings that are respectively: correctly modelled without generalization and correctly generalized during modelling.The class 3 of building shows buildings incorrectly generalized.

The results for the solid model
Referring to the Warsaw test area, there were 504 buildings modelled, according to the LOD2 standard and using 2342 planes.The segmentation process was conducted with a threshold equal 0.20 m, which is equal the expected accuracy of the ALS.As a result, 6918 segments were created (Tab.1).In Figure 3, a number of segments per reference roof plane is presented.The more planes were the results of segmentation, the more complicated building can be modelled and the accuracy analysis is more complex.The accuracy assessment of the planes for the area of Warsaw was conducted by adopting the following threshold: for Class 1, all those planes were included for which for all the segments the RMS was lower than 0.20 m (37% of all planes).In accordance with the CAPAP expectations, the model should include all those roof elements for which the horizontal dimension exceeds 4 by 4 meters (16 m 2 ) and the height difference between the surrounding roof elements is greater than 1 meter.Therefore, Class 2 included all the planes for which at least one segment did not meet the conditions of Class 1, but none of the segments, the height of which was determined on the basis of quantile 05 or 95 not exceeding 1 m, have an area exceeding 16 m 2 .As a result, 55% of the planes were assigned to Class 2. These buildings were properly generalised and modelled in accordance with the LOD2 standard.Using the quantiles of normal distribution instead of minimum and maximum values was aimed here at removing the outliers.In Class 3, there were 190 planes (8%), for which at least one segment was found to be equal or to exceed the assumed threshold values (Table 2).At least one segment has an area above 16 m 2 and the quantile 05 or 95 has a distance from the plane higher than 1 m Table 2. Results and assumptions of planes classification into 3 classes for the Warsaw test area (solid models).The classification was also conducted for individual buildings (Fig. 4, Table 3), thus aggregating the results of the surface quality assessment.There were 39 buildings (8%), which were assigned to Class 1, i.e. there were buildings for which all planes were in the Class 1, so there was no omission error noticed.Class 2 includes 288 buildings (57%) for which at least one plane was in Class 2 but none of the planes belonged to
Class 3, i.e. at least one omission was found but it was too small to classify it as an error of detail correctness.If at least one plane belonged to Class 3, it did not meet the requirements of accuracy and detail correctness The example of the quality assessment of individual building model are shown in Figure 5.The accompanying statistical parameters for segments are summarised in Table 5.

The results for the mesh model
For the Katowice mesh model (Fig. 6), there was no possibility to distinguish planes and classify them, therefore the accuracy assessment was conducted for individual buildings.The results of the mesh building model classification are presented in Table 4. Similarly to the CityGML model in the Warsaw area, the following three thresholds were specified: in Class 1 all the buildings were included for which for all of the segments, the RMS value was less than 0.3 m (40% of the buildings).In the next threshold (Class 2) similar requirements were adopted as for solid models, i.e. to Class 2 belonged all the buildings for which at least one segment did not meet the Class 1 requirements, but none of the segments whose height was determined on the basis of the 05 or 95 quantile not exceeding 1m had the area larger than 16 m2.A total of 50% of all buildings were classified into this class.In Class 3, there were 205 buildings (10%) for which at least one segment was found not to meet the requirements of Class 2.

Class
Buildings number Assumption 1 819 (40%) for all segments within a building, the RMS is lower than 0.   Table 5.The quality assessment of individual building model, statistical parameters for segments (Fig. 5c).
Figure 6.The presentation of the qualitative assessment results of the mesh models for individual buildings in Katowice test area: Class 1 -blue, Class 2 -green, Class 3 -red.(see Table 3).

DISCUSSION AND CONCLUSIONS
In the presented methodology, the normal distance between the point cloud and the roof surface is used for the accuracy assessment of the building models.Such an approach has already been used (Doringner and Pfeifer, 2008;Akca et al., 2010;Oude Elberink and Vosselman, 2011).However, one of the problems with the distance calculation is that most of the points are located close to the reference plane (Oude Elberink and Vosselman, 2011), and that the detection and evaluation of the omission error need to be conducted semi-manually (Akca et al., 2010).Similar problems were observed in preliminary studies (Pilarska et al., 2017), where segmentation was not performed and all points were assigned to a plane and took part in the accuracy analysis.
In the proposed methodology, the segmentation of a normalised point cloud is conducted, with the use of conditional segmentation and a single threshold connected with ALS accuracy.The normalised point cloud is created based on the distances between the 3D models and the reference points from the ALS.Thanks to the statistical analysis of points (in particular segments), it is possible to find roof elements which were omitted during the modelling process, and to verify if these omissions were consistent with the expectations of the customer.In contrast to commonly used methods (Rutzinger et al., 2009;Rottensteiner, 2013;Rottensteiner et al., 2013), this solution does not require any reference models.An ALS point cloud is used as reference, therefore, the only limiting factor is the point density and the correctness of the point cloud classification.Such data should be available because in the most cases the data has already been used to create 3D building models.Therefore, it is possible to evaluate the accuracy of all 3D building models and not the selected sample, as is the case with the assessment using reference 3D models.Such large scale accuracy analysis can be useful nowadays when many institutions and countries are interested in obtaining 3D building models.
A simple segmentation method is used in the accuracy assessment approach rather than a more complex one, e.g.surface detection, because it might lead to all the building models having to be remodelled.Although the surface detection approach is used by Oude Elberink and Vosselman (2011), in the authors' opinion the use of this type of method in a countryscale project where data are provided by many contractors using different methods, would raise a question about the reliability of the surface detection algorithm, which is used during the assessment process.
The proposed methodology was also applied to mesh models, which was easily adapted for assessing the accuracy of such models.Additionally, the authors see the potential of this approach referring to popular mesh models because of the growing interest in oblique images.According to the results, the accuracy and model building classification obtained indicated that mesh models characterise with sufficient accuracy for many applications.Additionally, the results proved that the proposed methodology might be easily applied to other models and not only solids.

Figure 2 .
Figure 2. Example results of the following steps from the proposed method for both types of model (a) solid model and (b) the 3D mesh.From top to bottom: model, reference point cloud, roof planes (only for solid model), distances between points and model, point cloud with applied threshold, segments.

Figure 3 .
Figure 3. Graph showing the relation between the number of planes and segments in the point cloud per plane.The accuracy assessment of the planes for the area of Warsaw was conducted by adopting the following threshold: for Class 1, all those planes were included for which for all the segments the RMS was lower than 0.20 m (37% of all planes).In accordance with the CAPAP expectations, the model should include all those roof elements for which the horizontal dimension exceeds 4 by 4 meters (16 m 2 ) and the height difference between the surrounding roof elements is greater than 1 meter.Therefore, Class 2 included all the planes for which at least one segment did not meet the conditions of Class 1, but none of the segments, the height of which was determined on the basis of quantile 05 or 95 not exceeding 1 m, have an area exceeding 16 m 2 .As a result, 55% of the planes were assigned to Class 2. These buildings were properly generalised and modelled in accordance with the LOD2 standard.Using the quantiles of normal distribution instead of minimum and maximum values was aimed here at removing the outliers.In Class 3, there were 190 planes (8%), for which at least one segment was found to be equal or to exceed the assumed threshold values (Table2).
plane was in Class 2 but none of the planes belonged to Class 3 (see Tab. 2) 3 177 (35%) at least one plane belonged to Class 3 Table3.The results of the qualitative assessment and the assumptions for LOD2 models for the Warsaw test area.

Figure 4 .
Figure 4.The presentation of the qualitative assessment results of LOD2 models for individual buildings in the Warsaw test area: Class 1 -blue, Class 2 -green, Class 3 -red (see Table3).
segment has an area above 16 m 2 and the quantile 05 or 95 has a distance from the plane bigger than 1 m

Figure 5 .
Figure 5.The quality assessment of individual building model.(a) Distances between points and model, (b) roof planes, (c) segments.

Table 1 .
General segmentation statistics for both test areas

Table 4 .
The results of the qualitative assessment and the assumptions for the 3D mesh building models for the Katowice test area.