A GENERAL DATA-DRIVEN ALGORITHM FOR FAÇADE STRUCTURE MODELING USING GROUND BASED LASER DATA

Façade reconstruction from laser point cloud has been an interesting subject in Photogrammetric community for the last two decades. However, due to the variety of architecture types and the nature of laser data, proposing a fully automatic modelling algorithm is still a challenge. Irregular architecture, density variation, occlusion and noise level are the main hindering factors of proposing a general model for façade reconstruction. This paper describes the sequences of an automatic datadriven method which starts from raw laser data and ends with object extraction. Statistical analysis was frequently utilized in segmentation, splitting line detection and object characterization. A rule-based modification method was employed to model the complexity of façade layout. Developed interface enables non-expert user to interact with modelling process by setting few parameters. The method was tested over a couple of datasets.


INTRUDOCTION
The subject of building reconstruction is a wide field of study, which is about transforming building measurements to 3D models.Among reconstruction methods developed throughout the last decade, image-based and laser-based methods have received more attention.In spite of great improvements in reconstruction methods, a large number of problems remain, which keep the subject in the focus of the scientific community.Most of these methods are normally focused on the specific parts of the building modelling problems.Thus, there is a need for a general method which benefit from different approaches.Moreover, technological advances such as shifting from terrestrial laser scanning to mobile laser scanning on one hand and increasing demands for 3D as-built models on the other hand, urges proposing more efficient and intuitive methods with a high degree of automation.

Previous studies
Given the large volume of works in façade modeling, we refer the reader to recently published literature to obtain a comprehensive idea on the subject (Haala and Kada 2010;Musialski et al. 2012;Tang et al. 2010;Vanegas et al. 2010;Vosselman and Maas 2010).Using various approaches such as photogrammetry, computer graphics, and civil engineering, they provided interesting discussions about the technology gaps and future demands.
A considerable part of researches in recent years was dedicated to topology extraction which is also in our focus.In these studies, data-driven and model-driven approaches were used to obtain structured arrangement of façade elements from images and point clouds.Grammar based description of the building in recently developed software packages such as "CityEngine" (Watson et al. 2008) was utilized for procedural modelling to reconstruct some cultural heritage buildings (Haegler et al. 2009).The output of our presented method can also be used for procedural modeling in such popular software.M¨uller et al. (2007) used production modeling of grammar to subdivide a facade texture into some elements such as floors, tiles, windows, and doors using single image, regardless of its resolution and orientation.They exploited mutual information theory to find similar floors and tiles through searching ordered sequences in vertical and horizontal directions respectively.For classification purposes, they used architectural parametric templates discussed in (Dick et al. 2004) with aid of manual efforts.Some other image-based methods used similar ideas to develop a procedural method (Lipp et al. 2008;Ripperda and Brenner 2009).However, they use interactions to solve the problem of irregularities.In contrast, we tried to automatically solve this problem by adapting irregularities to cellular representation.Pauly et al (2008) and Mitra et al (2006) developed a computational framework to discover regular or repeated geometric structures in 3D spaces.They proposed a non-linear algorithm to define the regularity using a combination of translation, rotation and scale functions.Although these methods were inspiring, they were still challenging when the number of symmetry groups increased.In less complex setting, Becker and Haala (2009) introduced a cell decomposition technic to find reliable clues of splitting lines in point cloud.Then they used a derivation tree to describe the dataset as sequences of wall and windows.This work was inspiring for our presented algorithm as a cellular representation of point cloud utilized to extract splitting lines.Wang et al. (2011) also combined a data-driven window localization with a façade pattern inference to enhance the robustness of window detection.However, these methods are sensitive to data quality.Teboul et al. (2011) proposed grammar-based segmentation method in images.They merged a bottom-up classification with a shape grammar using a random exploration method for optimization.Wan and Sharf (2012) tried to exploit the same spirit to decompose scanned facades to their basic shapes using more flexible grammar rules.They used a consolidation method (Zheng et al. 2010) with some manual efforts to transfer 3D points to 2.5D space and obtain a grammatical description.Our presented method is close to this work in terms of decomposing the dataset to depth layers.
A considerable number of studies focused on individual object extraction from laser data (Pu and Vosselman 2006;Pu and Vosselman 2007;Pu 2008;Pu and Vosselman 2009;W. Shen 2008).Specifically, they used the holes in dataset as a clue to detect windows.Alternatively, we used the remaining data after pre-processing as a more reliable evidence to extract the windows.
Fewer number of researchers used data-driven approach to detect regularity in façade area.They tried to find windows from raw laser data aiming to detect the regularity and to fill the gaps in the façade surface (Sam Friedman and Stamos 2011;S. Friedman and Stamos 2012;Mesolongitis and Stamos 2012).They took the advantage of repeated elements to classify and predict suitable elements for missing parts.Shen et al. (2011) proposed a data-driven method to fit an adaptive structure to the facade.They used some penalty functions to find the position of the splitting planes based on linear feature extraction from the dataset.After an automatic recursive process of element grouping and splitting refinement, the output was shown as a partitioned and classified dataset.Our method is close to the spirit of this work as we used an adaptive splitting approach, but with more robust clues derived from data.

Overview of the algorithm
Finding a general grammatical description, applicable to all façade types is an open question in the automatic modeling process.The main challenges are due to diversity of façade structures, non-uniform density of points, missing parts due to occlusion and having a rather large amount of noise and outliers in the datasets.Partitioning of original point cloud to some nonterminal cells is our proposed solution.The rest of the problems are due to the nature of data acquisition, which affects the robustness of modeling algorithms.Tackling these problems requires reliable data processing methods to obtain strong clues of actual building elements.At segmentation level, three depth layers were generated using a conditional RANSAC.Unwanted data were eliminated through some filters.In partitioning level the best fitted decomposition array was generated through a data analysis process.Using some rules, boundaries of the cells were modified in structuring level.Meanwhile topologic relations were defined based on the adapted cellular partitions.Characterising the objects located in the generated cells was the final stage of the modeling process.

PRE-PROCESSING
Due to variations in façade architecture, proposing a unique modeling technique for all façade samples seems to be impossible.Most of the facades are composed of a main wall faced to the street view which holds the building elements such as windows, doors, balconies and extrusions.This main wall in some cases is divided into two parallel parts; each containing part of the building elements.The criterion of the proposed method is the amount of 3D points collected from the main walls.The laser data, collected from street view, normally do not represent a uniform scan of the points.Lower parts of the façade which usually are close to the laser scanner are recorded with high density compared to higher parts.For this reason the density of points is not a reliable indicator for a robust modeling method.Moreover, the occlusions are common problems in laser data modeling.The modeling method should be less sensitive to the lack of data caused by obstacles such as cars, trees and traffic facilities.For these reasons following assumptions were considered for common building samples: • The façade is a single planar wall or composed of parallel planar walls holding the building elements such as windows, doors balconies and extrusions.

•
A considerable amount of laser points has to be collected from wall area.

•
Building flats are represented by some openings which are aligned horizontally, except for staircase windows.
Therefore, the cut of the façade with above mentioned assumptions would be the start dataset.If the original dataset is a geo-referenced large point cloud, 2D cadastre maps could be used to automatically depart single façade from the rest of the dataset.Now the dataset is ready to go through the modeling process.

SEGMENTATION
The idea is to identify the points belonging to the following three segments: main wall segment (MWS), behind wall segment (BWS) and front wall segment (FWS).Wall parameters such as position, orientation and dimensions are obtainable from the main wall segment.The openings (windows and doors) are the main clue for façade partitioning.They could be detected in BWS.
The first and critical step of the segmentation process is detection of the main wall plane.For this aim, RANSAC was utilized as a robust primitive estimator to detect wall parameters in the noisy area of the façade (Schnabel et al. 2007).However, due to dataset variations and the problem of inlier points detection, a conditional RANSAC was employed to increase the robustness of the method.The results of wall segmentation are presented in Fig. 3 (1-4a) in section 5.

Wall plane detection
The ideal plane is identified by exploring the histogram of points which shows their distances to a selected plane.In regular RANSAC method the plane with minimum distance to the points represents the best fitted plane (BFP).According to the literature, the number of randomly selected triples is obtained from the following equation (Tarsha-Kurdi et al.

2008):
) Where α is the minimum probability of finding at least one plane in the dataset (default value: 0.99), ε is the maximum percentage of points belonging to the same plane (default: 50%) and s is the number of parameters (s= 3).Based on prior knowledge, normal façade wall is vertical.Therefore the selected random points were restricted to the triples with approximately horizontal normal vectors.This condition assured us that the majority of selected points belong to the wall.As a result, the median of minimum distances was taken as the BFP.The output consisted of a normal vector and the perpendicular distance to the origin of the coordinate system in object space.

Extraction of wall points
The next step is detection of all inner points which belong to the main wall.Having BWS and FWS free from wall points is essential for the next stage.As we know, the actual wall points are not laid on an ideal plane.Therefore density histogram of BFP was utilised in the following way to detect wall boundaries with high certainty: moving from the peak of the histogram to the left and the right (the front and behind of the wall), we took the immediate flat positions (zero slope) as the main wall's boundaries (Fig. 1).In cases where the façade had two parallel main walls, they appeared as two main peaks in density histogram.Developed function was enabled to detect other main peaks based on the defined threshold.Therefore, the next main wall was extracted by undergoing through the same boundary detection procedure.

Data reduction
By removing wall points, the points belonging to the behind and front of the wall appeared as clusters of points in 3D space.Remaining data usually contain irrelevant objects such as trees, traffic facilities, passer-byes and also objects from inside of the building such as inner walls which need to be removed from the dataset.Based on the data characteristics, a perpendicular distance was defined by the user to cut off irrelevant objects from the dataset.Approximately half to one meter would be enough distance to keep the objects attached to the wall (Fig. 1).The next step was removing the data which was located out of the wall boundaries.For this aim, MWS points were projected on the BFP and the outline of the wall was extracted from the plane.The BWS and FWS points were also projected on this plane.The points which were out of the extracted boundary were removed as irrelevant points.

PARTITIONING
In the third level of details described in CityGML (Gröger and Plümer 2012) regular façade is represented as an array of openings and extrusions.In this surface each horizontal row of windows and doors refer to a floor.Usually the openings of floors are also aligned in the vertical direction in whole or part of the façade surface.Therefore, a horizontal and/or vertical alignment of building elements with similar attributes is common in building architecture.However, the regular array form is not always the case.Variation in size and position of openings is also common in the existing architecture.For instance staircases are the most common disturbing structures, as their windows are not aligned with the floors direction.The variations in size and placement of the windows may also cause vertical misalignments.Therefore, any general algorithm has to deal with irregularities caused by these variations.In this section we describe a method to generate an initial structure by finding the best fitted cellular array.This model facilitates proximity analysis and characterizing of building elements.In the next step, developed array is modified to cover above mentioned irregularities.

Rasterizing
Behind the wall segment (BWS) which contains laser points from openings, is essential to conduct splitting process as they form separate point clusters.First of all, 3D points were transformed to a 2D form.This transformation simplified the analysis by giving structure to irregular point cloud.For this aim, BFP rasterized by user selected resolution (e.g. 10 cm) and all BWS points were projected on this plane.Each pixel got one or zero value, based on the existence or absence of points.Another advantage of rasterizing 3D points was the possibility of applying image processing functions over the dataset.In our application, the user was enabled to perform a cleaning procedure to remove small objects from the dataset, similar to the literature (Khoshelham et al. 2010).

Initial structure
The splitting process is based on detecting rows and columns with minimum valid pixels (1 value).For this aim two density histograms in horizontal and vertical direction were devised.Horizontal histogram shows the sum of pixel values in the columns, and vertical histogram shows the sum of pixel values in the rows.In façades with regular array of openings, empty intervals were suitable places for splitting lines.However, in case of irregular structure, we rarely had empty spaces in the histogram (Datasets 3 and 4 in Fig. 3).Therefore, the positions with minimum values were more likely to be splitting spots.Consequently, an extremum detection function was utilized.This function picks the minimum values occurred after the maximums which cross a user defined threshold.To avoid over splitting, the histogram can also be smoothened beforehand, through a user-defined kernel, depending on the dataset.However, all over splittings can be mitigated in the next stage.Generated lines configure the best fitted regular matrix which keeps topological correctness of the model in the next step.Binary image and related histogram analysis are presented for four real datasets (1-4b) in Fig. 3.

Final structure
Extracted matrix had to be modified regarding to the corresponding dataset.To do that, initial splitting lines were overlaid on the binary image.To check and correct any interference between raster image and the initial structure, horizontal and vertical histograms were generated for pixels located between adjacent splitting lines.All detected interferences were treated according to the defined correction rules listed in table 1.In the modified structure, cellular indexes remain the same as initial structure, while the corner points might be modified.
After line corrections the structure was stored by saving the positions of two (lower-left and upper-right) corners of the cells.These corrections along with corresponding rule numbers are illustrated in Fig. 2. The expected result was cellular representation which had no interference with the dataset.However the adjacency of cells was disturbed in rare cases (Fig. 3. 2c).In most of the cases the reason was the existence of remaining wall points in the dataset.These disturbances proved that the quality of the final structure highly depends on the quality of wall extraction.

Feature extraction
Previous step enforced the structure to fit to the distribution of the data.In the ideal case, each cell includes a single object of the building which is isolated by cell boundaries from the others.Binary Image was employed to detect the objects of cells.The algorithm, in order, selected a cell through the cellular indexes.Keeping the values of the selected cell, the rest of the binary image was switched off.Therefore, the objects could be defined by coordinates of their bounding box extractable from two horizontal and vertical histograms.Therefore, the façade information was characterized by saving bounding box coordinates of the point clusters, linked to the corresponding cell indexes.
In some cases, multiple clusters of points were detected in each cell.This can be either due to the lack of data from a single object, or the existence of two objects in one cell (Fig. 3. 1c).If the algorithm detects two or more clusters in one cell, the largest cluster is taken as the main object and the rest are saved to be treated later.

EXPERIMENT AND DISCUSSION
The algorithm was established by developing some functions in Matlab.Developed functions were run on a system with the characteristics mentioned in table 2.
System type 64-bit operating system Processor Intel® Core™i5 CPU M 540 @ 2.53GHz Memory 4 GB Graphics NVIDIA Quadro FX 880M Table 2. Running system specifications An interface was also developed to enable the user to control the modeling procedure (Fig. 4).User interaction was through setting a few parameters to increase the quality of performance.These parameters are listed in table 3. It took less than 5 minutes to perform automatic modeling for a dataset with around a million points.The result was an indexed cellular representation of the façade with corresponding coordinates of the included objects (Fig. 3. 1-4c).Occlusion: The lack of data due to the occlusion results in incorrect object attributes.However, further topological analysis using cellular indexes might mitigate the occlusion problem.
Noise level: The level of noise may be noticeable due to factors such as distance from the façade and disturbance caused by irrelevant objects in the street which may cause wrong object detection or incorrect attributes.

CONCLUSION AND FUTURE WORKS
This paper describes sequences of a general method of automatic façade modeling.An intuitive approach was taken to extract main wall using statistical analysis.Clusters of points located behind the main wall were analyzed to detect splitting lines.These lines formed an initial cellular structure.Cell borders were modified through a rule-based processing to fit the dataset.Further analysis went though object extraction from each cell and characterization with corner coordinates.Compared to similar methods, the proposed method handles more complex façade layouts with reasonably high automation level.Presented experiments showed reasonable agreement with ground truth (Fig. 3).
In future, developed structure may be utilized as a guideline for object classification.Combinations of the algorithm with more model-driven methods can be utilised to increase the efficiency and the automation level.The method might also be promoted to cover more complex façade architectures.
line & the next cell is empty Remove the line 4 No crossed object, over-sized object or empty space No action

Figure 1 :
Figure 1: Schematic view of facade points segmentation using point histogram.
User interaction parametersAs an advantage, variations in point densities does not sensibly affect automatic modeling procedure.The quality of object extraction directly depends on following factors: Façade architecture: developed algorithm automatically handles any planar façade and its parallel parts.Curvatures or angular breaks are normally treated as protrusions or intrusions.

Figure 2 :Figure 3 :
Figure 2: Rule based correction of structure.Red dashes shows removed lines based on the corresponding rule number (table 1)

Figure 4 :
Figure 4: Snapshot of the facade modeling interface

Table 1 .
Structure correction rules