AUTOMATIC 3 D BUILDING RECONSTRUCTION FROM A DENSE IMAGE MATCHING DATASET

Over the last 20 years the demand for three dimensional (3D) building models has resulted in a vast amount of research being conducted in attempts to automate the extraction and reconstruction of models from airborne sensors. Recent results have shown that current methods tend to favour planar fitting procedures from lidar data, which are able to successfully reconstruct simple roof structures automatically but fail to reconstruct more complex structures or roofs with small artefacts. Current methods have also not fully explored the potential of recent developments in digital photogrammetry. Large format digital aerial cameras can now capture imagery with increased overlap and a higher spatial resolution, increasing the number of pixel correspondences between images. Every pixel in each stereo pair can also now be matched using per-pixel algorithms, which has given rise to the approach known as dense image matching. This paper presents an approach to 3D building reconstruction to try and overcome some of the limitations of planar fitting procedures. Roof vertices, extracted from true-orthophotos using edge detection, are refined and converted to roof corner points. By determining the connection between extracted corner points, a roof plane can be defined as a closed-cycle of points. Presented results demonstrate the potential of this method for the reconstruction of complex 3D building models at CityGML LoD2 specification.


INTRODUCTION
The demand for three dimensional (3D) building models has increased over the last two decades, for applications such as asset management, energy modelling and navigation.Due to the need for up-to-date and readily available 3D models, a vast research effort has focussed on developing an automated workflow for 3D building reconstruction.The success of such approaches is often assessed through the level of detail and accuracy achieved, as defined by the Open Geospatial Consortium (OGC) CityGML standard (Gröger and Plümer, 2012).CityGML defines five Levels of Detail (LoD) starting from DTMs (LoD0), and advancing to buildings with interior rooms and façade details (LoD4) (Gröger and Plümer, 2012).3D building models can be simplified by modelling the roof as a flat roof, defined by LoD1.These simple 3D shapes can be easily reconstructed automatically by applying a single, constant height to building footprints.Examples of this include Ordnance Survey (OS) MasterMap Topography Layer -Building Height Attributes for the UK, and the Dutch Kadaster, which offers countrywide LoD1 building models of the Netherlands (Ordnance Survey, 2014a;Stoter et al., 2014).
Investigations into LoD2 reconstruction, where roof geometry is also modelled, have been successful only in the case of large buildings with simple roof structures (Rottensteiner et al., 2014).Many of the currently proposed methods for LoD2 reconstruction tend to favour lidar as the primary data source, either in the form of point clouds or raster DSMs, for the segmentation of roof planes.These approaches tend to suffer from undersegmentation; with small roof features either not being modelled or causing reconstruction errors within dominant roof planes, which are potentially due to limitations in the point density of the lidar point clouds (Rottensteiner et al., 2014).Few methods have utilised image-based point clouds, and the high spatial resolution offered by dense image matching, with densities now equal to or greater than that typically provided by lidar data capture, (Rottensteiner et al., 2014).
The production of dense image-based point clouds has been made possible through recent developments in aerial image capture and data processing.The capture of imagery from large format digital aerial cameras has seen an increase in image footprint and radiometric resolution, whilst simultaneously improving the spatial resolution of the ground pixels.The increase in image footprint size means that much higher overlaps, typically 80% fore/aft and 60% lateral, can be achieved compared to conventional film based aerial image capture (Haala, 2011).The increased image overlap means a ground pixel can now typically be observed in as many as 15 overlapping images.Whilst this increases the likelihood of a successful pixel correlation, at the same time, algorithms have been developed which now allow pixelto-pixel matching, thus leading to the term dense image matching.A popular example of this is Semi-Global Matching, which calculates and minimises cost functions to match corresponding pixels (Hirschmüller, 2008).The results of pixel-to-pixel matching allows the production of image-based point clouds at the same spatial resolution as the captured imagery.This offers the potential to overcome roof plane completeness errors which can often occur in lidar-based reconstruction due to lower point density (Leberl et al., 2010;Rottensteiner et al., 2014).Other by-products of pixel-to-pixel matching include DSMs and true orthophotos with sharp image boundaries along roof edges and high levels of roof detail.This paper addresses 3D reconstruction by extracting 3D roof vertices and developing a network with topological connectivity.Information extracted from the trueorthophoto, DSM and image-based point cloud are integrated to determine the connection between roof corners in order to form closed-cycles of roof planes.The paper is structured as follows: Section 2 discusses previous work on the reconstruction of 3D buildings models; Section 3 describes the test site and datasets; Section 4 outlines the methodology used for the extraction and reconstruction of the roof geometry; Section 5 presents the results achieved to date and Section 6 draws some preliminary conclusions from this research, describes ongoing endeavours and makes suggestions for future work.

RELATED WORK
Successful automated reconstruction of 3D buildings with correct roof geometry is dependent on the quality of the extracted features.The geometry of a roof can be described by the number and the shape of roof faces, thus most methods aim to classify roof planes from the input dataset.This can be done either using feature-based or area-based methods.Feature-based methods aim to extract edges and points to reconstruct the 3D geometry so tend to be applied to aerial photography.A roof plane can be described as a closed polygon consisting of n linear segments made up of v n,1 vertices (Brenner, 2001).This network based approach is the basis of manual extraction from stereo-imagery (Gruen and Wang, 1998), as well as being implemented into proposed workflows.Many developed methods have implemented low-level feature extraction procedures to determine edges and points, which generally require refinement before being used for reconstruction.Wang (2012) manually refined Canny edge and Moravec point detection to remove false positives and then reconstructed the roof geometry by computing the relations between roof corners and roof edges.Rau (2012) refined manually extracted structure lines to remove dangles, connect neighbouring walls, and remove structural lines that pierced other lines before creating TINs to determine the planar parameters between the structure lines.
Researchers have strived to remove the need for manual intervention by applying hypotheses to the reconstructed roof shapes.However these hypotheses can severely restrict the reconstructed geometry.Melnikova and Prandi (2011) constrained reconstruction to square roofs with 90° corner angles and ridge roofs where three corners could form a triangle.Woo et al. (2010) refined detected Canny edges by clustering lines that were parallel or perpendicular with a ±10 degree threshold to reconstruct rectangular planes.Whilst the reconstruction was successful, given an average error of 0.38 m when comparing extracted lines to ground truth lines, the developed method was only applied on synthetic images and again struggled to reconstruct non-rectangular roofs.
Because of the aforementioned issues, many researchers are tending to favour area-based reconstruction, which aims to segment regions based on a similarity measure.As concluded by Rottensteiner et al. (2014), in summarising the outcomes of the recent ISPRS benchmark assessment of 3D building reconstruction, this area-based reconstruction tends to favour the use of lidar data, in the form of point clouds or raster DSMs.Points can be clustered into planes based on similar attributes such as normal vectors (Nex and Remondino, 2012), distance to a localised fitted plane (Abdullah et al., 2014;Oude Elberink and Vosselman, 2009), or height similarities (Sohn et al., 2012).This clustering is performed using methods such as region-growing from seed points, 3D Hough-transform or the RANSAC algorithm (Novacheva, 2008;Perera and Maas, 2014).Planes can also be segmented by classifying and combining cross sections using similarity measures.However, these tend to be more computationally expensive compared with planar detection due to the number of points being tested for clustering (Hebel and Stilla, 2008;McClune et al., 2014).Planar segmentation results are dependent on correct determination of threshold parameters, such as the neighbourhood used to calculate the attribute, and incorrect results can arise in areas with low point density and complex structures (Rottensteiner et al., 2014;Yan et al., 2012).
Planar segmentation procedures, similar to those mentioned above, can also be applied to data from aerial imagery, which can now potentially offer much higher point densities.Bulatov et al. (2012) used an imagebased DSM to compute normal vectors for each pixel, while Omidalizarandi and Saadatseresgt (2013) performed region growing on image based point clouds to form planar segments.However, it was found that errors from planar segmentation can arise at the location of the planar boundaries (Omidalizarandi and Saadatseresgt, 2013).These boundary errors can be overcome by combining feature-based and area-based methods, with the extraction of edges from imagery tending to form a post-processing step to refine the boundary of planes from lidar data (Awrangjeb et al., 2012;Demir and Baltsavias, 2012;Perera et al., 2014).
In summary, current methods tend to favour lidar as the primary data source for planar extraction, but often result in under-segmentation of roof planes leading to geometric errors in 3D reconstruction.Methods using imagery tend to apply strict constraints to the reconstruction which often limits the number of buildings successfully reconstructed.Some methods have utilised both imagery and lidar, but have predominantly used the imagery only as a subsidiary dataset.However, advances in digital aerial imagery captured through dense image matching can potentially overcome some of the limitations of current methods.

TEST SITE AND DATASETS
The data utilised in this research was captured by OS, the national mapping agency of Great Britain, for an area of the city of Newcastle upon Tyne, UK, in November 2010 using a Vexcel UltraCam XP camera.Imagery was captured with 80% fore/aft overlap and 60% lateral overlap from a flying height of 1700 m.This produced a ground sample distance of 0.1 m.OS processed the imagery using Microsoft UltraMap software to derive an image-based point cloud, DSM and true orthophoto at the same spatial resolution as the original imagery, and these products were supplied for use in the research presented herein.
The imagery covers a 25km² area of Newcastle upon Tyne city centre as well as surrounding industrial zones and residential suburbs.Thus, there is a large range of building shapes and sizes exhibiting various roof types.An example of the test site extracted from the true orthophoto can be seen in Fig. 1.OS MasterMap building topography was utilised in order to extract buildings.This data is produced through manual digitisation of ground and aerial surveys at 1:1,250 scale, and offers a nominal planimetric accuracy of 1 m within urban areas (Ordnance Survey, 2014b).A polygon defines the outline of the building at ground level, so does not take into consideration any roof overhang.
For validation purposes, reference data was extracted manually from the stereo-imagery.The Cartesian coordinates of roof corner positions were measured to facilitate analysis of the planimetric and height accuracy of the final building models.The methodology can be split into three main sections which are outlined in the workflow shown in Fig. 2. Firstly, roof lines are extracted using an edge detector.

METHODOLOGY
The methodology then develops on the theory of scan line segmentation (Jiang and Bunke, 1994) and run graph vectorisation (Montero et al., 2009) to refine the detected edges before converting the edges into points to form a network of ridgeline connectivity.These three steps, together with the 3D reconstruction, were implemented automatically in MATLAB 2015a.

Pre-processing
Due to large amounts of noise in the image-based point cloud derived using Microsoft UltraMap, a point cloud was instead created from the raster DSM product, by converting the centroid of each DSM cell into a Cartesian point.This DSM 'point cloud' was then classified to extract ground points using the TerraScan ground classification procedure (TerraSolid Limited, 2015).The normalised DSM (nDSM) was created by subtracting the ground classification from the DSM.Next, OS MasterMap building footprints were used to extract buildings from the true orthophoto and the nDSM, providing an initial building boundary region, and normalised building elevation.
The extracted building datasets form the input for the building reconstruction, and ensure that the search area for the edge detection is limited only to relevant building regions.Each building footprint was buffered by 2 m to compensate for any roof overhang.

Roof Geometry Extraction
The Canny edge detector (Canny, 1986) was used to extract the 2D linear edges of each roof from the true orthophoto.By applying the corresponding height from the nDSM to each detected Canny edge pixel, it was possible to eliminate pixels on the ground and at the image boundary.To overcome any roof boundary edge not detected directly using the Canny edge detector, the nDSM boundary was included with the edge detection for modelling.
In order to remove falsely detected edges from shadow, roof texture and other unwanted artefacts, a workflow based on the theory of scan line segmentation was developed (Jiang and Bunke, 1994;McClune et al., 2014).The corresponding height value from the nDSM was applied to each detected Canny edge and a least squares linear regression was performed along each X and Y cross-section of the roof.By measuring the distance from height attributed Canny pixels to a least squares fitted line, pixels within a threshold distance of the line were classified as false positives and removed, whilst those above a threshold were kept as breakpoints.Canny edge pixels along the cross section were iteratively added to the linear regression computation until the residuals exceeded the threshold.When an edge exceeded the threshold, the previously detected Canny edge of the cross section was defined as a breakpoint and thus the edge of a roof plane.This edge was then used as the starting position of a new least squares linear fit.Each Canny edge pixel along the cross section was iteratively included in the linear regression calculation until the end of each cross section was reached.This procedure was performed iteratively for each X and Y cross-section of the roof until no further Canny edge pixels could be removed.This process is illustrated subsequently in Fig. 5.

3D Roof Reconstruction
The corner positions of roof planes were extracted from the refined edge pixels using run graph vectorisation, which converts raster edge images to a vector format by utilising line tracing (Montero et al., 2009).Edges were automatically traced and classified based on pixel connectivity to form individual line segments.To classify pixel connectivity along an edge, the Freeman chain code was used to classify each pixel based on the direction of the neighbouring pixel, determined using a 3x3 kernel (Freeman, 1961).Edges were clustered based on the dominant classified direction to form individual line segments.The endpoints of the classified edges were extracted as the corner positions of the roof planes with geometric constraints then applied to refine the extracted corners.Building models were reconstructed at LoD1 and LoD2, according to CityGML, from the nDSM boundary and the detected edges, respectively.
For LoD1 reconstruction, constraints were applied to the interior angles of roof corners and edge lengths to ensure orthogonality.Investigations were undertaken to determine the angular thresholds to implement.Any corner point with an angle exceeding 90° ± 55° was removed.If the measured angle was smaller than 90° ± 55° but larger than 90° ± 35° then the shortest edge forming this corner was removed.The heights of the corner were assigned using a metre-wide search window to assign the maximum height value.Then the median height value was assigned to all roof points to give the flat surface required for LoD1 reconstruction.
For LoD2 reconstruction, any edge with a length shorter than 0.5 m was firstly removed as noise.Various rules using angles between line segments, line orientations and proximity of corner positions were then implemented to connect unconnected endpoints, defined as any point that does not connect at least two lines.Varying search windows of 2, 4 or 6 m, dependent on the length of the line with respect to the longest extracted roof edge, were used around each unconnected endpoint to find potential connecting endpoints.
Once all endpoints that met the connectivity criteria were connected, all unconnected edges were removed.The corners extracted from the nDSM boundary and the refined Canny edges were connected together to form the LoD2 building models.The height at the corresponding nDSM pixel was assigned to the extracted ridgeline corners.
Threshold sensitivity testing, necessary to determine the optimal parameters, was undertaken on the Newcastle dataset for the Canny edge detector as well as the aforementioned thresholds for the connectivity of edges.
The full details of these tests are beyond the scope of this paper.
The proposed methodology was tested on a total of 50 different buildings, 10 for five different roof types: flat, gable, hipped, cross-gable and complex.Buildings were selected from across the image extent to cover a wide range of building types from industrial, residential and city centre scenes, as well as covering different shapes and sizes.For the ground truth data, Cartesian coordinates of roof corners were extracted from stereopairs of images, as well as the individual roof planes for planar analysis.The completeness, correctness and quality indicators were used to evaluate the extracted roof planes, defined as a closed cycle of roof endpoints.

Roof Geometry Extraction
Example results of Canny edge detection, combined with the nDSM boundary, can be seen in Fig. 3a and Fig. 4a for two different roof structures.Whilst the main roof structure lines have been extracted, highlighting good localisation of roof edges, a number of false positives are also extracted.
The complex roof structure in Fig. 3a shows all ridge and valley lines have been extracted from the true orthophoto, but roof texture characteristics have also been extracted, mainly in the form of short and curved lines.In addition, long straight edges have been extracted from shadow cast across the roof face, which has also prevented some edges being detected, particularly at the boundary of the roof.By including the nDSM boundary these edges are created, but errors in the ground segmentation cause poor localisation of edges at roof boundary corners, as illustrated towards the bottom of  Similar results are seen for the Canny edge detection of the hipped roof with dormer windows in Fig. 4a.The main ridgelines have been extracted, but three of these edges are also duplicated.The two large roof planes contain several small dormer windows, where edges have been correctly extracted, but are affected by false positives at the end of the boundary.False positives have been extracted by shadows cast from the dormer windows as well as the texture of the roof, in the form of repetitive small ovals where the colour gradient in the corrugated roof texture changes.
By applying the residual threshold rule, false positives are removed from the roof faces.Nearly all of the false edges from the roof planes in Fig. 3a were removed whilst preserving the main ridgelines, as shown in Fig. 3b.However, several short edges were not removed: notably along the shadow edges and on the roof planes.There were also a couple of edges at the junction of multiple ridgelines, which have been erroneously removed, as the small change in height is below the threshold used to remove points along the fitted line.
Similar results are seen for the edges detected in Fig. 4a for the hipped roof and the refined edges in Fig. 4b.Small repetitive ovals extracted from the roof texture have been removed and all major ridgelines of the hipped roof have been extracted.The ridgelines of the dormer windows have also been extracted and could potentially be used to reconstruct these small features, although some noise is still present and requires further refinement. a. b.

LoD1 3D Roof Reconstruction
Example results of LoD1 reconstruction using the nDSM boundary of a building can be seen in Fig. for two different roof structures.The results show how a flat surface can be created from the corner points, extracted from the raster edge.The parameters used are able to reconstruct perpendicular corners as well as edges which have angles larger than 90°.Thus reconstruction is not limited to any particular geometry type, as highlighted as a weakness of many feature based reconstruction approaches (Section 2).a.
b. Qualitative analysis of the results in Fig. 6 show results which conform to the expected building geometry.However, to quantify the performance, the difference between reference coordinates and extracted coordinates were measured.This revealed that corners were extracted within 0.5 m of their true position and within 0.5 m of their correct height.When compared to the accuracy requirements of CityGML, building models were reconstructed to the positional and height accuracies required for LoD3 models, which is a greater accuracy compared to LoD1 and LoD2 (Gröger and Plümer, 2012).

LoD2 3D Roof Reconstruction
The example 3D reconstruction of both a flat roof and a complex roof structure can be seen in Fig. 7.For the flat roof in Fig. 7a all ridgelines, with one exception, are successfully extracted and the two roof planes are correctly extracted at varying elevations, with the inclusion of the step-edges.Two smaller roof planes from skylights have also been extracted because they form a closed cycle of points and edges, thus showing the potential of this method to also extract smaller features. a. b.
Figure 7. Results of LoD2 reconstruction for (a) a flat roof and (b) a complex roof structure.
For the more complex roof structure in Fig. 7b, the developed method has managed to successfully extract edges to reconstruct the roof planes.The results for all 50 buildings are summarised in Tables 1 and 2.
The results in Table 1 show the RMSE of the extracted points when compared to reference data.A corner point was determined as being successfully detected if an extracted point was within 2 m of the ground truth, in compliance with the CityGML LoD2 planimetric specification (Gröger and Plümer, 2012).Points were successfully extracted with a planimetric mean RMSE of just under 0.50 m.The height RMSE was slightly higher with a mean of 0.65 m.The final two columns of Table 1 indicate the percentage of roof points correctly extracted as part of the reconstruction and the total number of points extracted as a percentage of the number of reference points.The percentage of correct corners detected was relatively high, with an average detection rate of 75%.Of the missing points, in some cases the Canny edge failed, shown in Fig. 8, while in other cases under-segmentation occurred when classifying edges The results of planar extraction, where planes were formed by a closed cycle of roof edges, are shown in Table 2, again compared to manually delineated reference data.The results for planar extraction of flat roofs shows generally successful reconstruction, with 80% of the roof planes being detected.However other roof structures were not reconstructed successfully with the four remaining roof types all having less than 50% quality success.The results in Fig. 8 show that all roof corner points of a hipped roof have been successfully extracted with the exception of two linking ridgelines.This results in only one out of the four roof planes being detected.In this particular example this is due to failure of the Canny edge detector.Other examples exist where the connectivity ruleset has not managed to connect a ridgeline to any other point or edge, and thus this edge is removed from the reconstruction.
Most current methods are able to successfully reconstruct simple roof structures and struggle with complex roof structures (Rottensteiner et al., 2014).However the preliminary results presented in Table 2 suggest that complex roof structures are reconstructed more successfully compared to simple roof structures in the gable and hipped category.This may be caused by the size of the search window.Complex roofs tend to be larger than simple roof structures, thus the search windows currently used may be more suited to larger buildings.
Figure 8. Results of LoD2 reconstruction for a hipped roof.

CONCLUSIONS
The preliminary results presented in this paper are encouraging and demonstrate how data extracted from the by-products of dense image matching can be integrated for automatic 3D building reconstruction at varying levels of detail.Corner points of roof planes have been extracted from edge detection to form closed roof plane polygons.Whilst the Canny edge detector offers good localisation of extracted edges, it is also prone to extracting false positives from roof texture.These false positives can be reduced by fitting lines to the detected edges along a cross section to remove points within a threshold distance.The removal of detected Canny edges along cross sections of the roof using scan line segmentation can cause edges to become disconnected.However, these edges can be reconnected using run graph vectorisation.Errors in the initial segmentation of the individual edges can be overcome using angle, length and search window thresholds to reconstruct building models at LoD1 and LoD2, as defined by the OGC CityGML standard.
The proposed methodology is currently being tested on the ISPRS WGIII/4 Vaihingen dataset (Rottensteiner et al., 2014) to investigate the transferability of the methodology and to compare the approach with current state-of-the-art methods.Future work will further develop the connectivity workflow to overcome the mentioned limitations.This will include investigating varying search window sizes according to building footprint size, and the further development of the connectivity ruleset to increase the number of correctly detected points whilst minimising connectivity failures.
Results have demonstrated how small features can be extracted, but also how these can also hinder reconstruction, especially where this extraction is incomplete.Reconstruction of these objects will therefore be further investigated.

Figure 1 .
Figure 1.Orthophoto showing the city centre of Newcastle upon Tyne © UltraMap XP Image Copyright 2010, Ordnance Survey
Fig. 3a.The inclusion of the nDSM boundary also has the effect of duplicating detected roof boundary edges, as seen on the right of Figure 3a.a.b.

Figure 3 .
Figure 3. (a) Results of Canny edge detection and (b) the refined edges using scan line segmentation for a complex roof structure.

Figure 4 .
Figure 4. (a) Results of Canny edge detection, and (b) the refined edges using scan line segmentation for a hipped roof with dormer windows.

Figure 5 .
Figure 5. Cross sections of a gable roof from (a) the nDSM, (b) the least squares fitted line using Canny detected edges with extracted heights, and (c) the final result of scan line segmentation.

Figure 6 .
Figure 6.Results of LoD1 reconstruction for (a) a gabled roof and (b) a more complex roof structure

Table 1 :
using Freeman chain code.The total percentage of corner points detected shows that the method over-segments the reconstructed roof.Similar to the planar extraction methods mentioned in Section 2, small features can affect the results, extracting unnecessary features, as highlighted in Fig 7a.Points and lines were also extracted and connected from neighbouring features, such as overlapping trees and vehicles close to the building, which were not removed by the ground classification.Quantitative analysis for the location and number of the extracted corner points

Table 2 :
Quantitative analysis to determine the completeness (Com), correctness (Cor) and quality (Q) of the extracted roof planes