3D POINT CLOUD DATA PROCESSING AND INFRASTRUCTURE INFORMATION MODELS: METHODS AND FINDINGS FROM SAFEWAY PROJECT

Monitoring and digitalization are key to improve the resilience of the infrastructure network in the context of assessing its disaster management cycle. SAFEWAY is a project funded by the H2020 framework that aims to assess infrastructure resilience integrating multiscale information attending to all modes of disaster management cycle. This work presents the methodologies developed in the project for road and rail infrastructure monitoring and modelling, using remotely sensed data from Mobile Mapping Systems (MMS). First, 3D point clouds of both road and rail infrastructure are heuristically processed, obtaining geometric and semantic information from the most relevant assets, as well as the alignment, which is a key entity for generating information models. Such models are computed following the specifications of the Industry Foundation Classes (IFC) 4.1 schema, considering its current limitations and future potential for linear infrastructure modelling. Finally, the information is centralized in a core software platform where a user interface has been developed to aid visualization and interpretation of the resulting data. * Corresponding author


INTRODUCTION
Currently, there is an increasing concern regarding the ability of the transportation network to function during adverse events, and quickly recover to an operational level of service. This concern is motivated by two facts: First, the maintenance budgets of the transportation network are not evolving accordingly with its necessities (European Commission, 2019), leading to a faster deterioration of an ageing network; and second, the frequency of extreme climate events is likely to increase due to climate change. A study by Forzieri et al. (2018) projects different climate risks to critical infrastructures, estimating an impact of 10 times the present damage by the end of the century, being the highest economic losses for industry, transport and energy sectors. Thus, the concept of resilience is gaining attention, as a quantification of the capacity of a transportation network to perform despite the increasing risk of extreme events, considering the different phases of the disaster management cycle: Prevention, preparedness, response and recovery (Alexander, 2002).
In this context, this work presents research methodologies that have been developed during the execution of SAFEWAY project (SAFEWAY, 2021). It has been funded by the H2020 framework of the European Commission, under the Smart, Green and Integrated Transport programme. Its main focus is to develop an Infrastructure Management System (IMS), conceived to improve the resilience of the European infrastructure by integrating predictive knowledge about the potential occurrence of extreme events (natural or humanmade), and knowledge about the infrastructure current performance. This includes information about the network given by the infrastructure managers (e.g. number of vehicles per time unit, accident hot-spots…), and also a multiscale monitoring of the facilities, including both satellite (Interferometric SAR) and terrestrial (Mobile Laser Scanning) techniques.
In terms of disaster management cycle, infrastructure monitoring is an essential pre-disaster activity (Erdelj et al., 2017), as it can be applied to the assessment of structural health, environmental changes or asset inventory. The objective of SAFEWAY with respect to infrastructure monitoring is to develop an Infrastructure Information Model supported by open standards and following the interoperability principles of Building Information Modelling (BIM). While BIM is widely adopted in the building industry, there is still a major need of a standard, neutral exchange format for transportation infrastructure (Chong et al., 2016;Costin et al., 2018). That challenge is being slowly handled, where the most noteworthy effort is the work done by buildingSMART, an international, not-for-profit organization that maintains and develops the Industry Foundation Classes (IFC) data model. Their latest release, IFC4.3 RC2 (BuildingSMART, 2020), is a standard candidate that includes different infrastructure domains within the data model: Road, rail, bridge, tunnels, and ports and waterways.
This opens a new and promising research line: The generation of Infrastructure Information Models as IFC-compliant data models, by feeding meaningful geomatic data to them. Thus, this work will focus on the two main challenges that arise from this line of research: (1) The processing of remotely sensed data, specifically 3D point clouds, to automatically extract geometric and semantic information to feed the information models, and (2) The definition of the models, using available, open resources to generate IFC-compliant files.
Regarding the first of the challenges, the state of the art has evolved considerably during the last decade. The growing popularity of MMS and the improvements on hardware that is able to process and handle 3D point cloud data more efficiently, allowed a fast improvement of the methods and algorithms to segment and classify 3D point cloud data of infrastructure environments. Specifically, this work is focused on two different infrastructures: Road and railway.
Road infrastructure has been a common object of study in the literature, as the geometric and radiometric information provided by MMS is appropriate for road inventory mapping, including not only the road pavement itself but also many of its assets (Guan et al., 2016). Early approaches estimated the road surface using existing algorithms such as RANSAC (Smadja et al., 2010) or by extracting road edges using geometric-based criteria such as slope (Yoon and Crane, 2009) or the presence of curbs (Yang and Dong, 2013). Other works integrate the segmentation of the road in a more complex segmentation framework that includes the whole road environment, relying on heuristics and hierarchical processing (Yang et al., 2015) or using supervised machine learning approaches, such as the approach in (Balado et al., 2019), where a Deep Learning model is trained to classify the main elements of the road environment, finding good classification results for the road surface. The segmentation of the road surface allows the subsequent detection of road markings. The intensity attribute of the 3D point clouds has been used as a key feature to detect reflective elements such as road markings (Cheng et al., 2017;Guan et al., 2014). Wen et al. (2019) develop a deep learning framework that obtains promising results for road marking segmentation, classification and road marking completion, dealing with partially occluded markings. Research involving other built road assets has been active as well. Traffic signs are typically segmented following analogous approaches than for road markings, relying on the intensity attribute of the point clouds, with the necessity of an extra sensor fusion step, combining 3D point cloud and 2D camera information to extract the semantics of the traffic signs (Arcos-García et al., 2017;Wen et al., 2015). Other assets that have been assessed using 3D point cloud data are pole-like objects such as light poles  or guardrails (Matsumoto et al., 2019;Vidal et al., 2020).
Railway infrastructure has received proportionally less attention than road infrastructure in the literature, but the existing research shows a similar potential for the segmentation and classification of 3D point clouds of railway environments. Arastounia (2017) presents an algorithm that is able to detect rail tracks, contact cables and catenary cables even in complex configurations with slops, curves and merging rail tracks, in both terrestrial and aerial point clouds. There exist similar works that develop supervised machine learning approaches instead of relying exclusively on heuristics, such as the work of Sánchez-Rodríguez et al. (2018), which uses a Support Vector Machine (SVM) algorithm to classify the rails, or Soilán et al., (2020), that apply a Deep Learning semantic segmentation model (PointNet) to segment 3D point clouds of railway tunnels. Recently, Karunathilake et al. (2020) proposed an approach to detect different structures on the rail geometry such as railway crossings and turnouts, showing that the extraction of secondary information such as distance change between tracks, or change of levels between rails, is not straightforward and its automation requires a complex geometric analysis.
With this context, it is clear that point cloud processing will play an important role on as-built modelling processes (Pătrăucean et al., 2015). Recent works that generate IFC models of bridges from automatically processed point clouds (Sánchez-Rodríguez et al., 2020;Zhao and Vela, 2019), and the upcoming publication of IFC data models for road and railway infrastructures motivates this work, whose main objective is to present the relevance of remote sensing in the digitalization of linear infrastructure in the context of a European research project, showcasing the developed methodologies and qualitative results that have been achieved. The contribution of this work is twofold: • To offer an insight on the developed methods, that exploit the promising synergies between the growing capabilities of remotely sensed data to obtain precise semantics of the infrastructure environment, and the need of input data for generating as-built digital models. • To show how the resulting data from these methods are applied on practice to the core platform of SAFEWAY, where they are combined with different modules aiming to improve the resilience of the transportation network.
This work is structured as follows: Section 2 details the case study data as extracted from the SAFEWAY pilots and the developed methods for point cloud processing and infrastructure modelling. Section 3 outlines the results and discusses their current potential and limitations. Finally, conclusions of this work are shown in Section 4.

METHODS.
This section describes the most relevant aspects of the methodologies developed in the project. In Figure 1 a complete workflow can be seen, with the interaction of three well-defined blocks: (1) Point cloud processing, where semantic and geometric data from road and railway environments are extracted, (2) Infrastructure modelling, which defines the output format of the data from the previous block and generates IFCcompliant models.
(3) SAFEWAY core platform, where the infrastructure models are fed into a multiscale software that integrates relevant information in all resilience dimensions (preparation, response and recovery, mitigation).

Case study data
Among the four pilots within SAFEWAY project, each of them in a different European country, point cloud data from MMS has been collected in two of them. In total, the pilots contain approximately 300km of road and 110km of railway as 3D point cloud data.
In order to validate the proposed methods, smaller sections from the collected 3D point clouds have been selected. Road case study is divided in highway and conventional sections, from which approximately 20km of road data have been selected for validation of the methods. Similarly, about 90km of railway data have been selected for validation. It is also relevant to note that urban areas and railway stations have not been included in the case study data.

Point cloud processing
This subsection offers a description of the point cloud processing methods that were developed to fulfil the objectives of the project, for both road and railway infrastructure.

Road infrastructure
The objective of the presented methods is twofold: − To obtain the alignment of the road as a set of 3D, structured and consecutive points, as it is an essential entity for spatial referencing and positioning in the infrastructure model. − To extract semantics and geometries of relevant assets of the road infrastructure.
The alignment is defined as the central axis of the road. Therefore, it can be easily located if the road edges are known. For that reason, the proposed approach starts with a ground segmentation method, based on Douillard et al. (2011) voxelbased region growing approach, using the recorded trajectory of the MMS for the selection of seed points.
Then, road markings are detected on the ground segment by locally analysing transversal sections of the road. The intensity attribute of the point cloud is essential for this analysis, as road markings have considerably higher intensity values than the asphalt. A local analysis is mandatory as this value depends on the distance and the incidence angle between the sensor and the surface, so applying global thresholds would lead to wrong results.
After this detection process, points belonging to road markings are still unstructured. Thus, a classification process is proposed where road markings are grouped via Euclidean clustering, and subsequently analysed, extracting features on each cluster such as length along the trajectory, width, or continuity, to define two classes of linear markings: solid and dashed. Other road markings such as arrows are not considered for this process.
Finally, road edges are extracted by analysing the spatial context of the road markings with respect to the trajectory of the MMS. That means, linear markings that are further from the trajectory at both sides are selected as edge candidates. Then, consecutive polynomial curves are fitted to those markings confirmed as road edges, and they are sampled with a fixed spatial resolution. For each couple of correspondent sampled points from both edges, a point of the road alignment is computed as the closest point of the road point cloud with respect to the coordinates that result of averaging such sampled points (Figure 2). Figure 2. The road alignment of this highway section (red colour) is defined as de central axis of the road, which is delimited by its edges (green colour).
Furthermore, semantics and geometries of other relevant assets are also extracted: • Road markings: Their position and geometric features are necessarily known after the extraction of the alignment, as explained. • Traffic signs: These assets play an important role on road safety. They are detected using the intensity attribute of the point clouds in a similar fashion than road markings, as well as previous knowledge about their dimensions and geometry. If the traffic sign is on a pole, the geometry of the pole can be also analysed for its representation in the infrastructure model. • Overpasses location and clearance: Overpasses can be detected by searching for the presence of solid structures directly over the road segment. Their clearance is an important measurement in terms of road maintenance, and it can be computed as the vertical distance between a reference on the road (normally, road edges and alignment) and the height of the lower face of the overpass entrance. • Guardrails: They are an important asset in terms of road safety, that can be detected using previous knowledge about its relative position with respect to road edges and its geometric features. If the profile of the guardrails is known beforehand, as it is usually standardized, the infrastructure model will only require the positioning of the guardrails from the point cloud processing block. Figure 3 shows a road section where different entities and assets are detected on the point cloud. Section 3.2.1 will detail how this geomatic information can be used to generate the infrastructure model. Figure 3. Semantic segmentation of a conventional road section. Road edges (green), alignment (red), guardrails (black) and traffic signs (blue) are shown.

Railway infrastructure
The objectives and methods developed for railway infrastructure are analogous to those of road infrastructure. However, the differences between the geometries and elements to be defined are significant, hence the methods have to be different as well.
Here, rail detection is a key process to define the alignment of a railway lane, as they will be analogous to the road edges from Section 2.2.1, as a reference to compute the alignment.
The proposed approach starts by isolating the railway track. In order to do so, each processed railway section is voxelized and then rotated according to the pitch angle of the trajectory recorded by the vehicle, so the influence of the slope is neglected. Then, a height histogram is computed and the points whose height belongs to the largest bin is selected following the approach of Arastounia (2015).
A two-step process is defined for detecting the position of the rails. In a first step, a rough rail estimation is conducted by analysing local differences of point height and point intensity, as it is observed that rails are higher and less reflective that their local neighbourhood. Then, the roughly selected rail points are divided in several sections, transversally with respect to the direction of the trajectory, being the length of each section small enough to neglect the impact of rail curvature on the subsequent steps. For each of these sections, a single point on top of each rail is computed by analysing the profile of the rail points (let them be Pt). This step has into account rail turnouts, such that only one pair of rails -the one followed by the trajectory of the vehicle -is considered.
The second step refines the previous estimation by analysing the set of points Pt, having into account the presence of false negatives in the detection process. Then, a region of interest can be built from the refined coordinates of the set Pt based on the previous knowledge of the rail dimensions, allowing a fine detection of the rails.
Finally, the approach shown in Section 2.2.1 for computing the alignment of the road is employed here: Consecutive polynomic curves are fitted, and subsequently sampled to obtain a set of points delineating the rails. The alignment is defined as the set of points resulting of averaging each couple of correspondent points from both rails (Figure 4). The railway environment has different assets whose detection and positioning in an infrastructure model is of great interest. Considering the points that do not belong to the railway track, the following assets are automatically extracted: • Masts and cantilevers: They are pole-like objects that sustain the electrification structure of the railway infrastructure. They are detected applying a Principal Components Analysis (PCA)-based dimensionality analysis on the voxelized point cloud. First, voxels whose structure is linear (first eigenvalue significantly larger than second and third eigenvalues) and vertical are selected, and then clustered and filtered by height.
A region growing analysis allows to segment the masts together with their cantilevers, that support the electrical wiring. • Wiring: A similar PCA-based approach is applied to perform an initial segmentation of the cables, considering that wiring has a similar direction than the trajectory if projected on the XY plane. Then, wire clusters are generated filtering by length, linearity and density. Finally, they are classified as catenary or contact cable by analysing the spatial relationship of the wire clusters on a rasterized version of the point cloud on the XY plane. • Droppers: Droppers are the elements that join catenary and contact cables. They are detected by selecting the bounding box of catenary and contact cables, removing them afterwards, and clustering the remaining points with the DBSCAN algorithm. Clusters that are in contact with a catenary-contact cable pair are removed are selected and classified as droppers. • Signs: Sign panels are detected using the intensity attribute of the point cloud in a similar manner than for the road infrastructure. The presence of signs attached to point clusters previously classified as masts has to be considered.
Note that these are brief descriptions of the actual implementation, hence previous considerations or requirements of the data are omitted for simplicity. Figure 5 shows a railway section with the results of the segmentation process, highlighting the aforementioned assets.

Infrastructure modelling
This subsection presents the IFC model building methodology used to represent and enrich the information obtained by the point cloud processing methods. Differently from them, the IFC modelling procedure does not operate in a different manner for road and railway. The differentiation is established at an upper level of abstraction by setting a spatial structure (IfcSpatialStructureElement). This defines a hierarchy where a project is divided into sites, which are divided into facilities (e.g. road or railway), which are divided into facility parts (e.g. road segment). Such a classification aids in the organization of the projects and incentives the grouping of elements with similar characteristics or close in space. The positioning of most, if not all, elements in the infrastructure model is guided by the alignment (IfcAlignment / IfcAlignmentCurve). It serves as a linear reference system that, besides easing the geometric and positioning definitions, ensures that the aligning of one asset of the transportation network with another, only depends on the alignment.
The placement of elements, however, also requires two additional components besides the alignment: a relative point definition (IfcDistanceExpression) and an orientation (IfcOrientationExpression). The obtention of that point is what drives the cloud processing when trying to place a certain kind of object in the model. It uses the alignment as a basis curve and describes a point in space by a distance along the curve and a series of offsets with pre-defined directions that allows the description of any point in space ( Figure 6). As for the orientation, it is extracted from the slope and tangent direction of the alignment at the defined point.
Relative to the geometric definition of the different elements included in the model, each case has its own details, but most of them can be defined using an extrusion operation. This operation is defined by the extrusion of a profile (IfcProfileDef) along a curve (IfcSectionedSolidHorizontal) or direction (IfcExtrudedSolid) for a certain length ( Figure 6). The use of different parametric profiles allows the creation of a large variety of elements. Additionally, this can be further extended by defining an arbitrary outer curve of a profile using, for example, a polyline (IfcPolyline). The extrusion along a direction is used for straight elements such as traffic signs, plates or posts. On the other hand, the use of a curve for extrusion results in curved elements that follow the shape of the used curve. This is the case for the railing of the guardrail and the road pavement. Figure 6. Schema of the modelling methodology. The position of an asset is defined relative to the alignment, and its geometry is extruded from a profile definition.
It is worth noting that the BIM model of the infrastructure should not only contain 3D information, but semantics as well. This encompasses the identification of individual elements (name, description, tag, etc…), their material, relationships to one another, and property sets. For instance, lane width, number of lanes, and road width of the road were added into the model by attaching different property sets to an annotation element (IfcAnnotation). The semantic meaning is also fundamental in one of the elements previously mentioned, the traffic signs. Their intended meaning can be included in the model either in the element description or by the use of the previously mentioned property sets.

RESULTS AND DISCUSSION
This section presents the most relevant outcomes and applications of the presented methods, discussing their utility and limitations.
In the context of SAFEWAY, as it was shown in Figure 1, the outputs of the point cloud processing methods presented in Section 2.2 are sent to the infrastructure modelling block as well as to the SAFEWAY data core platform. Similarly, the infrastructure models feed the core platform as well. While this platform contains several layers (climate associated risks, decision support tools) aiming to improve the resilience of the infrastructure, the tool within the scope of this work consists of a user interface that allows an interactive and navigable visualization of the processed point cloud data. In Figure 7a, a screenshot of the user interface can be seen, where the user can explore the infrastructure from the point of view of the Mobile Mapping System, on the 3D point cloud and on a panoramic 2D view. Furthermore, the same area is extracted in the satellite view from Google Maps. Over it, some of the extracted semantics can be seen, as the alignment, the overpass clearances or the traffic signs.
Similarly, Figure 7b shows a highway exit section, whose 3D point cloud view includes the segmentation results from the ground, road markings and traffic signs. In the aerial view, the alignment is drawn as a polyline.
While the automation potential of remotely sensed data for infrastructure modelling and digitalization is made clear in this work, there are still some limitations that must be discussed. First, it is important to note that the presented methodologies in Section 2.2 are highly based on heuristic processes, with many parameters that control the performance of the algorithms. The adjustment of those parameters may be dependant of certain specifications of the infrastructure typology that may differ between regions or countries (e.g., the track gauge of the railway network). One of the possible solutions is to perform a parameter analysis that select those parameters which have a dependence on the type of infrastructure and develop a user interface to allow the infrastructure manager to adapt the algorithms to their needs. Another option, more attractive in terms of research and given the promising results of the heuristic-based segmentation methods, is to consider the output of those automatic methods as labelled data that can be used to train supervised learning models. With the recent advances in Deep Learning on the 3D point cloud domain, and the possibility of obtaining labelled data from large datasets, the development of classification models with better generalization properties than the heuristic methods is a clear future line for this research. In terms of infrastructure modelling, the a relevant limitation is given by the fact that IFC Road and IFC Rail are yet not published as final standards by the time this work is being written. Thus, some of the assets that can be modelled do not have a semantic representation on the standard, although they can be visualized as a 3D model with the appropriate geometry, as specified in Section 2.3: The IFC schema presents multiple ways to model the geometry of an element. Currently, several approaches are being explored, such as the use of simplified mesh geometry to represent highly complex elements that are not easily parametrized in a tessellated manner. However, this limitation is just a matter of time, hence future research on the extent of the generation of IFC Road and IFC Rail models from 3D point cloud data is expected, leading to an automation on the digitalization of the infrastructure.

CONCLUSIONS
This paper presents the most relevant results of the methodologies developed in SAFEWAY, a H2020 project that aims to improve the resilience of the European infrastructure. Specifically, this work focuses on the important role played by remotely sensed data collected by MMS. Those data allow to obtain detailed geometric representations of the infrastructure that can be processed to extract meaningful information. This work presents a description of different heuristic-based methods that perform a semantic segmentation of road and rail infrastructures, extracting some of the most relevant assets. Furthermore, methods for the automatic computation of the alignment of both infrastructures are presented. Then, that alignment is employed as the base geometry to generate information models as IFC-complaint files. Finally, resulting data from both point cloud processing and information modelling is fed to the software core platform of the project, where it can be interactively visualized, as well as play a role as input for the decision support tool that is integrated in such platform.
There are some interesting conclusions that can be extracted from this work. First, the clear synergy between 3D point cloud data for the generation of infrastructure BIM models. In a context where digitalization and interoperability are becoming more important, the automation of monitorization and digitalization tasks is key for an efficient implementation of the BIM methodology in the infrastructure field.