DIGITAL SURVEY AI AND SEMANTICS FOR RAILWAYMASONRY BRIDGES HEALTH ASSESSMENT

Masonry arch railway bridges represent a historically built heritage to be preserved. The multidisciplinary approach requires the construction of a common language, namely the creation of a formal conceptualisation of bridge domain that could serve as basis for both adding new layers of knowledge in H-BIM modeling and aiding the automatic segmentation of masonry bridge point clouds, thus supporting the semi-automatic creation of information models. The presented research aims at showing the results of an in-depth analysis conducted on masonry arched bridges computational ontologies; following this, the authors propose a semantic conceptualization in the masonry bridge domain, structured with three group of key concepts needed in the process of knowledge: bridge elements, materials, and defects. As a case study the masonry bridges of the Sicilian Circumetnea railway are chosen.


INTRODUCTION
Masonry arch bridges are a significant part of Europe's road and railway heritage, both in numerical consistency and valuable integration in the environment. Although their durability is considerably higher than concrete and metal bridges, with relatively modest maintenance costs, masonry bridges and their construction techniques are nowadays a legacy of the past that often supports today's communication networks. In the second half of the 19th century, stone and masonry arch bridges had an enormous development in Europe and Italy, mainly due to the development of the railway networks. Although they are quite prototypical structures, as they were built following the nineteenth-century guidelines for designing masonry arch bridges, these works of art use local construction techniques and materials, reflecting the modus costruendi of the time as historical permanence of the past. Due to the disuse of several railway networks, which have been replaced by road transport, a significant number of these masonry bridges have been abandoned. Natural obsolescence threatens this heritage due to neglect and lack of maintenance. In this regard, a substantial number of masonry bridges have collapsed over the past years. As a critical component of the road and railways infrastructure, masonry arch bridges require special treatment. It is crucial to develop effective and integrated procedures to characterise the structural conditions, identify and prevent potential vulnerabilities of such historic assets by studying their geometric configurations, construction techniques, and documentary heritage. In this direction, the Italian guidelines for monitoring and management of existing bridges (MIT, 2020) suggest that knowledge is a crucial step in the comprehensive approach to understanding the constructions' behaviour. It is well known that analysis and preservation of both building and civil heritages is supported by many processes involving several actors from different application areas. This means the uncontrolled production of specific and multidisciplinary data. For the reasons mentioned above, the creation of a formal conceptualisation of bridges domain, in other words an ontology, is useful since it creates a common language and allows data reuse and knowledge sharing (Colucci et al., 2020). Indeed, an ontology, as a various-grained knowledge base, could be used for purposes related both to the enhancement of masonry bridges and to the evaluation of their health status, with a focus towards vulnerability assessment. Moreover, H-BIM approach could be the starting point for managing, conserving, and maintaining masonry arch bridges. It gathers all present and past information on the artefact concerning the different areas of analysis. Besides that, H-BIM could be used throughout the life cycle as a supporting tool constantly updated. In this context, a key role is played by the vendor-neutral data exchange standard Industry Foundation Classes (IFC), which allows communication between stakeholders and researchers, who often deal with field-specific knowledge and language.

Digital survey, AI and semantics: an ever-changing scenario
In recent years, digital surveys have moved from being pioneering applications to become routine procedures. Moreover, the increased availability of specifically designed LIDAR instruments (i.e., mobile mapping systems) accompanied by ever-more user-friendly and efficient output management tools and the development of low-cost photogrammetric approaches, have democratised the use of reality capture technologies. Numerical models, known as point clouds, describe geometric and non-geometric information, essential in several fields of applications, including architecture and infrastructure. Moreover, recurrent acquisitions of point clouds over time allow multitemporal monitoring, through the comparison of the data acquired during different times. On the other hand, data obtained provides numerous qualitative and quantitative information on the artefact, but ontologically indistinct. Hence, data interpretation is moved to a later moment and requires the involvement of a trained operator (Migliari, 2001). The interpretation of semantic data is strongly debated by the scientific community. Artificial Intelligence with Deep Learning (DL) techniques can provide valuable support in this direction, speeding up long and repetitive procedures prone to human error. The learning process of neural networks can be guided by ontologies that, as conceptualisations of a domain of interest, are able to feed these systems with the needed information. Hence, ontologies are widely used in semantic web development and have a pivotal role in DL applications. Besides that, ontologies are nowadays introduced in BIM workflows to semantically structure data and help in the management of repositories and web libraries. The semanticrelated aspects of information modelling, especially in terms of level of granularity, and the opportunities given by a wellstructured database related to BIM models, are some of the liveliest research topics in information modelling, in both heritage and infrastructure fields.

Research gap and research proposal
Masonry arch bridges, especially railway ones, are civil engineering artworks characterized by specific features. Although they are parts of in-use infrastructures, they should not be considered as infrastructure assets in the narrower sense, since their construction straddles the line between the 19 th and the 20 th century, and they are no longer being built nowadays. Therefore, rather than considering the design approach, it is necessary to deal with the management as well as the preservation and valorisation of this heritage, since it is prone to natural and anthropic hazards likely to any other cultural asset. Handling with masonry arch bridges means finding a meeting point between the best practices for infrastructure management and those for the study and enhancement of cultural heritage. This aspect makes this area of study not yet strongly explored. Moreover, ontologies in the engineering field have been extended adopting several standards and developing ontology building tools. This led to a broader comprehension of the importance of standardised vocabularies and formalised semantics. There is a need for workflows and procedures that allows a reliable data exchange with the appropriate level of granularity indeed. Although buildingSMART International project IFC-Bridge (Borrmann et al., 2019) proposed a bridge oriented extension of IFC, it particularly focuses on the design of reinforced concrete bridges and lacks a structured solution for managing the complex analysis of bridges state of health. On the other hand, considering masonry bridges from a Cultural Heritage point of view, neither the standard specifically designed for CH, the CIDOC Conceptual Reference Model (CRM), nor its extension contains a specific focus on masonry arch bridges. Given the above-mentioned scenario, two key research questions arise: • Which ontology structure is best suited to describe the different aspects necessary to investigate to get a more complete understanding of masonry arch bridges? • How this knowledge domain could be exploited for a BIM semi-aided modeling with a focus on vulnerability assessment? The development of new ontologies or the extension of one of the aforementioned standards for a holistic analysis of masonry bridges are surely an interesting step in the evaluation of the state of health of these assets, which requires a deep analysis of available standards and ontologies. Creating a new data schema or an extension could also serve to add semantic layers to H-BIM models or to help neural networks in semi-automatic point cloud segmentation, helping operators to simplify timeconsuming and repetitive procedures.
The presented research aims at showing the result of an in-depth analysis conducted on masonry arched bridges computational ontologies; following this, the authors propose a semantic conceptualization in masonry arch bridge domain, structured with three group of key concepts needed in the process of knowledge: bridge elements, materials, and defects. This conceptualization represents the very first step in the creation of an ontology that will be used as a knowledgebase for deep neural network training, toward bridge parts and defects semantic recognition, with the dual goal of aiding the automatic segmentation of masonry bridge point clouds and supporting the semi-automatic creation of informative models, structuring part of the metadata. From this point of view, choosing the bestsuited-to-these-tasks formal structure is crucial.

SEMANTIC AND BIM IN THE INFRASTRUCTURE FIELD
BIM and semantics are closely linked, as the very concept of parametric design needs an organised structure as a basis which gives to the modelled object a semantic value. As a knowledge collector BIM also requires agility in connecting structured databases, whose use is essential dealing with assets of considerable complexity, such as infrastructure and cultural assets. Interoperability research in BIM for architecture, engineering, construction, and facility management is a trend nowadays (Ozturk, 2020). IFC data exchange standard has been primarily designed to describe buildings; the increasing need for full adoption of BIM approach in infrastructure domain (Costin et al., 2018) led to the development of extensions of the IFC schema, to integrate information and concepts about infrastructure, such as bridges, roads, railways, and tunnels. Several researchers analysed this open international standard, such as the study of (Borin & Zanchetta, 2020), enlightening its limitations and potentials, thus revealing a mixed scenario. IFC schema is more suitable for managing geometry and structured information than integrating unstructured data or semantic knowledge. Hence the need for specific extensions, according to the type of analysis to be conducted. Speaking of bridges, there is a wider interest in concrete or steel facilities, given the high incidence of these typologies around the world (Trzeciak & Borrmann, 2018). There are numerous IFC extensions, that range from those used to enable BIM-based descriptions of structural health monitoring systems in compliance with IFC modeling capabilities (Theiler & Smarsly, 2018) to those collecting domain knowledge of bridge rehabilitation to improve information integration and constraint management (Wu et al., 2021). In the context of structural analysis requirements, (Park et al., 2020) propose an extended IFC-based bridge information modeling method using a process to apply the meshfree structural analysis method to the IFCbased model. (Isailović et al., 2020) presented an approach for point cloud-based detection of spalling damage joined with a method for semantic enrichment of IFC model with damage semantics. (Ismail et al., 2017) improved the semantic quality of BIM models and link specific domain information from various domains with focus on bridge models based on the IFC standards. In (Esser and Aicher, 2019) the Visual Programming Language (VPL) tool Dynamo was used to prepare bridge models from InfraWorks, because neither InfraWorks nor Revit have a native IfcBridge interface yet. Thus, the authors developed an innovative Dynamo library containing nodes to interact and export bridge models into the new IFC 4x2 standard.
In (Simeone et al., 2019) A prototypal application for Semanticenriched BIM was applied to build heritage case studies.

MATERIALS AND METHODS
Considering the outlined research scenario, the approach here illustrated and developed is a part of the broader methodology proposed in (Garozzo, 2021). Indeed, in this framework the authors focused on the semantic aspects of the research, where the design of a masonry arched bridge computational ontology will constitute the basis for the automatization of the segmentation process being -the knowledgebase for deep neural network training -and the connection with the new concepts in the H-BIM workflow. The under-development ontology needs to meet both the cultural and the structural aspects of masonry bridges. In addition, to smoother the creation of the ontology, a mixed approach exploiting top-down, and bottom-up potential has been considered. Indeed, the first one was functional to identify the classes and concepts to be described. The other one helped to characterize the materials and construction techniques that are specific for each geographic area. In this case, the masonry arch bridges of Circumetnea have been selected as a case study to support materials and construction techniques identification and test the ontology applicability to the purposes of the research.
To accomplish the first approach, historical treatises and manuals need to be consulted to extrapolate the terminology, the identification of parts and their relationships. As for the constructive techniques and materials, they vary according to the area and the construction period. All these aspects concur to define the different typologies. In addition, also the structural aspects and state of conservation issues need to be considered. Once all these aspects are explored and identified, the conceptualisation step of the ontology can be run up to define a scheme.
The data scheme needs to be verified against other existing schemas (i.e., IFC-bridge, CIDOC) to understand if it is better to develop an extension of those existing ontologies or develop a brand-new one. Furthermore, the comparison allows us to understand which concepts are considered and which ones must be added. In any case, the data schema needs to be implemented in terms of materials, building techniques and state of preservation according to a selected case study. In summary, the methodology is structured as follows: • Conceptualisation: it provides an in-depth survey of existing vocabularies and taxonomies. Then, the identification of the relationships and hierarchies between parts to choose the proper classes, subclasses, and properties of the developing ontology is required. It is crucial to conduct in-depth research by consulting technical manuals and treatises and analysing several case studies and their typologies. • Comparison of existing ontologies: it is helpful to understand whether it is better to use an existing ontology or create a new one. • Ontology development: a particular attention is given to the level of granularity. Some levels of information to be added, for instance, are related to the semantic structure, construction techniques, and typical defects. The research work here presented focuses more on the first two aspects of the pipeline herein described.

Case study
Circumetnea is a still-in-service railway connecting Catania to Riposto, almost encircling Mount Etna and passing through several towns in the slopes of the volcano. It is the last narrowgauge railway in Sicily still in service, as other similar railways are no longer used. Built in a very short period (1889-1895) on the impulse of the 19 th century commercial growth of the area, Circumetnea is supported by several masonry bridges reflecting the typologies described in the technical manuals of the period. It is worth noticing that these bridges are heterogeneous in terms of materials, geometry, and number of arches, as well as there is a certain grade of recurrence in typology (Figure 1). Moreover, Circumetnea bridges are heritage assets at risk. As an example, traffic rearrangements and requirements have led to demolition of some of them.
To entirely approach the case study, the authors first focused on a cognitive phase developed at the State Archive of Catania, with an in-depth documentary research of about 200 folders, dated at the time of Circumetnea construction. Projects, metric computations, plans and longitudinal profiles of the rail route allow the investigation of construction and technological features, such as foundation typologies, usually not visible during on-site visits as well as a better understanding of the reasons and the developments of the project. After this early knowledge phase, a census was performed to detect the bridges to be exploited for this research purposes. After the detection using Google Earth, masonry arch bridges found were loaded in QGIS, aiming at using archival and survey information, coordinates (latitude and longitude), spans number and building materials to build a classification of attributes. The authors analysed a total of 37 bridges, 11 of whom were surveyed adopting suitable integrated surveying techniques. Several survey techniques were integrated according to the conditions and the type of artefact to be surveyed. In this phase, it was possible to digitally acquire easily accessible urban bridges of modest span. Leica Geosystem BLK 360 laser scanner was used, with a maximum range of 60 m and a scanning speed of 360,000 points/sec ( Figure 2A). Nevertheless, it was necessary to integrate this survey with photogrammetry. A dataset of high-resolution images (4496x3000 pixels) was collected with a Nikon D5300, focal length 18 mm. These images were processed using the Agisoft Metashape digital photogrammetry software.
In addition to this, other survey approaches were carried out, using videogrammetry and drone photogrammetry ( Figure 2B).

Definition of requirements
Creating an ontology is a very complex process from a technical point of view (i.e., use of ontology editors and frameworks, knowledge of programming languages) and requires a deep understanding of the instance to be classified. Given the abovementioned considerations, working on masonry bridges requires information about three essential aspects: i) the construction techniques and materials, ii) the elements concerning the geometry and, finally, iii) the aspects regarding the masonry bridges state of health, which can be defined through an analysis of defects. The ontology field in the AEC industry has been strengthened by the adoption of several standards and by a higher recognition of the importance of standardized vocabularies. Among masonry bridges, especially if carriageable or pedestrian, it is possible to find ancient specimens. Due to their configuration, these assets are characterised by a stratigraphy that need to be analysed through an interdisciplinary and archaeological-oriented approach (Savini et al., 2021). In this direction, the reference to CIDOC and its extension fits perfectly. Nevertheless, masonry railroad bridges make such a classification difficult to apply and require, at the very least, the revision or the creation of an ad hoc CIDOC extension. In terms of geometric aspects, part of the requirements is satisfied by the IFC standard, as the most common vendorindependent format to exchange BIM models. At the moment, IFC does not fully support infrastructure constructions like roads, bridges and tunnels, despite the recent developments of IFC4, which include the under-development IFC-Bridge extension.
As highlighted in a study carried out by BuildingSmart Italy (Guida All'IFC per i Ponti -BuildingSMART Italia), one of the critical issues to be addressed is mainly related to the classification of elements constituting a masonry bridge through the IFC format. This mainly concerns complex geometric elements typical of masonry bridges. One of these is certainly the vault. Such an element, that can be easily encountered in buildings, especially historical ones, in masonry bridges assumes a unique architectural and structural meaning. The extension presents, indeed, a first rough subdivision in substructure superstructure and deck, that goes to a more accurate subdivision into abutment, deck, deck_segment, foundation, pier, pier_segment, pylon, substructure, superstructure, surfacestructure. This level of granularity is suitable for structural analysis or other kinds of applications, as it is evident from the framework of Discrete Macro-Element Method (DMEM) software, e.g. HiSTrA Bridges (Caddemi et al., 2019). The element vault is not classifiable now according to the IFCbridge format but as a user-defined element; indeed, despite allowing the attribution of an arch structure, IFC-bridge does not lets to fully define its configuration. In the study and analysis of masonry arch bridges, both from a structural and a conservation point of view, aspects related to the vault are essential, especially dealing with geometrical information (i.e., type of arch that generates the vault -round a., lowered a., pointed a.; type of vault -straight v., oblique v.). For an accurate knowledge of the artifact, also aiming at the integration of information on the health status, it is necessary to go into a higher level of detail. This becomes crucial, for instance, in the analysis of geometric aspects as in the study of skew vaults or when, for reasons related to the analysis of decay, it becomes critical to analyse the single ashlar that is part of the vault. In this direction, the aspect linked to the documentary information heritage regarding the bridge object is certainly important. Therefore, linking all the information to the corresponding parts is a key point in the creation of a finegrained BIM model, able to become an information hub capable of describing the architectural object in its entirety, in order to investigate the semantics of bridge components and the construction phases of these assets. Deepening the aspects related to the masonry bridge's state of health, the classification and analysis of defects is a very specific one, characterized by a considerable complexity. This is due to the close relation of defects with both materials and geometric apparatus of bridges. It therefore requires further investigations and the creation of a purpose-built classification that needs to be developed subsequently and according to the aforementioned semantisations. A further point, related to the training of neural networks, sees the importance of using semantics to guide the training of neural networks, both in the field of image recognition and in the generation of synthetic datasets. In this case, a core point is related to the use of an ontology that contains information on the position in space of elements composing masonry bridges. Aiming at this, advances the hypothesis of the creation of an ad hoc ontology, consistent with the IFC-bridge standard, that is believed to be, with its interoperable address, the most suitable tool to pursue the purposes of this research.

Conceptualisation
The conceptualization process started organizing concepts based on a semantic triple model, a set of three entities that codifies a statement about semantic data in the form of subject-predicateobject expressions. To better manage the creation of the semantic conceptualization, a mixed bottom-up and top-down approach have been used, according to the needs explained below.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLVI-M-1-2021 28th CIPA Symposium "Great Learning & Digital Emotion", 28 August-1 September 2021, Beijing, China A top-down approach was adopted to identify classes and relations between bridge elements, with a focus on their characteristics and geometric shapes. Applying an analyticaltechnological method, the components of the bridge were identified with the related details and nomenclature; the functional hierarchy of each part relating to the whole was highlighted. To reach this goal, historical treatises and manuals need to be consulted. During this process, an important reference was the illustrated dictionary (Torre, 2012), that suggests a remarkable and comprehensive classification of most of the concepts related to a masonry bridge instance. Adopting the general classification criteria enlightened in this reference, these items were categorized. Also, several information related to dimensional and geometrical aspects were empathized, i.e., the plan layout and the arch profile of vaults. Starting from the object Masonry arch bridge, the main classes identified are Spandrel Wall, Pier, Vault, Roadway, Foundation.
Each of these classes, conceived as subclasses of a more generic Physical Object class, is characterized by types (different typology of the same class) and lower-level concept categories (other classes that compose the more generic ones). These classes are characterized by Material, Shape and Defect, designed as Physical Property (Figure 3). The Shape property is slightly different from the others. It is designed to have, among its features, the possibility of adding formulas as a text. This is a key point for elements such as the vault that, as previously mentioned, is one of the most significant elements in masonry arch bridges (Figure 4). On the other hand, a bottom-up approach starting from the chosen case study helped to characterize materials. This kind of approach has been preferred because materials and construction techniques are specific to each area and, at least in this first phase, it was chosen to better focus on previous classes and on the methodological approach in general. In addition, also the structural aspects and state of conservation issues need to be considered. In doing this, a mixed approach exploiting top-down, and bottom-up approach was followed. As a reference, the authors use both the Italian guidelines for monitoring and managing existing bridges and some inspection sheets, previously produced for the Circumetnea under an agreement with the Department of Civil Engineering and Architecture of the University of Catania. An aspect that does not emerge from this first conceptualization is related to the position of the objects in space, a crucial point for the application of the ontology in the training of neural networks and in the creation of synthetic datasets. Indeed, for this aspect, the authors are considering a solution to be applied directly to the architecture of the ontology, through the creation of special connected classes, which convey the objects by defining their relative position.

AI approach, first results
It is well known that AI methods require a significant amount of annotated data. In the context of masonry arch bridges, that are no longer being built nowadays, it is difficult to collect the necessary number of datasets. To overcome this issue, it is possible to exploit data augmenting techniques to create synthetic data to enrich the dataset, through an approach based on Generative Adversarial Networks (GANs). At this stage of the research, the needed material for the creation of the synthetic dataset was acquired, waiting for the development of the ontology and the next steps linked to it. A masonry arch bridge dataset consisting of 10.446 images of almost 3.000 masonry arch bridges was collected by the web. These data were acquired using a web scraping technique on Structurae.net, an international database and gallery of structures. Data collection is a key point in the training process, as a poorly built dataset strongly influences the GAN models. After the scraping, it took several days of manual data cleaning to obtain the final dataset to be used for the training. Images of aqueducts, drawings and plans, images with altered colouring, images that do not represent significant elements of bridges were removed from the dataset, leaving a total of 7434 images after cleaning operations.

CONCLUSION
The historic infrastructures still in operation represent a heritage to be safeguarded and protected. The focus of this work has been the creation of a common domain of knowledge useful for a better interaction between the different competences involved.
In literature there is a lack of standard or created from scratch schemas related to masonry bridges (i.e., extensions of CIDOC-CRM) or interoperable formats ready to be used for this research purposes (i.e., IFC-bridge). Therefore, it was necessary to carry out an in-depth study, to develop a conceptual scheme that constitutes the basis for the creation of an ontology. The study involved an initial investigation of the topic of masonry bridges to identify the requirements, consultation of existing thesauri (for terminology), and verification of existing formal knowledge schemes to identify the most suitable scheme. To better develop the conceptual framework, a mixed top-down and bottom-up approach has been experimented. This led to a better description of the classes and items as well as an increasing granularity. The chosen case study provided excellent support to reach the objectives of this work whose aim is to create a knowledge system usable both for enriching masonry bridges BIM models and for automatically segmenting point cloud data via AI techniques, specifically Deep Learning. All the historical, archival and bibliographic sources as well as the 3D acquisition and processing helped in the understanding of the topic and its specifications. The next steps of the research will be addressed to the development of the computational ontology and its test in the above-mentioned scenarios. Another key point will be the AI application aimed at an increasingly automated information modelling.

AUTHORS CONTRIBUTION
Conceptualization; methodology; formal analysis; investigation; data curation; visualization; writing -original draft: R.G. Conceptualization; supervision; visualization; project administration; writing -original draft; writing review & editing: C.S. From a strictly editorial point of view, the paragraphs were divided as follows: paragraphs 1 and 3: R.G., C.S; paragraph 2: R.G., paragraph 4: C.S. All authors have read and agreed to the published version of the manuscript.