CONNECTING GEOMETRY AND SEMANTICS VIA ARTIFICIAL INTELLIGENCE: FROM 3D CLASSIFICATION OF HERITAGE DATA TO H-BIM REPRESENTATIONS

: Cultural heritage information systems, such as H-BIM, are becoming more and more widespread today, thanks to their potential to bring together, around a 3D representation, the wealth of knowledge related to a given object of study. However, the reconstruction of such tools starting from 3D architectural surveying is still largely deemed as a lengthy and time-consuming process, with inherent complexities related to managing and interpreting unstructured and unorganized data derived, e.g., from laser scanning or photogrammetry. Tackling this issue and starting from reality-based surveying, the purpose of this paper is to semi-automatically reconstruct parametric representations for H-BIM-related uses, by means of the most recent 3D data classification techniques that exploit Artificial Intelligence (AI). The presented methodology consists of a first semantic segmentation phase, aiming at the automatic recognition through AI of architectural elements of historic buildings within points clouds; a Random Forest classifier is used for the classification task, evaluating each time the performance of the predictive model. At a second stage, visual programming techniques are applied to the reconstruction of a conceptual mock-up of each detected element and to the subsequent propagation of the 3D information to other objects with similar characteristics. The resulting parametric model can be used for heritage preservation and dissemination purposes, as common practices implemented in modern H-BIM documentation systems. The methodology is tailored to representative case studies related to the typology of the medieval cloister and scattered over the Tuscan territory.


INTRODUCTION
In recent years, the field of architectural heritage has benefited from an increasing use of digital information models, such as Heritage-Building Information Modeling (H-BIM) systems, which have enabled the exploitation of digital replicas as integrated tools to archive, retrieve and disseminate the knowledge related to the documentation and preservation of a heritage asset (Croce et al., 2019). In this context, it has become increasingly essential to implement and characterize the different digital 3D surveying outputs, as obtained from laser scanning or photogrammetry (Bevilacqua et al., 2018), in order to make them more easily interpretable and shareable, in terms of both inclusion and visualization of the information related, e.g., to the results of material analysis, to the level of degradation of surfaces, to the state of conservation or to the architectural components represented (Croce et al., 2020). Hence, the correct construction of an information model for the handling of digital heritage data should follow the consecutive phases of: data acquisition; semantic segmentation -i.e. the distinction of the represented elements and their classification according to a certain grouping criterion; 3D restitution of the geometric shapes for H-BIM applications. To date, however, this workflow proves to be time-consuming and barely automated, which often leads experts to forego Scanto-BIM solutions in practical applications (Andriasyan et al., 2020;López et al., 2018). To the other hand, Machine Learning (ML) and Deep Learning (DL) techniques, derived from Artificial Intelligence (AI), are * Corresponding author offering promising results for the classification, hence interpretation, of digital heritage data (Fiorucci et al., 2020). In this context, starting from the reality-based surveying, the purpose of this paper is to propose a semi-automatic approach to the construction of parametric models for H-BIM related uses. This approach is based on: i) 3D data classification techniques that exploit AI to distinguish architectural components of historic buildings within point clouds; ii) generative algorithms, built in visual programming environments, for the reconstruction and subsequent propagation of each component identified ( Figure 1). The methodology is validated on relevant case studies of cloisters of Italian medieval buildings.

PREVIOUS WORKS
In the architectural heritage domain, a very active research topic that has increasingly emerged in recent years involves the application of ML and DL techniques, fields of AI, to assist the digital data interpretation, logical organization, and semantic enrichment of a given asset being studied (Fiorucci et al., 2020) e.g. in terms of recognition of architectural elements , re-assembly of dismantled fragments (Paumard et al., 2020) and detection of occluded or damaged wall regions (Ibrahim et al., 2020).. In the case of survey data, the attention is devoted to the possibility to recognize and annotate, in much straightforward manner, the specific characteristics of a building or site: from the former experiments geared towards the semantic segmentation of heritage 2D images (Manfredi et al., 2013;Korc and Forstner, 2009;Ibrahim et al., 2020), the investigations moved on to studying more automatic annotations directly on 3D media, i.e. point clouds (Grilli et al., 2018) and/or textured polygonal meshes . Among these approaches, supervised learning ones, relying on annotated training data as input, are more suitable to describe the complexity and morphology of historical buildings:  presented a robust approach that exploits one such algorithm, the Random Forest (RF) (Breiman, L., 2001). A small portion of the 3D point cloud is manually annotated, and appropriate features are extracted, describing the elements of the dataset, e.g. columns, walls, floors, vaults and so on. Then, the RF classifier is trained to identify the same classes in new, unseen and non-manually annotated parts of the point cloud.
Further developments of this approach led to test the algorithm also for multi-level and multi-scale semantic segmentation (Teruggi et al., 2020). With the same goal, (Murtiyoso and Grussenmeyer, 2020) presented an algorithmic approach in the form of a toolbox that supports the manual segmentation of large point clouds, including several semi-automated pipelines. As of today, ML approaches prove to be more efficient than the DL subset, mainly due to the limited availability of semantically annotated data that would be needed to train deep artificial neural networks . In 2020, however, the ArCH benchmark dataset presented in  provided the first real attempt to solve this bottleneck by bringing together a collection of manually labeled heritage point clouds. Concurrent with developments in semantic segmentation of survey data, the widespread use of Building Information Modeling techniques applied to cultural heritage is also significant, as confirmed by the extensive literature reviews provided by (López et al., 2018;Tang et al., 2010). H-BIM methods allow today to bring together, in a unique environment, the geometric representation of a heritage artifact to the knowledge related to its study and analysis; this feature has doubtlessly contributed to the wide diffusion of Scan-to-BIM practices, intending to reconstruct an information model starting from 3D laser scanner or photogrammetric survey data. The use of AI-derived techniques for the automation of such processes -to date still lengthy, mostly manual and timeconsuming-has been first tested in the work by (Croce et al., 2021a), demonstrating the effectiveness and feasibility for the reconstruction of H-BIM models. The present work shows a further step: stemming from the analysis of the recurring architectural elements typical of the medieval cloister typology, visual programming algorithms are used for the reconstruction and propagation of conceptual geometries derived from surveying and identified over the annotated point cloud.

METHODOLOGY
The present research starts from the insight that semantic segmentation techniques exploiting ML can optimize the Scanto-BIM process, making the latter faster and more effective and thus improving the interpretation and reconstruction phases. The proposed approach combines: 1. Semantic segmentation via ML, to increase automation in recognition and classification of element classes in both 2Dand 3D-heritage survey data. The RF algorithm is used for the classification purpose; 2. Parametric reconstruction of the classes of elements, by making use of visual programming languages (Rhino & Grasshopper for Rhinoceros), in view of implementation in H-BIM platforms.

Semantic segmentation via Machine Learning
The approach to the semantic segmentation of heritage data leverages RF as a supervised learning algorithm and it is implemented according to the successful strategies described by (Croce et al., 2021a;. According to the learning process illustrated by (Weinmann, 2016) and starting from an initial point cloud obtained by laser scanner or photogrammetry, the process is articulated in the following phases: neighborhood selection, feature extraction and selection, manual annotation and classification.
A suitable set of features is extracted and selected from the original point cloud data: these can be radiometric (color values) or geometric features (computed considering a spherical neighborhood of each 3D point). These data, alongside the manual identification (annotation) of a reduced portion of the cloud, are used to train a RF to classify new data. In addition to the geometric features derived from the covariance matrix and the R, G, B color data, we hereby consider a curvature measure, the Normal Change Rate, which describes for each point the speed of the orientation change.
With a view to inserting this study in a wider context and to share the results more effectively, the identification of the classes of elements is performed by relying on the subdivision proposed by the state-of-the-art 3D ArCH benchmark , that distinguishes ten classes: arch, column, molding, floor, door/window, wall, stair, vault, roof, other. The predictive model is validated on a subset of annotated data (the 25%), thus sorting a confusion matrix, that shows the comparison between true and predicted classes. The procedure is performed using MATLAB's Machine Learning Toolbox.

H-BIM reconstruction via visual programming
An annotated point cloud is obtained at the end of the semantic segmentation process: therein, the different architectural components are distinguished in accordance with the ten classes identified by . With this outcome, each class of architectural elements can be processed separately and treated independently of each other, by building a conceptual reference model for each geometry. This indeed is in line with the rationale of the H-BIM process, whereby the model is generated through smart objects, appropriately distinguished in terms of typology and morphology, e.g., roof, wall, floor, column etc. A simplified 3D model is thus generated from the classified point cloud: for each class, the three-dimensional objects are reconstructed relying on reference geometries and proportions derived from treatises of historical architecture. The reconstruction of the architectural components, indeed, follows the modeling rules as proposed by (De Luca et al., 2007), where an ideal shape is reconstructed through recognition and parametric reconstruction of atoms, profiles and surfaces that compose it. The procedure is broken down class by class and it is accomplished by exploiting the visual programming language, respectively following these steps: 1. Structuring of the class element to be reconstructed through definition of basic construction plans, constraints and atoms, base profiles and ensuing functions of extrusion, loft, sweep etc. 2. Subsequent modifications of a given class element based on proportional variations or changes noted in the reality-based model; 3. Definition of element replica operations. The duplicates of a class object allow for the propagation of the defined conceptual geometry to multiple model elements that present the same characteristics.
The mathematical and conceptual representation of each class is managed through generative modeling procedures, based on the creation of Non-Uniform Rational B-Splines (NURBS) and Boundary-Representations (B-Reps).
In detail, we leverage on the graphical algorithm editor Grasshopper, integrated with Rhino, for the generation, real-time modification, and graphical control of the architectural forms.
The model obtained at the end of the process can be used to build H-BIM type representations, i.e., to construct 3D repositories of the architectural heritage, that can be further enriched with information related to conservation and documentation.

CASE STUDIES
The case studies analyzed in this contribution refer to the architectural typology of the cloister, a characteristic structure that can be found in several historical buildings, both civil and religious, and in which recurring typological elements (e.g., The original structure dates to the 14 th century, but several transformations performed during the 17 th century have given it its current layout, with a series of vaulted galleries facing the internal courtyard and a central cistern. The survey was performed by laser scanner and was later integrated with drone-based photogrammetry to restitute the roofing elements; it returned a point cloud of 6 M points. iii) The third cloister is located inside the medieval convent of San Matteo in Pisa, founded in the 11 th century and turned today into a National Museum. The structure underwent major changes during the 16 th century with the construction of the portico: a granite-columned loggia with Gothic windows and a cross-vaulted ambulacrum currently closes the central space, which covers around 20x35 m. The survey in this case was carried out by ground-based photogrammetry and the resulting point cloud consists of about 12 M points. In the considered point clouds, the minimum space between points was set to 0,01 m.

Semantic segmentation via Machine Learning
For the three case studies, the semantic segmentation approach relying on the RF algorithm is applied on a case-by-case basis: for each case study, a sufficiently exhaustive portion of the model, describing all the classes that are present, is chosen as the This set is manually annotated with the classes detected by . However, the number of classes is reduced to 9 in two cases: for the Grand cloister, in fact, there are no objects belonging to the class '6 -Stair', while for the San Matteo point cloud the class '9 -Roof' is missing, since the survey, performed by ground-based photogrammetry, did not allow to detect the covering elements. The training set was composed by almost 1,8 M points for the Grand cloister dataset, by 2 M points for the Grand-Ducal cloister dataset and by 3 M points for the National Museum of S. Matteo. Once the manual annotation of this portion of the point clouds is completed, the next step consists in the extraction and subsequent selection of geometric features: some of the features selected through predictor importance estimate are shown in Figure 3. To them are then added, as additional predictors, the color values (R, G and B) and the Z coordinate. The selected features, associated to the manually annotated training set, allow to train the RF classifier, in order to extend the classification to the whole point cloud. This procedure is followed for each one of the cases under investigation.
The result of the semantic segmentation procedure is then evaluated by a holdout subset (25%), and the resulting confusion matrix provides a final estimation of the ML model's performance after it has been trained (Figures 4-5, Table 1). The final classification is shown in Figure 6 and forms the basis for the subsequent construction of the H-BIM model.

H-BIM reconstruction via visual programming
The result of the semantic segmentation procedure, i.e. the annotated point cloud, provides a 3D datum that is already segmented: the classes are thus isolated and imported, one by one, for reconstruction in the parametric modeling environment. Each class of components thus constitutes a separate file in .e57 format, that is imported in Rhino environment for further generative modeling. At this stage, by the use of graphical algorithms implemented in the visual programming interface Grasshopper, the conceptual geometry is reconstructed based on procedures of interpretation and formalization of the shape grammar of the represented elements (Figures 7-8).
An example is provided in Figure 7 for the 'column' class of the Grand cloister dataset: at first, descriptors and geometrical attributes are defined for the construction of this architectural component. Hence, the column shaft is constructed relying on the study of the dimensional relationships between the column base diameter and its height, which allows to establish, at different heights, reference circles. Then, these selected profile circles are used to define, via a loft function, the conceptual surface shape of the column's shaft.
Once this reference geometry is constructed, the information can be propagated to the entire 'column' class, considering the parts Figure 7. Construction of the column's shaft by visual programming: base geometry, loft function and propagation.  of the original point cloud that fall under the same category of elements; this is done through duplication/replica operations and subsequent modification and/or adjustment of the class' parameters, again performed through Grasshopper's visual programming language. Such a process can be reiterated each time for each class. With similar principles, the reconstruction takes into account the relationship between one class and another. Figure 8 shows the results obtained in the construction of some significant classes extracted from the considered datasets: the 'column' class for the Grand cloister, the 'vault' class for the Grand-Ducal cloister and the 'arch' class for the National Museum of San Matteo. The creation and propagation of conceptual geometries via generative design rules allows to manipulate each time the repetition -and, eventually, the parameters' modification -of these reconstructed graphic elements. By extending this procedure to the whole classes of the three datasets, the result is a conceptual model, as displayed in Figures  9-10, that can be leveraged in the future for further semantic enrichment, e.g., in terms of addition of documentary resources or information on the state of preservation, recovery projects, and so forth.  The conceptual representation obtained provides indeed an effective support tool in the documentation of each architectural asset: being based on conceptual geometries, it is always possible to add, retrieve or update information directly within the 3D representation, as a basis for H-BIM type information systems (Figure 11), Figure 11. The Grand-Ducal cloister visualized in Autodesk Revit.

CONCLUSIONS
New ways of interpreting 3D data through AI allow to shift the focus from the raw survey output to the reconstruction and subsequent enrichment of an H-BIM model, for cultural heritage documentation and conservation policies. As for the proposed approach, originally, a predictive ML model allows to semantically organize the information contained within raw survey data. Then, by recognizing ideal geometries from these segmented data and by reconstructing each class in a parametric environment, a conceptual representation is derived. This representation can be used as an information system where to store knowledge-related data: the latter may be graphically associated to the whole class of elements, or to the single elements, or to parts of them, as expressed in (Croce et al., 2020). As a further development of this work, our aim is to define, distinguish and model these different types and levels of annotations, also considering the possibility to transfer information from a segmented class of the point cloud to its parametric H-BIM representation.
The results are promising in terms of increasing automation in Scan-to-BIM processes, for a more effective documentation of architectural assets. The extension of the proposed methodology to other datasets, but also the implementation of a unique training model to be used to classify multiple datasets of the same type (e.g., referring to the architectural typology of the cloister), are developments of the research currently underway.