FROM ARCHITECTURAL SURVEY TO CONTINUOUS MONITORING: GRAPH-BASED DATA MANAGEMENT FOR CULTURAL HERITAGE CONSERVATION WITH DIGITAL TWINS.

: Continuous monitoring procedures are becoming even more crucial for assessing the potential deterioration of architectural structures, due to the many inducted advantages. A cultural heritage site, in fact, is constantly subject to degradation, due in particular to atmospheric agents. Preserving it with preventive analyses is an important goal for the scientific community. In this context, data collection methodologies, together with Artificial Intelligence, play a central role for the analysis of surface degradation. This study aims to present a proposal for the automated analysis of the state of degradation of an artefact, exploiting a novel solution including a combination of a graph database and a dynamic Digital Model. In this regard, a preliminary case study is presented, based on the Quadriportico of the Cathedral of San Matteo in Salerno, Italy.


INTRODUCTION
Preserving Cultural Heritage (CH) from the passing of time, from degradation and from the action of pathogenic elements is, nowadays, an aim that the AEC (Architectural, Engineering, Construction) scientific community has set in order to protect buildings and historical sites. This important goal has radically changed the approach to the conservation of CH, with the dissemination of preventive and predictive analysis, using innovative continuous monitoring methods. Indeed, the field of conservation, to interpret the conditions of environmental and structural degradation, can take great advantage of technological improvements in Wireless Sensor Systems (WSS), including different survey methods able to generate huge amounts of environmental data. In this scenario, an open issue consists in correlating, in an automated procedure, these data coming from heterogeneous sensors, describing the degradation phenomena related to the state of health of an architectural artifact. As a matter of fact, to date, traditional digital representations of such artifacts are mostly concerned with geometrical and colorimetric information, without effectively providing a representation of the state of conservation. On the other hand, novel solutions, based on Digital Model representations, while allowing us to develop more effective conservation strategies, still require to face significant challenges for data management, including a way to represent relationships among data streams coming also from WSS and processing/analysis methods. Given the above issues four main questions arise: 1. What are the parameters that influence the state of conservation of an architectural artifact?
2. Which reality-based methodologies are useful for data acquisition? 3. What are suitable strategies to store data obtained by WSS and by advanced acquisition strategies? 4. How to support complex representations and crossreferencing of collected data?
This study aims to present a proposal for the automated analysis of the state of degradation of an architectural artifact, based on data collected by WSS, leveraging its Digital Model in combination with a data persistence and processing solution aimed at supporting Artificial Intelligence (AI)-powered analyses. Among other advantages, the proposed approach makes degradation analyses as objective as possible, thus reducing the amount of subjectivity currently present in less-automated processes.
To exemplify the application of the proposal, a preliminary casestudy, based on the Quadriportico of the Cathedral of San Matteo in Salerno (figure 1) will be presented 1 (M.C.).

STATE OF THE ART AND WORK AIMS
The diffusion of WSS, combined with Big Data techniques, is leading to significant evolutions in the concept of "continuous monitoring", in other fields of investigation, such as medical, engineering, automotive, etc. Indeed, by combining data collected from Sensor Networks with advanced data processing/analytics capabilities, it is possible to set-up innovative predictive analysis procedures.
This revolution in monitoring strategies is leading to many exciting scenarios also in the CH field, where new types of analysis, based on predictive models are becoming possible, as well represented in the literature. Indeed, several researches are dedicated to the development of innovative solutions for continuous monitoring. The first continuous monitoring experiments were outlined from 2010. In particular these studies focused on continuous monitoring of microclimate through the installation of indoor and outdoor sensors to manage and monitor CH (Rodríguez-Sánchez et al., 2011;Mesas-Carrascosa et. al., 2016, Frasca et. al, 2020. In 2017 Carlo Ratti and other researchers from the Senseable City Lab of the Massachusetts Institute of Technology studied an innovative and economical method, based on the analysis of data collected by normal smartphones, to monitor the vibrations of road bridges. The monitoring of the Harvard Bridge (Boston, Ma) was conducted with crowd-sensing actions. In this case, the information was collected by smartphones of travelers in transit, i.e. through, mobile sensors. The idea was subsequently tested by comparing the data collected in hundreds of crossings made on two bridges (San Francisco and Boston) with those collected by fixed monitoring stations (Matarazzo et al., 2018). Continuous monitoring procedures are nowadays fundamental and indispensable in order to assess the potential deterioration of architectural structures. And in this context, the survey also plays an important role. With reality-based methodologies, it is possible to obtain point clouds and digital three-dimensional models that allow you to navigate the set and detail of architectures in the virtual space (Campi et al., 2018;di luggo et al., 2019;Tucci et al., 2019;Lovas et al., 2020), while with the WSS methodologies it is possible to acquire a series of parameters such as temperature, humidity, vibrations, etc., which determined the state of conservation of the product. The integration of reality-based techniques and WSS are related to the concept of Digital Twin (DT) (Madni et al., 2019;Evans, 2019). The idea of the DT was defined in 2002, in the manufacturing sector when Michael Grieves coined the term in and then developed, subsequently, also in other fields of investigation (Grieves, M. and Vickers, 2017;Grieves, 2019). In this context, DTs are algorithms that, based on a series of information arriving from sensors, emulate the behavior of an architectural object or artifact while providing real-time knowledge of data through a database. Currently there is no unambiguous definition of the term and the different concepts and methodologies strongly depend on the respective application context (Van der Valk et. al., 2020). In fact, there are different interpretations on the concept of DT. The term is currently very popular and has also attracted growing attention in the various disciplinary fields. For example, in the architectural field the concept of DT is innovative and closely linked to the HBIM model (Jouan and Hallo, 2020;La Russa et al., 2020) or to augmented reality (Dezen-Kempter et al., 2020). The purpose of this methodology aims to obtain a virtual copy of the object to develop conservation and maintenance strategies with a constant update of the data but still today there are several difficulties. Therefore, the construction of a digital twin is complex and does not fully reflect the work we want to conduct. For this reason, we believe that our model is much closer to a dynamic Digital Model. The presented proposal involves the remote continuous acquisition of 3D data, through the application of micro-cameras positioned in situ where, at the same time, will also be installed the WSS. This continuous monitoring will be used to inform the dynamic Digital Model (DM) of the case study, enabling multipurpose usage of these collected data and to support inference processes, either rule-based or machine-learned. For the acquisition of environmental data and dimensional data, in this phase of the research we had foreseen the installation of microcameras in situ with real-time and remote photogrammetric acquisition and WSS sensors. Unfortunately, due to the health emergency, numerous regulatory measures have been adopted in Italy to deal with the consequences of the spread of COVID-19. Due to the state of emergency that still affects every kind of movement and mobility on the regional and national territory, it was not possible to proceed with the installation of these sensors within the case study. Therefore, it was necessary to reformulate the work program and, in this specific phase, it was decided to proceed with the acquisition of dimensional data with the digital photogrammetric survey method, while the environmental data were downloaded from the Mistral open service sites. -Meteo Italian Supercomputing Portal 2 and Copernicus -Europe's eyes on Earth 3 . Subsequently, we will proceed to phase 2 with the installation of the sensors within the case study for both dimensional and environmental data (figure 2) (M. F.). Nevertheless, an effective exploitation of this knowledge requires a flexible and easily accessible data representation. To address this issue, we propose the use of a graph database, which is a relatively new model to store and retrieve, in a more flexible way, heterogeneous data. Indeed, in the last years, graph-based technologies have demonstrated their advantages in different domains and have been included in the yearly list of Top Trends in Data and Analytics by Gartner for 2020 4 . In particular, Neo4j (Robinson and Webber, 2013) is an open-source graph database manager that has been developed over the last 16 years and applied to a high number of scenarios related to data representation (Dietze et al., 2016), exploration (Drakopoulos et.al., 2015) and visualisation (Jimenez et al., 2016), among others. Neo4j is characterised by high scalability, ease of use and its proprietary query language, Cypher. Cypher is designed to be a declarative language that highlights patterns' structure using an SQL-inspired ASCII-art syntax.
In the context of data cross-referencing tasks, Neo4j has been used in (Sansonetti et al., 2019) as part of a social recommender system based on Facebook friends networks and on data coming from DBPedia and Europeana. Indeed, complex networks, in general, can be represented in a clear, easily accessible, structure, as in the case of social networks (figure 3). Graph-based representations are not constrained by static schemas so both node labels and relationship types can be added without disrupting the database. In this work, we describe the use of Neo4j as a key tool to enable a more effective representation of DTs, managing data coming from different sensors. These are inter

CASE STUDY: THE QUADRIPORTICO OF THE CATHEDRAL OF SAN MATTEO
With the conquest by Roberto il Guiscardo, Norman prince, in 1076, the city of Salerno experienced a period of great rebirth and strong building expansion, especially in the so-called Hortus Magnus area with the construction of the monumental cathedral, dedicated to Saint Matthew. The architectural project, conceived by Archbishop Alfano I, was completed in just four years, between 1080 and 1084 and, since the majesty of the work, inspired by the great Abbey of Montecassino, required a lot of space, the church of Santa Maria degli Angeli and a small church dedicated to the San Giovanni. In 1085 it was Pope Gregory VII who consecrated the religious building but in reality the works on the cathedral had not yet been completed. Only in the following century, in fact, were completed with the Quadriportico and the Bell Tower (De Angelis, 1937;Rosi, 1948;Di Stefano, 1986;Pace, 1997;Braca, 2003;Braca 2018). Today, the current appearance is the result of the transformations that took place, especially during the Baroque period. The cathedral has a plan that is divided into a longitudinal body marked by three naves, a perpendicular body, the transept and a large atrium the only original example of a Romanesque Quadriportico in Southern Italy 5 , to which the monumental bell tower leans against the southern side (figure 4). The four-sided portico has a substantially square plan and is surrounded by a colonnade with 28 columns surmounted by round arches decorated with volcanic stone inlays on the pilasters and pendentives. Above the lateral arms runs the loggia composed of mullioned windows and pentafore while the sides arranged to the east and west are today largely distorted. It is very probable, as also reported by historical sources, that the loggia actually also wound on these last two sides and inside were the sacristans rooms. The existing space has undergone several renovations and, only after the restoration works after the Second World War, the original structure emerged which still today gives a scenic aspect by the presence of the different stone materials (figure 5).
In the portico, furthermore, along the four sides there are some tombs obtained from reused Roman sarcophagi (Vaccaro, 2020) while

Meteorological data
The Mistral project, funded under the Connecting Europe Facility (CEF), began in October 2018 and the Meteo-Hub application allows registered users to create and download collections of personal meteorological data chosen from the various forecast models, weather stations, parameters and time period. In the specific case, the temperature, relative humidity and solar radiation data were downloaded in the 2020-2021 time frame and in JONS format. From the Copernicus website, however, it was possible to download the same parameters from 2011 to 2020.

3D data: the photogrammetric survey
In the last decade, in the context of continuous monitoring and activities related to the diagnosis of the architectural heritage, reality-based survey methodologies have played a key role for the purpose of preventive conservation. In fact, in the Cultural Heritage sector, the integration of image based (photogrammetry) and range based (laser scanner) techniques is increasingly widespread today. However, the use of one methodology rather than the other is always conditioned by both the objectives and the purposes of the survey. For continuous monitoring and degradation analysis, continuous data acquisition is very important with the construction of accurate three-dimensional models with highresolution photorealistic textures (Nespeca et al., 2016;Roussel et. al., 2019;De Fino et al., 2019, Randazzo et. al., 2020. In this regard, image-based modeling is a technique that offers many advantages in this field while providing the accuracy of the metric data and a high-quality colorimetric representation. Therefore, on the basis of these reasons, digital photogrammetry was used for the case study, which allows rapid mapping and low type recordings compared, for example, to the laser scanner technique, much more expensive in terms of time and above all instrumentation costs (Dhonju et. al, 2017;Khalloufi et al., 2020). The two photographic datasets, on the case study, were recorded, after about two months, using, in both cases, a Nikon D5000 SLR with CMOS sensor, with image size of 23.6 X 15.8 mm, resolution 12.3 megapixels and AF-S DX NIKKOR 18-55mm lens. To increase stability, a photographic tripod was also used when taking pictures. The lighting conditions were highly variable in both acquisitions and, therefore, a pre-processing and radiometric correction phase was required that preceded the processing phases. In the first photographic acquisition, which took place on February 15, 2021, 602 frames in RAW format were acquired. Respecting the average overlap between the acquired images of about 70%, to ensure automatic correspondence between homologous points, and, in relation to the conformation of the spaces, it was decided to set the focal length to 24 mm by integrating both axes converging and parallel axes. The digital three-dimensional reconstruction process followed the canonical four phases: sparse cloud, dense cloud, mesh and textured polygonal model. Starting from the orientation procedure, only the images with the highest quality parameters were used and, subsequently, processed using the Bundle Adjustment algorithm in the Structure from Motion -Agisoft Metashape. The Dense Image Matching algorithms allow you to generate the dense point cloud, processed using the "Medium'' quality parameter from which it was possible to process the polygonal model, consisting of a triangular mesh of 7,959,089 faces. Finally, the captured images were projected to generate a highly detailed texture. With the same acquisition technique and the same instrumentation, a second dataset was recorded on 6 April 2021 which produced 598 frames. In this case, the photogrammetric process generated a sparse cloud of 3330,593 points, a dense cloud of 43,798,345 points and a triangular mesh of 6,665,159 faces (figure 6).

Figure 6.
Workflow of data acquisition. Integration of meteorological data and 3D data.
After the photogrammetric processing it was necessary to compare the acquired data. The point clouds were imported into open-source software CloudCompare and using the ICP (Iterative Closest Point) algorithm it was possible to align the two photogrammetric datasets. The integration of data and distance values (C2C cloud to cloud) made it possible to define a percentage of overlap between the two datasets and, at the same time, to identify the typological characteristics necessary to define the state of conservation and the forms of surface degradation ( figure 7). In this phase, however, it was not possible to find any discrepancies between the two photogrammetric outputs because they were acquired in a very short period of time. However, the continuous implementation of the data will allow us to define the progress of the degradation (M.F.).

DATA PROCESSING
Continuous monitoring of architectural heritage poses a number of problems in terms of the infrastructure needed to collect, organise and make available data collected at different sampling frequencies and representing different kinds of information.
Large quantities of data, on the other hand, are needed to perform very different tasks that are data-driven by nature. The main problems arising from the vision of a large infrastructure to collect, organise, store and provide data about monitored buildings are found in all the different phases of the process. Concerning the collection phase, although sampling frequency is not, generally, high for monitoring sensors, a single building can contain multiple sensors dedicated to the collection of different information. Also, large venues need multiple sensors that are continuously active. As the number of monitored buildings increases, the traffic flow can rapidly become challenging to manage. Raw collected data may need different types of processing in order to extract meaningful information from the basic flow provided by the sensors. Data processor instances may analyse the data flow provided by multiple sensors and extract significant events to be stored in a shared repository for future reference.
Both raw data and processed information must be stored for future retrieval and analysis. Specific solutions for time series retrieval and cross-referencing of events, monitored building types and sensor availability should be adopted to adequately support data collection actors and analytics actors.
Lastly, data analysis must be supported with a task-agnostic approach as different kinds of queries and tasks may become important as the amount of available data increases and new technological solutions are developed. This scenario, summarised in Figure 8, is characterised by a significant amount of different actors that may, at any time, connect to the infrastructure or disappear either due to malfunction or to other causes. The robustness of the proposed infrastructure must, therefore, be evaluated in terms of its capability to sustain heavy traffic due to message passing, to cope with the unpredictability of the actors' availability and flexible storage characteristics.
To manage with the presented problems, we foresee the use of the Apache Kafka message streaming infrastructure to manage communication among the involved actors, which may be added or removed from the infrastructure at any time without having an impact on the rest of the actors. This is summarised in Figure 9.
To keep record of the available service providers and to store data produced by data processor actors a graph structure is used. This way, new actors can query the database to obtain the necessary information to connect to data providers in Kafka and, at the same time, can use it to store/retrieve the information needed for them to operate. The database itself can publish relevant information concerning updates to the graph structure, so that actors depending on the latest available information can react to such events. In synthesis, heavy message passing workload is intended to be managed by Apache Kafka, processed information concerning both detected events and infrastructure status are stored in Neo4j while raw time series are stored in a dedicated database. To support new actors in the process of finding data sources that are of interest for their specific task and keep track of the relationship between actors involved in the present framework, the Neo4j database can also be used. This can go beyond virtual actors like data processors/analysers and must include physical actors, like buildings and environments. In the setting, each actor, either physical or virtual, is represented by a node, linked to relevant actors by oriented relationships. This way, it is possible to explicitly represent the system structure at any given instant, allowing complex reasoning both on collected data on the typology of data processing actors. Figure 10 shows the currently foreseen actors and their relationships represented in Neo4j: nodes represent both physical actors, like buildings, environments they are composed of and sensors located in such environments. Sensors publish data streams on Kafka topics, so that new processors can subscribe to the topic they are interested in by querying Neo4j. The database itself keeps track of processors receiving data from sensors. Processors create events in the dataset and provide their output on Kafka nodes, being updated when complex events are reported by processor nodes. Such structure can inform the entire system about its status, its dynamic evolution and it can be used to support reasoning by combining both structural information and data streams (S.D.M., A.O.). Figure 10. Representation, in Neo4j, of the involved actors together with their relationships

CONCLUSIONS
In conclusion, the first results of this research are reported, even if with a methodological method. In the field of CH, this type of investigation conducted through continuous monitoring can become an analytics tool for the construction of dynamic Digital Model that interpret qualitative and quantitative data to describe and monitor the health status of the architectural artifact. Therefore, this study is aimed at investigating the benefits and drawbacks of graph-based data representations, to implement digital documentation strategies of architectural artifacts and support conservation experts in the decision-making process (M.C.).