CULTO: AN ONTOLOGY-BASED ANNOTATION TOOL FOR DATA CURATION IN CULTURAL HERITAGE

: This paper proposes CulTO, a software tool relying on a computational ontology for Cultural Heritage domain modelling, with a specific focus on religious historical buildings, for supporting cultural heritage experts in their investigations. It is specifically thought to support annotation, automatic indexing, classification and curation of photographic data and text documents of historical buildings. CULTO also serves as a useful tool for Historical Building Information Modeling (H-BIM) by enabling semantic 3D data modeling and further enrichment with non-geometrical information of historical buildings through the inclusion of new concepts about historical documents, images, decay or deformation evidence as well as decorative elements into BIM platforms. CulTO is the result of a joint research effort between the Laboratory of Surveying and Architectural Photogrammetry “Luigi Andreozzi” and the PeRCeiVe Lab (Pattern Recognition and Computer Vision Lab) of the University of Catania,


INTRODUCTION
In the last decades, we have witnessed to the explosion of digital cultural assets all over the world.We are aware that digital cultural resources have a great potential -often not fully exploited -for giving access to cultural heritage to citizens, researchers and cultural and creative industries.Nevertheless, there is still a lack of software tools and applications able to transform such resources into semantically enriched ecosystems to ease information accessibility.The impact of such tools and applications would open new perspectives in the field of humanity research as well as increasing awareness by citizens and industries in terms of cultural identity and creativity.The need for specific actions has been also highlighted in three H2020 calls on European Cultural Heritage (Reflective 6 -2015, Reflective 7 -2015, SC6-CULT-COOP-2016-2017) stressing the importance of interconnecting digital cultural assets through thesauri, classification schemes, taxonomies and ontologies.
This paper proposes CulTO, Cultural heritage Tool based on Ontology, a software tool relying on a fine-grained computational ontology for Cultural Heritage domain modelling, with a specific focus on religious historical buildings, for supporting cultural heritage experts in their investigations.CulTO is specifically thought to support curation of photographic data and text documents for historical buildings and for indexing, retrieval and classification.The developed computational ontology aims also at enriching Historical Building Information Modeling (H-BIM) with non-geometrical information on historical buildings through the inclusion of new concepts about historical documents, images, decay or deformation evidence as well as decorative elements (Quattrini et. al, 2016).
CulTO computational ontology has been designed through a multi-facet bottom-up analysis of constructive, functional and decorative elements of a religious building -the church of Santa Maria delle Grazie in Misterbianco in Catania, Italy.We have modelled building elements at a high abstraction level using standard ontologies and schemas, thus enabling the generalization to other historical religious buildings as well as integration with existing Cultural Heritage ontologies (e.g., CIDOC-CRM) (Ronzino et al, 2016).On top of the ontology, we have developed a software tool driving experts in the annotation process, which is known being time-consuming and error-prone, for further automated content analysis methods.
Thus, the main contributions of CulTO are: 1) it allows users to provide concept-level annotations constrained by a specific formal ontology, 2) it enables the creation of clusters of collected information (both visual and non) as well as to identify automatically which part of a historical building a specific image belongs to, thus easing the categorization effort, and 3) it supports searching and retrieving information either by performing text query on the image content semantically-driven by our ontology.
The remainder of the paper is organized as follows: Section 2 discusses mainly the state of the art on ontologies and information retrieval methods for Cultural Heritage.Section 3 is the core of the paper and describes the ontology, the case study, the tool and a preliminary information retrieval model exploiting semantically enriched image annotations.Section 4 deals with H-BIM data enrichment.The results are discussed in Section 5, while concluding remarks and future activities are given in Sect.6.

Ontologies for Cultural Heritage
In recent years, the availability of a large-scale unstructured and distributed knowledge together with the massive production of multimedia data makes the cultural heritage domain particularly suited for semantic web modelling.Indeed, semantic web (ontologies, schemas, etc.) has found fertile ground in the cultural heritage because of the need to integrate, enrich, annotate and share the produced data.A well-known attempt to provide a mechanism able to perform integration, interchanging, structuring, reasoning and discoverability across many cultural heritage sources is the CIDOC/CRM ontology presented in (Crofts et al, 2003), developed mainly to store cultural heritage information.The CIDOC/CRM has been used as a conceptual representation of the cultural heritage domain in (Stasinopoulou et al, 2007), where an ontology-based metadata integration methodology is proposed.In (Papatheodorou et al, 2007) the expressiveness of the CIDOC/CRM ontology has been enhanced to perform inferences for intelligent querying through a Knowledge Discovery Interface.In (Alexiev et al, 2013) the "Fundamental Relations" approach is presented as an effective "search index" over the CRM complex graph.
Cultural heritage ontologies are often employed to support the development of high-level software tools for digital content exploitation.In (Ghiselli et al, 2005), a web-based virtual museum based on ontology is proposed where visitors can perform queries and create shared information by adding textual annotations.These new generation of approaches has enabled the conversion of traditional cultural heritage website into a well-designed and more content-rich one (Bing et al, 2014), integrating distributed and heterogeneous resources, thus overcoming the limitations of systems such as MultimediaN E-Culture project (Schreiber et al, 2008), which, instead, manually performs data enchriment through semantic web techniques for harvesting and aligning existing vocabularies and metadata schemas.MultimediaN E-Culture project also developed a new software, named "ClioPatra", which allows users to submit queries based on familiar and simple keywords.
An attempt to integrate the Building Information Modelling with an ontology-based knowledge management system is proposed in (Simeone et al, 2014) with the objective to improve BIM abilities for inference and reasoning through an ontology able to interrelate all the domains needed for a comprehensive interpretation of the historical artefacts.The underlying ontology has been then extended in (Cursi et al, 2015) to model artefacts, their historical contexts, the heritage processes and all the actors interacting with buildings during the conservation process.Recently, a new workflow to integrate HBIM 3D data with semantic web technologies, including taxonomies, has been presented in (Quattrini et al, 2017).More specifically, data enrichment is performed by creating a set of shared parameters in Revit (one of the most used BIM platform), contextually with 3D modelling, reflecting the properties defined during the ontology design.One of the biggest challenge that the information retrieval in the Cultural Heritage domain has to face is the natural heterogeneity of data.One of the main attempts to provide a unified access to digital collections is the CatchUp full-text retrieval system (Kamps et al, 2009).Ontologies have been often exploited in image retrieval systems to improve accuracy as they allow for bridging the "semantic gap", i.e., the gap between the low-level content-based features and the data interpretation given by users.
In the eCHASE project (Hare et al, 2006) several cultural heritage institution metadata schemas have been mapped into the CIDOM CRM to expose them using the Search and Retrieve Web Service (SRW).Recently the INCEPTION project (Llamas et al, 2016) has been focused on the innovation in 3D modelling of cultural heritage assets, enriched by semantic information, and their integration in a new H-BIM.The peculiarity of the system is that users are able to query the database using keywords and visualize a list of H-BIM models, description, historic information and the corresponding images, classified through the application of deep learning techniques.

CULTO
In this section, we present our system -CULTO -for supporting the modelling of cultural heritage buildings as well as the visual data annotation step, necessary to develop high-level applications for data curation, retrieval and classification.

Ontology description
The main core of CULTO is its ontology, which has been designed to characterize religious historical buildings.Before describing our ontology, some key aspects of churches are given.
The most peculiar elements of these buildings are defined as Functional elements, which are rooms of the building that absolve a specific function.Among those crypt, chorus, presbytery, chapel, transept, nave, apse and sacristy are some of the main examples.These structures possess the same Constructive Elements, such as stairs, horizontal structures, walls and opening, generally found in any other different type of building.
Other characteristic structures of churches are Ancillary Elements, a class that encloses altar, baptismal font and pulpit.These structures could be sorted, for example, by the constitutive materials or by date of realization information.Every Ancillary Element may exhibit a Decorative System, i.e. a simple Decorative Element, usually as a finishing, a sculptural decoration, a non-load-bearing ribs and a classical order elements, or a Decorative Structure (Restuccia, 1997), frequently found in portals and altars.A Decorative Structure is commonly a Simple System composed by an abutment and an arch or an architrave.Particularly, an entablature added on a Simple System lead to a Trabeated System, while a classical order (pedestal, column and entablature), lead to an Overlapped System.Nevertheless, developing a class called Find has been crucial to illustrate unknown objects, their function or the finding location.
All these elements, designed in the ontology as subclasses of PhysicalObject (a base class which encloses e.g.Altar, BlockAltar, Column, Capital, etc.), are characterized by peculiar properties, encoded as subclasses of the generic class PhysicalProperty (e.g.Material).The developed ontology could be adapted to other building types (and therefore lots of other study cases could be classified) by creating different subclasses of PhysicalObject and Physical Property in order to represent the objects belonging to the new application domain and their attributes.We exploited our visual ontology to support the image annotation phase.In particular, to accomplish this, we extended the previous ontology with the concepts describing the annotation process in a generic application domain.In particular, the link between user annotations and ontology entities is modeled through the Annotation class, a subclass of Sample class employed to associate sample images.to a Physical Property.Since the Annotation class is a subclass of the Sample one, it derives the property isInImage, used to specify the location of an annotated object in an image identifier.Thus, for each new annotation, an Annotation instance is automatically created and associated with the corresponding PhysicalObject subclass instance; this allows the tool to infer all relevant properties encoded into the ontology.
Our ontology has been developed using Protégé, a free, opensource ontology editor which supports OWL 2 (Ontologies Web Language) and RDF specifications.

Case study
The case study used in this paper is the church of Santa Maria delle Grazie in the ancient Misterbianco (5 km far from Catania in Italy).This church is one of the few memories that survived the catastrophic events occurred at the end of the 17th century in eastern and south-eastern Sicily: i.e., the disruptive Mount Etna eruption (1669) that covered and erased 16 Etnean towns and the earthquake ( 1693) that destroyed almost all the towns of the Val di Noto.The church was covered by the eruption of 1669 and was brought to light recently thanks to the excavations carried out by the Superintendence to Cultural Heritage of Catania.
The choice of this case study was motivated by the availability of a large set of documents and images whose classification and analysis is of key importance for understanding the architectural artefact and formulating specific hypotheses about the construction and transformation phases (Calabrò, 2016).Furthermore, the exceptional conservation conditions of the church because it has been buried under lava flow and then excavated, enables to reason on the classification and localization of archaeological finds (Figure 2).
The study on the church is in progress and we have also acquired 3D data by means of laser scanning and photogrammetric techniques (Figures 3, 4) in order to start an in-depth investigation on this valuable architectural heritage.The archival documents found so far span the period between the end of the 16th century and 17th century, up to 1667 (two years before being buried under a 12 mt blanket of lava)-and hundreds of images collected.Nevertheless, manual categorization and curation of the bulk of gathered information is largely impractical, as it is extremely expensive and error prone.Furthermore, the excavation work wasn't carried out as an archaeological one and the exact location of many findings, such as fragment of architectural decoration, frescos etc., it is still unknown; thus making the categorization process even trickier.Santa Maria delle Grazie is a regular plan church with a single nave and a large presbytery that presents two chapel, a bell tower and a large room recognized as sacristy (Figure 4).Bell Tower and Crocifisso chapel entrances are in the nave and sacristy is located between them.A little vaulted hallway connects so-called Gothic Chapel, dedicated to Santa Maria delle Grazie, and presbytery.Overall there are nine altars: six of them are in the nave, two are in chapels and the last, the major, is in presbytery.Considering one of the altars of the nave (Figures 5 -6), the hierarchies and relations between elements may be split into decorative system and block altar.The decorative structure frames the niche and is classified as an overlapped system that inherits simple system with classical order on it.Classical Order is tripartided in pedestal (composed of base, dado, cimasa), column (column base, shaft, capital) and entablature (architrave, freize and cornice).Block altar, instead, is composed by mensa, altar frontal and predella.

The annotation and visualization tool
To support data curation and retrieval we developed upon the previously described ontology an annotation tool, which aims at guiding and constraining users in the labeling process within the concepts enforced by the ontology.It provides means to draw polygons and assign classes (the type of the annotated part, e.g.altar, column, etc.) and labels (the real altar or column the user is currently annotating); a label indeed corresponds to a particular ontology instance/individual, whose properties (e.g. the kind of material or its shape) are already defined in the ontology itself.
Similarly to other annotation tools (I.Kavasidis, 2014;B. C. Russell, 2008) the interface presents the user with an image to work on, together with several tools for browsing through images, zooming in and out, adding, editing and removing annotations.
However, unlike those other tools, part of the assignment responsibility is moved from the user to the tool itself in two different ways: 1) once a class is chosen, the tool allows the user to select one of the instances belonging to that class as the label for the current annotation; users don't need to provide any other information since all the properties of the annotated part are automatically inferred from the ones of the selected label; 2) once a part is annotated (e.g. a column), the tool automatically prompts users to annotate all its subparts (e.g. its capital), guiding the annotation process and inferring the proper subparts labels.Moreover, the user may add a textual generic description of the current annotation (e.g. the presumed altar dedication) and select some other properties predefined in the ontology for any visual annotation (e.g. the object visibility) as shown in Figure 7.
Furthermore, the tool allows the user to tag the position where an object is found, enabling a successive post-processing stage.Finally, in order to allow the annotation of unknown objects, it is possible to insert additional instances selecting the class whose the object belongs to, by clicking the button "New", typing the new label name and selecting the object it is part of (if available) and all its visible properties (see Figure 7 b).Once the "Add Instance" button is pressed, the ontology is updated with the provided information so that the new instance can be reused.This is the mechanism provided by the tool to augment dynamically the knowledge about the current application domain.
As mentioned before our ontology has been developed Protégé, thus all the annotations are exposed in an RDF endpoint for further querying and retrieval.To enable the two tasks, we integrated in our annotation and visualization tool an RDF search engine.Thanks to the ontology-based structure and an ontology reasoner which is integrated in our tool, all our annotations are at the content-level encoding information not only on the type of objects but also on the materials of such objects.Thus, our search engine allows users to perform queries (shown in SPARQL) such as: 1) Find all marble objects SELECT ?annoWHERE { ?obj rdf:type culto:PhysicalObject. ?obj culto:hasMaterial ?material.?material culto:materialHasType ?type.?obj culto:hasAnnotation ?annoFILTER (str(?type) = 'marble').} 2) Find all marble altars SELECT ?objWHERE { ?obj rdf:type culto:Altar.?obj culto:altarHasMaterial ?material.?material culto:materialHasType ?type.FILTER (str(?type) = 'marble').}

H-BIM data enrichment
The ontology here presented potentially allows for overcoming one of the major lacks present in available commercial BIM platforms, that is the possibility to add new concepts about Cultural Heritage (historical documents, images, decay or state of conservation).
As a matter of fact BIM platforms are fully compliant with new building constructions both from geometrical and informative point of view.When dealing with Cultural Heritage, and in particular with Architectural Heritage, the main difficulty is to create new libraries of building components to be used in the virtual environment (Santagati et al, 2016;Murphy et al, 2013;Fai et al, 2014;Apollonio et al, 2013).
The ontology could serve as a semantic layer to be added to the BIM.Several tests were carried out to link the formalized ontology to Revit, one of the most used BIM platform around the world.
The main problem is that Revit does not have its own programming language (e.i.Autolisp for Autocad), so the only dialogue/exchange allowed is between databases, so the BIM model has to be exported into a database by means of DB link Figure 7. A: the user is prompted with a dialog box where he is able to tag the finding position with a red dot.B: once the user has selected the class Altar and clicked the "New" button, the dialog box allows the user to add a new instance in the current ontology and specify all its attributes (e.g. the altar material).

A B
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W5, 2017 26th International CIPA Symposium 2017, 28 August-01 September 2017, Ottawa, Canada and the Protégé exported as RDF database, then they can be merged/compared by developing specific database tool as already tested in (Fioravanti et al., 2015).Furthermore, the shared parameters of the BIM model should be labeled according to the ontology definition.

CONCLUSION AND FUTURE WORKS
This work is projected towards novel and more intelligent ways to manage, enrich and implement data on Cultural Heritage for a broader knowledge process finalized at the preservation, valorization and conservation of cultural assets.Our ontologydriven visualization tool is a great leap forward to achieve such goal as it greatly supports users in the storage, curation and access of cultural heritage digital data.The next step will be mapping excavation findings to church in order to recompose all the digging steps.Although coming with a good photographical documentation, the excavation works were not carried out as archeological ones and the exact location of many findings (fragments of architectural decoration, frescos etc.) is still unknown.For example, the dates engraved on several findings could help the identification of altars naming and dating.
To support this task we are currently working on developing deep learning approaches that, leveraging our semantic visual annotations, will hopefully identify automatically matches.This possibility of mapping images related to findings on a plan will be very useful in all those archaeological excavations with imagery documentation but no planimetric localization.
In the future we will work on the fully integration of the developed ontology into BIM platform and on the possibility to use this ontology to semantically segment a point cloud.
These relationships are shown as arrows in Figure1, containing a partial visual representation of the developed ontology, and specify which class should be considered part of another (e.g.Capital is part of Column), while blue circles embody classes such as Capital, Shaft, ColumnBase (subclasses of PhysicalObject) or Material (subclass of PhysicalProperty).

Figure 1 .
Figure 1.(A) The Visual OWL representation of a subsection of the developed ontology.In particular, Column, Capital, Shaft and ColumnBase are defined as subclasses of PhysicalObject and are linked to each other by relationships in the form of XHasY; the column material is in turn defined as a subclass of PhysicalProperty.(B).Extension of our visual ontology to support the annotation phase.

Figure 2 .
Figure 2. The excavations works at the church of Santa Maria delle Grazie in ancient Misterbianco.

Figure 3 .
Figure 3. Longitudinal cross section of the church of Santa Maria delle Grazie (ancient Misterbianco) in 3D and ortographic view of the point cloud in grey scale.

Figure 4 .
Figure 4. Plan of the church of Santa Maria delle Grazie (ancient Misterbianco)