THE ISPRS-EUROSDR GEOBIM BENCHMARK 2019

Abstract. Standardised data formats and data models are essential for data integration and interoperability, which in turn adds value to data by allowing its reuse in multiple contexts. For this reason, in recent years extensive efforts have been focused on standards development. When representing the built environment, 3D city models and Building Information Models are particularly relevant, and their integration is now required to underpin use cases that cover the full life-cycle of a built asset, including design and planning as well as operations and management, and to support legal applications such as cadastral systems. For those kinds of data, CityGML by the Open Geospatial Consortium and Industry Foundation Classes by buildingSMART are the most popular reference standards. However, many users report, often through informal channels, the difficulties of working with these formats. This paper summarizes the outcomes of the GeoBIM Benchmark 2019, a scientific initiative funded by ISPRS and EuroSDR to collect insights into the most relevant issues encountered in the management of CityGML and IFC within existing software. Alongside data management (import, visualisation, analysis, export) problems, issues of particular consequence in terms of integration relate to georeferencing IFC files and the conversions among the two kinds of formats and models. Thus, the benchmark was designed to explore these tasks in available software. Following analysis of the benchmark results, a key outcome is the impossibility to find clear patterns in the behaviour of tools, which consequently means there is no consistency in the implementation of standards. Although the results could seem disappointing, the criticality in managing these standards as they are was described and this awareness can be the starting point for further research or further standards development. Finally, this project was useful to gather a wide community around this topic, and the discussion about the GeoBIM-related issues was definitely pushed.



INTRODUCTION
Building Information Modelling (BIM) can be defined as A digital-based building design process that uses a single comprehensive system of computer models rather than separate sets of drawings (Sharman, J., 2017). The increasing emergence of data from such systems, driven by a number of national-level initiatives towards increased efficiency in construction, offers the opportunity to make use of this data to improve sustainability of the built environment across the full life-cycle of an asset. This is particularly the case when such data is integrated with 2D and 3D geoinformation, which is already widely used in decision making. This integration is known as GeoBIM and can underpin applications ranging from historical reconstruction (Gago-Silva, 2016) through urban planning (Olsson et al., 2019) to asset management (Wang et al., 2019b). Further examples can be found in (Wang et al., 2019a).
However, integrating geospatial and BIM data cannot be achieved in a replicable, automated (and hence widely deployable) manner without ensuring that the data is interoperable (defined as the ability of two or more systems or components to exchange information and to use the information that has been exchanged (Geraci et al., 1991)). Having interoperable data, achieved by means of standard formats and data models, is the essential premise to any data integration. * Corresponding author To achieve this integration, the use of the respective standards for the two kinds of information systems is critical. These are: • The CityGML standard, from the Open Geospatial Consortium 1 . CityGML is an open data model, based on XML, for storing, managing and exchanging virtual 3D city models (Open Geospatial Consortium, 2012). The CityGML standard supports 5 different Levels of Detail (LoD) which aim to facilitate an effective visualization and an efficient spatial analysis of the 3D models (Open Geospatial Consortium, 2012).
• The buildingSMART Industry Foundation Classes standard 2 . IFC is a standardized open data model designed to encourage information sharing and remove information silo's throughout the lifecycle of a built asset (BuildingS-MART, 2016). It has been certified as an international standard - ISO 16739-1:2018(BuildingSMART, 2016 While standards do, in theory, promote interoperability, in practice researchers and practitioners dealing with IFC and CityGML often experience difficulties in their management, likely due to possible issues in their structure, implementation in software or use for modelling data. Unfortunately, these issues are mostly reported through unofficial channels, so that the specific problem remains unclear and are often underestimated by standardization organizations.

The GeoBIM benchmark project
The aim of the GeoBIM benchmark 3 , a scientific initiative, funded by the International Society for Photogrammetry and Remote Sensing (ISPRS) within the 'ISPRS Scientific Initiatives 2019' framework and co-funded by the European association for Spatial Data Research (EuroSDR), was to build evidence of and insights into the problems encountered in standardised data management.
The benchmark was carried out through a series of systematic tests making use of a wide range of software packages designed to work with these standards, in order to provide a framework as complete as possible to describe the present ability of existing software tools to use (i.e. read, visualise, import, manage, analyse, export) CityGML and IFC models and understand their performance while doing so, both in terms of information management functionalities, and, eventually, information loss, and in terms of ability to handle large data sets.
A second component of the study was dedicated to exploring the available tools and procedures to georeference building information models. This is not a straightforward practice in the BIM field: BIM models are usually defined in their own local Cartesian coordinate system, so metadata are required to locate a 3D BIM model on the earth. Options for georeferencing currently include setting a broad project location to defining a real-world location for the project base point (i.e. the 0,0 point for the local coordinate system) and then transforming and rotating (if necessary) the BIM model. Further examples can be found here (Diakite, Abdoulaye, 2018).
Thirdly, the benchmark was also designed to involve many people, having different skills, expertise and interests, in order to ensure that as wide a range of tools was tested and also to identify problems encountered by both novice and expert users.
The four topics investigated in the benchmark are: Task 1: What is the support for IFC within BIM (and other) software?
Task 2: What options for geo-referencing BIM data are available?
Task 3: What is the support for CityGML within GIS (and other) tools?
Task 4: What options for conversion (software and procedural) (both IFC to CityGML and CityGML to IFC) are available?
Initial summary outcomes are reported here, with detailed outcomes to follow in further publications 4 .
A parallel goal of the benchmark was to offer a common ground where people, coming from various fields and having different interests, could meet to tackle a common challenge, namely, the use of open standards for exchanging cross-discipline information and models. For this purpose, a GeoBIM benchmark workshop, organised as part of the initiative, was held on 2nd and 3rd December 2019 as an educational event concerning both the 3D city models and the BIM topics. The goal was to promote and foster the GeoBIM topics and applications. Furthermore, it was an opportunity to bring together various people having different perspectives on the issue and interested to work together to actually achieve integration between the geo-and BIM-domain. The outcomes of this discussion will also be described and documented in this paper.

Data
A key component of the project was the provision of a number of IFC and CityGML datasets for use in the bench marking activity, to allow all tested software to be compared on an equal basis. These are described in full in (Noardo et al., 2019b)  • CityGML model of Rotterdam at Level of Detail 1 and 2 (i.e. with more detailed roof structures) to test handling of LoD2 data and also including some errors to see how these are handled by the software • Buildings in LoD 3 -i.e. with more detail on the facades, generated procedurally

Participant Recruitment
Voluntary participation was an important part of the study, since it enabled a number of different points of view to be considered and encouraged the use of a wider range of software packages, with tests carried out by users with different levels of expertise in the software, from beginner users (allowing an assessment of whether the software is also user-friendly enough to be used in current practice) to expert users (what is the actual potential of such tools?) (Noardo et al., 2019a). Participants were recruited via a snowballing effort starting with the networks of the researchers and also involving the sponsors of the project (ISPRS and EuroSDR).

Standardising the Responses
In order to obtain homogeneous answers to facilitate results comparison, online forms were provided, both guiding the tests through very detailed instructions and collecting the answers and the data about the tests in a systematic way. Questions in the forms related to how well the software imported the data, visualised the data, georeferencing functionality, what analysis functionality was provided (if any), and whether the software could re-export the data. Timing information was also requested for each test, and respondents were asked to provide extensive screenshots to illustrate successful results in problems.
In addition, the participants were asked to deliver the exported CityGML or IFC data sets, so that they could be analysed and compared to the originals. The capability of data sets to remain unchanged after passing through an indefinite number of conversions 'standard format -native software format -standard format again' is imperative condition for interoperability. For this reason it is important to test and verify it.

Analysis of the Results
The submitted forms were analysed to assess both quantitative (did they pass the test?) and qualitative (what problems were encountered?) aspects of the software. The results have also been collected into documents in order to serve as best practice guidelines for anyone wanting to use standardised data sets.
Some patterns in the behavior and performance of software were also investigated in depth, to better understand the possible causes of issues and identify to what level each software package supported the features of the given standard.
Finally, the delivered models were validated, inspected and analysed again to understand what is changed from the original ones.

RESULTS
A total of 44 responses were received for Task 1, 8 for Task 2, 31 for Task 3 and 45 for Task 4. Each response consisted of a completed questionnaire and extensive screenshots of results obtained and problems encountered in each of the software packages tested e.g. Figure 1. A summary of the results is reported in the next subsections (3.1 to 3.4). In-depth examination of the submitted data is still underway and will be reported in future publications.

Task 1 -Support for IFC
30 tools were tested for this including Infraworks, Archicad, Blender, Revit (multiple versions) and FZKViewer amongst others. Figure 2 shows the average scores attributed to the tested software grouped according to their type: GIS tools, BIM software, Extract Transform and Load software, 3D modellers (mainly CAD systems), analysis software and 3D viewers that were extended with improved functionalities to manage 3D information systems, often tailored to these specific standards (IFC and CityGML). This custom tailoring explains why they usually perform the best, at least in the standardised data reading and interpretation.
Software in the 3D viewers category, which has been extended and customized to deal specifically with such standardised information, offers the best support in reading standardised information, from the geometry, semantics and georeferencing points of view. However, it should be noted that none of the tested software packages or custom tools scored a 100% support for IFC in terms of general interpretation or functionality offered. Georeferencing proved particularly challenging for BIM software, and most of the other tools tested in this task (see also 3.3). Semantics were also not well managed, and in most of cases, similar (not always exactly the same) categories to the original information were associated to each element on import of the data. Often the hierarchies and other relationships are lost, or only partially kept. Additionally, while all the tested software was able to visualize the IFC models in 3D, support for editing, analysis and query is still very limited.
The results also highlighted the difficulties in maintaining the IFC semantics through the import-export processes. Moreover, the interpretation of geometries varies across software.
Analysis of the exported models (re-exported to IFC from the native format of the software package) showed that none of the tested software packages created models exactly consistent with the one that was imported. The most common errors are: • Elements missing (e.g. stairs elements, openings, building elements, for example in Savigliano a part of the roof is missing in some cases, some kinds of geometries); • Elements displacements or change with weird geometric features (e.g. Figure 3); • Change of grouping of elements (e.g. in storeys).
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B5-2020, 2020 XXIV ISPRS Congress (2020 edition) When comparing the models by means of the NIST IFC Analyzer software 5 , it was noted that only BIM Server, out of all the 30 tested software packages, is able to import and export exactly the same model, while in all the other cases some changes occur.

Task 3 -Support for CityGML
For Task 3 of the benchmark around 15 tools were tested, falling within the same categories as those for Task 1. A common issue in software for managing CityGML is the need for specific plug-ins or external tools to convert the GML format to others more easily manageable. Moreover, the files containing multiple Levels of Detail are not always consistently read and interpreted, resulting in overlapping geometries both during visualisation and for eventual use of the data in analysis. The average support for multi-LoD data is in fact 30%. The complexity and richness of the object-oriented semantics is also sometimes lost when the data are imported. The management of LoD3 data (in CityGML this includes data sets with the definition of details on facades, like windows and doors) is generally well-supported, at least for visualisation. However in general, the level of support for reading and interpreting the data and in executing analysis using this data is limited, as shown in Figure  4.
Finally, larger data sets (e.g. one middle-size whole city, Amsterdam, in the lowest 3D level of detail, LoD1) often led the software to crash or under-perform, which is a problem for using such data for city analysis. Similarly to the support for IFC in task 1, from analysing and inspecting the models re-exported by the tested tools, the huge 5 https://www.nist.gov/services-resources/software/ifc-file-analyzer inconsistency is apparent: the only tool maintaining one hundred percent of the same entities and features is the Safe Software FME Workbench, and even this was not true for all the cases. In other packages, many of the entities are lost, while the number of entities increases without an apparent reason. It is possible that the objects are split, or duplicated, from where the high number of new features could come.
The Amsterdam model, which is the geometrically less complex, being an LoD1 model, is the one where the fewest changes happen. No new objects are added, but performance was still poor in some software, where the amount of lost entities reaches more than 50% in total (see Figure 5). However in the specific case, exported by the novaFACTORY 6 software, the reason is probably due to the large size of the file: the participants reported a crash during the export, which probably, occurred before all the elements being exported. Problems with the exported multi-LoD Rotterdam models were also encountered, without any clear pattern or consistency, which makes it very difficult to analyse and understand the problem that the various software packages had with this data. The number of entities decreased by 84% in one of the ESRI Ar-cGIS tests, while it increased of 338% in another one using the same software. Similarly, for the geometry information, the geometric objects differences vary from average -76% in one of the ESRI ArcGIS tests to the average +32% of the safe Software FME and 3DCityDB. Other FME-based conversions were tested for the same dataset, giving completely different results, as also happens for the other data sets.
Similar problems were encountered when testing the BuildingsLoD3 dataset, in which the number of entities vary a lot. For example, a big increase of bldg:BuildingInstallation entities is noticed unexpectedly in one of the ArcGIS tests, whilst the models exported by other ArcGIS tests report the complete loss of many entities and the addition of others. Again, no pattern is found that could explain the behaviour of the software or the kind of data or format problem.
Generally, the tested tools allowed the manual georeferencing of the imported IFC models. However, the tools were not able to read and write all the georeferencing elements in IFC, and in particular problems were encountered with IFC4 (which includes enhanced georeferencing options when compared to IFC 2x3). Some software packages still make use of an ad-hoc solution for reading and writing georeferencing using other IFC elements.
Moreover, different tools interpret the various kinds of georeferencing options offered by IFC (Clemen, Hendrik, 2019) in different ways. That makes it even more difficult to manage georeferencing consistently.
The issue of cartographic projection is also important in georeferencing BIM, especially when dealing with projects covering a large extent, in which the Earth curvature should be taken into account. In those cases it is not just a technical procedurebut a full methodology is required to find the proper reference points and apply the correct transformation in 3D. Furthermore, discussion with the software vendors will be required to ensure that this approach can be implemented correctly -to date there is no tool allowing this in an easy and straightforward way.

Task 4 -Conversions IFC to CityGML and CityGML to IFC
When converting from IFC to CityGML, the majority of the conversions delivered for the benchmark, with the available tools, produce very generic models, with most (or all) entities being the most generic one in the resulting standard data model. In fact, all the entities are converted from IFC to CityGML Gen-ericCityObject in many cases.
Only one more successful attempt was delivered, which made use of multiple tools to ensure a more consistent conversion from the typical BIM structure (solids and detailed elements) to the GIS one (external, less detailed surfaces). In these cases, where the process was more complex, the output often has a very irregular geometry, composed of many triangles and not closed surfaces (e.g. Figure 6). This demonstrates the difficulty in achieving a workable transformation, especially by means of existing tools and plug-ins (a number of different settings of ArcGIS Pro-Data Interoperability extension and FME were trialled to achieve the conversion successfully).
While a second successful conversion was achieved with the IFC2CityGML tool ( (Stouffs et al., 2018)), this used CityGML v.3 which includes very detailed elements which are similar to IFC elements. This makes the conversion easier to achieve, although the challenges of converting to a different kind of representation, more typical in GIS, still exist (this wasn't carried out by this conversion).
The conversions from CityGML to IFC, similarly, did not make changes in the geometry, but the semantics associated with each element changed according to the destination data model.
The geometric processing was very limited in most of cases: in CityGML to IFC conversions, the surfaces remain surfaces in the converted IFC version, and in the IFC to CityGML conversions, often solids are converted either to solids or to generic gml:Geometry, without following the practice of respective actual models. Figure 6. Converted model IFC to CityGML in one of the attempts most directed to the achievement of a consistent representation with 3D city models practice, by means of a specific workflow in FME.

THE GEOBIM BENCHMARK WORKSHOP: GROUP DISCUSSION OUTCOMES
Wider issues relating to GeoBIM integration -both in terms of data but also in terms of more procedural and use-case related challenges -e.g. at management level -were discussed during a GeoBIM benchmark workshop held in Amsterdam in December 2019. Participants (more than 60 in total) included a mix of industry representatives, students, researchers, municipal employees and others.
The workshop had an educational aim, linking topics from 3D city modeling and Building Information Modeling, and presenting the basics of the respective standards as well as an overview of the use cases that GeoBIM integration can support. In addition the research team wanted to better understand how such technology could be helpful for practice. During the workshop the research team was able to collect feedback from people and stakeholders in and outside academia to better understand how to approach subsequent stages of the project.
A working session was organised for people to describe how they would address GeoBIM issues not relating to technical challenges. Groups of 5-6 people discussed how to address one or more challenges regarding GeoBIM adoption in their own organisation and/or based on each own experience. The outcomes of such discussions were finally sketched on paper and presented to others for further discussion.
The proposed topics for discussion were: • Lack of understanding of GeoBIM and how/why it can be useful • Lack of GIS/BIM skills and especially combined skills across both • High cost of implementation (changes to workflows, new staff skills) • What are the best standards to use in GeoBIM (and how to use them)?
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B5-2020, 2020 XXIV ISPRS Congress (2020 edition) • How to make Geo and BIM worlds talk to each other?
• Who are the important stakeholders to involve?
• What are the three major factors that could significantly push GeoBIM integration forwards?
Four key factors emerged from the discussion: 1. The need for more interdisciplinary or transversal education in relation to spatial data in general and in particular GeoBIM and its potential.
Interestingly, one of the participant groups highlighted the potential of embedding this in school level education, as part of curriculum activities relating to energy management and climate change, which are currently hot topics. Free courses were also mentioned as a route towards further educating people about GeoBIM. The role of universities in this task was also mentioned, as well as the need to establish a common terminology. One group also highlighted digital illiteracy as a general issue.
2. The need for software and data to support GeoBIM activities -in particular open source software.
Native support or plug-ins were both suggested as approaches to take. During their discussion, one group highlighted the difficulty of sharing data which contrasts with the need for open data to drive GeoBIM forwards. Standardised workflows were also considered important. The concept of "elephant tools" was mentioned to highlight the fact that not everything needs to be translated and tools can be over-engineered (small tools are better).
3. The need for more fully costed working use cases for GeoBIM.
A number of groups highlighted potential use cases (energy, climate change, planning, flood modelling, terrorist attacks, sound/noise, shadow/solar). One group also suggested, during their presentation, that it would be necessary to highlight the use cases that GeoBIM can provide that can not be handled in other ways. Another group took this further and highlighted the importance of the financial aspect of the problem: specifically, who is going to pay for this integration and the maintenance of the integrated data.
4. The need for a legal push, such as asking BIM for new buildings, was identified as important.

CONCLUSIONS
As can be gleaned from the summary results the benchmark exercised proved very valuable in terms of gathering best practices and data about the functioning of tools to manage standardized data in practice. In particular, the participants provided a very high level of detail in their answers which will take time to process.
However, based on the preliminary results the aim of being able to identify issues that were causing a consistent problem across multiple software packages was only partially met, since in many cases it was not possible to recognize patterns in the errors or inaccuracies and in loss/gain of objects or other data through the import-export and conversions processes. One consideration here might be the different level of expertise in the participants -perhaps novice participants encountered problems which experts instinctively overcame or had workarounds for that were not documented in their responses.
During the discussion sessions at the workshop organised in the context of the project, it was interesting to note that nobody tackled the issue of discussing what standards, other than CityGML and IFC, can be useful (and how) for GeoBIM. Perhaps the participants considered it a solved issue on the one hand, since world-wide organizations are supporting IFC and CityGML. However, as this benchmark demonstrates there are many issues that could still need to be addressed.
As well as addressing these through further software development, it should also be noted that the standards bodies themselves are focusing on this integration problem. Indeed, as noted in Section 3.4, CityGML 3.0 has a data model that is closer to IFC than previously, which should certainly be useful to future integration and indicates a closer alignment of the standards. Although it will still be a challenge to convert the detailed data that originates from BIM into GIS-usefull concepts.
The conclusion that can be drawn from this study is that at the moment no easy and systematic interoperability between BIM and BIM (IFC), geospatial and geospatial (CityGML) or between BIM and geospatial is offered by the existing tools. The high level of complexity of standards make them very difficult to implement, understand and deploy. Furthermore, the many optional approaches to implementation within each standard (for example the many kinds of geometries that are compliant or the alternative ways in which an entity can be defined) make it very difficult to find a unique way to implement and use them, which is an issue for both users and software developers.
In future work it is recommended that standards-setters simplify the models and make them less ambiguous, as well as more manageable within the tools and understandable for users. Some efforts towards this have already began: both towards making a more workable version of CityGML, by using another kind of encoding, in JSON (see CityJSON 7 in (Ledoux et al., 2019)), and towards creating an improved version of IFC 8 .