CRITICAL REVIEW OF THE INTEGRATION OF BIM TO SEMANTIC WEB TECHNOLOGY

The AEC–FM industry (Architecture/Engineering/Construction and Facilities Management) is increasingly using different building information modeling (BIM) methodology to solve complex challenges. With help of Semantic WEB technology, product data models and other relevant information are increasingly linked to BIM models. The article discusses the challenges of existing BIM standards to meet future requirements, to fully utilize semantic technology. The article provides suggestions for further research, and it specifically calls for a more strategic research that can look a bit longer than just the challenges associated with various limited case projects. The article discusses whether existing BIM formats are able to meet future requirements, where the potential in the construction industry to fully utilize semantic web technology is difficult with today's BIM standards. Furthermore, it is suggested that previously developed SW resources should be gathered, then earlier initiatives are easier to find, use and build upon. The literature study shows many initiatives spread across many domains in the AEC-FM area. Most studied articles have a high degree of technological focus, where the semantic web opportunities are tested in a chosen case.The findings of this study can be used as a starting point for further strategic research and development.


INTRODUCTION
Building Information Modelling (BIM) is a very broad term in the AEC industry, that describes the process of creating and managing digital information about a built asset such as buildings, roads, bridges, tunnels and so on.BIM has been developed for more than three decades, and the potential and research focus seem to increase year for year.This emerging trend has deeply changed the industries in ways to focus, define, tailor, and manage the semantics of product models closely linked to geometry (Pauwels et al, 2017, Ebravipour, 2015).BIM has evolved from level 0 (figure 1), with Computer Aided Design (CAD), through managed CAD in 2D or 3D (level 1), and to management of 3D environment (level 2) with data attached from separate discipline models (i.e.owner/ facility management, architect, structural engineer, electrician and different service engineers).
Level 3 integrates building electronic information modeling with full automated connectivity and cloud/ web storing of models.The last level, Enterprise BIM (level 4) is quiet new and has an overall primary business perspective and it is knowledge based and process driven in a model server environment.At levels 3 and 4, the importance of data being searchable for both humans and machines will be crucial.
The BIM deliveries in building contracts require use of the open exchange format IFC (Industry Foundation Classes), even there are larger challenges related to integrate it with SW, even after the last version, IFC4 (Liebich et al., 2013).Researchers started to suggest SW technology solutions for the AEC industry in the early 2000s (Pan et al., 2004, Elghamrawy andBoukamp, 2008).
Some researchers have later also focused the added value of using SW to increase the value of BIM by enabling data integration and complex searches in multiple data sources (Shen and Chua, 2011).The SW technologies is referred to as one of three web technologies (semantic search, cloud computing and mobile computing) that are not commonly used in the construction sector, but they argue that SW could provide considerable value in addition to the already existing BIM technologies.Other researchers have emphasized the importance of these technologies providing improvements in information exchange in the construction industry (Aziz et al., 2004, Aziz et al., 2006, Pauwels et al, 2015).
The use of sensor technology at construction sites and in facility management has increased and contributes to increase the value of SW technology in this area.Radio Frequency Identification (RFID) uses electromagnetic fields for automatically identifying and tracking codes associated with objects, and researchers incorporates sensor technology in SW technology and presents presentations how to build document information can be controlled using RFID-based semantic contexts (Elghamrawy and Boukamp, 2010).
Furthermore, other researhers (Rezgui et al., 2011, El-Diraby, 2013) discusses reasons to focus SW based service orientated approaches, instead of trying to develop common data standards, which need to be complete to realize the big benefits for the industry in their developing of automated processes.Rezgui (Rezgui, 2011) argues that support processes need to be taken into account through the development of service-oriented approaches, and the human communication aspect needs to be improved and existing information systems must be migrated based on the development of ontologies.This tendency of using SW technologies is recently also embraced in the Technical roadmap of BuildingSMART (BuildingSMART, 2014), as shown in figure 2. This figure illustrates the three longstanding levels of the technical roadmap, and this is supplemented by a fourth level to the right with 'semantic search in the cloud' and a 'cloud library'.It is a major task for industry to create and implement process and product standardized digital workstations.The industry needs to collaborate to become effective.According to buildingSMART International, the development and implementation must be done gradually and the key stages have to be mapped.The roadmap directions (figure 2) set out the development requirements of the underlying standards and their development to enable web-based exchanges.
To realize this part of the technical roadmap, the Linked Data Working Group (BuildingSMART, 2015) has been launched, aiming to support the usage of SW technologies in the construction industry, such as the Resource Description Framework (RDF) and the Web Ontology Language (OWL).
The main focus is on these two research questions: 1. What is the status of using semantic web within BIM? 2. Which research challenges need to be solved and how can they be alternatively be solved with SW?
The selection of literature is mainly based on publications from Web of Science, American Society of Civil Engineers, Science Direct, Scopus, which combine different parts of the following keywords: SW, BIM, linked data, facility management, OWL and IFC.
Section 2 gives a short overview of some relevant SW techniques.Section 3 presents BIM and Ontologies.Section 4 focuses linking across domains, while section 5 discusses the findings.Section 6 gives a qualitative overview of the conclusions from the literature study and outlines some recommended future directions for further research.

SEMANTIC WEB
Semantic Web (SW), Web 3.0, Linked Data Web, Web of Data.Whatever we call it, SW represents a major development potential in linking information.So, building information can effectively be linked from one source to another source, and the information can be understood by computers, and perform more and more advanced tasks on our behalf.The languages developed for SW are based on metadata models in RDF and RDF Schema format (Brickley and Guha, 2014) and Logic-based Knowledge Representation in Web Ontology Language (W3C, 2014).In RDFS and OWL, you can build additional ontology languages upon RDF.SW is not concerned of the data structure, but the meaning and understanding of the data.Some semantic technologies include Natural Language Processing (NLP) and Semantic Search.Anyway, both SW and NLP have a common goal to represent information that is understandable to a machine and not just a human being.This fundamental difference gives a completely different perspective on how storage, query, and viewing information can approach.Applications that refer to a large amount of data from many different sources benefit greatly from this feature.However, this does not take advantage of storing large amounts of highly structured transaction data.Therefore, it is important to know when it is wise or not advisable to use SW technologies.SW is based on two basic ideas: • Affiliated meta information with Internet-based resources.
Metadata is a part of information about other data that can be expressed or implied.

•
The ability to explain the meta information, although this is relatively short in development.Further, there are a number of techniques for providing such information as the query language SPARQL (W3C, 2013), machine learning and other statistical techniques.
Description Logic provides a mathematical basis for knowledge representation systems, and can be used to argue with the information.NLP is an important and ongoing research area in theoretical computer science and artificial intelligence, it can look beyond the web and process anything from text in PDFs.
Both the SW and NLP subjects have been thoroughly studied by researchers in terms of language syntax, and both aim to understand languages, especially text.In recent years, a lot of resources have been spent on SW.Anyhow, both SW and NLP have a high focus on how to represent relationships in text and miscellaneous structures.
SW and NLP are very different, but they are in one way complementary to each other.SW deals with representation, standardization and reasoning about "facts".Important problems are defining vocabulary and designing so-called ontologies.SW do not deal very much with the question where these "facts" come from.NLP deals with trying to automatically understand the meaning of natural language texts, and it can serve more low level activity as input for SW.Often in BIM modelling we need reliable results, and NLP does not deliver 100% accurate results (Gao et al, 2015).In NLP many of the techniques in NLP are based on statistics.Therefore, the questions about precision and recall play an important role here.
SW is identified using an identified with a Unique Resource Identifier (URI).SPARQL -the query language of the SW developers, selects which sources of information to search for answers to various questions.For this reason, SW applications need access to data through federated or distributed queries.In fact it is easier to describe RDF, RDFS, OWL in terms of SPARQL (Allemang and Hendler, 2011).(Radulovic et al., 2015) that guidelines are needed for each domain as well.

Limitations with the IFC format
There are a number of factors that limit the yield of languages incorporated into STEP technologies as the definition of engineering ontologies (Beetz et al., 2009): The EXPRESS-based IFC format is not like OWL and other Description Logic based ontologies, based on a mathematical strict theory.Therefore, a logical based set of axioms and theories is needed to benefit from existing "intelligent" algorithms.

•
In a wide and domain-independent SW context, the coding of the metamodel greatly undermines the major constraints of the STEP standard (STANDARD for the Exchange of Product model data).In addition, interoperability is restricted.The consequence of these weaknesses is that the opportunities to exploit external ontology resources are inhibited, while also reusing engineering ontologies.The severe structural constraints due to file-based indexing of EXPRESS and attributes scoping local to entity definitions constitute obstacles for easy distribution.

•
Lack of good support mechanisms (formats) to distribute schemas and instances across networks.
Even with all of the above limitations related to step-based IFC files are much of the ongoing research in the BIM area is based on the STEP approaches to information modeling and exchanges (Radulovic et al., 2015).Yet has knowledge representation (Sowa, 2000) increasingly been identified and highlighted as a key area for future research projects and are used in a number of previous and ongoing projects.

From IFC to IFCOWL Ontology
Many researchers have focused on how the EXPRESS based schema IFC can be converted to an OWL ontology for several years (Pauwels et al, 2017, Terkaj and Sojic, 2015, Pauwels and Terkaj, 2016).Most of these formalizations that have been performed must technically be considered as transcripts -and not as translations, because the resulting systems in OWL is usually semantically richer than the starting point in IFC, but at the same time it still lacks the depth of a translation (Borgo et al, 2014).Many researchers argue that EXPRESS lacks formal semantics (Beetz et al., 2005, Beetz et al., 2009, Krima et al., 2009, Barbau et al., 2012).At the same time, they claim that OWL prefers the possibilities for axiom-based theories to better support the support of knowledge representation and semantic data sharing.
The IFCs-EXPRESS-is not like OWL and other Description Logic (DL)-based ontology definition languages, based on a mathematically rigid theory (Beetz et al., 2009, Pauwels andTerkaj, 2016).In order to benefit from some of the existing "intelligent" algorithms and technologies, according to Beetz, a logically based, demonstrable set of axioms and theories is required.Apart from the STEP initiative, the popularity of EXPRESS in the AEC-FM industry is very limited, and reuse of existing ontologies or tools for interoperability often prevents, especially those related to SW (Beetz et al., 2009).Beetz et al. (Beetz et al., 2009) have explored an semiautomatic method for lifting EXPRESS forms to OWL files.The conversion efforts have resulted in a recommended ifcOWL ontology, which stays close to the EXPRESS schema.Figure 3 shows that IFC is defined in the EXPRESS schema, but IFC is also available in the XSD form (ISO 10303-28), allowing building models to be be shared as ifcXML files.
Figure 3.The IFC data model is available in EXPRESS (native), XSD, and OWL format on a schema level, allowing to capture and use building data, using the three different technologies IFC Step, XML, RDF (Pauwels et al, 2017).
The purpose of providing the XML option is to exploit other industry domains using the XML format to utilize the IFC form.
Similarly, the BuildingSMART Linked Data Working Group has proposed and maintained ifcOWL ontology as a second alternative form (buildingSMART, 2015).In addition to this, building models can also be expressed as RDF graphs, but it is common to consider the EXPRESS form as the data model (gray in figure 3), with the XSD and OWL variants as derivatives (Pauwels, 2017).Barbau et al. developed the OntoSTEP model (Barbau et al., 2012) by setting the rules used for the automated conversion from EXPRESS to OWL.Several research groups point out out that the conversion of IFC to OWL enables the use of SW technologies to build information models and even simplify the link between various IFC models and databases (Barbau et al., 2012, Schevers and Drogenmuller, 2006, Zhang and Issa, 2011) In addition utilizes other researchers (Pauwels et al., 2011) the ability to add Semantic Web Rule Language rules (SWRL) to enrich an OWL version of IFC while facilitating the use of reasoning engines.
Terkaj et al. (Terkaj and Urgo, 2012) has developed a modular OWL ontology for factory modeling and data sharing between heterogeneous and geographically distributed software tools.
The main design of the ontology is based on IFC and is called the Virtual Factory Data Model (VFDM), based on IFC.The conversion to OWL was inspired by the method of Beetz et al. (Radulovic et al., 2015, Borgo et al, 2014), and for that reason, a relevant part of VFDM can also be used for other domains that are not directly related to factory and production.These ontology modules focus on production-related factors and related interrelations.Othe previous research works has also also focused this (Colledani et al., 2008, Colledani et al. 2009) where a conceptual model based on the UML class diagram is presented (Pauwels et al., 2011) and a relative database (Colledani et al. 2009) to support the design of factories consisting of flexible production systems (Gola andŚwić, 2011, Gola andŚwić, 2014).
VFDM can be commersicially utilized applications to enable interoperability between different tools by developing VFDMbased I/O data exchange plug-ins (Terkaj andUrgo, 2012, Terkaj andUrgo, 2014).A researcher group (Abanda et al., 2017) has identified and used appropriate methods for odontology design and tools (eg the Protégé -OWL 3.5) at 5D modeling based on a developed ontology based on New Estimates of Measurement (NRM) for cost estimation during the tender stages.One of the leading open source knowledge / ontologies, Protégé (Knublauch et al., 2004) was used, and the conclusions and compliance in the ontologies were checked using reasoners.The instances can be generated automatically or manually in Protégé.Abanda et al. (Abanda et al., 2017) argues that the names must be consistent with other ontological concepts, and automatic creation was therefore used and these were adapted to IFC nomenclature or native BIM software names.Although BIM is a powerful digital model, its use is limited by the fact that there are challenges in extracting custom data.This further limits the ability to use the data in different business processes (Nepal et al., 2013).
The approach from Abanda et al combine ontologies with a 3D BIM model to facilitate information extraction from BIM models based on New Rules of Measurement (NRM, lag REF) for cost estimation during the tendering stages.BIM based ontologies using NRM can be re-used, shared and used for other intelligent applications.Abanda et al. (Abanda et al., 2017) found by literature search that the vast majority of other BIM-based cost-estimation techniques are not based on ontologies.Nevertheless, the BIM cost estimate ontology area is constantly evolving, and below is a collection of peer reviewed literature from the area:  Abanda et al., (Abanda et al., 2011) developed an ontology to estimate the cost of labor in construction projects. Lee et al., (Lee et al., 2014) proposed an ontology-based BIM approach for construction cost estimation.


Ma and Liu (Ma and Liu, 2014) developed a BIM-based system for cost estimation of building projects, without to exploit the concepts of ontologies. Lawrence et al., 2014(Lawrence et al., 2014) proposes a generic approach at a cost estimate using flexible mapping between a building model and a cost estimate. Choi et al. (Choi et al., 2015) developed a methodology, connecting volume and area BIM data with unit cost and developed a quantity takeoff prototype system.

Areas for linking data
There are a many relevant domains for linking BIM data, and this article look at only a few.Pauwels et al. (Pauwels et al., 2017) have conducted a comprehensive literature study related to development and application progress in SW technologies in the AEC domains.These surveys and analyzes provide a good strategic map and basis for future research on the use of SW technologies in the AEC domains.The results show that SW technology plays an important role in logical applications and applications that require information from multiple application areas (such as BIM, GIS, energy, infrastructure, product manufacture data).The article argues for challenging research opportunities related to the creation and maintenance of links between different datasets, as well as the development of wellfunctioning implementation methods related to current programming techniques, different types of user input, and related to automation of procedures.
Projects within AEC-FM involve many participants in all parts of the construction process.These practitioners must exchange and combine information both in the design, construction and operational phase.There is very often a need to combine different domains (figure 4).Niknam and Karchenas (Niknam and Karshenas, 2017) have organized the building elements in UNIFORMAT II classification system (ASTM standard, 2015) and defined a BIM-shared ontology that defines design properties in a knowledge base.In the project, the BIM knowledge base was linked with a schedule and cost knowledge base, and information was obtained using queries.The results show that mapping of shared ontologies is efficient and transferable to related areas.
In the following, the article focus on linking BIM & GIS and linking product manufacturer data.

BIM and GIS
In particular, the combination of the BIM and GIS subjects has gained more and more interest in standardization bodies and researchers in the last 3-4 years, and in this connection several ongoing activities.GIS has no particular focus in this article, but it is mentioned, all the time BIM projects rely on GIS data, and this actuality increases as maturity in modeling increases (Niknam and Karshenas, 2017).
BIM and GIS (based on CityGML) have different perspectives and maturation.Nevertheless, there are a number of overlapping overlapping areas, and the gaps between the two disciplines decrease.Both limitations and potentials exist with regard to better integration of the subject fields.GIS and BIM have an additional challenge in the semantic level conversion (detailing).BIM models contain many geometrical and topological errors which need to be properly handled and often fixed (Ohori et al., 2017).These can not be problematic when used in a BIM environment due to some reasons: Many more geometry types are usually supported in BIM software than in GIS software and make data flow from GIS to BIM often easier.
In recent years, the increasing research and standardization effort on integration of BIM and GIS from a semantic point of view.However, the exchange of information remains high.This is partly due to the different development objectives of the two systems (Choi et al., 2015).
In the building managers area, Donnel et al. (Donnel et al., 2013) illustrated how the combination of scenario modeling, linked data and complicated event management can significantly improve the available information.Future building managers will benefit on having access to all possible data that they require from an organization.This requires the development of unique adapters for each current domain.Once established, linked data systems can be scaled further for portfolios of buildings while being exploited by multiple types of users.
Many different data domains are very relevant for linking to a web of linked building data (e.g.BIM, GIS, heritage, sensor data, simulation data, smart cities) into one web of linked building data (Pauwels et al, 2017).

DISCUSSION OF RESULTS
The literature search showes a lot of initiatives spread across many domains in the AEC-FM area.This industry is a major field, and it not possible to study all research areas in depth.
The research question is rather wide, but the purpose has been to make an overview.It has been chosen to make the review more general to see the major lines of the development of SW usage associated with BIM.It seems important to remember the first thoughts and principles and intentions of SW (Berners-Lee et al, 2001, Gao et al, 2015).
The main impression from the literature search is that many of the initiatives undertaken within SW against the BIM area are struggling to be properly.The challenging reasons are largely repeated in most of the studied articles.Limitations in the IFC form, and lack of functionality to exploit the semantic possibilities is a reminder.The initiatives about linked data are largely emphasized as simpler, and it is pointed out to be too much human interpretation in many of the projects.It is also difficult to get a good precission in the seeks (Venugopal et al., 2015).The step towards machine learning and automation of SW processes is largely considered to have been relatively short, but there are several good initiatives.It is still difficult to find useful BIM resources near the user's needs (Venugopal et al., 2015).
There are many actors in the construction industry who have little knowledge about how to exploit the SW technology.At the same time, the literature review has found few initiatives that are published as open and available.There are found research prototypes BIMSO / BIMDO (Choi et al., 2015) and BIMSeek / BIMSeek + (a search engine for retrieving online BIM product resources), both are open and accessible on the web, allowing semantic retrieval for BIM resources, based on IFC, can be used (ASTM standard, 2015, Venugopal et al., 2015).
The big stakeholders want to use BIM through the whole lifetime of the buildings and wish to enrich the product features of their BIM models.The leading companies in the supplier market try to facilitate their products for SW solutions, and this volume of SW based building product libraries are growing rapidly on the World Wide Web.However, BIM resources usually originate from heterogeneous systems from a variety of different manufacturers.Often the data has ambiguity and leads to uncertain categorization of product descriptions (Gao et al, 2017).The result furthermore causes problems to provide effective support for obtaining and categorizing the information.
For that reason and to reduce ambiguity, the need of semantic annotation of BIM information in natural language is great.

CONCLUSIONS AND FURTHER RESEARCH
The literature study clearly shows that there is great focus and believe that the BIM subject area needs support from SW and linked data.Articles from a wide range of applications are read and evaluated, and the literature study and technological development show that there is a great need to work and connect data interdisciplinary.The study shows mostly only smaller and limited SW projects in the AEC industri, and most of reviewed articles are limited to cover only a part of a regular BIM project.In this way, many of the projects are characterized by pilot projects, and there are rather few initiatives that build further on previous initiatives.It is also insufficient focus on SW technology related to the challenges and possibilities in the Fasility and Management phase, even it seems to be a positive change the last few years.
The literature study did not find any systematic overview of existing SW resources and associated accessibility, to help new and existing users within the BIM field.This is considered to be a significant need, and will probably contribute to slow down the developement if it is not taken into account.A good overview of existing standards, formats and libraries will enhance further development.Most articles in literature studies have a high degree of technological focus, where the semantic possibilities are tested in a chosen case.The focus is large against the weaknesses and limitations of the STEP-based IFC format that was established before SW originated, and is not designed on a platform adapted to an easy utilization of all possibilities within the SW and linked data field.Even with the weaknesses of the IFC format and associated illuminated interoperability claims, most of the studied articles show benefits of their projects.Nevertheless, it seems that the SW approach has not taken place entirely within the AEC-industry.
An important intention with the web is to make the information open and machine-readable.Several of the projects described in the articles imply a significant degree of human interpretation.This may contribute to explain why the implementation of SW technology is relatively slow.
In order to get a better overall perspective (life cycle), the focus should be more focused on stakeholders' needs and benefits from SW development, and this should therefore be tan important basis for further research.The amount of attribute data increases during the buildings life cycle, and that means it is a need for more SW research related to the operation and maintainance part.Figure 1 (BIM-Venge) and Figure 2 (Roadmap) indicate higher importance of linking SW technology to BIM in higher level of BIM developement.Only one article (Pauwels et al, 2017) has been found that describes Roadmap.In addition, it is found very few projects which have reached level 3 in the figures, and it is from this level the big possibilities of utilizing the semantic possibilities on the web lie.Here is, for example, web storing / sharing of models.As more projects reach this level, an increasing pressure is being expected to achieve good and well-functioning SW solutions.In this context, it is advisable to channel parts of the research into a more comprehensive and strategic track in the construction industry.One question that should be thoroughly considered is whether the current IFC format has too many limitations, so that future wishes and needs are not met.Several researchers want a restart (Pauwels et al, 2017, Radulovic et al., 2015, Venugopal et al., 2015), even they realize it is very difficult to change a fragmented and conservative AEC-industry.
However, any focus on such assessments should be intensified.Rule-based inferencing for semantic enrichment or BIM models has many applications, and some are also considered to be developed to be automated (machine learning).The term "space" is the most primitive functional element in a building.
The survey shows a clear wish for increased development of the term "space" which have a great potensial in the operation and maintance phase.Often it is sufficient to connect much of the information to a spesific room.An improvement of this "space" information could be further developed by SW techniques like linked data and reosoning, etc.
The reviewed articles have a small focus on IFC4, which came in 2013.However, for a number of technical reasons, this version has been implemented to a limited extent.At the same time, it is necessary to look at the coming improvements to optimize the SW potential in the forthcoming IFC5 version.

Figure 2 .
Figure 2. The technical roadmap for process support by BuildingSMART, showing web technologies in the right side of the graph (BuildingSMART, 2014).

and the object. A collection of RDF statements basically represents a labeled, directed multi-graph. In theory, this makes an RDF data model better suited to certain types of knowledge representation than other relational or ontological models. In practice, RDF data is often stored
(Allemang and Hendler, 2011)ted by Sir TimBerners-Lee in 1989 (Berners-Lee,  2001).The SW links facts, instead of connecting to a particular document or program, you can instead refer to a specific piece of information contained in the document or program.If this information is updated, you can automatically take advantage of this update.SW does not make the data smarter, because that's not what the SW needs(Allemang and Hendler, 2011).SW only needs the right data in the right place, so different smart applications can do their work.But we need consistent data!
Linked Data is a set of design principles for sharing machinereadable interlinked data on the Web.Open Data, on the other hand, is data that can be freely used and distributed by anyone, subject just to the requirement to attribute and share-alike, at most.Datasets that are both open and linked are Linked Open Data.There are some general guidelines for Linked Data Generation, but researchers argues (buildingSMART, 2016)was just connecting external data.To realize this idea, Bernes-Lee (Bernes-Lee, 2006) released 4 basic principles for publishing with a view to linking data to the web.Later it has evolved into 5 basic principles..