FROM POINT CLOUD TO BIM: A SURVEY OF EXISTING APPROACHES

HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. From point cloud to BIM: a survey of existing approaches N. Hichri, Chiara Stefani, Livio De Luca, Philippe Veron, Gael Hamon


INTRODUCTION
In recent years there has been an increasing need to have structured and semantically enriched 3D digital models of historical buildings in order to handle, more efficiently, projects of maintenance, restoration, conservation or modification. In effect, in order to acquire accurate data on existing buildings, various survey techniques are adopted such as laser scanner, which allows obtaining raw 3D points clouds of buildings. Then, it is necessary to focus on an efficient way to shift from this raw 3D data to a complete and semantically enriched CAD building model.
The concept of Building Information Modeling (BIM), its expansion and democratization among professionals in the field of AED (Architecture, Engineering and Design), make it essential in this quest of semantization of digital mock-ups. It can be both defined as a technology and as a methodology. It is a technology because it is a digital representation of physical and functional characteristics of a building, and it is a methodology because it enables the * Corresponding author collaboration between the various actors in the different phases of the building life cycle. It is also based on a set of structured architectural information on buildings, concerning components, characteristics and relations between them, and allows both to complete and to enrich the purely geometric description of a digital mock-up by associating semantic features.

Fundamental problem
However, in architecture there is no efficient software ensuring this direct shift from point clouds to complete enriched CAD models, even if some software companies (such as Autodesk with Revit) are proposing new tools for exporting point clouds. In practice, any dedicated software can help semantically structuring or efficiently segmenting point clouds of historical buildings. The specificity of historical components makes this task very difficult.
In addition, in order to build an efficient digital representation of historical buildings it is essential to analyze and understand the entire chain that goes from the 3D point clouds acquired to well-structured and semantically enriched 3D digital models. Such a process should take into account three main steps: the data acquisition, the data segmentation and the enriched 3D model (BIM). Therefore, it is essential to detail the BIM approach which starts to be largely used in the field of architecture design studies but not enough in the one of cultural heritage. Our research focuses on the study of BIM techniques applied to existing buildings, the so called "as-build" BIM or "asis" BIM. The process of "as-built" BIM consists on converting the measurements of the geometry and the appearance of existing buildings on semantically rich representation. The creation of this process is based on a first phase of survey and data collection, then a second phase of data treatment leading to the final semantically enriched model. Previous works have proposed several approaches to produce "as-built" BIM using different techniques and trying to automate its generation.

Aim and structure
This paper proposes a review of the existing approaches on the three main topics mentioned below: acquisition, segmentation and "as-built" BIM. In section two, a quick review of the techniques of 3D acquisition is drawn. Then, in section three, a brief presentation of some point cloud segmentation approaches is proposed. Section four, describes an overview of "as-built" approaches of characterization classifying methods of components representation according to shapes, relations, and attributes. This classification is followed by a review of various "as-built" BIM approaches. Then, a critical analysis for these approaches will be accomplished before introducing the conclusion.

APPROACHES OF DATA COLLECTION
Data acquisition techniques are: topometry, photogrammetry or lasergrammetry.
Topometry includes traditional ways of survey such as the use of optical telescopic sight and a measuring system for angular direction of sight. These techniques lead to results with high precision but it requires important work quantities in order to find significant object structures to facilitate its post-treatment. This technique is timeconsuming and become more tedious when objects become more complex (Deveau, 2006). Photogrammetry is the technique using images taken from different points of view in order to build a 3D restitution of scenes and building (Guarnieri et al., 2004) (Grussenmeyer et al., 2001). An advantage of this technique is the resulting point cloud which is enriched with color information. This could help informing about the state of conservation in the case of historical buildings. This technique is also less expensive than lasergrammetry.
Lasergrampmetry is the easiest and speediest technique (Fuchs et al., 2004). It is a real-time and direct acquisition solution proceeding by projecting a laser beam onto the surface to be measured (Boehler et al., 2002). There are different kinds of scanner: Long-range scanners measure angles (horizontal and vertical) and distances by calculating the time of flight or by comparing the phase shift of the transmitted and received wave of a modulated signal (Marbs et al., 2001). Triangulation scanners include a base and calculate the impact point of the laser beam using one or two CCD camera (Marbs et al., 2001). Today laser scanning technologies are in constant evolution and allow obtaining a better point clouds quality with highest density of points and a reduced error margin. Moreover in some hybrid approach (De Luca, 2006), photos can be manipulated in a second phase and allow completing missing parts of the point cloud.
The result of those techniques is an unstructured point cloud. Even if some hybrid approaches permit completing the missing parts by combining different survey techniques, there is no current way allowing structuring the cloud in the acquisition phase.

APPRAOCHES OF POINT CLOUD SEGMENTATION
In order to obtain a structured point cloud, a segmentation method is applied; this method can be manual, automated or semi-automated. Research in this field is in constant progress, for this reason, we will list only some methods that have been applied to an architectural field in order to facilitate the next step of shape recognition. The aim of this article is mainly the "as-built" BIM approaches.
One of these methods applied in the architectural field is based on color similarity and spatial proximities (Zhana et al., 2009): it uses an algorithm based on region growing in order to find the nearest neighbor of each seed point creating regions which will be merged and refined on the basis of colorimetrical and spatial relations.
Another method is based on shape detection (Ning et al, 2010): In a first step, an algorithm based on region growing and normal vectors is adopted to segment each planar region. Then, architectural components are extracted through an analysis of planar residuals.
There are also another method based on a distance measured between planar faces (Dorninger et al., 2007). This method is inspired from the 2.5D segmentation approach introduced by (Pottman et al., 1999) and it measures the distance in order to determine seed-clusters for which a region growing algorithm is performed. After that, an analysis of component connection is accomplished in the object space in order to merge similar seed-clusters.
Previous point cloud segmentation are limited to surfaces segmentation. In the field of cultural heritage, studies are almost not diffused and not very relevant. However, in the field of industry, many researches focused on this issue and presented interesting results (Golovinskiy et al., 2009), (Rabbani et al., 2006).

"AS-BUILT" BIM APPRAOCHES
The concept of BIM is a new paradigm for the design and the management of buildings. It is a digital representation for both physical and functional characteristics of buildings and constitutes the most efficient representation in order to obtain a semantically enriched model. It is essentially used for the design and the management of new buildings and only few researches focused on the possibility of its application in the field of cultural heritage (Fai et al. 2013), (Arayici et al, 2008).
"As-built" BIM is a term used to describe the BIM representation of a building concerning its state at the moment of survey. This would inform about the state of conservation of historic buildings. It is usually a manual concept that involves three aspects: firstly, the geometrical modeling of the component, then the attribution of categories and material properties to the components and, finally the establishing of relations between them.

"As-built" BIM characterization
The characterization of "as-built" BIM involves the characterization of object shapes, relations and attributes. These aspects will be detailed below.

Representing the shape of the object
According to (Tang et al., 2010), the shape of an object can be classified through three dimensions: parametric or nonparametric, global or local, explicit or implicit.
Parametric representation describes the model using a set of parameters such as the height, the length, the radius, etc. (Campbell et al., 2001). While parametric representation uses other ways of characterization such as triangular meshes.
For example, a cylinder is described along its axis and its radius, whereas in non-parametric representation it will be represented using a triangular mesh. (Tang et al., 2010)  Global Vs. local representation In a context of global representation, the entire object is described while in a local one only a portion of the object is characterized. For example, parametric representations are mostly considered as a local representation. Also, complex shapes are often considered as local when they are decomposed into parts. In this case, for example CSG is used to represent each part. On the other hand, nonparametric representation, such as triangle meshes, are flexible enough to represent the whole object and can be considered as a global representation. (Tang et al., 2010)  Explicit Vs. implicit representation To distinguish the shape of the object, this last axis is the most significant. The explicit representation allows a direct encoding for the shape of the object (i.e. triangular meshes) whereas the implicit representation allows an indirect encoding for the shape using an intermediate representation (i.e. a histogram of normal surfaces).
The B-Rep is used for surface representation. It describes shapes using a set of surface components that constitutes the surface limits (Baumgart et al., 1972). Volumetric representations describe shapes with geometric solids known as CSG (Constructive Solid Geometry), which consists on building complex shapes starting from simple geometric primitives (such as cube, cylinder, sphere…) by combining them using Boolean operators like union or intersection (Chen et al., 1988). Compared to the B-Rep, CSG are more intuitive but are not so flexible because of their limited library of primitives (Kemper et al. 1987) (Rottensteiner et al., 2000). In addition, the B-Rep allows efficient representation of partial objects, such as partially occluded objects, which are very frequent in "as-built" BIM creation (Walker et al., 1989).
Even if explicit representation allows a precise description of geometries that are required for modeling the "as-built" BIM, they do not really fit algorithms for recognition and automatic segmentation. For this reason, alternative representations are often used.

Representing relations between objects
In a BIM context, it is necessary to represent relations between objects. In effect relations are required to describe positions and displacements of components (i.e. diagnosis on lacks and failures in tubes and pipelines, navigation inside a building, etc.) (Nüchter et al., 2008) (Cantzler et al. 2003).
Different spatial relations can be described in the BIM: aggregation, topological and directional relationships. Aggregation (i.e. part of, belong to, etc.), could be modeled with a hierarchical-based tree representation that permits to describe the composition in a local-to-global way. For example, nodes could represent objects or primitives and arc could represent the aggregation relations linking them (Fitzgibbon et al., 1997). Topological relationships (i.e. connected to, inside, outside of, over, etc.), and directional relationships (i.e. above, below, etc.), can be represented by a graph-based. However, it is possible to represent all those spatial relationships by using a B-Rep representation.

Representing objects attributes
Unlike relations and shapes that are well-described, few studies focus on attributes description. Attributes allows characterizing objects in order to enrich the final 3D representation. They include information about materials, (texture, age, cost, etc.) and can inform also on the state of conservation and on the documentation of historic building, for instance, whether the object has been replaced or restored.
Attributes or object classes can be: graphical or alphanumerical (Solamen, 2009). The graphical attributes includes properties required for the 3D modeling (Cartesian points, numerical values, limited spaces, etc.). The alphanumerical attributes includes all additional information concerning dimension, composition, economic data, etc.
Attributes are also structured on a set of classes (Ben Osman, 2011). In effect, every object is characterized by semantic information defining it. Classes can be tangible (i.e. wall, floor, ceiling, etc.) and abstract (cost, manufacturing process, relationships between classes, etc.)

Review of "as-built" BIM approaches
The process of "as-built" BIM is mainly a manual process that can be tedious, intensive and subjective. In effect, manual modeling of simple primitives is time-expensive, and modeling a historical building can be very difficult, and may require thousands of primitives.
Besides, automating the process is very challenging because for many reasons. First, digital models of buildings can be very complex and contains not linked components. Those kinds of components are known as clutter and cannot figure on the final BIM. Then, input data can be insufficient and resulting data can vary according to modeling details and users expectations. All those difficulties become more important in the case of historical buildings. In fact, historic buildings are very complex because they are characterized by a huge number of various shapes.
Current literature proposes automatic "as-built" BIM approaches that could be classified into four main categories: heuristic approaches, approaches based on context, approaches based on prior knowledge and approaches based on ontologies.

 Heuristic approaches
In this field, studies are at their early stages and most of methods, like heuristic approaches, rely on a first segmentation of the scene. Those approaches use a human knowledge codification belonging to the architectural field. As matter of example, doors and windows are always embedded in wall class, roofs are always "hierarchically above" walls. We can also distinguish walls and roofs according to their directions: in effect walls are always vertical while roofs may have various inclinations. Among these works, an algorithm has been developed and allows extracting windows from building façades (Pu et al., 2007). It is based on three steps: a first step of segmentation using the (Vosselman et al., 2004) method, then a step of constraint definition (position, size, topology, direction, etc.) and finally, a last step of recognition, using a heuristic table. Other algorithms allow the automatic extraction of building features (Pu et al., 2006) and finally the algorithm of (Rusu et al., 2009) uses heuristics to detect elements in a kitchen environment.

 Approaches based on context
Using this same heuristically logic, some modeling approaches based on context use relations between components. As a matter of example, (Xiong et al., 2010) uses this approach to model the interior of a room. A first step of voxelization allows encoding input data from point clouds and turns them on a voxel structure to minimize the density of points variations. Then, it detects planar patches by combining neighbor points using a region-growing method. Those patches will then be classified according to their contextual relationships, on patches of wall, ceiling, floor and clutter. For example, in the case of planar patches surrounded by walls, adjacent to the floor in the bottom and to the ceiling on the top, it is more probable to correspond to a wall patch than a clutter one. At least, a last step of patch intersection and removing for clutter is operated.

 Approaches based on prior knowledge
Another "as-built" modeling approach is the recognition method based on prior knowledge. This approach follows the principle of detecting differences existing between the conditions of the "as-built" and "as-designed". In this kind of approach, the recognition problem is reduced to a simple problem of fitting or matching between the entities of the scene and the point cloud. This kind of approach is used by (Yue et al., 2006) to detect construction defects in some sites.

 Approaches based on ontologies
A last modeling approach is the approach based on ontologies. This method introduced by (Hmida et al., 2012), and which is based on knowledge anthology inspired by the model of the semantic web, uses a priori knowledge of objects and environment. This knowledge is extracted from databases, CAD drawings, GIS, technical reports or expert knowledge belonging to particular fields. Therefore, this knowledge constitutes the basis of a knowledge-based selective detection and recognition of objects in point clouds. In such a scenario, the knowledge of these objects must include detailed information on the geometry of the object structure, 3D algorithms, etc.
All approaches mentioned previously identify some or all of the characteristic elements of a scene. Their performance and efficiencies are probably related to the complexity of the scene.

Critic analysis of "as-built" BIM approaches
The approaches mentioned above may provide satisfactory results in the recognition of elements composing a scene. But in a BIM context and in order to semantically enrich point clouds, it is not sufficient to detect their sub-parts as architectural components (walls, windows, doors, etc.). An important requirement is also to define the relations linking components to their attributes, in particular, spatial relations (topological, directional, etc.) between them. As example, if a wall is detected, it should be specified that it is connected to the ground, in a specific position, adjacent to other walls, these last ones having other positions, etc. And it is also necessary to specify, whether such wall is made of stone or bricks. In effect, attributes can vary according to the field, to the needs of management and to the use of the building. As consequence, in the field of historical building it could be also necessary to qualify other kinds of attributes such as material, color, conservation state, etc.
These "as-built" approaches listed before would be even more efficient in the case of flat surfaces and simple scenes, which is not the case for heritage buildings modeling.
In fact, historical buildings are characterized by very complex and varied shapes, mostly not responding to classical geometrical laws. For example, walls are not always vertical and can be tilted in many cases. Some elements are even more complex such as capitals which have specific characteristics and different architectural styles. Modeling them becomes even harder because of their deterioration over time. In effect, due to degradations, elements having common semantic features lose similarities at the level of their shapes. This is, for instance, the case of capitals with their details (acanthus leaf, volute, etc.). In this context, a study (Murphy M. 2011) tried to create a library of parametric objects based on historic data and called HBIM (Historical Building Information Modeling).

CONCLUSION
Previous paragraphs illustrated techniques of acquisition, segmentation of point clouds and current methods to semantically enrich data. With the aim of obtaining enriched 3D models, these approaches are complementary and are used in consecutive way: the acquisition step produces not structured point clouds, then they are segmented into regions with several segmentation algorithms, and finally the 3D model is constructed and enriched using different recognition techniques (Figure 1). This panorama of research demonstrated that even if this approach can lead to satisfactory results in the case of modern buildings, in the field of cultural heritage this chain is not well-adapted. For this reason, we propose an approach that starts enriching the 3D model at the early stages of data collection and segmentation. There is a lack of solutions focusing on the particularities and the complexity of historical buildings. Therefore, other approaches could be considered for the enrichment of data collection and segmentation, in order to find an appropriate way to link the first step of acquisition and the final "as-built" one.
This approach proposes to link the first step of acquisition and the final "as-built" BIM. Semantic features will be affected to historic objects directly in the survey and the segmentation stages, on the basis IFC classes.