THREE DIMENTIONAL RECONSTRUCTION OF LARGE CULTURAL HERITAGE OBJECTS BASED ON UAV VIDEO AND TLS DATA

This paper investigates the synergetic use of unmanned aerial vehicle (UAV) and terrestrial laser scanner (TLS) in 3D reconstruction of cultural heritage objects. Rather than capturing still images, the UAV that equips a consumer digital camera is used to collect dynamic videos to overcome its limited endurance capacity. Then, a set of 3D point-cloud is generated from video image sequences using the automated structure-from-motion (SfM) and patch-based multi-view stereo (PMVS) methods. The TLS is used to collect the information that beyond the reachability of UAV imaging e.g., partial building facades. A coarse to fine method is introduced to integrate the two sets of point clouds UAV image-reconstruction and TLS scanning for completed 3D reconstruction. For increased reliability, a variant of ICP algorithm is introduced using local terrain invariant regions in the combined designation. The experimental study is conducted in the Tulou culture heritage building in Fujian province, China, which is focused on one of the TuLou clusters built several hundred years ago. Results show a digital 3D model of the Tulou cluster with complete coverage and textural information. This paper demonstrates the usability of the proposed method for efficient 3D reconstruction of heritage object based on UAV video and TLS data. 1 Corresponding author


INTRODUCTION
3D reconstruction of culture heritage objects is increasingly required for different purposes, e.g., preservation (Remondino, 2011), reconstruction (El-Hakim et al., 2004), and as-built documentation (Yastikli, 2007).There are several instruments can be used to construct high-quality 3D point clouds.Among others, terrestrial laser scanner (TLS) systems and unmanned aerial vehicles (UAV) have been increasingly used for 3D reconstruction of heritage objects to date.Good examples can be found in literatures, where 3D reconstructions of culture heritage objects were obtained from TLS data and UAV photogrammetry (Remondino, 2011).With the fact that no single sensor can acquire complete information by performing one or several multi-surveys to cultural object reconstruction, some studies have been done by integrating these two platforms for 3D reconstruction of large and complex scenes (Remondino and Rizzi, 2010;Xu et al., 2014).In addition, the rapid development of advanced image-based reconstruction techniques enables to obtain 3D reconstructions from both images and video frames.As a result, it brings diverse combinations of multi-scale data for the reconstruction.Nevertheless, few studies have considered the viewing differences between UAV imaging and TLS scanning in the combined designation.To solve that, this work introduces a coarse-to-fine registration method using terrain invariant regions for combining multi-source point clouds.With that, this paper provides a pipeline for the design of efficient 3D reconstruction of culture heritage objects based on UAV video and TLS data 2. STUDY AREA AND DATA ACQUISITION

Study Area
The methodology was applied to the area in Fujian province of China.More than 3000 Tulou clusters built several hundred years ago are located in the study area, which were viewed as great cultural heritage.In this study, we focused on the Zhencheng TuLou (Figure 1), that built in 1912, which was considered as the best preserved and most typical one.It was well designed with a combination of two concentric rings in the center and two allocated units aside.It contained four floors and covered the area of around 5000 m 2 .

UAV Data Acquisition
An octocopter named BNU D8-1 (Xu et al., 2014) was employed for aerial video shooting.It worked at an average height of 80 m above the ground with manually control.A digital video camera Sony PJ 790was used to perform video shooting over the Zhencheng Tulou.In total of 234 frames were continuously extracted from the video, with size of 1920×1080 pixels Figure 1 Example video frame of the study area in Longyan, Fujian Province, China

TLS Data Acquisition
In this case study, the close range of TLS is too short-range to measure a huge building.Thus, we used the long-rang scanner Riegl VZ-4000 to capture the point cloud of the Zhencheng Tulou.The building exterior facade information can be captured by the scanner, while the roof information which can be complemented by UAV data is beyond Riegl VZ-4000's reachability.In total of seven stations were captured around the Tulou.The multi-scan registration procedure was performed using the RiSCAN PRO 1.1.7software.

METHDOLOGY
This work aimed at providing a practical solution for efficient 3D reconstruction of culture heritage objects based on both UAV video and TLS data.The viewing differences between image shooting and TLS scanning were considered in particular.
Figure 2 shows the framework of the proposed methodology.
The procedures contain three components: 1) 3D point cloud generation from UAV video image sequences, 2) coarse-to-fine registration of UAV-reconstructed point clouds and TLS data using invariant regions, and 3) textural model of the culture heritage object.

Image sequences
Blur-free images

TLS surveying
Multi-scan registation

Coarse-to-fine registation
Final point cloud

3D reconstruction
Figure 2 Framework of the proposed method

3D point cloud generation from UAV Video frames
This step started from extracting frames from videos, resulting in a set of highly overlapped image sequences.Following this, the removal of blurred images was performed to ensure the remaining images with high quality.Considering its efficiency and fast implementation, the method developed by Crete et al. (2007) was used to achieve the goal.The automated structurefrom-motion (SfM) (Snavely, 2009)method was used to process the remaining set of blur-free images to generate a sparse point cloud.Further, the Patch-based Multi-view Stereo (PMVS2) (Furukawa and Ponce, 2010)was used to increase the density of SfM point cloud.Considering its fully automated capability the VisualSFM software package (Wu, 2013) that integrates both SfM and PMVS2 techniques was used.The final outputs include the coordinates of a set of dense points and corresponding normal as well as RGB color.

Coarse-to-Fine Registration of Multi-Source Point Clouds Using Invariant Regions
The aim of this step is to estimate the Euclidean motion between the two point clouds.To achieve this, we treated the one from TLS measurement as base and oriented the other one using a coarse-to-fine method.A group of common features that manually selected from the two sets of point clouds is used for estimating the transformation parameters in order to coarse registration.The well-known Bursa model was used in this procedure, as shown in Formula (1).
where (x, y, z) and (x 0 , y 0 , z 0 ) represent the points in the TLS and image-reconstructed point clouds.
From a general point of view, the Iterative Closest Point (ICP) (Besl and McKay, 1992) or its variants can be used to fine-turn the Euclidean motion in the following sequences.Nevertheless, as the point clouds yielded using UAV image reconstruction and TLS measurement have different densities and with different views, directly performing the ICP algorithm may result in local minimization.Hence, rather than using the whole set of points for the estimation, we used only the potential invariant regions.In detail, the terrain points were viewed as invariant ones between the two clouds.We then used the estimated parameters to perform Euclidean transformation for the remaining set of points.For increased speed, we used the Iterative Closest Compatible Point (Bae and Lichti, 2008) algorithm for the fine registration, where the Euclidean motion between two point clouds is iteratively computed by minimizing the cost function C with a least square method, see Formula (2).
where (x, y, z) and (x 0 , y 0 , z 0 ) represent the points from the two sets, n and n 0 represent the corresponding normal within the search radius d, [-a, a] is the threshold of normal constraint.

Textural Model of the Culture Heritage Object
Note that the digitization was not the main focus of this study.
Here, we followed the traditional methods on digital reconstruction.With the complete point cloud from both UAV image reconstruction and TLS measurement, further steps in terms of digitalization, reconstruction, and texturing are implemented to obtain a high-resolution photorealistic 3D reconstruction.

RESULTS AND DISCUSSIONS
Figure 3 presents the reconstructed point clouds from UAV video image sequences.The results from the full set of 234 image sequences and from only 115 of blur-free images were respectively shown in Figure 3a and 3bb.Note that, the point cloud from blur-free images achieved more complete information than that from the full-set of image sequences and on the roof of the Tulou in particular.Moreover, with the fact that the complexity of SfM algorithm is O(n 2 ), the time cost of using only 115 of blur-free images was dropped to a quarter of that yielded using the full set of 234 images.The experimental results demonstrated that the completeness of imagereconstructed point clouds depends rather than the number of images involved in but on the image qualities.This also confirmed the knowledge in literature (Alsadik et al, 2015).
Hence, for increased speed, the removal of redundant images can make some sense (Alsadik et al, 2013;Alsadik et al, 2015).Furthermore, from our experimental experiences, decreasing the number of matched image pairs also contributes to saving time in image reconstructing process while without in loss the quality of 3D scene geometry (Xu et al, 2014).From image reconstruction point of view, the combination of decreasing both redundant images and matched image pairs are good issues for real-world applications.
Figure 4 shows the 3D reconstruction of the Zhencheng Tulou with complete and textural information.Although the digitization was out the scope of this paper, the experimental results demonstrated that the integrated points from UAV image reconstruction and TLS measurement met the demand of digitization and texturing.However, density differences should be considered in the integration procedure.Generally, the points yielded using only SfM reconstruction possess too sparse density to ensure enough common features for the reliable registration.Hence, further process should be considered to improve the density of SfM outputs in order to obtain a point cloud with comparable density to that of TLS measurement

CONCLUSIONS
This paper gave a framework for the design of 3D reconstruction of cultural heritage objects.The attention was focused on 3D reconstruction of huge or complex objects which possesses constraints for data acquisition with any single sensor.A novel registration method was introduced using invariant regions in a coarse-to-fine configuration.UAV video frames were used for image 3D reconstruction, where around a half of blurred images were removed from the full image set.The remaining set of images achieved more complete information than that of using the full image set.Moreover, the time cost of using the blurred-free images was decreased around 3 times that yielded using the full image set.Complete 3D model was obtained from the integrated points yielded using both UAV image-reconstruction and TLS measurement.Further work can investigate the use of invariant features for rapid and automatic feature matching that enable efficient point clouds registration.

Figure 3 .
Figure 3. 3D point clouds from UAV full-set video image sequences