JOINT PROCESSING OF UAV IMAGERY AND TERRESTRIAL MOBILE MAPPING SYSTEM DATA FOR VERY HIGH RESOLUTION CITY MODELING

ABSTRACT: Both unmanned aerial vehicle (UAV) technology and Mobile Mapping Systems (MMS) are important techniques for surveying and mapping. In recent years, the UAV technology has seen tremendous interest, both in the mapping community and in many other fields of application. Carrying off-the shelf digital cameras, the UAV can collect high quality aerial optical images for city modeling using photogrammetric techniques. In addition, a MMS can acquire high density point clouds of ground objects along the roads. The UAV, if operated in an aerial mode, has difficulties in acquiring information of ground objects under the trees and along façades of buildings. On the contrary, the MMS collects accurate point clouds of objects from the ground, together with stereo images, but it suffers from system errors due to loss of GPS signals, and also lacks the information of the roofs. Therefore, both technologies are complementary. This paper focuses on the integration of UAV images, MMS point cloud data and terrestrial images to build very high resolution 3D city models. The work we will show is a practical modeling project of the National University of Singapore (NUS) campus, which includes buildings, some of them very high, roads and other man-made objects, dense tropical vegetation and DTM. This is an intermediate report. We present work in progress.


INTRODUCTION
Today 3D city models are important base datasets for many applications, some more traditional, other quite new.The efficient generation of high resolution structured models (not just unstructured surface models for the whole area) is both a relevant research topic and an important issue for the professional practice.Both Unmanned Aerial Vehicles (UAV) and Mobile Laser Scanners (MMS) are important techniques for surveying and mapping.UAV technology, if used in aerial mode, and terrestrial Mobile Mapping Systems (MMS) are complementary technologies.While aerial UAV images are ideally suited to model the roof landscape and part of the terrain, terrestrial point clouds are able to look under the trees and also to represent the façades.If these two datasets are amended by terrestrial images we have most of the primary information needed to generate a complete model 3D city model.
Integration of multiple sensor data can add more information and reduce uncertainties to data processing, also allow for a higher degree of automation.Most of the multi-source data fusion researches are built on data from similar view directions, like the airborne LiDAR point clouds and images from aerial and satellite platforms.For city model reconstruction, the integration of data through the use of complementary data sources can be helpful for both building detection and model reconstruction.However, only few researchers have worked on integrating of data from airborne and ground based sensors.Von Hansen (2008) proposed a method using extracted straight lines to register airborne and terrestrial laser scanning data.Stewart et al. (2009) combined and positioned the terrestrial laser scanning data to airborne LiDAR data to detect ground displacements caused by natural hazards.Rutzinger et al. (2009) proposed a method to extract vertical walls from terrestrial mobile and airborne laser scanning data.Al-Manasir and Fraser (2006) georeferenced terrestrial laser-scan data with the help of terrestrial images via signalized common points.
On the other hand, most of the past UAV applications are performed in rural or suburban areas, such as cultural heritage site modeling, suburban mapping, agriculture monitoring, etc. (Eisenbeiss, 2009, Gruen et al., 2012).Only few projects are reported about mapping urban areas with UAV, and even more seldom is the combination of UAV images with MMS point clouds for city modeling.How to combine and how to use them are some of the questions to be answered in this paper.This paper focuses on the integration of UAV images and MMS point clouds to build a very high-resolution 3D city model.This work will show a practical modeling project of the Campus of the National University of Singapore (NUS), which includes DTM, buildings and other man-made objects, roads, and dense tropical vegetation.Our contribution will be: (a) very high resolution roof landscape modeling from UAV image data, (b) accuracy evaluation of the geo-referencing of point clouds by using UAV image data, (c) modeling buildings with their façades from laser point clouds and terrestrial images, (d) setting an example of integration of these three data sources for 3D city modeling, which, according to our knowledge, has not been done before.Fusion is done both on the level of data processing and data integration.

Main Process Workflow
The input of our work is: (1) raw point clouds from MMS; (2) UAV images; (3) few Ground Control Points (GCPs); (4) Terrestrial images for geometric modeling and ( 5) optional: texture mapping from aerial and terrestrial images.Our working procedure can be generalized into several steps, including: (a) UAV images aerial triangulation (b)

Absolute Orientation
Figure 1: Flowchart of the data processing steps.

UAV Image Aerial Triangulation
The aerial image collection has been done by the AscTec Falcon 8 octocopter system, developed by Ascending Technologies GmbH, with an off-the-shelf camera Sony Nex-5.It is a two-beam octocopter with 4 rotors on each side, powered by battery.It has a build-in GPS/IMU, a barometer, electronic compass and a stabilizing system for both the camera and the platform.It has up to 300 meters remote controlling distance with a maximal operation slot of 20 minutes.Limited by the flying time, the image data collection over the NUS campus area (2.2 km2) had to be done with 43 starts and landings in 3 days, because each flight collected only a small block of 4x4 or 5x5 images.After data cleaning, we finally got 857 images out of 929 raw images with corresponding GPS/IMU records.The layout of the block is shown in Figure 2 (left).The right part shows a sample image, with 5cm ground resolution.
The Sony Nex-5 is a mirrorless interchangeable-lens camera, with an image dimension of and a pixel size of in both x and y directions.We use its original lens with a fixed focal length of 16 mm.The camera calibration was done in our lab with the software package IWitness, using the point cloud calibration method (Remondino, Fraser, 2006).After calibration a re-projection error of 0.24 pixels is obtained.
For bundle adjustment processing, we acquired 39 ground control points using Trimble GPS, with an accuracy of 2 cm in X, Y direction, and 3 cm in height.The bundle adjustment is conducted using APERO, an open source tool developed by French National Geographic Institute (IGN).Given the rough position from GPS of each image, the software can find the relationships between images.Then, the tie points are extracted automatically by SIFT algorithm and a following exhaustive search strategy to deal with failure matching.The result of bundle adjustment using APERO is shown in Table 1.The RMSEs computed from 11 check points give 7 cm planimetric and 6.5 cm height accuracy.These values, slightly above the one-pixel mark (5 cm) are acceptable considering the fact that we are dealing here with natural control and check points.

Georeferencing of Point Cloud Data
The Mobile Mapping System we used is RIEGL WMX-250, which consists of two RIEGL VQ-250 laser scanners, an IMU/GNSS unit, a distance measurement indicator, and two calibrated optical cameras.The system can collect time stamped images and dense point clouds with a measurement rate up to 600K Hz and 200 scan lines per second.
Figure 3 shows the system installed on a car (left) and a sample of the point cloud data of the CREATE area (middle), located inside NUS campus, rendered according to the intensity value.

Point Cloud Data Adjustment
As there are many high buildings and dense trees which partially block the GPS signals, the coordinate accuracy of GPS is not very high, thus influencing negatively the accuracy of the trajectory after fusion of GPS and IMU.The GPS signal losses happen frequently and unpredictably, so it needs many control points to re-georeference the point clouds.
The control points for this are measured from georeferenced stereo UAV images.Altogether, 169 control points have been measured manually from the stereo UAV images.To verify the accuracy of manual measurements, a group of points were measured four times and the differences of coordinate values between the four measurements were evaluated.As a result, the average measurement errors in X, Y and Z directions are 2.8cm, 3.7cm and 4.8cm.
Compared to the accuracy of the point cloud data before adjustment, these errors are relatively small, such that we can use them for georeferencing of our mobile mapping point cloud data.
The software used for trajectory adjustments was RIEGL's RiProcess, designed for managing, processing, and analyzing data acquired with airborne and terrestrial mobile laser scanning systems.A two-step procedure was applied to adjust the data using control point from point clouds.The locations of these control points were chosen regarding criteria like spatial distribution, but also in crossroads where there are overlapping areas from different passages.In these overlapping areas the adjustment was done in two steps, first relative, then absolute to the control points.After the trajectory is adjusted by given control points, the new trajectory is applied to re-calculate the coordinates of point clouds.

Accuracy Evaluation
The data accuracy check is conducted on new measured points, rather than on the control points used in the adjustment.We manually measured 16 points from both UAV stereo images and point cloud data.The check points are evenly distributed over the whole area along the roads, as shown in Figure 4.The check points were measured carefully at the corner of land marks or at places where we could easily identify the corresponding position.
Both data sets we used in our evaluation experiment are based on the UTM projection system.But, the height values of the UAV image data is based on WGS84 ellipsoid heights, while the point cloud data carries orthometric heights.As the test area is quite small (only 2.2 km 2 ) and there is no big mountain causing gravity anomalies in such a small area, we can assume that the height differences between orthometric and ellipsoid heights can be replaced by a constant value.
After georeferencing of the point cloud data, the inaccuracy of data due to GPS signal losses will be greatly improved.Before georeferencing the point clouds to the control points we get an average deviation of 0.4 m in planimetry and 0.6 m in height, with a maximal height deviation of 1.3 m.The inset figure in Figure 4 (right) shows the graph of residual distribution at X and Y directions after adjustment.An accuracy analysis of the georeferencing resulted for the 16 check points in the RMSE values of 11 cm for planimetry and 20 cm for height.These are values which could be expected given the cumulative error budget of UAV images and laser-scan point clouds.

Building Model Reconstruction using UAV Imagery Data
The building models have been created using the semi-automatic modeling software CyberCity Modeler (Gruen, Wang, 1998).Giving weakly ordered key points of roofs measured in a Digital Workstation, following a set of criteria, it will automatically generate roof faces and wall faces, where only a small amount of post-editing is needed.It greatly reduces the operation time for constructing building models and can generate thousands of buildings with a fairly small work force.It is also invariant to model resolution, and is also able to generate fine details on building roofs such as air-condition boxes, water tanks, etc.We used ERDAS StereoAnalyst as Digital Workstation and implemented a converter between StereoAnalyst and CyberCity Modeler.Part of the building and terrain modeling result is shown in Figure 5.

Façade Modeling
After georeferencing of the point clouds using control points from UAV data, the two datasets can be registered together.The roof models from stereo UAV images can serve as building contour, which is a compensation for incompleteness of MMS point cloud, thus making it easier to model the façade from occluded point clouds.The façade modeling in this project is done manually in 3ds Max.To make the modeling procedure easier, the point clouds is wrapped into surface mesh model and imported into 3ds Max as reference.The roof model from photogrammetric measurement of the same building is also imported into 3ds Max, they should match the point cloud mesh model.Sometimes, the boundary of the façades from points clouds show a small deviation to the boundary of roofs from UAV images.For parts of the façade without point clouds, the complete façade structure can be deduced from the regular pattern of windows.Moreover, terrestrial or oblique images of façades can also provide further reference for this deduction.Figure 6 shows an example of a complete building model with roof structures and façades.
We also took terrestrial stereo images of an area called "University Town" for 3D façade modeling.
Figure 7 shows the sample data and the modeling results of the most complicated building model in University Town, the CREATE building.This building has a big volume and very complex structures, including solar panels on the roof, bars on the façades to stop the hot wind going up, etc. Actually, the building is so large that one image cannot cover the whole building.We have to measure each part of the roof and then combine the roof parts into a complete model.The façades are modeled using 3D Max from wrapped point clouds.For the detailed structures which cannot be seen from UAV image or Mobile Mapping point clouds, we took terrestrial pictures using a NIKON D7000 camera.With the help from these images, we can model the inside structures.As shown in Figure 7, the detailed structures can be created with the help of integrated multiple data sources.

CONCLUSIONS
This paper addresses a project of very high resolution 3D city modeling by integration of UAV images, terrestrial images and MMS data.This pilot project of creating a model of the NUS campus, Singapore delivers many valuable experiences for future applications and research topics: (1) The accuracy of multiple data fusion is carefully evaluated; (2) the experiment shows that using the UAV data to georeference MMS data is feasible and successful and can save a lot of surveying field work; (3) complete building models are created by integration of UAV data, terrestrial images and MMS data.
However, the problems we met in this project also show the directions of our future work: (1) how to improve the level of automation in georeferencing of MMS data?(2) how to improve the accuracy of data registration by utilizing features from both data sources?(3) how to do the micro-adjustment to achieve a perfect match between building roofs with the façade model?The wall extraction method in Rutzinger et al. (2009) provided some insights for us to further investigate this problem.(4) how to improve and speed up the façade modeling procedure ?From our working experience in this project we also found that convenient tools for multi-sensor integrated modeling are still not available, which indicates a topic one should spend more effort on.
The project is work in progress.We still have to process most of the terrestrial images and we also plan to fly UAV oblique images for a certain section of the area to have another useful information channel.This will enter an automated georeferencing and point cloud generation system to compare the results with manual and semi-automated approaches.

Figure 2 :
Figure 2: Layout of the NUS image block (left) and one sample image of a high-rise building (right).

Figure 3 :
Figure 3: Mobile Mapping System (left), example point cloud data at CREATE buildings (middle) and MMS campus trajectory.Altogether, 34.4 GB of raw point cloud data of 16 km long road sides were collected within 3 hours, with 5.25GB sequences of stereo street images.The point cloud data is stored separately in 54 files.Figure 3 (right) shows the overview of the colored point cloud data with 700 points/m 2 in the middle part of the road.The left part of Figure 4 shows the interpolated intensity data along the trajectory.We interpolated the intensity data with 2.5cm sample space and normalized the intensity values to grey level for visualization, as shown in the upper part of the left image in Figure 4.The intensity data of the point clouds have very stable quality along the road, showing clear corners and edges of road marks, such that it provides great potential for us to measure corresponding points between images and point clouds precisely.

Figure 4 :
Figure 4: Left: Overview of NUS campus point cloud intensity data.Right: Distribution of check points and their coordinate differences at X and Y directions (Unit: meter).

Figure 5 :
Figure 5: Results of NUS campus modeling: (a) one of the UAV images; (b) measured terrain; (c) zoom-in result of a textured building model.
Example of a complete building model, (a) roof model from UAV images, (b) wrapped mesh from MMS point clouds, (c) complete model by integrating roofs and façades, (d) façade image (up) and 3D model (down).

Figure 7 :
Figure 7: Example of complex CREATE building, (a) an image from UAV; (b) sample point cloud from Mobile Mapping system; (c) terrestrial image sample collected by NIKON D7000; (d)(e)(f) views onto the 3D model of the CREATE building, produced by integrating the data from UAV, point cloud and terrestrial images.

Table 1 :
Result of bundle adjustment of the full block with APERO.Residuals at check points.