IMPROVING VERTICAL ACCURACY OF UAV DIGITAL SURFACE MODELS BY INTRODUCING TERRESTRIAL LASER SCANS ON A POINT-CLOUD LEVEL

Digital Surface Models (DSM) generated by image-based scene reconstruction from Unmanned Aerial Vehicle (UAV) and Terrestrial Laser Scanning (TLS)point clouds are highly distinguished in terms of resolution and accuracy. This leads to a situation where users have to choose the most beneficial product to fulfill their needs. In the current study, these techniques no longer compete but complement each other. Experiments were implemented to verify the improvement of vertical accuracy by introducing different amounts and configurations of Terrestrial Laser scans in the photogrammetric Structure from Motion (SfM) workflow for highresolution 3D-scene reconstruction. Results show that it is possible to significantly improve (~ 49% ) the vertical accuracy of DSMs by introducing a TLS point clouds. However, accuracy improvement is highly associated with the number of introduced Ground Control Points (GCP) in the SfM workflow procedure.


INTRODUCTION
Intense development of unmanned aerial vehicle (UAV) photogrammetry led to a different, new applications in the closerange aerial domain and also introduces low-cost alternatives to classical manned aerial photogrammetry (Colomina, et al., 2007;Eisenbeiss, 2009). This is connected to the spread of low-cost UAV platforms with integrated inexpensive commercial cameras, Global Navigation Satellite System (GNSS) and Inertial Measuring Unit (IMU) for precise and flexible mission acquisition with high platform stability. Also, advancements in computational processing power and machine-vision algorithms and methods enabled 3D object reconstruction from nadir and oblique 2D imagery by so-called Structure from Motion (SfM) photogrammetry approach and Multi-View Stereo (MVS) methods (Westoby, et al., 2012;Carrivick, et al., 2016;Wallace, et al., 2016;Mlambo, et al., 2017); hereafter, referred to simply as "SfM". This led to significant improvement of results achieved with previously used image-stitching procedure. The SfM results form a 3D point cloud, Digital Surface Model (DSM) and orthorectified map (ortho-photomaps) with Ground Sampling Distance (pixel resolution) in sub-decimeter range. For applying such a method, it is necessary to acquire highly redundant, fine spatial resolution (>5 megapixels) aerial photographs with a large overlap (>80 %) preferably in "double-grid" mission-structured acquisition (Mlambo, et al., 2017).
With the previously mentioned assumption, the SfM method solves camera calibration and image geometry by automatically identifying matching features visible in as many images as possible. Bundle-block adjustment is used to transform measured image coordinates into 3D points covering the Area of Interest (AoI) (Micheletti, et al., 2015). Afterward, a Dense Point Cloud is created by applying multi-view dense point-cloud generation based on Automatic Tie Points. The last stage implies rendering * Corresponding author of a continuous mesh model with texture and an ortho-rectified mosaic. SfM is widely used for large-scale forestry reconstruction (Swetnam, et al., 2018), marine environments (Burns & Delparte, 2017), archeology (Remondino & Campana, 2014), and man-made structures (Irschara, et al., 2010), etc.
For large-scale scene reconstruction, although there are plenty of efforts devoted to making point-cloud data denser and more accurate, they cannot substitute for the laser scanning point cloud (Shao, et al., 2016). Also, Terrestrial Laser Scanning (TLS) technology can provide a high-resolution and accurate surface/bathymetry model of the landscape, as well as surface and shallow underwater structures in the surveyed area (Bewley, 2003;Collin, et al., 2018;Bandini, et al., 2018). In this study, the aim is to improve spatial accuracy, especially for its vertical segment, by introducing TLS measurements into point clouds generated from airborne UAV imagery georeferenced with Ground Control Points (GCP). The focus is to achieve improved vertical accuracy of high-resolution mapping products for shallow underwater structures and surrounding elevations above Mean Sea Level (MSL).
Until now, photogrammetric products were compared/competed with TLS products. This was mainly due to cost-to-effectiveness ratios and the fact that both techniques provide similar services (3D point clouds) with different resolutions and accuracy (mainly absolute accuracy). The user could simply choose the most beneficial product to meet his requirements. In the current study, these techniques no longer compete but complement each other: TLS is known for its high-millimetric accuracy; photogrammetry is known for its visualization (Remondino and Campana, 2014) and relatively large survey areas. The data is integrated into the photogrammetric workflow producing high-resolution and fine-scale surface/bathymetric models for measuring and mapping underwater structures/objects. This study suggests a new technique for surface and shallowwater bathymetry reconstruction integrating data of sensors that penetrate the water (UAV RGB camera) with those that do not (Terrestrial Laser Scanner operating at 785 nm). The scene reconstruction implies UAV RGB data transformation from 2D imagery into 3D cartometric models while using a TLS point cloud (millions of physical/ground points) as sets of tie or corresponding points (normally features that can be clearly identified in two or more images) in the photogrammetric workflow.

MATERIALS AND METHODOLOGY 2.1 Study Area and Data Collection
The Prosika Canal is located on the south-eastern coast of Vrana Lake (43.843857°N, 15.624078°E), a designated nature park in central Dalmatia, Croatia. The canal was dug during the 18 th century to connect Vrana Lake with the Adriatic Sea to reclaim new agricultural fields in Vrana and protect them from seasonal flooding. In the late 19 th century, the canal was quite shallow. The 800 m-long canal has been broadened and deepened several times. In 1948, it gained its final dimensions: an 8 m width and its lowest point at about 0.35 m Mean Sea Level (MSL) (Katalinić, et al., 2008).
The Area of Interest (AoI) covers around 0.05 km 2 on the outlet of the canal (Figure 1.). The site is topographically flat with included anthropogenic structural objects (canal, bridge, docks, and houses), and sparse natural foliage (trees, shrubs). The canal is particularly interesting object for investigating advantages of the proposed technique for shallow-water observations. The upper part of its bed is concrete with high, steep banks. The lower part is a stone bed and natural, gradually sloping banks. The outlet (lower) part of the canal is shallow with natural underwater structures, while the upper part is elevated to prevent seawater entering Vrana Lake and has a smoother bottom texture.
The area was surveyed by UAV. Terrestrial measurements were also conducted using TLS and a Differential Global Positioning Systems (DGPS) service.

Figure 1. Area of Interest -Outlet of the Prosika Canal
A DJI Phantom4Pro UAV (DJI Development Team, 2017) was deployed in nadir position, double-grid mission at a 40 m altitude Above Ground Level (AGL), with enabled mechanical shutter and 80% overlap to ensure SfM-derived ortho-mosaic quality and subsequent a DSM dense 3D point cloud. A total of 524 photos were collected while surveying the AoI. The Terrestrial Laser Scanner, a FARO Photon 120/20 (FARO Photon Development Team, 2009) was used to scan 4 locations along the canal with scans overlapping at 30%-50%. Two scans were made from each side of the canal bank on its lower part (outlet) with distances between scans from approximately 15 m to 21 m. Within the AoI, coordinates of 18 geometrically well-defined objects were collected. This ground-truth data was used as GCPs and for validation purposes (Figure 2). For this purpose, CROPOS (Croatian Positioning Service) VPPS high-precision real-time positioning service was used with declared accuracy of ± 2 cm for horizontal (2D) and ± 4 cm for vertical (3D) positioning (Marjanović & Link, 2009). For all collected Validation Points (VP) and GCPs, the following reference coordinate system and coordinates were used: HTRS96/TM -Ellipsoid GRS80; N, E, H (Transverse-Mercator projection); Orthometric height: H = h -N (HVRS71) (Official Gazette, 2004).

Methodology 2.2.1
Pre-processing TLS data was pre-processed in FARO SCENE (FARO SCENE Development Team, 2017). Every data set was inspected for outliers to assure data quality. On the one hand, depending on the point distribution of the data set, distant sparse points were removed. On other hand, UAV imagery went under quality inspection of radiometric performance. An image is excluded if radiometric values significantly differs from radiometric values of the rest of the data set.

Processing TLS data
In FARO SCENE software, point clouds were produced from the raw TLS data. Since it was important for this task to determine how different amounts of TLS data and its configuration is affecting the result, several combinations of point clouds were co-registered. In particular, we were interested in how it affected DSM vertical accuracy and spatial resolution. For that purpose, different co-registered point clouds with different TLS configurations were produced: a) 2 co-registered point clouds located on the same canal bank (L2_S1). b) 2 co-registered scans on the different canal banks (L2_S2). c) 4 co-registered scans on the different canal banks (L4), 2 per side. Co-registration was implemented using a cloud-to-cloud registration employing SCENE Software with average Mean Scan Point Distance Error; around 5-7 mm. Cloud-to-Cloud registration proved to be the appropriate processing procedure due to the presence of geometric structures on the site and its high percentage of overlap. Point clouds were exported in LAS file format and a moderate distance threshold was applied for eliminating duplicate points on co-registered TLS data sets. Horizontal and vertical alignment of the corresponding point clouds and a Sparse Point Cloud generated from UAV imagery was checked and verified using CloudCompare (CloudCompare Development Team, 2019).

2.2.3
Processing UAV imagery UAV imagery was processed in Pix4Dmapper Pro, v. 4.4.12. This software has an automated processing procedure based on SfM algorithms. SfM includes searching and matching identical points (key points) by analyzing provided imagery with different descriptors. One of the most common is SIFT (Lowe, 2004). The number of identical points is highly related to the image texture, color gradient, and image resolution. Key points matched with provided auxiliary image data (approximate coordinate and orientation angles) were used to calculate exact orientation and camera position in the iterative process of bundle-block adjustment. This reconstruction enables identical point validation and a calculation of their 3D coordinates. The result is a Sparse Point Cloud whose density is then enhanced in the next step by implementing different algorithms such as Clustering View for Multi-View Stereo and Patch-based Multi-View Stereo (Furukawa & Ponce, 2010). Afterward, the processing procedure entails rendering of a continuous mesh model with texture. The generated digital elevation model is used to project every image pixel, and to calculate the georeferenced ortho-mosaic as a final processing result (Strecha, et al., 2008). In general, Pix4D Software has integrated SfM in three processing stages with the following outputs: (S1) Initial Processing, which calculates a sparse 3D point cloud based on key points; (S2) Point cloud and Mesh, which generates a Dense Point Cloud with a very accurate background for distance, surface and volume measurements and 3D Textured Mesh providing representation of the shape of the model that consists of vertices, edges, faces, and texture from the imagery that is projected on it; (S3) DSM, ortho-mosaic, which generates a Digital Surface Map; i.e., a model of the mapped area and 2D georeferenced ortho-mosaic map as the main result (Pix4Dmapper Support Team, 2019).
The proposed novel approach integrates the TLS data in Pix4D SfM processing procedure. The processing was performed in three different scenarios.

2.2.3.1
First scenario The first processing scenario includes a 2-stage process starting with (S1) Initial Processing with imported and marked 4 GCPs using the Croatia Coordinate System (HTRS96/TM). After the first stage was completed earlier, the produced TLS point clouds were introduced into the project. Second (S2) stage was skipped and the processing continued with (S3) DSM and ortho-mosaic calculation processing step. The Inverse Distance Weighting method was used for DSM calculation without applying any noise or smoothing filter. The procedure was repeated for 3 different TLS point clouds with previously explained configuration.

2.2.3.2
Second scenario The second processing scenario was conducted as a three-stage processing procedure. The initial processing procedure was carried out the same way as in first processing scenario, with the same 4 GCPs selected. After TLS point cloud introduction, the processing continued with the second (S2), and then the third processing step (S3). The goal was to evaluate the final DSM based on a Dense Point Cloud, which is generated based on Automatic Tie Points and introduced the TLS point cloud. The explained processing procedure was carried out on the data set with the introduced point cloud of 2 co-registered TLS scans on the different canal banks (L2_S2). For DSM calculation, the same processing parameters was used as in the first processing scenario.

2.2.3.3
Third scenario To assess the improvement of vertical accuracy by introducing a larger number of GCPs, the following processing procedure was applied: a 3-stage processing scenario with 12 GCPs introduced and co-registered TLS point clouds on the different canal banks (L2_S2). For the evaluation purposes, additional DMS models were produced based only on UAV imagery and two different GCPs configuration -4 GCP and 12 GCP configuration (without introduced TLS point clouds). The processing was carried out by applying automatic 3-stage Pix4D processing procedure with the same parameters as in all other scenarios.

Post-processing
The first goal of the post-processing was to establish in what way the heights are affected by different amounts and configuration of introduced TLS point clouds. This was assessed on selected Validation Points by comparing coordinates measured by DGPS and average coordinates values of the 3x3 pixel size area on the models, with and without introduced TLS point clouds. The second goal was to quantify volumetric changes between models, with and without introduced TLS data sets. For that purpose, Digital Surface Models of topographic Difference (DoD) were calculated for models with provided GCPs and a model with the same GCPs and introduced TLS point clouds. The third goal was to compare heights within models with different introduced TLS point clouds on structures on land and in shallow water. For this purpose, profiles of concrete, parts of the canal, and a bridge were selected. The fourth goal was to inspect surface behavior after TLS data introduction via roughness values. Roughness was calculated on a point-cloud level for each point as a value equal to the distance between that point and the best-fitting plane computed on its nearest neighbors, while the radius was set to be the same as the average Ground Sampling Distance (GSD).

RESULTS AND DISCUSSION
The achieved average GSD for all projects and processing procedures is 1.37 cm. Vertical accuracy of the first, second and third processing scenario results was inspected on the same 6 Validation Points through mean value, standard deviation, and coefficient of variation.
The first processing scenario results are presented in Table 1. The highest value for mean-vertical accuracy was achieved with introduced TLS point cloud with the highest point density and area coverage; i.e., 4 co-registered TLS scans (L4). The total improvement of mean-vertical accuracy compared with the 4 GCPs model is ~ 8%. This improvement is most highly related to the fact that DSMs in the first processing scenario were produced solely based on Sparse Point Cloud of UAV imagery and patches of introduced TLS Dense Point clouds. Thus, a ratio of TLS point-cloud coverage and AoI could affect the results. Also, the results were affected by the distribution of the points related to the TLS scan configuration.
The results for the second processing scenario showed that vertical accuracy will improve when conducting a 3-stage processing procedure instead of a 2-stage one with the same TLS point-cloud configuration (Table 2).

Model parameters 4 GCP & L2_S2 & (S1) -(S3) Mean-vertical accuracy [m]
0.2142 Standard Deviation 0.1721 Coefficient of Variation 0.8034 Table 2. Achieved vertical accuracy of the second processing scenario model The compared results for these two processing scenarios on the model with introduced L2_S2 TLS configuration shows an improvement in vertical accuracy by ~ 4%, reaching 15 times the average of the project's GSD.
Since vertical accuracy is strongly dependent on the number of GCPs introduced in the Bundle Adjustment (BA), it is suggested to use 3 or more GCPs per 100 images to obtain greater accuracy in the third processing scenario where the number of GCP was increased (Sanz-Ablanedo, et al., 2018): 12 GCPs were included for the 524-photo data set.
Validation results presented in Table 3 show great improvement in vertical accuracy, especially for the model with introduced TLS point clouds, which reached 4 times the average of the project's GSD, obtaining an overall improvement of ~ 49% as compared with the model without TLS data.  Table 3. Vertical accuracy of models obtained with 12 GCPs This is especially important knowing that the declared vertical positioning accuracy of the positioning service employed is ± 4 cm. This suggests that the proposed processing procedure can dramatically improve DSM vertical accuracy. Also, it emphasizes the importance of a larger number of well-distributed GCPs for high-resolution 3D scene reconstruction with this proposed novel approach. Both first and second scenario values of standard deviation and coefficient of variation confirm that the lower vertical accuracy achieved, the correspondence between points is quite high with a small spread in offset values. However, results from the third processing scenario gave very high mean-vertical accuracy values although the spread is greater and confidence drops. This type of trade-off between accuracy and confidence was anticipated to some extent.
Horizontal (x and y) uncertainties of first and second processing scenario models averaged -23.46 cm for the X coordinate and 9.75 cm for the Y coordinate. For the third processing scenario, the horizontal uncertainty of the X coordinate was 8.61 cm, and 9.87 cm for the Y coordinate. All horizontal uncertainties were estimated using the same 6 Validation Points for vertical accuracy. Comparison of all DoDs shows that overall height difference of models with and without introduced TLS data is quite smallless than 5 cm as shown in Figure 3. introduced 4 GCPs and DSM with introduced 4 GCPs and L2_S2 TLS configuration and processing procedure (S1)-(S3); e) DoD of DSM with introduced 4 GCPs and DSM with introduced 4 GCPs and L4 TLS configuration; f) DoD of DSM with introduced 12 GCPs and DSM with introduced 12 GCPs and L2_S2 TLS configuration and processing procedure (S1)-(S3).
On flat areas and areas with rougher texture, height differences are almost non-existing, but areas with smoother texture and lower object-to-background color contrast exhibit larger height differences. This effect is clearly visible for shallow underwater structures as shown in Figure 3. Reconstruction of underwater structures improves with the number of introduced TLS scans, especially for areas with lower object-to-background color contrast. It provides more realistic 3D reconstruction with a smoother gradient on introduced locations as visible on the middle part of the canal profile in Figure 4 b).  The main source of errors for reconstructed scenes employing the SfM approach, were saturated pixels and noisy pixels due to shadow, which are reduced when introducing TLS, as can be seen in Figure 5. Also, we can present improvements in edge reconstruction and differentiation of anthropogenic objects, such as walls, fences, etc.

CONCLUSION
In this paper we proposed a novel approach that introduces a TLS point cloud in the photogrammetric SfM workflow for obtaining high-resolution 3D scene reconstruction. We were particularly focused on the challenging area of underwater structures/objects and producing fine-scale surface/bathymetric data. Also, one of the main goals was estimating the improvement in vertical accuracy in different processing scenarios.
Our experimental results show that it is possible to obtain very satisfactory results of 5.6 cm mean-vertical accuracy for models by introducing TLS point clouds with a larger number of GCPs (12). This confirms established importance of introducing a greater GCP number in the BA process for achieving higher positioning accuracy. When comparing models with introduced TLS and the corresponding model with the same GCPs configuration but without TLS, we see an overall improvement in vertical accuracy of ~8 % for the 4 GCPs model, and ~ 49% for 12 GCPs model.
Since SfM relies on texture within an illuminated image pair to identify features, while TLS depends on the density of point capture models, introducing TLS provided better results for areas with lower object-to-background color contrast, and saturated and noisy pixels.
Significant improvement in vertical accuracy and subtle improvements in presenting underwater structures can be of significant importance for diverse applications.
For future research, it would be interesting to test this proposed approach on drone imagery with a lower number of GSDs and higher mission altitude, and to estimate the impact on resolution and vertical accuracy.