COMPARING ACCURACY OF ULTRA-DENSE LASER SCANNER AND PHOTOGRAMMETRY POINT CLOUDS

Massive point clouds have now become a common product from surveys using passive (photogrammetry) or active (laser scanning) technologies. A common question is what is the difference in terms of accuracy and precision of different technologies and processing options. In this work four ultra-dense point-clouds (PCs) from drone surveys are compared. Two PCs were created from imagery using a photogrammetric workflow, with and without ground control points. The laser scanning PCs were created with two drone flights with Riegl MiniVUX-3 lidar sensor, resulting in a point cloud with ~300 million points, and Riegl VUX-120 lidar sensor, leading to a point cloud with ~1 billion points. Relative differences between pairs from permutations of the four PCs are analysed calculating point-to-point distances over nearest neighbours. Eleven clipped PC subsets are used for this task. Ground control points (GCPs) are also used to assess residuals in the two photogrammetric point clouds in order to quantify the improvement from using GCPs vs not using GCPs when processing the images. Results related to comparing the two photogrammetric point clouds with and without GCPs show an improvement of average absolute position error from 0.12 m to 0.05 m and RMSE from 0.03 m to 0.01 m. Point-to-point distances over the PC pairs show that the closest point clouds are the two lidar clouds, with mean absolute distance (MAD), median absolute distance (MdAD) and standard deviation of distances (RMSE) respectively of 0.031 m, 0.025 m, 0.019 m; largest difference is between photogrammetric PC with GCPs, with 0.208 m, 0.206 m and 0.116 m, with the Z component providing most of the difference. Photogrammetry without GCP was more consistent with the lidar point clouds, with MAD of 0.064 m, MdAD of 0.048 m and RMSE value of 0.114 m. * Corresponding author


INTRODUCTION
Ultra -dense point clouds have now become a common product from surveys using passive (photogrammetry) or active (laser scanning -lidar) technologies using close-range sensing with unmanned aerial vehicles (UAVs). Processing massive point clouds for extracting products leads to many investigation topics that aim at balancing accuracy and processing speed. One key point is what to expect in terms of accuracy of the survey from these two technologies as they pose different pros and cons. UAV accurate position and orientation via a Global Navigation Satellite System (GNSS) and Inertial Measurement Units (IMU) can now provide directly high-quality point clouds from both photogrammetry and laser scanning (Masiero et al., 2015;Stöcker et al., 2017). Nevertheless Ground Control Points (GCP) and check points (CP) improve the overall accuracy and give information about residual errors (Guarnieri et al., 2013). Surveying targets to use as GCP and check points is time consuming and in large areas with vegetation cover and / or low accessibility (forests) can be a problem (Pirotti et al., 2017(Pirotti et al., , 2014Vaglio Laurin et al., 2016). For this reason knowledge on what accuracy to expect from lidar and photogrammetric processing without GCPs is important. Particularly when considering very dense point clouds, such as this investigation, were in some cases points are spaced less than 1 cm apart.
Definition of "ultra-dense" is subjective, as technology is enabling higher point density using higher quality cameras, lower flight altitude and slower flight speed of UAVs and faster measurement rate for laser scanners. In their work Cramer et al., (2018) use the term "ultra-high precision" to compare imagery with 0.5 cm ground sampling distance (GSD) and 800 points per square meter lidar point clouds. In our case we have larger GSD (~2 cm), but much denser point clouds reaching more than 20,000 points (see Figure 1). This is due to advances in technology as authors used VUX-1 and in this work VUX-120 was used. The accuracy in Cramer et al., (2018) was estimated in the Z axis around 3 cm.
Comparing accuracy of a survey product is a key process in geomatic sciences. The golden standard in this process is to compare measures with other measures that are about one order of magnitude (10x) more accurate. GNSS with differential correction in post-processing or real-time (RTK) is commonly used. Another approach is to use a total station, ideally with least squares compensation to minimize random measurement errors. A state-of-the-art survey with these positioning technologies can provide accuracy around 1-2 cm. This poses a limitation because to reach one order of magnitude of expected accuracy with respect to the sensors tested here, and thus be a valid reference, an accuracy of < 1 cm is ideal. The term "accuracy" used here is determined by comparing with GNSS reference measures It is important to clarify that reference measurements have their own error budget due to the technology that is used (GNSS). The accuracy of the GNSS measures in this study is two-fold with respect to the expected accuracy of the drone data and not ten-fold as would be ideal.

Study area
The study area is located in Castelfranco Veneto (TV) in the Veneto Region, in north eastern Italy. It consists of Villa Revedin Bolasco with an historical garden approximately 8 ha, with a lake in the middle and several heritage elements. The vegetation varies from dense evergreen trees to broadleaves. Figure 1 below give an overview of position and composition.

Ground survey
A total of 30 targets were placed in the premise; 12 targets are larger bit-encoded targets from Metashape (60 cm x 60 cm and 3 cm thick), the rest are smaller 10 cm x 20 cm topographic targets.

Figure 2. leftsmaller targets and right larger targets used for the
The ground survey consisted in using a GNSS receiver in RTK mode. The points resulted having an average accuracy (RMSE) of 2.20 cm and 2.73 cm horizontally and vertically respectively. Eleven points were also surveyed with a total station Leica TC702, to measure distances and compare them with resulting products. Figure 1 bottom left shows the distribution of the points and of the distances measured with the total station.

Methods
Four point clouds were created: two using active lidar sensors, and two from imagery processed via photogrammetric workflow with Metashape© carrying out the pipeline via structure from motion (SfM) and dense image matching (DIM). The two photogrammetric point clouds were with and without referencing the targets as ground control points (GCPs). The one without GCPs only used camera position and orientation from the GNSS and inertial measurement unit (IMU).
The UAV carrier consisted in a Soleon LasCo X8 multicopter equipped with one of the three sensors (one camera and two laser scanners) for each flight. Positioning and orientation were measured via a GNSS with RTK corrections and IMU (Applanix APX-20).

Photogrammetric point cloud.
The two point clouds produced via photogrammetry consisted in 1 billion points each.
A total of 1068 images, with average relative flight height of ~100 m and an average baseline of ~10 m. The camera has a focal length of 21 mm SONY ILCE-7RM3: Lens = ZEISS Lokia 2.8/21 -image size is 7952x5304 pixels width and height respectively.
The CMOS sensor has a physical size of 35.90mm x 24.00mm that for the 7952x5304 pixels, means an approximate average value per pixel of 4.52 µm. At the average relative flight height of 100 m this means a corresponding FOV of ~79.6°, ~59.49° and ~90.59° respectively in width, height and diagonal. This corresponds to an image footprint of ~166.67m ~93.75m and ~191.22m respectively and, at nadir, a pixel size (GSD -ground sampling distance) of ~2 cm.

Lidar point clouds.
Lidar point clouds were obtained from Riegl MiniVUX-3UAV and VUX-120. These two sensors respectively can provide a pulse repetition rate of 200,000 and 1,8 million measurements per second, respectively.

Methods
The objective of this work is to evaluate the differences between the four point clouds. There are several methods for assessing difference between point clouds. The distance between each point and the nearest neighbour of the reference point cloud is a very common approach (Lague et al., 2013). Distance between each point and the nearest tassel in a mesh can also be an option, but is efficient when regular surfaces are scanned. Specific cases like indoor scans with clean unobstructed walls can be used to create a Extended Gaussian Image and Histogram z-cluster to align scans to walls represented as reference lines (Chen et al., 2018). This is not exactly our scenario as there are walls to be used but they are not so regular with respect to the magnitude of the errors that we are trying to detect. In our case there are only a few surfaces such as walls and roofs, so the point-to-point approach was used. When using the difference between point clouds or point to mesh, one point cloud must be defined for reference. In this case the Riegl VUX-120 point cloud can be considered as reference as it has the highest point density and also we can expect a lower error budget, between ±2-5 centimeters (Bin et al., 2008;Habib et al., 2009;Petrie and Toth, 2008;Thiel and Wehr, 2004). In a very similar work with VUX-1 and photogrammetry, checking with precise ground control points provided an standard deviation in the Z axis of 0.029 m and 0.030 m after adjustment (Cramer et al., 2018).
We will refer to differences in position as residuals and treat them like errors, even if they are formally errors only if compared to a measure that has one magnitude better accuracy. As mentioned, the closest to a reference that can be considered as such is the VUX-120; differences can be considered errors with respect to that point cloud. To check for random, systematic and gross residuals, the distribution of distances in XYZ directions of each point with its nearest neighbour between pairs of point clouds is used. Due to the large size of the clouds, eleven small subsets were clipped. The subsets consisted in six of the large targets, 4 buildings at the corners of the study area, and one concrete bench at the center of the area (very close to target 12 in figure 1). These last five subsets were chosen because of the regular shape and absence of grass or other vegetation.
All combinations, without repetition of point clouds were processed, therefore a permutation without repetition that resulted in 6 combinations.
The coordinates of the single targets were also compared with the GNSS coordinates, but only for the VUX-120 and the two photogrammetry clouds, as the VUX-3UAV points were not dense enough to enable detection of the target center (see Figure 3).
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B1-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France Figure 3. 1x1 m point cloud clips with targets respectively (from left to right) from the photogrammetric, VUX-120 and miniVUX-3 survey. Blue-green-red color scale is intensity from the lowest value to the highest value.
Statistics calculated were the mean of absolute differences (MAD), root means square of differences (RMSE) and, for a more robust metric with respect to gross errors, the median absolute difference (MdAD) (Höhle and Höhle, 2009) calculated as: where Δdxyz is the distance between nearest neighbours.   Table 2. Values of difference metrics over the eleven subsets.

DISCUSSION
Results from the two photogrammetric products, with and without GCPs, show unsurprisingly that residuals in XY plane are much lower when using GCPs. Figure 5 below shows an example of GCPs used. Results show an improvement of average absolute position error from 0.12 m to 0.05 m and RMSE from 0.03 m to 0.01 m. As mentioned in introduction,the GNSS RTK accuracy is around 0.02 m, therefore we can say that the improvement can be quantified up to the GNSS RTK accuracy and the GSD of the orthoimages. Military Geographic service (IGM) which has a 10 cm accuracy (Barbarella and Ronci, 2005). It is likely the case that the camera centers had a value of orthometric height that was calculated using a different correction source, thus providing this offset that can be quantified to about 17 cm (table 2). If using GCPs, their coordinates must match the same reference as the camera centers coordinates. Mismatches in terms of the Z coordinate leads to a biased position in the Z component ( Figure 6). This is true in target n. 5 which has a blunder error in the Z direction, likely due to to human error. Figure 6. The top shows the PCs from photogrammetric with GCPs and VUX-120 for the target no. 5. The bottom represents the PCs for photogrammetric with and without GCP, and VUX-120 over a wall segment.

The point clouds are colored in RGB for
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B1-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France photogrammetric with GCP, in blue without GCP and in grey for VUX-120 Figure 6 above shows the situation over two of the eleven subsets used for comparing the point clouds. It is also quite evident that photogrammetric point clouds are smoother as they do not have the noise that laser scanners have due to precision of the laser beam orientation and distance measurement error that gets distributed depending on the incidence angle with the surface that gets hit.
The lidar point clouds provide the overall higher consistency between each other, with the VUX-120 sensors having the highest density and providing the most information also below dense canopies. This is something to take into consideration in context where vegetation is an important factor (Mozzato et al., 2018) and should be distinguished from the terrain or labelled to discriminate urban objects (Pirotti et al., 2019).
Similar results to this investigation have been found in (Hugenholtz et al., 2016), where high-grade GNSS vs. low-cost GNSS provided twice higher residuals in the vertical Z component. It must be noted though that in our case the higher Z residual component was due to a blunder and not to the GNSS receiver quality. The take home message is that automatic camera position and orientation via on-board GNSS with RTK might be improved by GCPs, but surveying GCPs can add blunders as human error is always to take into consideration. When the error is obvious, it can be removed with typical outlier-detection methods, but if it is not, it might bring unexpected results.

CONCLUSIONS
In this work we compared four ultra-dense point clouds, using GCPs over the photogrammetric products and point-to-point distances over all point cloud pairs. Results show that direct georeferencing using camera centers' positions measured with GNSS RTK and IMU without GCPs provides lower accuracy than using GCPs, but the error is limited to 12 cm in this case, which might be acceptable for some applications. Using GCPs improves planimetric accuracy, but the height component can increase the error in Z direction, due to how ellipsoid heights are converted to orthometric heights in camera centers and in GCPs. Human error in this case is a factor to take into consideration.